[FLOW3-general] Possible Memory Problem within Flow3

Chris Zepernick {SwiftLizard} chris at swift-lizard.com
Thu Dec 1 11:24:26 CET 2011


Hi Karsten,


> No, the identifier is needed when you create a link that somehow needs
> to address your new object, e.g..
>

Sounds reasonable to me.

>> Second Question is why is each instance of an entity stored into
>> \TYPO3\FLOW3\Core\Bootstrap::$staticObjectManager ?
>
> You mean the call to registerNewObject() that is done? That is so you
> can actually find objects not yet persisted in a consistent way.
>
> Believe me, both things are really needed, we learned that over time.
>

This one causes a mayor memory problem in the two scenarios I tested so 
far,...

First secenario is to insert a Bulk of 60.000 simple entities that just
has an attribute name. This is because on each iteration a new entity 
will be stored in the \TYPO3\FLOW3\Core\Bootstrap::$staticObjectManager,
wich is fine a long as you use the standard FLOW3 Repository, because in 
that case, when it has been implemented a call to detach() should help.

But we have a scenario where we use a custom repository so that we can
use an additional persistence layer. The expected behaviour for me would 
be that the new entity would be stored in that persistence manager
but instead it is stored in the default one.

This is what causes a mayor memory leak, because in that case the object 
ist stored in two persistance managers.

At the moment I found a workaround by:
$globalPersistence = new 
\TYPO3\FLOW3\Persistence\Doctrine\PersistenceManager();
$globalPersistence->clearState();

which keeps the memoryusage consistent even over 60.000+ Records.

code would be:
for($i = 0; $i < 60000; $i++){

   $model = new \SwiftLizard\Blog\Domain\Model\Blog();
   $model->setName('Chris Evil Test ' . $i);
			
   $this->repository->add($model);
   $this->repository->persistAll();
   $this->repository->detach($model);
			
   $globalPersistence = new 
\TYPO3\FLOW3\Persistence\Doctrine\PersistenceManager();
   $globalPersistence->clearState();
}

with out the last two lines the memory usage increases no matter what.

second secenario equals the first, but htis time we read those 60.000 
records and display them line by line.

The sad part about dotrine is that it delivers a query object,
and if I call execute and then iterate over the resultset at 
$iterator->current() doctrine will fetch the whole resultset and convert 
it into a huge iterator with all 60.000 Objects. Boom there goes my 
memory, and with this new registerNewObject call at construct, even 
twice as fast.

The workaround for that I discovered is similar to this:
$iterator = $this->repository->findAll();
$counter  = $iterator->count();
for($i = 0; $i < $counter; $i++){
   if($i % 1000 == 0 || $i == 0){
     $itemIterator = $iterator->getQuery()
		    ->setLimit(1000)
		    ->setOffset($i)
		    ->execute();

    foreach ($itemIterator as $item){
      $name = $item->getName();
      ...
    }

    $globalPersistence = new 
\TYPO3\FLOW3\Persistence\Doctrine\PersistenceManager();
    $globalPersistence->clearState()
   }
}

Ugly but it works and keeps the memory usage low. If I strip the last 
two line the memory usage will increase on each circle of the foreach.

In our framework at TYPO3 side we discovered something similar a while 
back and came up with the solution to put the DBs resultpointer into the 
iterator instead of the whole transformed dataset.

That way we keep the memory consumption low even on hude datasets, 
because the objects are only created when needed, and not bulk at first 
call to current().

Perhaps this would be an option here too, I think it might make sense to 
propose this to the doctrine guys, what do you think ?


> Do you think you could write a functional test for a massive import
> scenario?

I think we can provide some thing like that, either me or Peter will 
come up with something.

Last thing we discovered yesterday, that the is no option in the 
repository to create an updateQuery or deleteQuery ?
Is there a reason for that, or will that be implemented later ?

Cheers

Chris




More information about the FLOW3-general mailing list