[TYPO3-performance] Re: Re: DataHandler high memory consumption

Lukas Krieger lukas.krieger at me.com
Tue Nov 4 01:11:38 CET 2014


Ok, next try

http://typo3.org/api/typo3cms/backend_2_classes_2_utility_2_backend_utility_8php_source.html#l01153

I commented out line 1187 and the memory usage dropped to 8MB. Still not the 5MB if there is no cache clearing for those pages, but
there are a lot of temporary structures in the functions.

This finding leads us to the following class:
http://typo3.org/api/typo3cms/_ts_config_parser_8php_source.html

The TSConfigParser

$TStext is a very long string, the TSConfig which should be parsed
The default config + custom changes.
31045 characters long

$TStext should be for all pages the same because the are all on the same rootline and siblings.
I got the output of all created hashes and they are really identical:

hash db183e3701e8d371728c141af8aa416f
hash db183e3701e8d371728c141af8aa416f
hash db183e3701e8d371728c141af8aa416f
..
hash db183e3701e8d371728c141af8aa416f

So the parsed TSConfig should be stored. (and it really is stored. if clause on line 60 is true)

The result array has a flag whether the content was retrieved from cache.
So lets have a look at all flags of the 280 parsed TSConfig, which are the same, have the same hash from line 58 and are stored.

All 280 parsed TSConfigs have the cache flag set to 1.
And for every execution the if clause on line 66 is true!

So i outputted the memory usage before and after line 64:
$storedData = $this->matching($storedData);

pre matching 50655352
post matching 50655352
pre matching 50949456
post matching 50949456
pre matching 51243656
post matching 51243656
pre matching 51537968
post matching 51537968

As you can see, there is no increasing memory usage based on that function.
The memory increases either before or after that.

If i delete the whole function and just return an empty array:
public function parseTSconfig($TStext, $type, $id = 0, array $rootLine = array()) {
		
		$this->type = $type;
		$this->id = $id;
		$this->rootLine = $rootLine;
		
		$res = array('TSconfig'=>array(),'cached'=>1);
		return $res; 
}
The memory usage drops from 85MB to 10MB


So lets have a look at the caching of the TSConfig for each page.
After clearing all caches in the typo3 backend i get the following output before line 60
echo "id ".$id." hash ".$hash." count storedData ".count($cachedContent[0])." stored MD5 ".$cachedContent[1]."<br>";

id 2540 hash db183e3701e8d371728c141af8aa416f count storedData 0 stored MD5 
id 2539 hash db183e3701e8d371728c141af8aa416f count storedData 3 stored MD5 246b1096c55387fc8f31eea4f3d7c6d6
id 2536 hash db183e3701e8d371728c141af8aa416f count storedData 3 stored MD5 246b1096c55387fc8f31eea4f3d7c6d6
id 2535 hash db183e3701e8d371728c141af8aa416f count storedData 3 stored MD5 246b1096c55387fc8f31eea4f3d7c6d6
..

Obviously the first parsed TSConfig is not in the cache after i deleted everything.
All pages have the same hash, same storedData and same stored MD5

If i set line 68 to an empty array 
$res = array(
     'TSconfig' => ''/*$storedData['TSconfig']*/,
     'cached' => 1
);

The memory usage drops to 8MB

=> 
If this function return an empty array, there is no high memory usage. 
So there must be a problem when using the cached parsed TSConfig on the previous calling function. The caching and parsing of the TSConfig
works just fine!

Directly when i look at the function getPagesTSconfig in Backend  Utility again i found the problem!
http://typo3.org/api/typo3cms/backend_2_classes_2_utility_2_backend_utility_8php_source.html#l01153

On line 1197-1199

static $pagesTSconfig_cache = array();
....
 1197 if ($useCacheForCurrentPageId) {
 1198                                 $pagesTSconfig_cache[$id] = $TSconfig;
 1199                         }

The function is saving all the same parsed TSConfigs in a new array! (combined with the Backend User TSConfig)
280 different keys (the page uids) and always the same big parsed TSConfig content!
Saving such big arrays in PHP is really a mess!!!!

I can think of two patches immediately

1. Storing a serialized array on line 1198 and output the unserialized array on line 1161
By doing so, the memory usage drops to 18MB which is still big, but not as big as before.
Just consider, we are still saving 280 strings of ~36KB each in the array => ~10MB memory usage and each array is the same. 
Thats still bad!

2. Using the CacheManager which is also used for caching the already parsed TSConfig in the TSConfigParser
The parsed TSConfig on line 1186 is already cached in the CacheManager and it would be easy to add
a cache for the combined parsed TSConfig+User TSconfig based on the uid of each page.


Please let me know what you are thinking and if you would create a forge issue for that.


best,
Lukas

ps.: philipp, i wrote this post at the morning and i am just now able to publicize it. You were right! 

Quote: Lukas Krieger (lkrieger) wrote on Mon, 03 November 2014 12:44
----------------------------------------------------
> After out commenting the two lines 
>  7001 // Get Page TSconfig relavant:
>  7002                                 list($tscPID) = BackendUtility::getTSCpid($table, $uid, '');
>  7003                                 $TSConfig = $this->getTCEMAIN_TSconfig($tscPID);
> 
> The memory usage of my script drops from ~ 85MB to just 5MB!
> We have found the place where the high memory usage occurs!
> 
> 
> For every page uid in "static::$recordsToClearCacheFor"  the script gets the TSConfig of each page
> 
> Line 7002
> BackendUtility::getTSCpid($table, $uid, '');
> http://typo3.org/api/typo3cms/class_t_y_p_o3_1_1_c_m_s_1_1_backend_1_1_utility_1_1_backend_utility.html#a4a5108691e86b91f24617c203b5b9451
> Returns the REAL pid of the record, if possible. If both $uid and $pid is strings, then pid=-1 is returned as an error indication.
> 
> 
> Line 7003
> $TSConfig = $this->getTCEMAIN_TSconfig($tscPID);
> http://typo3.org/api/typo3cms/class_t_y_p_o3_1_1_c_m_s_1_1_core_1_1_data_handling_1_1_data_handler.html#a5ca44413411e7b1ea925da3254296b55
> Return TSconfig for a page id
> 
> =>
> 
>  6546 public function getTCEMAIN_TSconfig($tscPID) {
>  6547                 if (!isset($this->cachedTSconfig[$tscPID])) {
>  6548                         $this->cachedTSconfig[$tscPID] = $this->BE_USER->getTSConfig('TCEMAIN', BackendUtility::getPagesTSconfig($tscPID));
>  6549                 }
>  6550                 return $this->cachedTSconfig[$tscPID]['properties'];
>  6551         }
> 
> 
> At first i thought it would be the caching of the TSConfig for each page but my cachedTSconfig array is empty.
> It is just filled with 280 keys and for each key a new array:
> Array ( [2540] => Array ( [value] => [properties] => ) [2539] => Array ( [value] => [properties] => ) [2536] =>...
> 
> Of course it increases the memory usage but it is only a few KB for 280 pages!
> 
> So i commented out the line 6548 (i do not want to fetch my empty TSConfig) and the memory usage dropped to 5MB again!
> Thats the problem!
> 
> It also explains why there is no benefit of chunking the pages into pieces and let the DataHandler process only i.e. 10 pages at the same time.
> It calls external functions and the DataHandler itself does not use that much memory.
> (Therefore creating und destroying the DataHandler is no solution for the problem - as i tested before)
> 
> I have to leave now but i will dig deeper into the system and have a look at the functions on Line 6548 later:
> $this->BE_USER->getTSConfig('TCEMAIN', BackendUtility::getPagesTSconfig($tscPID));
> 
> 
> 
> 
> Quote: Tymoteusz Motylewski (tmotylewski) wrote on Mon, 03 November 2014 08:37
> ----------------------------------------------------
> > Hi Lukas,
> > Nice findings.
> > Can you make a forge issue for that. It would be nice if you can
> > investigate a little bit further, what is the root cause for the high
> > memory consumption.
> > Is it just the size of "recordsToClearCacheFor" array, or the processing of
> > this array, or do we have a memory leak somewhere?
> > 
> > Would the memory consumption be better if you set UserTS clearCache_disable
> > to 1 ?
> > This should help and doesn't require modifiyng the core.
> > 
> > Cheers
> > Tymoteusz
> > 
> > 
> > > Conclusion: it is the function processClearCacheQueue
> > > http://typo3.org/api/typo3cms/_data_handler_8php_source.html#l06992
> > >
> > >
> > > So after all that command execution (updating 150 pages)
> > > My script only uses 6MB memory and is much faster then before
> > >
> > > Updating 280 pages was done by using just 4.8MB memory.
> > >
> > > I will debug the processClearCacheQueue function later and post my results
> > > in this thread.
> > >
> > > Hope it will help someone :-)
> > >
> > > _______________________________________________
> > > TYPO3-performance mailing list
> > > TYPO3-performance (at) lists.typo3.org
> > > http://lists.typo3.org/cgi-bin/mailman/listinfo/typo3-performance
> > >
> ----------------------------------------------------
> 
----------------------------------------------------



More information about the TYPO3-performance mailing list