[TYPO3-v4] autoloader and path

Tue Jan 4 22:18:54 CET 2011

Hey,

On 01/04/2011 09:13 PM, Christian Kuhn wrote:
> I'll try to come up with a better analysis soon, but can't promise too
> much for the upcoming two weeks.

Here is a kickstart if someone wants to step in for this kind of analysis:

- Basically I've taken a 4.4 and a 4.5 introduction package on the same 
system and called a page (some page with some magic: eg. many links or 
many cObjects).

- Throw some benchmark programs to both pages to get a basic feeling, 
examples:
siege -c1 -b -t1 'http://introductiontrunk.domain.foo/examples/'
siege -c1 -b -t1 'http://introduction44.domain.foo/examples/'

- Do some tests with ab (apache bench) as well.

- Try some tests if you add ?no_cache=1 to the url

- Do further tests with enabled xcache and/or apc

- Next install xdebug with some ini settings like:
zend_extension=/usr/lib/php5/20090626/xdebug.so
xdebug.max_nesting_level=200
[debug]
; Profiling
xdebug.profiler_append=0
xdebug.profiler_enable=0
xdebug.profiler_enable_trigger=1
xdebug.profiler_output_dir=/tmp

- Enable $TYPO3_CONF_VARS['FE']['debug'] = 1
- Take the get getMilliseconds() implementation from 
t3lib/class.t3lib_timetrack.php and copy to 
t3lib/class.t3lib_timetracknull.php. This will give you together with 
the fe debug an output of the parsetime as a FE comment (for full cached 
pages, too). This gives you some feeling if a call to a page was a 
typical call, or if something weird happened during the call on the 
system. If you want reasonable results, a typical call should be used 
for the profiling session.

- Call a page to get a cachegrind profile in /tmp:
http://introductiontrunk.domain.foo/examples/?XDEBUG_PROFILE=1
Check the source that this was a 'typical' call (xdebug typically 
tripples your parsetime to gather the profile data). Do some subsequent 
calls to make sure that xcache / apc are fully in action for this php 
process.

- Open the cachegrind profile with kcachegrind (linux required, I've 
seen a running port with apple-foo). Analyze times, compare two 
'typical' 4.4 and 4.5 profiles. I've added a typical kachegrind picture 
in link [1]. Warning: A cachegrind gives a lot of data, it's not really 
easy to make correct conclusions, on the other hands it's sexy and 
addictive ;)

- After some basic tests I have seen that extbase raises parsetime of 
full cache pages ~5-10% between 4.4 and 4.5. I decided to remove those 
extensions from the introduction package to get more reliable data (I 
wanted to bench the core and not extbase): introduction, extbase, 
workspaces, tt_news, fluid.

- Compare again (4.5 will be a bit quicker after removing extbase), but 
still not at the speed of 4.4 for full cached pages:

...

Analyze data

...

TODO after all this data gathering:
Make some conclusion and maybe optimize some calls: Isolate single code 
fragments, benchmark them, optimize, bench again, write unit tests with 
high code coverage which validate before and after optimization to make 
sure no regression was introduced with the optimization, send to core list.

Further informations:

Keep in mind that caches (like the extbase caches) and other things like 
the preparedStatement stuff in 4.5 are double-edged: It can happen that 
those things make full-cached pages actually some milliseconds slower, 
but it could boost your page for some hundred milliseconds for uncached 
pages. This is also a reason why the caching framework is discussed: It 
raises parsetime for typically 1-2 milliseconds for full cached pages 
(which can be 10% of whole parsetime), but it can easily safe *much* 
more time if used correctly (eg. with big cache tables in production).

Another thing to think about: A simple site without USER_INT's which is 
fully cached on a recent system can be sent to a user in 10-15ms. On a 
typical server with 4 cores, several thousands requests can be done with 
this *per second*. This is usually *enough*. The problem is if your page 
is *not* fully cached. Even without much magic, a non cached page 
typically takes 200ms and easily more. This drops the maximum number of 
requests to 20 reqs/s if only non-cache pages are accessed. So, actually 
optimizing TYPO3 is not only about reducing parsetime of frequently used 
or expensive methods, but *much* more about proper caching and a quick 
cache system. If this then takes a millisecond or two to initialize on 
every request ... I can live with it, if only it scales for big caches 
and drops my parsetime for expensive things.

And yet another thing: For *real* projects the core is not that of a 
problem performancewise: If some extension does a lot of nasty things on 
a big table or something like that, the core only plays a minor role. 
Tipp: If you have some really high parsetimes you should just throw a 
cachegrind on it, it's often very easy to identify the real expensive 
things.

Regards
Christian

[1] http://schwarzbu.ch/fileadmin/img/cachegrind-example.png