[TYPO3-core] RFC: T3-speedup preg insted of ereg

Bernhard Kraft kraftb at kraftb.at
Wed Oct 26 23:52:32 CEST 2005


Hello,


This is a CVS patch request.

Type: Improvement

Description:
preg's are faster than ereg's (Don't ask why :) test it).
strpos is faster than strcspn in cases where you only look for one character.
I modified t3lib_parsehtml.php in such a way that each ereg,split(i) and similar
slower functions get replaced by an equally working preg statement. The regex were
transformed almost without changes. Only to the get/substitute subpart methods bigger
changes were made as the actual eregs didn't work properly for all cases (at leas in
my opinion).

I wrote a test-suite for the new library to ensure that all methods behave exactly the
same. It tests each method of the old and new class with the same input and option parameters
and compares the outputs. Normally (except for the already mentioned "broken" get/substitute Subpart
methods) those outputs must match completly.

You can call each test-tool (one for each method) by invoking it like:
cd t3lib_parsehtml_pregtestsuite/testing/parsehtml/
./test_HTMLcleaner.php

The script will run the $this->HTMLclean() method a predifined amount of times and compare the
outputs. You can define a different amount of checks by giving an integer as first argument.

At the end the time consumption get's printed out. Normally the new version just take between
8(!!!) and 93% of the time the orignial methods did take.

You can change the test-data by editing "data_parsehtml.php" in the directory or exchanginge the
test[1-3].html testfiles.

This improvement increases the page rendering time of a non-cached page. On my AMD64 1.8GHz laptop a
simple page took 1.200-1300 ms too render (no accelerator installed, php profiling active)
with the patch it took about 1.100-1.200ms (overal 50-100 ms improvement)


Branches:
HEAD, 3.7(?), 3.8(?)

References:
http://bugs.typo3.org/view.php?id=1685

Files:
preg_t3lib_parsehtml.diff
t3lib_parsehtml_pregtestsuite.tar.gz



greets,
Bernhard
-- 
----------------------------------------------------------------------
"Freiheit ist immer auch die Freiheit des Andersdenkenden"
Rosa Luxemburg, 1871 - 1919
----------------------------------------------------------------------
-------------- next part --------------
A non-text attachment was scrubbed...
Name: preg_t3lib_parsehtml.diff
Type: text/x-patch
Size: 19229 bytes
Desc: not available
Url : http://lists.netfielders.de/pipermail/typo3-team-core/attachments/20051026/2204495b/attachment.bin 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: t3lib_parsehtml__pregtestsuite.tar.gz
Type: application/x-gzip
Size: 82737 bytes
Desc: not available
Url : http://lists.netfielders.de/pipermail/typo3-team-core/attachments/20051026/2204495b/attachment-0001.bin 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 256 bytes
Desc: OpenPGP digital signature
Url : http://lists.netfielders.de/pipermail/typo3-team-core/attachments/20051026/2204495b/attachment.pgp 


More information about the TYPO3-team-core mailing list