[TYPO3-core] RFC Feature #15141: [Caching framework] Add compress data options to DbBackend

Christian Kuhn lolli at schwarzbu.ch
Thu Jul 15 23:31:31 CEST 2010


This is an SVN patch request.

Type: Feature

Branches: trunk, Performance improvement

BT: http://bugs.typo3.org/view.php?id=15141

Problem:
RDBM's tend to slow down if tables are too large to fit into memory. 
This is a problem especially with big cache tables and big chunks of data.

Solution:
Implement options for db backend to compress cache data with zlib:
- content field of caching framework tables must be changed to blob 
instead of text, to be able to store binary data
- Add "compress" option for DbBackend
- Add "compressionLevel" option for DbBackend to enable a selection 
between data size and cpu tradeoff

How to test:
- Apply patch, change db fields to mediumblob
- Additional unit tests show that compression is actually done right
- localconf.php:
$TYPO3_CONF_VARS['SYS']['useCachingFramework'] = '1';
$TYPO3_CONF_VARS['SYS']['caching']['cacheConfigurations']['cache_pages'] 
= array(
   'frontend' => 't3lib_cache_frontend_StringFrontend',
   'backend' => 't3lib_cache_backend_DbBackend',
   'options' => array(
     'cacheTable' => 'cachingframework_cache_pages',
     'tagsTable' => 'cachingframework_cache_pages_tags',
     'compressian' => TRUE,
   ),
);

Notes:
This is a performance improvement for the DbBackend if cache tables like 
cache_pages grow large (lots of rows with big data chunks). The data 
table typically shrinks to 20% of original size, which is a great 
benefit if fiddling with multi GB of cache tables.
The patch is a rip-off from the compressed db backend which is 
successfully delivered with enetcache since 4.3. It frees a lot of RAM 
on DBMS, to be used elsewhere on a server.

Numbers:
- On a production system with 4-5 GB of cache tables we saw a lot of 
slow insert queries (~8GB of RAM given to mysql, together with a pretty 
much optimized mysql innodb setup), with compressed backend the main 
cache table shrinked to < 1GB, and all slow inserts where immediatly 
gone. The additional CPU overhead is marginal.

Graphics:
Attached is a graphic from enetcacheanalytics performance test suite 
(mysql running on localhost):
- GetByIdentifier get()'s an increasing number of previously set entries 
from cache and measures time taken. -> The overhead of compressing data 
is not that big, zlib is pretty fast.
- SetKiloBytesOfData set()'s a stable number of cache entries with 
growing data size. With small data sizes, timing for compressed and 
uncompressed Backends are nearly the same, but the compressed backend 
speeds up a lot with bigger data chunks. -> Compress overhead is much 
smaller than setting big data chunks to mysql.
- Play with enetcacheanalytics to get more numbers and tests for your 
system, compressed backend is always quicker if number of rows and data 
size grows.

Regards
Christian
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 15141_01.diff
Type: text/x-patch
Size: 13085 bytes
Desc: not available
URL: <http://lists.typo3.org/pipermail/typo3-team-core/attachments/20100715/9aaa00cc/attachment-0001.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 15141_performance_difference.png
Type: image/png
Size: 70876 bytes
Desc: not available
URL: <http://lists.typo3.org/pipermail/typo3-team-core/attachments/20100715/9aaa00cc/attachment-0001.png>


More information about the TYPO3-team-core mailing list