[TYPO3-dev] Missing feature in t3lib_cache_backend_MemcachedBackend
Christian Kuhn
lolli at schwarzbu.ch
Wed Aug 11 22:00:59 CEST 2010
Hey.
Ugh, pretty long post, hope it's still interesting :)
Chris Zepernick {SwiftLizard} wrote:
> There is no way to check if a server is connected, so we can not check
> on initialization if connect really worked.
This doesn't really fit to the interface, and such a change should go to
FLOW3 first.
FYI: The current state of memcache backend:
- There is a pending FLOW3 issue which simplifies and speeds up the
implementation a bit, see [0] for details. This patch will be backported
to v4 as soon as it has been patched to FLOW3.
- There is a systematic problem with the memcache backend: memcache is
just a key-value store, there are no relations between keys. But we need
to put some structure in it to store the identifier-data-tags relations.
So, for each cache entry, there is a identifier->data entry, a
identifier->tags and a tags->identifier entry. This is by principle a
*bad* idea with memcache for the following reasons:
-- If memcache runs out-of-memory but must store new entries, it will
toss *some* other entry out of the cache (this is called an eviction in
memcache-speak).
-- If you are running a cluster of memcache servers and some server
fails, key-values on this system will just vanish from cache.
The above situations will both lead to a corrupt cache: If eg. a
tags->identifier entry is lost, dropByTag() will not be able to find the
corresponding identifier->data entries that should be removed and they
will not be deleted. So, your cache might deliver old data on get()
after that.
I have an implementation for collectGarbage() in mind, to at least find
and clean up the state of the cache if such things happen (didn't really
implement that, so I'm unsure if it actually works).
But, important thing is: If you are running a cluster, your cache *will*
be corrupt if some server fails, and if some of the memcache cluster
systems begins to evict data, your cache *will* be corrupt as well.
Keep these things in mind if you choose to use memcache. BTW: There is
an extension called memcached_reports in TER [4], which shows some
memcache server stats within the reports module.
An implementation without those problems is the redis backend, it's
already pending in FLOW3 with issue [1], it scales *very* well, even
better than memcache. redis [2] is a young project and as such a bit
experimental, though.
Best bet is currently still the db backend, maybe with the newly added
compression (very usefull for bigger data sets like page cache). We're
running several multi gigabyte caches without problems, it just won't
scale *much* more: The db backend typically slows down if you are unable
to give mysql enough RAM to make the cache table fully RAM driven. Be
sure to tune your mysql innodb settings if you have big tables!
If you are really interested in performance of the different backends,
you could also give enetcacheanalytics [3] a shot, it comes with a BE
module to run performance test cases against different backends.
I have started writing documentation about the caching framework, but
it's not finished yet. As a start, here is a sum up of your current
backend alternatives:
* apcBackend: Lightning fast wih get() and set(), but doesn't fit for
bigger caches, only usable if you're using apc anyway. If seen heavy
memory leaks with php 5.2
* dbBackend: Mature. Best bet for all usual things. Scales well until
you run out of memory. For 4.3 the insertMulipleRows() patch is
recommended if you're adding many tags to an entry (delivered with 4.4).
* fileBackend: Very fast with get() and set(), but scales only O(n) with
the number of cache entries on flushByTag(). This makes it pretty much
unusable for page caches. FLOW3 uses it for AOP caches, where it fits
perfectly well.
* pdoBackend: Alternative for dbBackend, *might* be neat with a db like
Oracle, but currently untested by me in this regard (I just tested with
sqlite, where it sucks, but that is because of sqlite).
* memcachedBackend: Ok performancewise, but has the mentioned drawbacks.
* redisBackend: Experimental, but architecture fits perfectly to our
needs. Pretty much every operation scales O(1) with the number of cache
entries. I'm able to give O-notations for every operation, depending on
number of input parameters and number of cache entries.
Regards
Christian
[0] http://forge.typo3.org/issues/8918
[1] http://forge.typo3.org/issues/9017
[2] http://code.google.com/p/redis/
[3] http://forge.typo3.org/projects/show/extension-enetcacheanalytics
[4] http://typo3.org/extensions/repository/view/memcached_reports/current/
More information about the TYPO3-dev
mailing list