[TYPO3-core] Caches and Locking

Sat Mar 8 20:20:10 CET 2014

Hi Philipp,

thanks for your thoughts.

> 
> Hi Markus,
> 
> Markus Klein wrote:
> 
> > The access to the caches still needs proper locking for concurrency,
> > though. As outlined in the blueprint, we have one more major problem
> here:
> > Any implementation of the readers-writers-problem requires at least a
> > shared counter variable for the number of active readers. Such a
> > shared counter may be stored in a file, but this is rather slow and I
> > would not consider this an appropriate IPC utility for resources with
> > high access frequency like caches. Therefore we need to use shared
> > memory. The problem here is that no PHP library for such a purpose is
> compiled in by default.
> > (There are two of them, but only one being Win compatible as well.)
> 
> Isn't that what a semaphore is supposed to do? A shared counter that can be
> incremented or decremented.

Yes, a semaphore can be initialized to a certain number. Each call to sem_acquire will decrement it then.
By default it is initialized to 1, so it works as a mutex.
But this is not what we need here. We can have unlimited parallel read access to the caches. We actually just need to know, which process is the first to read, who will lock the write mutex, and which is the last process being done with the read, unlocking the write mutex.
(this is for the first readers-writers problem)

> 
> Keep in mind that many shared hoster do not have shared memory between
> two requests.
> We should not kick shared hosting sites as they are the majority of TYPO3
> users.
I fully agree. But then we're not able to synchronize access and race conditions are "guaranteed".

> 
> 
> IMHO we face several problems that exists since longer, but pop up here
> now:
> 
> 1) Many operations are not performed as atomic is they need to be.
> That happens a lot with SQL queries which should actually run as a single
> statement. (e.g. delete cache entries in DB backend or create a new record in
> backend) We can either rewrite large parts of the core (the current
> implementation of the DB class does not easily support combined
> statements, unless you build the query yourself) or we can add transaction
> support that does solve the whole concurrency stuff inside the DB. The later
> case might be easily possible to implement (we just need to wrap the
> relevant code parts with start and stop transaction) and this should work on
> many shared hosting providers as well.
> https://dev.mysql.com/doc/refman/5.0/en/commit.html
Transactions only work with InnoDB, right? (Also not available on many shared hosters)
Don't focus on cache entries in DB. We have to tackle all possible cache backends.
Only the applications knows what kind of synchronized access is necessary. So you can never fully centralize this.

> 
> 2) Write-always vs DOS and the readers-writers-problem Old problem, new
> skin. The whole problem comes from the need to be able to selectively
> delete individual cache entries instead of dropping the whole cache if it is not
> needed any more.
> Even worse, certain caches, like the class cache need to mess with the data
> while needing to maintain the cache entry as such.
> 
> Normally one would let the last-write win. However for expensive
> operations (such as image manipulation, etc) one does not want to trigger
> the expensive processes again to prevent DOS vectors.
> We should decide by a case-by-case basis if TYPO3 actually needs to provide
> synchronization or whether it makes sense to allow to skip the whole
> generation (e.g. via a config option) and rely on external processes to
> generate the data for us. I guess this is already done for media manipulation
> with FAL, however might need some fine-tuning of the API throughout the
> core. I am not sure about the current state.
I agree we need to assess any access to a shared resource, if we can "skip" the locking hassle and "rely" on a low probability of parallel execution problems.

> 
> The other problem here is the original stumbling block, the core code caches,
> especially the class loader.
Actually the class loader issues brought this to my attention, that synchronization was neglected all the time before.
Now that we have a cache at this very low level, we need to tackle this, as a simple calls to the BE (with page module as default module) already cause several parallel requests (frames, ajax).

> 
> I suggest to use a semi-transaction by using exclusive read-write locks by
> either using flock (process synchronization issues) or even better by using a
> lock file. Processes that fail to acquire a lock should continue without writing
> for those kind of caches (unless the file is older xx seconds).
> This should be fairly robust with only small overhead.
That's the first pitfall. File creation, time checking and all these operations are not atomic (as a block).
Hence there's always a way to get this to a race condition. (Actually this is the problem of the 'simple' locking of the Locker class currently.)

> 
> Thus we have three cases here:
> a) data is not used to calculate data and calculation is cheep
> --> always write on cache miss, concurrency does not matter
> b) operation is expensive (e.g. media manipulation), data not used in
> calculation process
> --> try locking if available, fallback to write always (for shared
> --> hosting)
> or allow to skip generation (for custom solutions)
> c) data is required for calculation of the data (readers-writers problem)
> --> relay on OS or database where possible (use transactions), otherwise
> fallback to lock files or shared memory locks (if supported)
> 
> 3) Missing stacked cache concept
> You already described this very well. We need to deal with the different
> level of persistence of caches and that uses might should a non-persistent
> cache for performance reasons that should hold should-be-persistent data.
> The question here is to what extent implementation can expect cache data
> to
> exists. IMHO they should always work the the NULL backend, meaning they
> should never assume that a cache entry the just write can be read
> immediately afterwards or any time later.
Correct, never assume a cache read will be a cache hit!
Moreover, we don't need stacked caches everywhere. We've to investigate where it makes sense.
(e.g. CPU has cache levels, DRAM sometimes one, HDDs one or two, etc.)

> 
> 4) Boilerplate code instead of central function with callback
> Currently all CF-using code does something like in the documentation:
> http://docs.typo3.org/typo3cms/CoreApiReference/CachingFramework/Dev
> eloper/Index.html#caching-developer-access
> The problem here is that this puts too much logic in the hand of the
> developer and that more advanced functions like stacked caches cannot be
> easily implemented.
> I agree the to concept of an high-level API that reads a cache entry and
> gets a generator function to generate and (possibly) store a missing cache
> entry to reduce the boilerplate to:
> $data = $highLvl->getCache('cache')->get($identifier, 'generator');
> ...
> function generator($identifier) {...}
> Such that the code can works by assuming that a cache entry *always* exists
> (either coming from cache or being generated on-the-fly).
> 
> 
> I suggest to only fix the classloader/code caches and leave the high level
> synchronization issues for either 6.2+1 or ship this as a major update for
> 6.2 (in the sense of a service pack or a RHEL X Update Y).
> I fear we are running short on time otherwise.
Yes that's my intention too. But still I'd like to have a well-thought concept, that we can implement for the classloader as a first step.

> 
> Best regards
> --
> Philipp Gampe – PGP-Key 0AD96065 – TYPO3 UG Bonn/Köln
> Documentation – Active contributor TYPO3 CMS
> TYPO3 .... inspiring people to share!
> 

Kind regards
Markus

------------------------------------------------------------
Markus Klein
TYPO3 CMS Active Contributors Team Member