[TYPO3-core] RFC #12232: Bug: md5_file() to check if a file has been changed is very expensive [performance]

Ernesto Baschny [cron IT] ernst at cron-it.de
Tue Jan 18 00:20:50 CET 2011


Vladimir Podkovanov schrieb am 18.01.2011 00:15:
> On 18.01.2011 1:44, Ernesto Baschny [cron IT] wrote:
>> It's true that image size caching (w / h) has nothing to do with the
>> decision to "regenerate" certain parts of it.
>>
>> So your conclusion (to try to sum up is) is that we don't need an
>> upgrade wizard because cleaning the cache_imagesizes table will happen
>> during normal operation anyway, is that right? Or the upgrade wizard
>> could simply TRUNCATE the mentioned cache table?
> 
> We don't need Upgrade Wizard because cache_imagesizes will rebuild
> itself and this rebuild doesn't affect performance too much as I thought
> before.
> Cleaning cache_imagesizes could be suggested just to speed up rebuild
> (no time spent on checking old hashes just create them from scratch) but
> it is not prerequisite so we can skip cleaning.
> 
>>
>> Now my fear is that this md5 hashing might indeed break some things
>> because it relies solely on filemtime and filesize. So if we have
>> 1.000.000 files, the probability that we have different files with the
>> same mtime *and* the same size is pretty big if we consider that we
>> might have lots of files with the same tstamp (because someone might
>> have batch-uploaded tons of files at the "same time" or because all
>> files have the same tstamp because they were restored from some backup
>> of "whatever"): You end up with crazy behaviour of wrong resizes being
>> generated.
>>
>> Maybe adding the "file-path" to the hashing would help?
>>
> 
> It is not a problem as md5hash used only to check if file changed and
> not as index. The table indexed by md5filename that is hash from filepath.
> If you worry about deleting row (now it is doing by md5hash and not
> md5filehash) then it is another bug (RFC #16685), I sent patch yet.

Ok, I think I've got it! Thanks for the pointer, will take a look at
16695 too.

Cheers,
Ernesto


More information about the TYPO3-team-core mailing list