[TYPO3-core] RFC #12232: Bug: md5_file() to check if a file has been changed is very expensive [performance]

André Stösel andre at stoesel.de
Sun Dec 12 17:47:52 CET 2010


> > b) by the way we don't need new DB field then, we can use old
> > md5hash field, so no DB upgrade needed:
> > 'md5hash' => md5(filesize($identifyResult[3]) .
> > filemtime($identifyResult[3])),
> yes, please use this solution

I disagree.
This would call 2 "filesytem-query" + md5-hash for each file/image even
if one (filemtime) is enough.
I'm not an expert on filesystem behavior, but in the worst case this
would mean something like this:

  PHP: filemtime('filename')
   FS: look up the inode for `filename`
   FS: update atime for `filename`
  PHP: filesize('filename')
   FS: look up the inode for `filename`
   FS: update atime for `filename`
   
I would prefer to check the filesize only if the mtime differs from the
mtime from db.

> > c) AND make an extension that will parse cache_imagesizes table and
> > recalculates all hashes, it will be fast and no rebuilds needed
> > then.
> Vladimir, could you provide an Upgrade Wizard for the Install Tool, 
> which recalculates the hashes based on md5(filesize() . filemtime())
> and writes them to DB?

Or just keep the md5_file if filesize and filemtime are empty and
update the record.


More information about the TYPO3-team-core mailing list