[TYPO3-core] RFC: #10201: Duplicate cHash Values

Bernhard Kraft kraftb at kraftb.at
Tue Apr 28 23:42:20 CEST 2009


Steffen Kamper wrote:

> with realurl you don't see the cHash in url. md5 is unique, shortmd5 
> not. So it's no real decision, it's a must imho.

Hallo Steffen.

I have to correct you. Mathematically neither md5 nor the "short-md5" in
TYPO3 can be unique. If you have 16 bytes (32 hex-chars) then there are
256^16 == 16^32 possibilities for the md5-value (which is about 10^38)

If you use only the first 10 characters (5 bytes) of the md5-sum then
there are only 256^5 =~ 10^12 possibilities. As every byte increases
the number of possible variations by 256 times, its of course a drastic
difference.

Lets think of all variations of a file with lets say 50 byte. It's clear
then, if you reduce the "amount of data" in this file to 16 bytes by
calculating the md5 sum (kind digit sum) that some variations must result
in the same value.

So altough very unprobable its still possible that even with the full md5
value, the mentioned "md5 not unique" error occurs.


AFAIR the "duplicate cHash" sql error message, when putting a page into
cache, wasn't caused by two content pages resulting in the same cHash.
I do not exactly remember what was the cause (or if I even did some
research when I stumbled upon the problem) ... of course it could be
different in your case with those many pages.



greets,
Bernhard
-- 
Freiheit ist immer Freiheit des Andersdenkenden.
Rosa Luxemburg, 1871-1919
--------------------------------------------------
www.think-open.at


More information about the TYPO3-team-core mailing list