[TYPO3-dev] [TYPO3-core] RFC: #10201: Duplicate cHash Values
Ries van Twisk
typo3 at rvt.dds.nl
Wed Apr 29 00:07:12 CEST 2009
On Apr 28, 2009, at 4:42 PM, Bernhard Kraft wrote:
> Steffen Kamper wrote:
>
>> with realurl you don't see the cHash in url. md5 is unique, shortmd5
>> not. So it's no real decision, it's a must imho.
>
> Hallo Steffen.
>
> I have to correct you. Mathematically neither md5 nor the "short-
> md5" in
> TYPO3 can be unique. If you have 16 bytes (32 hex-chars) then there
> are
> 256^16 == 16^32 possibilities for the md5-value (which is about 10^38)
>
> If you use only the first 10 characters (5 bytes) of the md5-sum then
> there are only 256^5 =~ 10^12 possibilities. As every byte increases
> the number of possible variations by 256 times, its of course a
> drastic
> difference.
>
> Lets think of all variations of a file with lets say 50 byte. It's
> clear
> then, if you reduce the "amount of data" in this file to 16 bytes by
> calculating the md5 sum (kind digit sum) that some variations must
> result
> in the same value.
>
> So altough very unprobable its still possible that even with the
> full md5
> value, the mentioned "md5 not unique" error occurs.
>
>
> AFAIR the "duplicate cHash" sql error message, when putting a page
> into
> cache, wasn't caused by two content pages resulting in the same cHash.
> I do not exactly remember what was the cause (or if I even did some
> research when I stumbled upon the problem) ... of course it could be
> different in your case with those many pages.
>
>
basically what Berhard is saying is that if you just add 8 characters
to the short MD5,
the Probability will be 256x256*256*256=4294967296 lower to hit
the same value twice, if you make it 4 chars longer it's 65536 times
lower.
may be a install tool setting might be something? You don't hear to
often anyways
that there are people with chash clashes.
Ries
More information about the TYPO3-dev
mailing list