[TYPO3-dev] MySQL, search and utf8

ries van Twisk typo3 at rvt.dds.nl
Tue Oct 9 14:14:33 CEST 2007


On Oct 9, 2007, at 4:23 AM, Steffen Kamper wrote:

>
> "Steffen Kamper" <steffen at sk-typo3.de> schrieb im Newsbeitrag
> news:mailman.1.1191866513.10281.typo3-dev at lists.netfielders.de...
>> Hi,
>>
>> i have a non-TYPO3-question.
>>
>> I perform a search in MySQL and for example look for the word  
>> "schön". So
>> my query looks like:
>>
>> Select .... where ... and (body like "schön%" or body like "%schön 
>> %" or
>> body like "%schön" or body = "schön")
>>
>> looking to the results i see that MySQL shows up result with the word
>> "schon" which is different. Thoug it is utf8, the letter ö is  
>> saved in
>> utf8 as 2-Byte, so why gives MySQL me the result with o?
>>
>> And how can i provide this to get only exact matches with "schön"?
>>
>> thx, vg  Steffen
>>
>
> it seems that like is not binary safe in utf8-context.
>
> I got the right results by using LIKE BINARY, so my query works  
> like this:
>
> SELECT * , count( * ) anz
> FROM tx_drwiki_pages
> WHERE hidden =0
> AND deleted =0
> AND (
> body LIKE binary "schön%"
> OR body LIKE binary "%schön%"
> OR body LIKE binary "%schön"
> OR body = "schön"
> )
> GROUP BY keyword
> ORDER BY anz DESC
>
> vg  Steffen
>
>
> _______________________________________________
> TYPO3-dev mailing list
> TYPO3-dev at lists.netfielders.de
> http://lists.netfielders.de/cgi-bin/mailman/listinfo/typo3-dev

Hey Steffen and all,

I think changing the column to binary is not the correct solution
and will lead to not finding some words under some conditions.
Remember with UTF8 each character can be 1 or more bytes!!

Steffen, can you set the column to latin_german2_ci and see what the  
results are?

Ries









More information about the TYPO3-dev mailing list