[TYPO3-dev] MySQL, search and utf8

Stefan Geith typo3dev2007 at geithware.de
Tue Oct 9 08:48:53 CEST 2007


Steffen Kamper schrieb:
> ...
> Select .... where ... and (body like "schön%" or body like "%schön%" or body 
> like "%schön" or body = "schön")
> 
> looking to the results i see that MySQL shows up result with the word 
> "schon" which is different. Thoug it is utf8, the letter ö is saved in utf8 
> as 2-Byte, so why gives MySQL me the result with o?

I have a similar Problem:
I try to make an ABC-List for a glossary with links on only these
characters, that have entries in the database.

For that I use something like
     SELECT right(keyword,1) AS firstchar, count(*) AS cnt
     FROM   tx_sgglossary
     WHERE  deleted=0 AND hidden=0
     GROUP BY firstchar
     ORDER BY firstchar

All is set to UTF8 and the default collation in UTF-general

This works, but *only* if the first character is no 'Umlaut'.
Example:
Lets assume these records:
    Apple,Ärger,Bar,Öl,Ozz,Überzug
Then I get
    A 2
    B 1
    Ö 2
    Ü 1
So I never get the letters O and U linked in my ABC-List.

My first workaround was, to do change the query to
     SELECT CONVERT(RIGHT(keyword,1) USING `ascii`) AS ...
But this leads to
    ? 3
    A 1
    B 1
    O 1
So I never get the Letter U (that contains Ü*) listed :(

-----------

So is there any solution for this problem ?

Or do I realy have to do a huge manual REPLACE(REPLACE(...)),
that replaces each Umlaut with its replacement in the query ?

Maybe there is a function in mysql that converts a string to
its 'collation-equivilant' ? Something like soundex() but simply
converts the string to the form that is used for comparison ?

Same problem should exist with accents that are used in french -
so it should not only be a 'german problem' ...

-

Stefan






More information about the TYPO3-dev mailing list