[TYPO3-dev] MySQL, search and utf8

Steffen Kamper steffen at sk-typo3.de
Tue Oct 9 09:59:36 CEST 2007


"Stefan Geith" <typo3dev2007 at geithware.de> schrieb im Newsbeitrag 
news:mailman.1040.1191912746.20721.typo3-dev at lists.netfielders.de...
Steffen Kamper schrieb:
> ...
> Select .... where ... and (body like "schön%" or body like "%schön%" or 
> body like "%schön" or body = "schön")
>
> looking to the results i see that MySQL shows up result with the word 
> "schon" which is different. Thoug it is utf8, the letter ö is saved in 
> utf8 as 2-Byte, so why gives MySQL me the result with o?

I have a similar Problem:
I try to make an ABC-List for a glossary with links on only these
characters, that have entries in the database.

For that I use something like
     SELECT right(keyword,1) AS firstchar, count(*) AS cnt
     FROM   tx_sgglossary
     WHERE  deleted=0 AND hidden=0
     GROUP BY firstchar
     ORDER BY firstchar

All is set to UTF8 and the default collation in UTF-general

This works, but *only* if the first character is no 'Umlaut'.
Example:
Lets assume these records:
    Apple,Ärger,Bar,Öl,Ozz,Überzug
Then I get
    A 2
    B 1
    Ö 2
    Ü 1
So I never get the letters O and U linked in my ABC-List.

My first workaround was, to do change the query to
     SELECT CONVERT(RIGHT(keyword,1) USING `ascii`) AS ...
But this leads to
    ? 3
    A 1
    B 1
    O 1
So I never get the Letter U (that contains Ü*) listed :(

-----------

So is there any solution for this problem ?

Or do I realy have to do a huge manual REPLACE(REPLACE(...)),
that replaces each Umlaut with its replacement in the query ?

Maybe there is a function in mysql that converts a string to
its 'collation-equivilant' ? Something like soundex() but simply
converts the string to the form that is used for comparison ?

Same problem should exist with accents that are used in french -
so it should not only be a 'german problem' ...

-

Stefan

Hi Stefan,

good to knopw that i'm not alone with this :-)

You could try following::
ASCII(UPPER(SUBSTRING(keyword,1,1)))

maybe this conversion helps, but maybe you have to replace Ü with U before.


vg  Steffen








More information about the TYPO3-dev mailing list