[TYPO3-dev] MySQL, search and utf8

Pierre Rouwens pierre_rouwens at dmr.ca
Tue Oct 9 04:38:49 CEST 2007


Steffen Kamper a écrit :
> "ries van Twisk" <typo3 at rvt.dds.nl> schrieb im Newsbeitrag 
> news:mailman.1019.1191882880.20721.typo3-dev at lists.netfielders.de...
> 
> On Oct 8, 2007, at 5:11 PM, Steffen Kamper wrote:
> 
>> "Steffen Kamper" <steffen at sk-typo3.de> schrieb im Newsbeitrag
>> news:mailman.1.1191880845.21662.typo3-dev at lists.netfielders.de...
>>> "ries van Twisk" <typo3 at rvt.dds.nl> schrieb im Newsbeitrag
>>> news:mailman.1001.1191868206.20721.typo3-dev at lists.netfielders.de...
>>>
>>> On Oct 8, 2007, at 1:05 PM, Martin Kutschker wrote:
>>>
>>>> Steffen Kamper schrieb:
>>>>> Hi,
>>>>>
>>>>> i have a non-TYPO3-question.
>>>>>
>>>>> I perform a search in MySQL and for example look for the word 
>>>>> "schön".
>>>>> So my
>>>>> query looks like:
>>>>>
>>>>> Select .... where ... and (body like "schön%" or body like "% schön %" 
>>>>> or
>>>>> body
>>>>> like "%schön" or body = "schön")
>>>>>
>>>>> looking to the results i see that MySQL shows up result with the  word
>>>>> "schon" which is different. Thoug it is utf8, the letter ö is   saved 
>>>>> in
>>>>> utf8
>>>>> as 2-Byte, so why gives MySQL me the result with o?
>>>> Mysql knows three character sets: the one the clients sends  (queries),
>>>> the one the server uses (data storage) and the one the the  server 
>>>> sends
>>>> (query results).
>>>>
>>>> Seems like the last settings is for some reason not UTF8. See  the 
>>>> Mysql
>>>> docs on how to set (and check) those settings.
>>> It has something todo what collation is used internally for sorting
>>> and filtering...
>>> In this case MySQL did a good thing and did stick to standards... see
>>> below..
>>>
>>>> Masi
>>>
>>> Hey steffen,
>>>
>>> this document will explain it to you..
>>> http://dev.mysql.com/doc/refman/5.0/en/charset-collation-effect.html
>>>
>>> and this one how to solve your specific problem..
>>>
>>> http://dev.mysql.com/doc/refman/5.0/en/charset-collate.html
>>>
>>> Ries
>>>
>>> indeed it seems to work proper with
>>>
>>> SELECT ... FROM tx_drwiki_pages COLLATE latin1_german2_ci ...
>>>
>>> So i don't really understand what MySQL internally does because i  only 
>>> see
>>> collate as a ORDER BY issue but it seems to be more ...
>>>
>>> vg  Steffen
>>>
>>>
>> sry, it doesn't work - i only didn't saw a mysql error.
>> My complete query is
>>
>> SELECT *,count(*) anz FROM tx_drwiki_pages
>> WHERE hidden=0 and deleted=0 and (body like "schön%" or body like "% 
>> schön%"
>> or body like "%schön" or body = "schön")
>> GROUP BY keyword
>> ORDER BY anz desc
>>
>> vg  Steffen
>>
>>
>> _______________________________________________
>> TYPO3-dev mailing list
>> TYPO3-dev at lists.netfielders.de
>> http://lists.netfielders.de/cgi-bin/mailman/listinfo/typo3-dev
> 
> 
> what about this:
> 
> SELECT *,count(*) anz FROM tx_drwiki_pages
> WHERE hidden=0 and deleted=0 and (body like "schön%" COLLATE
> latin1_german2_c or body like "%schön%"  COLLATE latin1_german2_c
> or body like "%schön" COLLATE latin1_german2_c or body = "schön"
> COLLATE latin1_german2_c)
> GROUP BY keyword
> ORDER BY anz desc
> 
> What version of MySQL are you using? You seem to use an aggregate
> function
> without using a aggregate on your field names in your SELECT clause
> which is not valid...
> 
> Ries
> 
>  #1253 - COLLATION 'latin1_german2_ci' is not valid for CHARACTER SET 'utf8'
> 
> i'm using MySQL-Version: 5.0.44-log
> 
> vg  Steffen
> 
> 
> 
> 
> 
> 
> 
Hi,

This post has nothing to do with your original question ... but instead 
of using :
SELECT *,count(*) anz FROM tx_drwiki_pages
 > WHERE hidden=0 and deleted=0 and (body like "schön%" or body like "%
 > schön%"
 > or body like "%schön" or body = "schön")
 > GROUP BY keyword
 > ORDER BY anz desc

Why not just use mysql in boolean mode ???

where body match against ('body','schön')

with a full index on your body field...

Maybe i've misunderstood your query in this case just forget my post :)

Pierre




More information about the TYPO3-dev mailing list