[TYPO3-english] How can I convert my database to utf8?
Jigal van Hemert
jigal at xs4all.nl
Fri Jun 4 22:32:18 CEST 2010
Hi Jörg,
Jörg Klein wrote:
> "Jörg Klein" <joerg at klein-family.com> schrieb im Newsbeitrag
> news:mailman.1.1275528474.13822.typo3-english at lists.typo3.org...
>>> Furthermore, if you use the following settings in the Install Tool you
>>> should have a UTF-8 installation:
>>>
>>> $TYPO3_CONF_VARS['SYS']['setDBinit'] = 'SET NAMES utf8;';
>>> $TYPO3_CONF_VARS['BE']['forceCharset'] = 'utf-8';
>> I just tried that and it works! Thank you so much!
>
> My problem arose from another setting in
> $TYPO3_CONF_VARS['SYS']['setDBinit'].
> My provider wrote that the following should be set there:
> $TYPO3_CONF_VARS['SYS']['setDBinit'] = 'SET NAMES utf8;\'.chr(10).\
> 'SET CHARACTER SET utf8;\'.chr(10).\
> 'SET SESSION character_set_server=utf8;';
I looked up all those settings a few times before and also posted a
lengthy explanation on one of the TYPO3 lists a few weeks ago. You can
look the commands up in the online MySQL manual if you want :-)
Bottom line is that SET NAMES utf8; sets the correct variables for
charsets and collation (it uses the default collation for utf8:
utf8_general_ci) to make sure that both the MySQL client (=the functions
in PHP), the connection and the result set of a query is in utf-8.
>> I just tried to understand what exactly your code does:
>
> For the conversion your code temporarily changes the type of some columns:
> char to binary and text to blob.
> I never did that. Why is that needed?
The temporary changes to binary/blob types (also varchar to varbinary,
mediumtext to mediumblob, etc.) is used if there is utf-8 encoded data
in for example latin-1 fields.
Changing the type to binary/blob does nothing with the data itself, but
MySQL will now see the data as binary and does not interpret it as
having a charset and collation.
Changing the type back to char/text again and setting the charset and
collation does nothing with the data itself, but MySQL will now treat
the data as string data with the defined charset and collation.
The total effect is that the already utf-8 encoded data is now seen by
MySQL as utf-8 data.
If the data is correctly encoded in non-utf-8 columns you can simply
turn the lines which perform the first query into comments. The script
will then just set each column to utf-8.
The rest of the script gives some visual feedback (also useful to keep
the connection to the server open!) and sets the default
charset/collation for all tables and the database itself.
You can do this 'conversion' by hand, but doing this for dozens of
tables each with several columns to handle is not something I would look
forward to.
--
Jigal van Hemert
skype:jigal.van.hemert
msn: jigal at xs4all.nl
http://twitter.com/jigalvh
More information about the TYPO3-english
mailing list