[TYPO3] typo3 database utf-8 conversion

ries van Twisk typo3 at rvt.dds.nl
Wed Jan 16 13:34:50 CET 2008


On Jan 15, 2008, at 11:34 PM, Andreas Becker wrote:

> Thanks Ries
>
> > @ Steffen
> > What about mysqldumper.
> > It offers the export to utf8 - does it do any conversion?
> > Won't it be possible to dump a file directly to utf8 and then you
> > create a
> > new database - setting charsets and collations to utf8 and restoring
> > your
> > dumped utf8 file?
> > What actually is mysqldumper doing when it offers to export to utf8?
> >
> > mysqldumper would be also a very useful tool to perform such a
> > conversion,
> > as it offers to store and upload BIG datafiles without timeout
> > problems.
> > Is there a chance to implement this utf8 conversion there if it  
> isn't
> > already existing?
>
> 2) newer versions ALWAYS dump in utf-8 by default.
>
> That's why I asked Steffen about integrating the last part - the  
> "conversion" (search/replace) of the charset and collation settings  
> could not easily be integrated into mysqldumper.

ok...
>
>
> that is you simply load each record and each field of each table.
> Check if it's a serialized array.
> If so, de-serialize it and then re-load it (using a other DB
> connection) into a new utf-8 database.
> I have done something like that to migrate a mysql database to
> postgresql and it works perfectly.
>
> Can't this Part be automized like a tool like mysqldumper? Checking  
> if an array is serialized and if so de-serialize it?
> I guess there will be a need for such a tool as in the next year I  
> guess lots of people and sites will or have to convert to utf8

I don't know what mysqldumper is, if it's this thing : http://www.mysqldumper.de/en/
then it can be baked into that I guess.

>
>
> During the conversion from latin-x to utf-8 you have a change that
> single byte characters get's converted
> to 2 or more byte characters and that the length indicator of a
> serialized object is wrong.
>
> For sure it will be wrong but do you think the way to export the  
> database as binaries wouldn't solve these problems, as binaries  
> won't get converted? or am I wrong?

I don't see how you can export any database as binary...

>
>
> The idea came up after reading these articles here:
> http://www.mysqlperformanceblog.com/2007/12/18/fixing-column-encoding-mess-in-mysql/

Well, again the problem is with serialized array....

>
>
> In wordpress they have a plugin now duing the conversion and it sems  
> to work fine
> http://wordpress.org/support/topic/117955
> http://g30rg3x.com/utf8-database-converter/

I don't know wordpress, but properly they didn't store any data as  
serialized arrays in a database.
then conversion is simple

Ries

>
>
>
> Andi
>

--
Ries van Twisk
Freelance TYPO3 Developer
email: ries at vantwisk.nl
web:   http://www.rvantwisk.nl/
skype: callto://r.vantwisk
Phone: + 1 810-476-4193









More information about the TYPO3-english mailing list