[TYPO3-english] TYPO3 and character encoding problems
Pero Peric
pperic at mail.com
Thu Jul 10 11:54:35 CEST 2014
On 10.7.2014. 11:10, Jigal van Hemert wrote:
> Hi,
>
> On 9-7-2014 17:08, Pero Peric wrote:
>> Character Č hex. UTF8 code is C48C. So because character_set_client was
>> set to Latin1, this was not a code for UTF8 char but for Latin1 char. So
>> it arrived to mysql as C48C Latin1. Because DB, table and fields were
>> set to UTF8, conversion was made by mysql. C4 (what is in Latin1 hex.
>> for Ä) became C384 (Ä in UTF8) and 8C (what is a control char in Latin1)
>> became C28C (what is also some control char in UTF8 i suppose).
>
> Interesting. This is a variation of what my script tests for. The most
> common problem is that the table is latin1 and utf-8 data is stored in
> it. Character Č is then displayed as Ä + control code if you look in the
> database.
>
> What could be done is:
> - first convert table and fields to latin1. C384 C28C => C48C
Can you give me a hint how to do this conversion? Like mysqldump
--default-character-set=latin1 or you have something other on your mind?
> - second convert fields to binary data type and convert that to utf-8
Hm, this could be painy.. i suppose i need some script for this.
> The last step does two sneaky things. First MySQL will use the binary
> content and mark the column as a binary field. The second part lets
> MySQL interpret the binary data as utf-8 (without converting anything in
> the actual data).
>
> The simpler version of my converter could be changed to make both
> changes. I'll have a look.
Btw. i found one PHP script that does backup of database in pure PHP. I
thought that this could work because it would be reverse process.
Unfortunately it didn't work. Now i went further and wrote little PHP
script to see how is data from TYPO3 pages table/title field displayed
in browser. I thought that i would get right characters on the screen
because TYPO3 4.4.0 displays them correctly. But i was wrong. I tried
with set names utf-8, without, with latin1 but nothing worked. Only
garbage chars are displayed. How the hell TYPO3 displays this correctly?
Regards.
More information about the TYPO3-english
mailing list