[TYPO3-english] TYPO3 and character encoding problems

Pero Peric pperic at mail.com
Mon Jul 7 12:01:12 CEST 2014


On 7.7.2014. 11:15, bernd wilke wrote:
> Am 07.07.14 10:50, schrieb Pero Peric:
>> Hi,
>>
>> I tried to upgrade my TYPO3 4.4.0 ver. to 4.7.19 but I ran into
>> character encoding problems (in 4.4.0 all is working fine). I would
>> really appreciate if someone could explain me what is going on here.
>>
>> In 4.4.0 i have:
>>
>> [BE][forceCharset] = utf-8
>>
>> [SYS][setDBinit] is empty, i don't have SET NAMES UTF-8 here.
>>
>> DB and all tables are created as UTF-8.
>>
>> My MySQL character encoding variables look like this:
>>
>> | character_set_client     | utf8                       |
>> | character_set_connection | utf8                       |
>> | character_set_database   | latin2                     |
>> | character_set_filesystem | binary                     |
>> | character_set_results    | utf8                       |
>> | character_set_server     | latin2                     |
>> | character_set_system     | utf8                       |
>> | character_sets_dir       | /usr/share/mysql/charsets/ |
>>
>> In 4.4.0 i get all characters displayed correctly.
>>
>> So this is what i have in MySQL DB (i will show example on 1 character).
>>
>> Character Č is represented as hex. C384 what is UTF-8 character Ä. Now
>> what i would really want to know - how the hell something that is UTF-8
>> character Ä becomes character Č in 4.4.0? I suppose this has something
>> to do with forceCharset directive so does anybody know what in fact this
>> directive do?
>>
>> When i do upgrade to 4.7.19 this character is displayed as Ä (i would
>> say properly by it's hex. code) but not properly for me :-) If i create
>> page in 4.7.19 called Č it is stored as right hex. UTF-8 code for char Č
>> and that is C48C.
>>
>> So to summerize. In 4.4.0 character Č is stored in DB like hex. C384 and
>> displayed correctly on the screen, while in 4.7.19 it is displayed as
>> character Ä.
>>
>> If someone could explain me what is 4.4.0 doing here maybe i could
>> convert this properly for 4.7.19. Thank you!
>>
>
> let's see the history:
> since 4.6 it was neccessary to have a clean database with all coding in
> utf-8
> up to 4.5 it was possible to have a latin coding for single fields but
> with setDBinit and forceCharset use utf-8 for real.
>
> the problem is different size of one character may use and you have to
> reserve additional space to have a proper coding of 3 byte utf-8
> characters fitting in string fields.
>
> for cleaning up your database down to the fields there were a lot of
> scripts. (e.g.[1])
>
>
> as I state on the page (in german):
> afterwards it is preferred to use no entries for setDBinit and
> foreceCharset at all. Or just the default values, as empty values would
> break functioning.
>
> and be careful with new fields (by new extensions) if your database/
> your tables have no utf-8 setting by default as the fields may be
> created in latin!
>
>
>
> [1] http://pi-phi.de/293.html
>
> bernd

Bernd thank you for help. I also found this thread:

http://www.typo3forum.net/forum/typo3-installation-updates/55842-umstellung-utf8-update-typo3-4-7-a.html 


so i will try those methods/scripts.

Regards.


More information about the TYPO3-english mailing list