[TYPO3-core] RFC: Bug #5088: cache is not saved properly because of charset conflict in the database

Martin Kutschker Martin.Kutschker at n0spam-blackbox.net
Fri Mar 30 12:45:17 CEST 2007


Michael Stucki schrieb:
> Hi Martin,
> 
> 
>>>How to reproduce:
>>>| Here is a list of prerequisites:
>>>| - The backend must run in latin1 (no forceCharset=utf8 set!)
>>>| - setDBinit=SET NAMES utf8;
>>>| - cache_hash.content = utf8_general_ci
>>>| - Your template contains two page types (typeNum)
>>>| - Your template contains special characters
>>>|
>>>| Changing a single one of these requirements makes the bug go away.
>>
>>But with SET NAMES utf8 you set the mysql client encoding to UTF8.
> 
> 
> While discussing this with people who reported the bug, we found out that they 
> had been using ISO-8859-1 once, switched to UTF-8, but did not convert the 
> sys_templates table. So the database is storing UTF-8 data, but the content 
> (charsets of sys_template contents are never converted) was still 
> ISO-8859-1...
> 
>>Mysql client encoding and BE encoding must match.
> 
> That's the point. Since it doesn't match, we should change the field back to 
> mediumblob.

But that's a stupid setup. Mysql (and other DBs) will translate the content 
from the server charset to the client charset. That's fine and good and 
will work for sane setups.

The described setup sends latin1 data as utf8 into the DB which stores them 
  as utf8 (at least for cache_hash.content). This is nonsense.

What I see here is that you must follow one rule: if you change the charset 
somewhere - clear ALL caches! And read the docs before you fiddle with SET 
NAMES.

But I agree that we should take more care when decide to store data as TEXT 
(charset dependent) or as BLOB (charset independent). I have to admit I was 
part of the lets-get-rid-of-the-BLOB crowed but I think we should think 
more before simply revert those changes.

The case above is no reason for me to change anything, because the setup is 
broken. Of course we need to check what the impacts are for correct setups.

Masi


More information about the TYPO3-team-core mailing list