[TYPO3-core] RFC: Bug #5088: cache is not saved properly because of charset conflict in the database

Michael Stucki michael at typo3.org
Mon Apr 2 13:59:09 CEST 2007


Hi Martin,

> And if you choose Russian it will be something else. I'm aware that a
> multi-charset installation is a problem as the http connection will use
> different charsets, but the db connection will always use the same (set
> per my.cnf or SET NAMES).

Yes, so in this case the database connection uses UTF-8, but the data which is 
sent comes from a Latin1 browser. This is no problem when saving this data, 
because sys_template.config (the "setup" field) is defined as blob.

Changing this field into "text" without using a migration tool would probably 
strip some important(?) data from the template, so I think it's not the 
preferred solution, but should also fix the problem.

> In this case the data of cache_hash will indeed be in different
> charsets. So I seem, that in fact this data is binary as it is not
> stored plainly but as a serialized array.

sys_template.config is not a serialized array but still is defined as blob. 
The main difference is, that cache_hash.content is mediumtext, so the content 
will be converted, no matter if the data is serialized or not.

> You said they use latin1 for the http connection, but utf8 for the db
> connection. They may store the data in utf8, but the db connection must
> be in latin1. In this case Mysql will transparently convert the data.
>
> But after reading the posts again I don't know what happens. The reports
> vary (some say even they have problem with latin1 only running TYPO3
> 4.0?!?).

Yes, this also seems weird to me. I tend to think it's the users fault...

> Anyway, I come to the conclusion that a serialized PHP array is to been
> seen as binary data and therefore to be stored in a BLOB column.

Again, see above. It has nothing to do with the serialization of the data, the 
main point is that during save, the template setup was not converted because 
it is treated as binary data.

But obviously we both agree that cache_hash.content can be changed back now.

> So not only cache_hash.content but also fields like
> cache_pages.cache_data and cache_pagesection.content or session data
> fields may be affected in some way.

It depends where the data comes from.

- michael

PS: Can you explain me why $TYPO3_CONF_VARS[SYS][setDBinit] = "SET NAMES 
utf8;" changes the connection charset to UTF8 (that's expected) but adding 
"CHARACTER SET utf8" behind makes it remain as Latin1?
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
Url : http://lists.netfielders.de/pipermail/typo3-team-core/attachments/20070402/1c16bdfb/attachment.pgp 


More information about the TYPO3-team-core mailing list