[TYPO3-core] RFC: Bug #5088: cache is not saved properly because of charset conflict in the database

Martin Kutschker martin.kutschker-n0spam at no5pam-blackbox.net
Mon Apr 2 17:46:09 CEST 2007


Michael Stucki schrieb:
> Hi Martin,
> 
>> Dmitry Dulepov schrieb:
>>> Michael Stucki wrote:
>>>> PS: Can you explain me why $TYPO3_CONF_VARS[SYS][setDBinit] = "SET
>>>> NAMES utf8;" changes the connection charset to UTF8 (that's expected)
>>>> but adding "CHARACTER SET utf8" behind makes it remain as Latin1?
>>> Probably a bug... If I remember correctly, SET NAMES is equal to several
>>> "SET character_set_xxx=" directives.
>> SET NAMES x is the same as these three statements:
>>
>> SET character_set_client = x;
>> SET character_set_results = x;
>> SET character_set_connection = x;
>>
>> "CHARACTER SET utf8" means nothing at all and is no valid command.
> 
> This was a typo of course. I was using "SET ..." as it is also described
> here: http://wiki.typo3.org/index.php/UTF-8_support

So? This is a wiki and it might contain errors. And I can err of course too.

> I do not understand why this causes the connection to be reset to the
> server default, but I assume it could be one more problem adding trouble
> here...
> 
> I made a test for this on my console. MySQL is still using Latin1 by
> default, so UTF8 must be set for every connection:

Depends on the installation. I recall that the installer on Windows 
asked me if I wanted to use utf-8 as default. Anyway the default charset 
for the may be set with parameter of mysqld and in my.ini.

  > mstucki at debian:~$ echo "SET NAMES utf8; SET CHARACTER SET utf8; SHOW 
VARIABLES;" | mysql typo3_test | grep -i character_set_
> character_set_client        utf8
> character_set_connection    latin1 <== watch this!
> character_set_database      latin1
> character_set_filesystem    binary
> character_set_results       utf8
> character_set_server        latin1
> character_set_system        utf8
> 
> Weird, isn't it?

Yes, but it's documented in 
http://dev.mysql.com/doc/refman/4.1/en/set-option.html

"As of MySQL 4.1.1, SET CHARACTER SET sets three session system 
variables: character_set_client  and character_set_results are set to 
the given character set, and character_set_connection to the value of 
character_set_database. See Section 10.4, “Connection Character Sets and 
Collations”."

SET CHARACTER SET will always set character_set_connection to the value 
of character_set_databasem, which in your case is latin1. And using the 
same charset for db and connection does make sense, doesn't it?

What I suggest is that if you want to use utf8 you create the db with 
CREATE DATABASE db_name CHARACTER SET utf8. This will get you a 
character_set_database for a connection to that DB.

Note that you can also change the connection charset in my.ini. In this 
case you don't have to use SET NAMES, Which in turn should not be used 
together with SET CHARSET.

> Anyway, this is probably too deep in detail, so if you
> don't have the solution for this at hand, I suggest to stop this discussion
> and I will post it again in typo3.dev.

I have said more than my 2 cents. Changing it to a BLOB again won't be a 
problem om Mysql, but it should not be necessary.

Masi


More information about the TYPO3-team-core mailing list