[Typo3-dev] Utf8 problems

Ludwik Górski ludwik at iceage.pl
Thu Mar 25 10:21:25 CET 2004


> We know about this problem and did so all the way. WE just 
> didn't solev
> it since it mostly concerns visual representation in the 
> browser and not
> the data integrity. We still intend to solve it but we have to do it
> right or not at all in my opinion and therefore it is subject to
> priorities of everything.

Ok., I see. I can help with, for example, writing functions for
cropping, searching, counting chars etc. Please tell me if it's needed.
 
> BTW; the far greater problem is that content stored in the database
> might be corrupted. For instnace if a 100 char string (in 
> utf-8, say it
> will be 163 bytes) stored in a varchar(100) will... be 
> cropped! I don't
> know if MySQL 4 has a solution for this or what. In any case 
> TYPO3 will
> issue a warning if cropping occurs right after saving the contnet...

MySQL 4.1 supports utf8. You've to define character set for
database/table/column. There is an annotation:

"Tip: To save space with UTF8, use VARCHAR instead of CHAR. Otherwise,
MySQL has to reserve 30 bytes for a CHAR(10) CHARACTER SET utf8 column,
because that's the maximum possible length."

On
http://www.mysql.com/documentation/mysql/bychapter/manual_Charset.html#C
harset-Unicode.

So strings will not be cropped, but we've to review some database
definitions to avoid this redundancy.

Ludwik







More information about the TYPO3-dev mailing list