[TYPO3-50-general] UTF-16

Martin Kutschker Martin.Kutschker at n0spam-blackbox.net
Thu Nov 9 09:26:39 CET 2006


Robert Lemke schrieb:
> 
>> IIRC UTF-16 uses a two-bytes for each char opening the problem of 
>> endianness. Is there any prefernce for this?
> 
> No, not that I know of. As far as I could see, all the examples given 
> for PHP6 were Little Endian, but we might ask for what endianness makes 
> most sense.

No problem if we decalre that UTf-16 LE is THE charset. Any UTF-16 BE has 
then to be converted like any other charset.

> The Unicode support of PHP6 is based on the International Component for 
> Unicode (ICU), which is an IBM project [2].

I know ICU and I think it is a very good descision to use it.

>>> Use PHP6?
>>
>>
>> Given that TYPO3 5 is to me still "only" a vision it makes sense. Zend 
>> is probably faster in delivering a stable PHP6 than the TYPO3 
>> community with rewriting TYPO3.
> 
> Absolutely. Given the fact that they plan the release for next spring or 
> so, PHP6 will be stable enough when TYPO3 5.0 comes out.

Maybe it will be TYPO3 6.0 by then :-)

>> The same question comes when we talk about West-Europan sites. Do I 
>> really want to store UTF16 in my DB? Maybe TYPO3 doesn't need to 
>> handle this. At least on Mysql I can have different charsets for 
>> client and server. So Mysql could transparently deliver UTF16 but 
>> store in UTF8.
> 
> As we were told these conversions can really hit performance, so why not 
> avoid them? I don't see any drawback in storing the data as UTF-16 in 
> the database or XML files. Or is space really an issue?

It might be for some users. But in fact we can only make suggestions. If a 
user creates his DB in latin1 there is little TYPO3 can do for him.

Masi



More information about the TYPO3-project-5_0-general mailing list