[TYPO3-dev] RFC: Unicode with preg_replace
David Bruchmann
typo3-dev at bruchmann-web.de
Tue Mar 23 21:08:03 CET 2010
Von: Peter Russ <peter.russ at 4many.net>
Gesendet: Dienstag, 23. März 2010 20:45:29
> --- Original Nachricht ---
> Absender: Martin Kutschker
> Datum: 23.03.2010 20:33:
>> David Bruchmann schrieb:
>>> And utf-8 is perfect for western languages but never for eastern ones.
>>
>> Because it uses three bytes for each glyph or for "political" reasons.
>> I have the impression that
>> all the major character sets have been included completely in the
>> Unicode character set.
>>
>> Masi
>
> And it depends on the definition of "western".
> @David: what is your approach for a site to support "Chinese", Japanese,
> Russian and German? Ok may work on BE and FE with character set
> switches. BUT what is the characterset for the DB? Latin 2 Bytes?
>
I've a page with multilanguage support including many different asian
languages and I use only utf-8 (with auto-length, default one byte)
because I grab my contents from Microsoft.
But I read many documents about charsets and know that a solution like
that may display some characters wrong even if it's readable.
Furthermore it's not important what I think about any solution because I
neither know speaking nor writing any asian languages. More important is
to know that utf-8 isn't accepted by all people and until there is
perhaps sometime a really global charset we've to live with different ones.
By the way: Just for displaying some african languages you have to
download extra fonts where charset and font is nearly the same because
fonts for those languages are rare. I haven't verified how characters
are defined in those charsets but it shows again that utf-8 can't fit
all requirements.
The word "western" I used shortly just as contrast to "eastern". It
shouldn't have any defining character apart from separating languages
with different charsets and requirements. Concerning hebrew or arabic
languages I never read about any problems and I think they are accepted
in utf-8.
David
More information about the TYPO3-dev
mailing list