[TYPO3-dev] RFC: Unicode with preg_replace
masi-no at spam-typo3.org
Wed Mar 24 15:13:32 CET 2010
David Bruchmann schrieb:
> Von: Martin Kutschker <masi-no at spam-typo3.org>
> Gesendet: Mittwoch, 24. März 2010 09:37:45
>> David Bruchmann schrieb:
>>> More important is
>>> to know that utf-8 isn't accepted by all people and until there is
>>> perhaps sometime a really global charset we've to live with different
>> Accepted means the users make a deliberate choice to shun content
>> presented in utf-8 (as "foreign")
>> or they use software that cannot deal with utf-8 (which I think is
>> highly improbable).
> can you imagine that some persons (agencies, admins, developers) just
> don't think to need unicode as you think never needing ISO-Charsets?
I have never stated that I don't need the ISO charsets. In fact I oppose the utf8-as-only-supported
charset fraction! I just wanted to know what do you mean by "accepted". Did you use in a technical
sense (missing software support) or as expression of user preference.
What I do say is that IMHO Unicode (and therefore utf-8) is perfectly suited for CJK. If it isn't
used then it must be for historical reasons (eg connection with legacy data) or custom ("we used our
charset so long and we don't need no utf-8") or technical restrictions (utf-8 uses more HD storage).
But I'm fine with the choice. Each site should use the encoding it thinks its the best choice for it.
>>> By the way: Just for displaying some african languages you have to
>>> download extra fonts where charset and font is nearly the same because
>>> fonts for those languages are rare.
>> And so? This just shows that the fonts you currently use have not all
>> of the possible Unicde
>> characters embedded.
> AFAIK they don't use the unicode-table. For the case my knowledge is
> true your interpretation is wrong.
As I don't know the fonts in question I have to admit I cannot make any claims about them. As the
list of supported languages of Unicode is long and their goal is to support all scripts on earth I
thought it probable that the encoding of the fonts is Unicode. But of course this could be otherwise
for various reasons.
More information about the TYPO3-dev