[Typo3-dev] help needed: charset tables (Asian languages)

Martin T. Kutschker Martin.T.Kutschker at blackbox.net
Wed Mar 31 09:44:42 CEST 2004


Kasper Skårhøj wrote:
> I martin.
> 
> 
> In fact, the way I got hold of the gb2312 charset table was ... but
> requesting a URL at microsoft 256 times with a PHP script and parse the
> content... :-)

I see *sigh*

Anyway, I got hold onto some tables. I'll use the ones Microsoft 
provides if available otherise what I've found elsewhere.

The trouble is that a) the Asian standards became all updated at least 
twice (without a renaming) and b) there are no official mappings to 
Unicode. The maping is done independently by the vendors (so diffeent 
vendors => different mappings).

And what's more it seems that you cannot round-trip these charsets to 
Unicode (there is a Microsoft issue article where they mention about 30 
characters which do not round-trip per design).

I guess it will work out all right for most cases, but some "edge cases".

BTW, iso-2002-jp (JIS) is no fun. Conversion from JIS to Unicode is more 
or leass easy, but Unicode to JIS requires some work. I'll postpone it 
for now *). Anyone speaking Japanese who objects?

Masi

*) That is for "native" Typo3 support; mbstring, recode and iconv handle 
these charsets.





More information about the TYPO3-dev mailing list