[Typo3-dev] Japanese Backend

Martin T. Kutschker Martin.T.Kutschker at blackbox.net
Fri Mar 26 18:10:56 CET 2004


Kasper Skårhøj wrote:
> Hm, I probably didn't add it yet. will happen before final launch when I
> begin to merge languages again.
> 
> We still need to solve the shift-jis encoding - that has not been
> implemented yet. Maybe Martin Kutschker will do that for us???

I had a look a shift-jis (SJIS). It seems it belongs like gb2312 into 
the group of charsets that use ASCII (single 7bit byte) for latin 
characters and a two byte sequence for any other character (at least the 
8th bit of the first byte is set to distinguish from ASCII).

So the conversion should work as-is for shift-jis given we have access 
to a unicode mapping. I have currently done no real conversion test, though.

jis (ISO-2022-JP) is different as it uses special sequences to shift 
into and out of "multi-byte" mode. We could do conversions, but string 
functions on them are no fun. So I argue not to support it for the BE. 
Conversion OTOH seems to be necessary for mail, as it is the de-fact 
standard for SMTP and NNTP. This would require some rework of the 
current conversion code.

What's needed to prevent data corruption are the necessary string 
truncation functions (cut at specific byte length). For output we need 
the crop functions (cut at a given number of characters). Other output 
functions may be necessary, but I don't know what "transformations" the 
BE uses (eg case conversion).

We also need testers. Oliver, I guess Kasper is more ready to include 
"foreign" code if there are reports that it is actually working. I'd 
happy to get some peer review for the code. So anybody willing to do 
some testing just drops me a line per mail.

Masi





More information about the TYPO3-dev mailing list