[Typo3-dev] Japanese Backend
Martin T. Kutschker
Martin.T.Kutschker at blackbox.net
Fri Mar 26 18:10:56 CET 2004
Kasper Skårhøj wrote:
> Hm, I probably didn't add it yet. will happen before final launch when I
> begin to merge languages again.
>
> We still need to solve the shift-jis encoding - that has not been
> implemented yet. Maybe Martin Kutschker will do that for us???
I had a look a shift-jis (SJIS). It seems it belongs like gb2312 into
the group of charsets that use ASCII (single 7bit byte) for latin
characters and a two byte sequence for any other character (at least the
8th bit of the first byte is set to distinguish from ASCII).
So the conversion should work as-is for shift-jis given we have access
to a unicode mapping. I have currently done no real conversion test, though.
jis (ISO-2022-JP) is different as it uses special sequences to shift
into and out of "multi-byte" mode. We could do conversions, but string
functions on them are no fun. So I argue not to support it for the BE.
Conversion OTOH seems to be necessary for mail, as it is the de-fact
standard for SMTP and NNTP. This would require some rework of the
current conversion code.
What's needed to prevent data corruption are the necessary string
truncation functions (cut at specific byte length). For output we need
the crop functions (cut at a given number of characters). Other output
functions may be necessary, but I don't know what "transformations" the
BE uses (eg case conversion).
We also need testers. Oliver, I guess Kasper is more ready to include
"foreign" code if there are reports that it is actually working. I'd
happy to get some peer review for the code. So anybody willing to do
some testing just drops me a line per mail.
Masi
More information about the TYPO3-dev
mailing list