[Typo3-dev] help needed: charset tables (Asian languages)
Martin T. Kutschker
Martin.T.Kutschker at blackbox.net
Wed Mar 31 21:09:41 CEST 2004
Kasper Skårhøj wrote:
> TYPO3 only needs simple string functions to support multibyte charsets.
> I think the greatest problem is strtoupper and strtolower.
That is partially solved already. For UTF-8 I have code on my hard disk.
I could rework it to work on all charsets (via UTF-8), of course using a
caching mechanism.
> All regex usage is mostly on tokens in text, nothing else.
Nonetheless, should be checked.
> Then of course supprot for UTF-8 in the database is needed if you want
> search results and ordering to work right.
Sorting and Unicode is a chapter pe se. Mind that correct sorting is
language dependent. There are even two or more sort algorithms per
language. *)
Anway UTF-8 is strstr-safe. It's impossible to get a false hit (ie a
result that is misaligned to the mult-byte sequences).
> And extensions like indexed_search would need modification as well.
I have noticed the bug reports on the list.
Masi
*) Mysql has seen great improvements on thois sector. AFAIK in the dev
releases.
More information about the TYPO3-dev
mailing list