[Typo3-dev] help needed: charset tables (Asian languages)

Martin T. Kutschker Martin.T.Kutschker at blackbox.net
Wed Mar 31 21:09:41 CEST 2004


Kasper Skårhøj wrote:
> TYPO3 only needs simple string functions to support multibyte charsets.
> I think the greatest problem is strtoupper and strtolower.

That is partially solved already. For UTF-8 I have code on my hard disk. 
I could rework it to work on all charsets (via UTF-8), of course using a 
caching mechanism.

 > All regex usage is mostly on tokens in text, nothing else.

Nonetheless, should be checked.

> Then of course supprot for UTF-8 in the database is needed if you want
> search results and ordering to work right.

Sorting and Unicode is a chapter pe se. Mind that correct sorting is 
language dependent. There are even two or more sort algorithms per 
language. *)

Anway UTF-8 is strstr-safe. It's impossible to get a false hit (ie a 
result that is misaligned to the mult-byte sequences).

 > And extensions like indexed_search would need modification as well.

I have noticed the bug reports on the list.

Masi

*) Mysql has seen great improvements on thois sector. AFAIK in the dev 
releases.





More information about the TYPO3-dev mailing list