[TYPO3-core] RFC: Fix indexing of files that containspecialcharacters

Michael Stucki michael at typo3.org
Sat Jan 28 18:34:06 CET 2006


Hi Martin,

> Well, it is best guess for all West eruopean languages. Maybe it makes
> sense to create something like forceCharset: fileSystemCharset. Default is
> windows-1252 (to make Windows users happy), but can be set to anything.

I thought about that but hope there is a better solution. Maybe there is a
way to detect it automatically?

> fileSystemCharset is used whenever a file is processed the way you
> describe. Still, I suggest a test for UTF-8 before doing the conversion to
> avoid douple encoding.

Do we have such a function somewhere?

> I'm curious where you expect non-ASCII filenames in a TYPO3 context. TYPO3
> ASCII-fies everything.

In fileadmin/

TYPO3 usually displays the files without any conversion. The File->Filelist
module for example displays files containing German umlauts correctly.

In Indexed Search, it works vice versa: When the document filename is used
as the page title, it is expected to be utf8-encoded because all titles are
so. Upon display (e.g. in Web->Info) the title will be utf8-decoded for the
same reason like before.

Since our filename was never utf8-encoded, it is now utf8-decoded although
this was not needed. The result is a broken file title in the output.

Interesting: If the BE is forced to be UTF-8, the Filelist output will be
wrong and filenames with umlauts are destroyed again, probably because no
conversion happened here...

- michael
-- 
Use a newsreader! Check out
http://typo3.org/community/mailing-lists/use-a-news-reader/



More information about the TYPO3-team-core mailing list