[TYPO3-content-rendering] Illegal SGML characters in output

Martin Kutschker Martin.Kutschker at n0spam-blackbox.net
Fri Dec 16 13:08:54 CET 2005


Ernesto Baschny [cron IT] schrieb:
> 
> We need to find out in which character sets this is a problem. If I set
> my site to "forceCharSet=utf-8", the problem doesn't exist, because all
> pasted input will have corresponding UTF-8 entities which are valid. So
> maybe some charset expert around could tell us a bit about it, and if
> noone is available, I would do some research on it. I suspect every
> ISO-Latin-x variant hast this problem.

windows-1252 is the only superset of a charset of the iso-8859-* series.

The other MS encodings have ususally some characters on different places 
(besides some additional chars).

I have not tested it but I think it's only windows-1252 and iso-8859-1 
that are treated as synonyms. So probably only iso-8859-1 has the problem.

AFAIK, IE sends characters SGML encoded when form input and form charset 
don' match. So you get entities in your database whether you want it or not.

Masi



More information about the TYPO3-project-content-rendering mailing list