[Typo3-dev] Character set inquery

"Kasper Skårhøj" kasper at typo3.com
Tue Sep 9 12:12:13 CEST 2003


very useful, thanks!



God bless

- kasper

*********** REPLY SEPARATOR  ***********

On 09-09-2003 at 07:54 Sacha Vorbeck wrote:

>Hi Kasper,
>
>> Does any of you have some valuable insights here?
>
>I asked Björn Höhrmann (http://bjoern.hoehrmann.de/) who is working 
>for/with the W3C for a statement. He was so kind to write the following 
>message:
>
><quote>
>The detection of the character encoding of text/html resources is
>defined in section 5.2.2 of HTML 4.01,
>
>   http://www.w3.org/TR/html4/charset.html#h-5.2.2
>
>Basically
>
>   if (HTTP Content-Type header has a charset parameter)
>   {
>     use it;
>   }
>   elsif (document starts with a byte order mark)
>   {
>     use it; /* i.e., UTF-x depending on the BOM */
>   }
>   elsif (document has <meta http-equiv=Content-Type with charset)
>   {
>     use it;
>   }
>   else
>   {
>     do some magic; /* probability analysis, user defaults, ... */
>   }
>
>Common web browsers implement this to some extend. I am not aware of any
>browser where the <meta> element takes precedence over the encoding
>specified in the HTTP header. If there is, it is broken and it is likely
>that a number of web sites break in this browser.
>
>Note that the XML declaration <?xml version='1.0' encoding='...'?>
>Robert mentions is ignored as it is a processing instruction from a HTML
>point of view and the W3C HTML WG wants user agents to treat text/html
>resources only as tag soup, not as XHTML and thus XHTML/XML rules do not
>apply to the document.
>
>I am not sure how Apache, PHP, and Typo3 interact here. As the HTML
>Recommendation states, the character encoding should be specified in the
>HTTP header and the meta element is only to be used as a last resort if
>you cannot configure the web server properly. In PHP you could do
>
>   header('Content-Type: text/html;charset=utf-8');
>
>which should override any default encoding set by the default_charset
>config option for PHP or the AddDefaultCharset in the Apache
>configuration. If the document relies on the <meta> element so specify
>the character encoding of the document and the web server or PHP
>configuration specifies a default encoding, you run into the problem
>Robert talks about.
>
>If you cannot modify the HTTP header as in the example above, this is a
>documentation issue, you cannot do anything about it; the HTTP header
>always takes precedence and if the web server is configured to default
>to say ISO-8859-1 the document cannot use any other character encoding
>without modifying the web server configuration.
>
>regards.
></quote>
>
>-- 
>all the best,
>
>Sacha
>
>
>_______________________________________________
>Typo3-dev mailing list
>Typo3-dev at lists.netfielders.de
>http://lists.netfielders.de/cgi-bin/mailman/listinfo/typo3-dev



God bless

- kasper


- kasper
-------------------- o ---------------------
>>>    In God I trust - others pay cash!     <<<
Check www.typo3.com







More information about the TYPO3-dev mailing list