[TYPO3-dev] UTF8 problem when parsing XML data...

Dmitry Dulepov typo3 at accio.lv
Thu Jul 13 14:12:33 CEST 2006


Hi!

Jigal van Hemert wrote:
> Wat is "plain Unicode value"? Unicode is a collection of characters
> divided into groups ("planes" IIRC). These characters can be
> represented in a number of ways. In the example there was an 'entity'
> é. This is not a unicode value, but a numerical representation
> of a character. 233 is in Latin-1 (often encoded in ISO-8859-1) an 'e
> with accent aigu'.

You answered yourself :) There are Unicode characters and there are
representations for them. é refers to the Unicode character with
code 233. And it is represented in utf-8 as 0xC3 0xA9. Entities always 
show character number, they are independent of representation.

Dmitry.
-- 
"It is our choices, that show what we truly are,
far more than our abilities." (A.P.W.B.D.)




More information about the TYPO3-dev mailing list