[TYPO3-core] RFC: problem with php4 and xml data with byte order mark

Martin Kutschker Martin.Kutschker at n0spam-blackbox.net
Tue Nov 21 10:00:02 CET 2006


Martin Kutschker schrieb:
> Hi!
> 
> This is a SVN patch request.
> 
> Problem:
> PHP4 doesn't like a Unicode byte order mark at the beginning of XML 
> files in UTF-8. If a BOM is present the parsed data is no in UTF8 any 
> more. A BOM is valid at the beginning of an XML file. Ususally it's 
> added by text editors on Windows.
> 
> Solution:
> Look for the BOM and set the charset to UTF8 if a UTF8 BOM is found.
> 
> Note:
> Addionally I have changed the comment concerning PHP's behaviour when 
> parsing xml data and replaced ereg with preg_match.
> 
> Test:
> include('./class.t3lib_div.php');^
> $string = "\xEF\xBB\xBF".
>  '<?xml version="1.0" encoding="utf-8" ?><val>Välué</val>';
> 
> Branches: TYPO3_4-0 and Trunk

I hope it doesn't interfere with 4.0.3 now, but as yesterday nothing 
happened I have commited th patch right now.

All my other stuff I hope to get in in 4.0.4, which could be a Christmas 
release :-)

Masi



More information about the TYPO3-team-core mailing list