[TYPO3-core] RFC: problem with php4 and xml data with byte order mark
Michael Stucki
michael at typo3.org
Thu Nov 9 12:08:36 CET 2006
Hi Martin,
> Problem:
> PHP4 doesn't like a Unicode byte order mark at the beginning of XML files
> in UTF-8. If a BOM is present the parsed data is no in UTF8 any more. A
> BOM is valid at the beginning of an XML file. Ususally it's added by text
> editors on Windows.
>
> Solution:
> Look for the BOM and set the charset to UTF8 if a UTF8 BOM is found.
There are some problems with this patch:
- You have renamed $ereg_result to $match, but one line down you didn't
change that accordingly for $theCharset.
- The preg_match does not work anymore. Tested with PHP 5.1
> Test:
> include('./class.t3lib_div.php');^
> $string = "\xEF\xBB\xBF".
> '<?xml version="1.0" encoding="utf-8" ?><val>Välué</val>';
Hmm, that doesn't look like finished, right?
But alright. I have added some more changes and made a new patch, including
a new test script.
The patch works, I tried it with PHP4 and PHP5.1.
- michael
--
Use a newsreader! Check out
http://typo3.org/community/mailing-lists/use-a-news-reader/
-------------- next part --------------
A non-text attachment was scrubbed...
Name: t3lib_div-BOM_v2.diff
Type: text/x-diff
Size: 2540 bytes
Desc: not available
Url : http://lists.netfielders.de/pipermail/typo3-team-core/attachments/20061109/916f5b60/attachment.bin
-------------- next part --------------
A non-text attachment was scrubbed...
Name: test.php
Type: application/x-php
Size: 184 bytes
Desc: not available
Url : http://lists.netfielders.de/pipermail/typo3-team-core/attachments/20061109/916f5b60/attachment-0001.bin
More information about the TYPO3-team-core
mailing list