[TYPO3-core] RFC: Bug #13972: cropHTML uses faulty reg exp for HTML entities

Jigal van Hemert jigal at xs4all.nl
Thu Apr 15 20:30:31 CEST 2010


Ralf Hettinger wrote:
> as one character. The search pattern as used in the current preg_match
> currently always crops after the first semicolon and won't recognize
> such entites reliably.

Sorry, but this pattern isn't correct either.
- valid entities can be longer than 7 characters (e.g. ϑ) [1]
- not everything between & and ; is a valid entity

Even though the original regexp is incorrect, this one isn't correct either.

-1 for now.

[1] http://www.w3.org/TR/html4/sgml/entities.html

-- 
Jigal van Hemert
skype:jigal.van.hemert
msn: jigal at xs4all.nl
http://twitter.com/jigalvh


More information about the TYPO3-team-core mailing list