[TYPO3-core] RFC: #15812: Add backend maintenance for login news

Jigal van Hemert jigal at xs4all.nl
Thu Oct 7 21:25:47 CEST 2010


Hi,

On 7-10-2010 18:56, Helmut Hummel wrote:
> Why I don't like RemoveXSS:
>
> 1. It is kind of unmaintained

True. It needs probably a combination of someone who is very much into 
XSS problems and a programmer who can convert this knowledge in code.

> 1a. The code is difficult (because of the regex) which makes it hard (at
> least for me) to understand what it really does.

I can only advise to invest some time in learning PCRE. As we've seen in 
some performance related RFCs the code in the PCRE routines often 
performs as well as or better than PHP functions which seem much simpler.

It could be split up in separate functions to improve readability (and 
to increase "code quality").

> 2. It uses a blacklist approach, tampering some (probably all?)
> problematic keywords, mostly unaware of the context where they are used.

It was completely unaware of context about two years ago (hence all the 
false positives people complained about). The purpose of the regular 
expressions is to provide a (small) check for the context. It requires 
too much processing to check context completely, so 'evil' things inside 
normal text (e.g. as illustration) will trigger the check, but this is 
to be expected with the rather simple approach.

> EDIT: It's much better than I thought, but still blacklisting is error
> prone and needs someone keeping the blacklist up to date.

There is an identical problem with white lists: a list of allowed 
tags/keywords/properties must be updated when HTML5 support is 
introduced, or with newer versions of JavaScript, CSS, etc.

> 3. The tampering it does might be goodfor some (most?) cases, but the
> result looks pretty ugly and it breaks the HTML output, produces invalid
> HTML.

True. The tampering is very simple and can easily be modified. I guess 
it was the choice between XSS and invalid HTML.

> 4. Having said all this one last thing: It teaches a wrong lesson which
> is: "Yeah, let's use RemoveXSS and wer're done with security".

Judging by the number of people who still inject external data directly 
into a SQL query, I'd say we would be lucky if someone actually used 
RemoveXSS :-/

> What we really need is HTML Purifier[1], which uses a whitelist
> approach, always returns cleaned up and valid! HTML as a result and is
> an open source project which is actively developed. I have it on my list
> to get it into 4.5, but had no time to do it until now.

I am very curious about its performance. The main reason why I modified 
RemoveXSS in the first place was performance. It was used in an 
extension and the thing was incredibly slow. Soon it was clear that 
RemoveXSS contributed in a large way to the lack of speed.

-- 
Kind regards / met vriendelijke groet,

Jigal van Hemert
skype:jigal.van.hemert
msn: jigal at xs4all.nl
http://twitter.com/jigalvh


More information about the TYPO3-team-core mailing list