[TYPO3-dev] checking form inputs with a spamfilter

Peter Guhl peter.guhl at win-lux.ch
Wed Jul 25 09:11:46 CEST 2007


Hi

Daniel Pötzinger schrieb:
> The idea there was to have a lib for other extensions and call it like:
> $spamdetect->init($conf);
> $spamdetect->isSpam($rowtocheck);
>   
That sounds a lot like my spfgblib - except mine is not (yet?) object 
oriented.
> The solution suggested by Martin (to check _POST / _GET based on some 
> rules is a good idea too.
>
> But the more transparent way would be that a spamdetection is called 
> within an extension in my opinion.
>   
Maybe you are right. Even more since it's likely that spamprobe, 
spamassassin etc. are not available in webhostings for small websites 
(and are not needed if somebody does not want any forms at all).
> Maybe the first version of the extension "spamdetection" has something 
> to use. (based on blackword and blackip and some extrarules)
>
> To have a spamprobe check is a good idea and much better than updating a 
> blacklist.
>   
IMHO simple blacklists are much work and too easy to get around. 
Regexp-based rules are a bit more efficient.and since PHP can deal well 
with regular expressions I would consider that first. Of course a 
spamassassin-interface would do the whole job since nearly every filter 
type we could think of is available over there.

For any trainable filter there needs to be a backend-interface to do 
that. Therefore the extension has to keep everything somewhere. In my 
guestbook I simply flag the stuff and put it into the database (or at 
least I did it at the beginning; now the bayes filter is trained well 
enough and I could switch over to blocking the known junk). IMHO the 
extension should only say "this text is spam". If possible as a method 
like you described above. Then the other extension has to act. Modifiyng 
the arrays (_POST, _GET, _FILE) is more dangerous as long as filters are 
not well trained. Even though it's easy to put all blockd spam into a 
training database it's hard or even impossible to recover false 
positives into the original extension. At least I can't think of a 
simple way to do that right now.

NB: apparently form spammers already use bayes poisoning and other 
tricks learnt from mail spamming. Just to point out that they are 
already using state of the art technology in every field they work.

Regards
    Peter




More information about the TYPO3-dev mailing list