[TYPO3-dev] Spamfilter-Service

Peter Guhl peter.guhl at win-lux.ch
Tue Jul 15 09:26:13 CEST 2008


Hi

Dmitry Dulepov [typo3] schrieb:
> It is from 0 to infinity, integer numbers but completely voluntary in meaning :) To give you an idea: currently comments extension gives one spam point for each three "http://" in text and one spam point per "[url". These two are mostly used by spammers. So, if you have certain spam evaluation criterias, you can construct something of them.
>   
Well, that's probably what the bayes filter internally does.

Now... how should I achieve that... I get points from 0.00000000 to 
0.99999999. Harmless text range around 0.5 or a bit above. If I 
"normalize" that to 1..10 I get points a bit higher than spamassassin 
(From what I know there score 3 is normally SPAM, above 10 it's always).

No matter what I do; the behaviour will be different from Spamassassin. 
I really think about the extension "spamdetection" which might be good 
to level several sources the way they give a reliable scoring system. 
Thats the way spamassassin works. The Bayes-Filter there is giving out a 
percentage and SA seems to give them points based on percentage-ranges 
(i.e. 0 to 10% = -1 Point, 10 to 30% 0.1 Point, 30 to 50 % 0.5 Points...).

Of course any extension might access spfgblib directly as long as it 
takes in account the way it behaves.

I installed spamdetection. From the first look it apparently uses rules 
and blacklists. That's a good idea and can be combined with external 
sources. Does it already do scoring? I think I will do my first steps 
touse the ASP spfgblib somewhere with that extension. The resulting 
thing will probably be a PHP-snippet which should be usable elsewhere too.
> There is also a concept of "cut off" point (obtainable from $pObj->conf['spamProtect.']['spamCutOffPoint']). If you supply a number more than this point, comment will silently dropped.
>   
If you work with score you need that, of course.

Regards
      Peter





More information about the TYPO3-dev mailing list