[TYPO3-dev] Spamfilter-Service
Peter Guhl
peter.guhl at win-lux.ch
Tue Jul 15 09:26:13 CEST 2008
Hi
Dmitry Dulepov [typo3] schrieb:
> It is from 0 to infinity, integer numbers but completely voluntary in meaning :) To give you an idea: currently comments extension gives one spam point for each three "http://" in text and one spam point per "[url". These two are mostly used by spammers. So, if you have certain spam evaluation criterias, you can construct something of them.
>
Well, that's probably what the bayes filter internally does.
Now... how should I achieve that... I get points from 0.00000000 to
0.99999999. Harmless text range around 0.5 or a bit above. If I
"normalize" that to 1..10 I get points a bit higher than spamassassin
(From what I know there score 3 is normally SPAM, above 10 it's always).
No matter what I do; the behaviour will be different from Spamassassin.
I really think about the extension "spamdetection" which might be good
to level several sources the way they give a reliable scoring system.
Thats the way spamassassin works. The Bayes-Filter there is giving out a
percentage and SA seems to give them points based on percentage-ranges
(i.e. 0 to 10% = -1 Point, 10 to 30% 0.1 Point, 30 to 50 % 0.5 Points...).
Of course any extension might access spfgblib directly as long as it
takes in account the way it behaves.
I installed spamdetection. From the first look it apparently uses rules
and blacklists. That's a good idea and can be combined with external
sources. Does it already do scoring? I think I will do my first steps
touse the ASP spfgblib somewhere with that extension. The resulting
thing will probably be a PHP-snippet which should be usable elsewhere too.
> There is also a concept of "cut off" point (obtainable from $pObj->conf['spamProtect.']['spamCutOffPoint']). If you supply a number more than this point, comment will silently dropped.
>
If you work with score you need that, of course.
Regards
Peter
More information about the TYPO3-dev
mailing list