spamassassin-dev October 2011 archive
Main Archive Page > Month Archives  > spamassassin-dev archives
spamassassin-dev: [Bug 6668] DNSWL is lacking a rule to communic

[Bug 6668] DNSWL is lacking a rule to communicate excessive use to users

From: <bugzilla-daemon_at_nospam>
Date: Mon Oct 03 2011 - 17:53:51 GMT
To: dev@spamassassin.apache.org

https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6668

Darxus <Darxus@ChaosReigns.com> changed:

           What |Removed |Added
----------------------------------------------------------------------------
                 CC| |Darxus@ChaosReigns.com

--- Comment #3 from Darxus <Darxus@ChaosReigns.com> 2011-10-03 17:53:51 UTC ---
(In reply to comment #1)
> I would personally veto this immediately. We are not an advertising service
> for RBLs.

I find that statement kind of interesting, when shutting off network tests,
many of which require payment over some threshold (often around 100,000 hits a
day), makes SpamAssassin five times less accurate. 5.35x the false positives,
and 4.25x the false negatives, based on the 2011-03-24 score generation. And
that's if SA *knows* the network tests aren't working. What if it's expecting
the tests to work, and the major ones aren't because of going over their (free
use) thresholds? Probably bad.

I'm not happy about it, but SA seems pretty dependent on things like RBLs
which, under some circumstances, charge money.

>From the Ubuntu SpamAssassin 3.3.1 package:

/usr/share/doc/spamassassin/rules/STATISTICS-set0.txt.gz (no bayes, no net)
# SUMMARY for threshold 5.0:
# False positives: 238 1.12%
# False negatives: 9678 21.93%

/usr/share/doc/spamassassin/rules/STATISTICS-set1.txt.gz (no bayes, net
enabled)
# SUMMARY for threshold 5.0:
# False positives: 30 0.14%
# False negatives: 1381 3.13%

7.93x the false positives, 7.01x the false negatives, without network tests.

> If an RBL is submitted for inclusion for SA, it should not have policies that
> would affect anything but the most extreme cases. Any URLs should point to an
> SA page such as a wiki letting them know to disable the rules.

I think the cases where DNSWL has done are likely to qualify as "most extreme".

> > Also, I think it's really irresponsible for SpamAssassin to expose users to
> > this kind of punitive activity without actually warning them of the usage
> > thresholds of the services involved, as Warren lists here:
> > http://www.spamtips.org/2011/01/usage-limits-of-spamassassin-network.html
>
> I agree. What RBLs have this issue and I will immediate work to disable them
> in a default SA installation for the 3.4.0 release?

According to Michael Scheidell, Spamhaus's (providers of ZEN, SBL, PBL, XBL,
included in SA by default) policy of blocking queries results in "10 and 20 min
delays in inbound email" - bug #6220. You could call that DOSing email
providers, instead of disabling spam filtration, both with the same goal of
getting the provider to disable the relevant network tests. Which is worse?

Should the Spamhaus rules be removed from the default SA rule set because they
will DOS email providers for querying them for over 100,000 emails per day?

SEM (bug #6220) is the only one I know of that affects scores. And by a
mechanism that seemed to have the approval of SpamAssassin folks. Should that
bug be closed, and the rules not included in SA by default, because of that
mechanism?

I think it would be great if SpamAssassin, by default, didn't include any
network rules that have limits on free use. Although it would probably require
more work to improve the accuracy, which I don't really see happening.

-- Configure bugmail: https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug.