spamassassin-dev September 2011 archive
Main Archive Page > Month Archives  > spamassassin-dev archives
spamassassin-dev: Re: High ham rate in darxus corpora for URIBL_

Re: High ham rate in darxus corpora for URIBL_WS_SURBL Re: ham scores

From: <darxus_at_nospam>
Date: Mon Sep 19 2011 - 21:37:45 GMT
To: Mark <>,

On 09/19, Mark wrote:
> wrote:
> > The other 38 were notifications from, nothing spam
> > related, from 2011-08-02 to 2011-08-11. It looks like you just had
> > listed as a spammer for those 10 days. Those emails
> > are not hitting this rule now.
> has been whitelisted for years, so it's certainly not expected
> behaviour.

Any SA dev folks have opinions on this? I'm up for assuming there was
somehow a problem on my end and removing these from my corpora if that's
what you devs think I should do.

Mark, I encourage you to include in your

> Perhaps you were using a DNS server that returned bad results. Some
> governments (e.g. China) intercept DNS requests and return their own IP. Some
> ISP's think they can do that too for NXDOMAIN results.

It seems unlikely. I'm using a local bind server with two forwarders to my
hosting provider,, which is very open-source oriented, and
seems unlikely to pull something like that. Although I'm happy to ask
them via a support request if there was a related incident during this
time period.

The relevant rule is:
urirhssub URIBL_WS_SURBL A 4

Does that mean it could've matched anything ending in .4, or only

Man page is Mail::SpamAssassin::Plugin::URIDNSBL

> That should be preventable to a large extent by checking if the return code is
> within the 127/8 IP range.

Devs, if urirhssub with a value of "4" does not constrain to 127/8,
we should change the rules to match only, for example,

> We don't control external DNS servers of course, so if one of them decides to
> return a 127/8 code due to whatever cause (e.g. cache poisoning), it will
> cause a false detection signal.


> Another possibility is DNS client error. That is known to occur with
> multithreaded and asynchronous dns clients. Typical is a race condition while
> accessing memory, causing a mix up of query returns.

Seems unlikely, mostly because of the time frame.

> Did the hits have specific subdomains?

I just looked for notifications from livejournal that didn't hit this rule
in the same time frame - there were none. Everything I got from from August 2nd to August 11th hit URIBL_WS_SURBL. And all
included these urls:
Other URLs were generally of a subdomain <user>

> Also, I would expect that there would not be any query to SURBL for a domain
> that is on SA's internal frequently queried whitelist. should
> be on that list. Can you see if there were any changes/updates to SA that
> could have caused this?

The rules currently include:

Certainly looks to me like that shouldn't allow to be
looked up against SURBL.

Closest backup of those config files I have is 2011-08-23, and that file
has an md5 checksum identical to my current Same as the
backup from 2011-07-01:

# md5sum panic-2011-07-01/var/lib/spamassassin/3.004000/updates_spamassassin_org/
64a27859c0a7cdafbd856dce3461c2f3 panic-2011-07-01/var/lib/spamassassin/3.004000/updates_spamassassin_org/

$ md5sum /var/lib/spamassassin/3.004000/updates_spamassassin_org/
64a27859c0a7cdafbd856dce3461c2f3 /var/lib/spamassassin/3.004000/updates_spamassassin_org/

So it shouldn't be possible for to hit URIBL_WS_SURBL.
I've removed the examples from my corpora. I'd still like to know how it
happened. Here's the simplest example I can find:
Only URLs that could hit URIBL_WS_SURBL are and, right? Yep.

spamassassin -D 2>&1 | grep multi.surbl | grep starting | less

Sep 19 17:22:39.564 [9037] dbg: async: starting: URI-DNSBL, (timeout 15.0s, min 3.0s)
Sep 19 17:22:39.569 [9037] dbg: async: starting: URI-DNSBL, (timeout 15.0s, min 3.0s)

That's current trunk output, so there's a bug causing uridnsbl_skip_domain
to not work? Opened bug:

Even without uridnsbl_skip_domain I still can't explain why this rule hit,
and that still bothers me.

> Thanks, feedback on fp's is always very welcome with SURBL.

-- "Life is either a daring adventure or it is nothing at all." - Helen Keller