spamassassin-dev September 2011 archive
Main Archive Page > Month Archives  > spamassassin-dev archives
spamassassin-dev: Re: High ham rate in darxus corpora for URIBL_

Re: High ham rate in darxus corpora for URIBL_WS_SURBL Re: ham scores

From: Axb <axb.lists_at_nospam>
Date: Tue Sep 20 2011 - 14:32:48 GMT
To: dev@spamassassin.apache.org

On 2011-09-20 16:20, darxus@chaosreigns.com wrote:
> On 09/20, Axb wrote:
>> from what I'm seeing:
>>
>> livejournal.com is in 20_aux_tlds.cf
>>
>> util_rb_2tld livejournal.com
>
> I saw that, but didn't think it was relevant. How is it relevant? It also
> doesn't seem like it makes sense. "2TLDs include things like co.uk,
> fed.us, etc." Livejournal.com isn't one of those.

have you ever looked into 20_aux_tlds.cf and what it contains ?

the registered 2TLDs are listed im RegistrarBoundaries.pm
I we need to add domains to be treated as a pseudo TLD (extra power for
URI bl lookups) we use 20_aux_tlds.cf (in sync with the URI Bls engines)

for example: http://rss.uribl.com/hosters/hosters.txt

# This file replaces the SARE http://www.rulesemporium.com/rules/90_2tld.cf
# which will be deprecated as from 2010-05-01

# util_rb_2tld 2tld-1.tld 2tld-2.tld ...
# This option allows the addition of new 2nd-level TLDs (2TLD) to
# the RegistrarBoundaries code. Updates to the list usually
happen
# when new versions of SpamAssassin are released, but sometimes
# it's necessary to add in new 2TLDs faster than a release can
# occur.
#
# util_rb_3tld is supported by SA 3.3.x , eg: foo.bay.livefilestore.com
#

>
>> the uridnsbl_skip_domain rule applies to parent domain, not to subdomains.
>
> I wondered about that, but the standard rules don't include *any*
> subdomains, and... these are URLs, they are generally subdomains.

maybe there wasn't any need to included ruels for subdomains coz some of
these may be WLd @ URI lists and forcing a skip on others is generally a
source of huge discusssions (Hi Warren!)

>> You are trusting a third party DNS (as your forwarder) which *could*
>> be manipulating your queries.
>
> Yes, it's possible. As I said, I'd be happy to ask them (linode) if
> something like that happened if I could get confirmation on what exactly
> the query response had to be (*.*.*.4, or exactly 127.0.0.4?).
>
>> If you have a local resolver, why do the extra query hop?
>> or am I missing something?
>
> Seemed like a good idea to reduce load on the root servers. Do you
> disagree?

whatever... I prefer to have my resolver and cache under control and not
be caught by latency or third party quirks.. at least, no with my load.