spamassassin-dev September 2011 archive
Main Archive Page > Month Archives  > spamassassin-dev archives
spamassassin-dev: bernie-it_batt ham 61% DKIM_ADSP_ALL and other

bernie-it_batt ham 61% DKIM_ADSP_ALL and other fun in the corpora

From: <darxus_at_nospam>
Date: Wed Sep 28 2011 - 20:29:35 GMT
To: dev@spamassassin.apache.org

I wrote a script to read in scores of all the rules from
/var/lib/spamassassin/3.003002/updates_spamassassin_org/*.cf, then read
in the corpora from the last mass-check. It adds up the score of each of
the emails, and outputs the hits for emails that scored on the wrong side
of a threshold of 5.

I noticed DKIM_ADSP_ALL, meaning "No valid author signature, domain signs
all mail", shows up a lot:
http://ruleqa.spamassassin.org/20110924-r1175130-n/DKIM_ADSP_ALL/detail
I'm betting Bernie has something screwed up there. I'm the second worst
offender on that rule, at 11%. Looks like *all* of it was stuff that
originated on my server. Not entirely sure what's going on, but I changed
my ADSP setting from "all" to "unknown", so it won't trigger this rule
anymore. Guess I should scrape all that gunk out of my corpora :(

zmi has a hit on GTUBE in his ham.

I bet there's a number of other things in here that aren't right. The
directions for corpus cleaning might be useful for finding it:
http://wiki.apache.org/spamassassin/CorpusCleaning

These were all the hams that scored 5+ that didn't include DKIM_ADSP_ALL:
(Sorted by score, with the score then the list of rules.)

1000.794 DKIM_ADSP_CUSTOM_MED GTUBE RDNS_NONE
20.591 FRT_APPROV NORMAL_HTTP_TO_IP NUMERIC_HTTP_ADDR RCVD_IN_DNSWL_MED SPF_PASS SPOOF_COM2COM SPOOF_COM2OTH URIBL_AB_SURBL URIBL_BLACK URIBL_DBL_SPAM URIBL_JP_SURBL URIBL_WS_SURBL URI_HEX URI_OBFU_WWW
18.37 FORGED_MUA_OUTLOOK FROM_MISSPACED FROM_MISSP_DKIM FROM_MISSP_EH_MATCH FROM_MISSP_MSFT FROM_MISSP_REPLYTO FROM_MISSP_TO_UNDISC FSL_UA FSL_XM_419 KHOP_DYNAMIC OBFU_ATTACH_MISSP RCVD_IN_BRBL_LASTEXT RCVD_IN_PSBL TVD_RCVD_SPACE_BRACKET UNPARSEABLE_RELAY
15.558 FRT_APPROV NORMAL_HTTP_TO_IP NUMERIC_HTTP_ADDR RCVD_IN_DNSWL_MED SPF_PASS SPOOF_COM2OTH URIBL_BLACK URIBL_DBL_SPAM URIBL_JP_SURBL URIBL_RHS_DOB URIBL_WS_SURBL URI_HEX URI_OBFU_WWW
11.65 DRUGS_ERECTILE DRUGS_ERECTILE_OBFU NORMAL_HTTP_TO_IP RCVD_IN_DNSWL_MED RP_MATCHES_RCVD URIBL_AB_SURBL URIBL_BLACK URIBL_DBL_SPAM URIBL_JP_SURBL URIBL_SC_SURBL URIBL_WS_SURBL
11.58 FRT_APPROV HK_LOTTO IP_LINK_PLUS NORMAL_HTTP_TO_IP RCVD_IN_DNSWL_MED SPF_PASS SPOOF_COM2OTH URIBL_BLACK URIBL_DBL_SPAM URI_HEX URI_NOVOWEL
11.506 DRUGS_ERECTILE DRUGS_ERECTILE_OBFU NORMAL_HTTP_TO_IP RCVD_IN_DNSWL_MED RP_MATCHES_RCVD URIBL_AB_SURBL URIBL_BLACK URIBL_JP_SURBL URIBL_PH_SURBL URIBL_RHS_DOB URIBL_WS_SURBL
11.292 NORMAL_HTTP_TO_IP RCVD_IN_DNSWL_MED RP_MATCHES_RCVD URIBL_AB_SURBL URIBL_BLACK URIBL_DBL_SPAM URIBL_JP_SURBL URIBL_SBL URIBL_SC_SURBL URIBL_WS_SURBL URI_HEX
11.227 HTTP_ESCAPED_HOST NORMAL_HTTP_TO_IP RCVD_IN_DNSWL_MED RP_MATCHES_RCVD URIBL_AB_SURBL URIBL_BLACK URIBL_DBL_SPAM URIBL_JP_SURBL URIBL_SBL URIBL_WS_SURBL URI_NOVOWEL
11.081 DRUGS_ERECTILE DRUGS_ERECTILE_OBFU RCVD_IN_DNSWL_MED RP_MATCHES_RCVD URIBL_AB_SURBL URIBL_BLACK URIBL_DBL_SPAM URIBL_JP_SURBL URIBL_WS_SURBL
10.504 DRUGS_ERECTILE DRUGS_ERECTILE_OBFU NORMAL_HTTP_TO_IP RCVD_IN_DNSWL_MED RP_MATCHES_RCVD URIBL_AB_SURBL URIBL_BLACK URIBL_JP_SURBL URIBL_WS_SURBL URI_HEX
10.334 DATE_IN_PAST_06_12 HTML_FONT_SIZE_HUGE HTML_IMAGE_RATIO_06 HTML_MESSAGE LOTS_OF_MONEY MARKETING_PARTNERS MILLION_USD MIME_HTML_ONLY NA_DOLLARS RDNS_NONE URIBL_GREY
10.172 HTTP_ESCAPED_HOST NORMAL_HTTP_TO_IP RCVD_IN_DNSWL_MED RP_MATCHES_RCVD URIBL_AB_SURBL URIBL_BLACK URIBL_DBL_SPAM URIBL_JP_SURBL URIBL_SC_SURBL URIBL_WS_SURBL URI_NOVOWEL
9.95 DRUGS_ERECTILE DRUGS_ERECTILE_OBFU NORMAL_HTTP_TO_IP RCVD_IN_DNSWL_MED RP_MATCHES_RCVD URIBL_AB_SURBL URIBL_BLACK URIBL_JP_SURBL URIBL_SC_SURBL URIBL_WS_SURBL
9.944 FRT_APPROV NORMAL_HTTP_TO_IP RCVD_IN_DNSWL_MED SPF_PASS SPOOF_COM2OTH URIBL_BLACK URIBL_DBL_SPAM URI_HEX URI_OBFU_WWW
8.406 ADVANCE_FEE_2_NEW_MONEY ADVANCE_FEE_3_NEW ADVANCE_FEE_3_NEW_MONEY ADVANCE_FEE_4_NEW ADVANCE_FEE_4_NEW_MONEY ADVANCE_FEE_5_NEW ADVANCE_FEE_5_NEW_MONEY FORM_FRAUD_3 FORM_FRAUD_5 LOTS_OF_MONEY MILLION_USD MONEY_FRAUD_3 MONEY_FRAUD_5 MONEY_FRAUD_8 RISK_FREE US_DOLLARS_3
8.065 DRUGS_MUSCLE NORMAL_HTTP_TO_IP RCVD_IN_DNSWL_MED RP_MATCHES_RCVD URIBL_AB_SURBL URIBL_BLACK URIBL_JP_SURBL URIBL_WS_SURBL URI_HEX URI_NOVOWEL
6.622 DNS_FROM_RFC_DSN HTML_MESSAGE MIME_HEADER_CTYPE_ONLY MIME_HTML_ONLY MSGID_FROM_MTA_HEADER RAZOR2_CF_RANGE_51_100 RAZOR2_CF_RANGE_E4_51_100 RAZOR2_CHECK RDNS_NONE SINGLE_HEADER_1K SPF_FAIL URIBL_BLACK

And the ones that did include DKIM_ADSP_ALL:

13.351 DKIM_ADSP_ALL DOS_RCVD_IP_TWICE_B FH_HELO_EQ_D_D_D_D FORGED_RELAY_MUA_TO_MX HELO_DYNAMIC_IPADDR2 RCVD_IN_BRBL_LASTEXT RCVD_IN_PBL RDNS_DYNAMIC
11.902 DKIM_ADSP_ALL DOS_RCVD_IP_TWICE_B FH_HELO_EQ_D_D_D_D FORGED_RELAY_MUA_TO_MX HELO_DYNAMIC_IPADDR2 RCVD_IN_PBL RDNS_DYNAMIC
11.902 DKIM_ADSP_ALL DOS_RCVD_IP_TWICE_B FH_HELO_EQ_D_D_D_D FORGED_RELAY_MUA_TO_MX HELO_DYNAMIC_IPADDR2 RCVD_IN_PBL RDNS_DYNAMIC
11.902 DKIM_ADSP_ALL DOS_RCVD_IP_TWICE_B FH_HELO_EQ_D_D_D_D FORGED_RELAY_MUA_TO_MX HELO_DYNAMIC_IPADDR2 RCVD_IN_PBL RDNS_DYNAMIC
11.902 DKIM_ADSP_ALL DOS_RCVD_IP_TWICE_B FH_HELO_EQ_D_D_D_D FORGED_RELAY_MUA_TO_MX HELO_DYNAMIC_IPADDR2 RCVD_IN_PBL RDNS_DYNAMIC
5.026 DKIM_ADSP_ALL DOS_RCVD_IP_TWICE_B DOS_RCVD_IP_TWICE_C FORGED_RELAY_MUA_TO_MX LOTS_OF_MONEY RCVD_IN_PBL RDNS_NONE
5.025 DKIM_ADSP_ALL DOS_RCVD_IP_TWICE_B DOS_RCVD_IP_TWICE_C FORGED_RELAY_MUA_TO_MX RCVD_IN_PBL RDNS_NONE
5.025 DKIM_ADSP_ALL DOS_RCVD_IP_TWICE_B DOS_RCVD_IP_TWICE_C FORGED_RELAY_MUA_TO_MX RCVD_IN_PBL RDNS_NONE
5.025 DKIM_ADSP_ALL DOS_RCVD_IP_TWICE_B DOS_RCVD_IP_TWICE_C FORGED_RELAY_MUA_TO_MX RCVD_IN_PBL RDNS_NONE
5.025 DKIM_ADSP_ALL DOS_RCVD_IP_TWICE_B DOS_RCVD_IP_TWICE_C FORGED_RELAY_MUA_TO_MX RCVD_IN_PBL RDNS_NONE
5.025 DKIM_ADSP_ALL DOS_RCVD_IP_TWICE_B DOS_RCVD_IP_TWICE_C FORGED_RELAY_MUA_TO_MX RCVD_IN_PBL RDNS_NONE
5.025 DKIM_ADSP_ALL DOS_RCVD_IP_TWICE_B DOS_RCVD_IP_TWICE_C FORGED_RELAY_MUA_TO_MX RCVD_IN_PBL RDNS_NONE
5.025 DKIM_ADSP_ALL DOS_RCVD_IP_TWICE_B DOS_RCVD_IP_TWICE_C FORGED_RELAY_MUA_TO_MX RCVD_IN_PBL RDNS_NONE
5.025 DKIM_ADSP_ALL DOS_RCVD_IP_TWICE_B DOS_RCVD_IP_TWICE_C FORGED_RELAY_MUA_TO_MX RCVD_IN_PBL RDNS_NONE
5.025 DKIM_ADSP_ALL DOS_RCVD_IP_TWICE_B DOS_RCVD_IP_TWICE_C FORGED_RELAY_MUA_TO_MX RCVD_IN_PBL RDNS_NONE
5.025 DKIM_ADSP_ALL DOS_RCVD_IP_TWICE_B DOS_RCVD_IP_TWICE_C FORGED_RELAY_MUA_TO_MX RCVD_IN_PBL RDNS_NONE

-- "It is the first responsibility of every citizen to question authority." - Benjamin Franklin http://www.ChaosReigns.com