spamassassin-users October 2010 archive
Main Archive Page > Month Archives  > spamassassin-users archives
spamassassin-users: Re: rule to catch subject spamming

Re: rule to catch subject spamming

From: Lawrence _at_nospam <_at_nospam>
Date: Sun Oct 24 2010 - 18:35:36 GMT
To: users@spamassassin.apache.org

On 23/10/2010 5:47 PM, RW wrote:
> On Sat, 23 Oct 2010 14:28:38 -0230
> "Lawrence @ Rogers"<lawrencewilliams@nl.rogers.com> wrote:
>
>> Hello all,
>>
>> I noticed recently that our users are getting spam with the subject
>> similar to the following:
>>
>> SehxpyNaturalRedheaddFayeReaganHasHerFirstLesbianExperienceWithBrunet
>>
> I got some of these a while ago. They were pretty hard to catch because
> they came through Hotmail and had little to work with in the body.
> I added:
>
>
> header SUBJ_LONG_WORD Subject =~ /\b[^[:space:][:punct:]]{30}/
> describe SUBJ_LONG_WORD Longwordinsubjectlikethis
> score SUBJ_LONG_WORD 2.0
>
> header SUBJ_JOIN_CAP_WORD Subject =~ /([[:upper:]]+[[:lower:]]+){5}/
> describe SUBJ_JOIN_CAP_WORD JoinedCapitalizedWordsRuntogether
> score SUBJ_JOIN_CAP_WORD 1.5
>
>
> They are missing some "?:", but for single header rules I don't really
> care.
>
Thanks, but some testing showed that your rules FP on URLs in the
Subject line.

I have settled on the following as it's more specific and less prone to
FPs (I can't think of any possibilities right now)

# Matches a new technique used by spammers in the Subject line
# Running a bunch of pornographic words together (with no spaces) to
evade spam filters
# The message itself is generally malformed HTML with one or more
unusually long lines
# This rule is a meta rule that tests for the Subject containing any
numbers, letters, or common formatting
# Must hit at least 3 SA rules (__LOCAL_SUBJECT_SPAMMY, and 2 others...
usually HTML_MESSAGE and MIME_QP_LONG_LINE)
# string must be at least 42 characters and contain no spaces

header __LOCAL_SUBJECT_SPAMMY Subject =~ /^[0-9a-zA-Z,.+]{42,}$/
meta LOCAL_SUBJECT_SPAMMY1 ((__LOCAL_SUBJECT_SPAMMY + HTML_MESSAGE +
MIME_QP_LONG_LINE + MPART_ALT_DIFF + TRACKER_ID) > 2)
describe LOCAL_SUBJECT_SPAMMY1 Subject looks spammy (contains a lot of
characters, and no spaces)
score LOCAL_SUBJECT_SPAMMY1 5.0
tflags LOCAL_SUBJECT_SPAMMY1 noautolearn

Cheers,
Lawrence Williams
LCWSoft