spamassassin-users October 2010 archive
Main Archive Page > Month Archives  > spamassassin-users archives
spamassassin-users: rule to catch subject spamming

rule to catch subject spamming

From: Lawrence _at_nospam <_at_nospam>
Date: Sat Oct 23 2010 - 16:58:38 GMT
To: users@spamassassin.apache.org

Hello all,

I noticed recently that our users are getting spam with the subject
similar to the following:

SehxpyNaturalRedheaddFayeReaganHasHerFirstLesbianExperienceWithBrunet

SpamAssassin seems to be having a hard time determining whether it is
spam or not because it appears as one long word.

In all cases, the subject contains no spaces (to prevent detection I
would think) and is longer than 62 characters (not sure why they do
this, but it is true in every sample I've seen so far).

I would like to create a rule to pick up on this, but having a bit of
difficult with the regex for the rule. This is what I've come up with so far

header CR_SUBJECT_SPAMMY Subject =~ /.{62}/
describe CR_SUBJECT_SPAMMY Subject looks spammy (contains a lot of
characters, and no spaces)
score CR_SUBJECT_SPAMMY 2.5

I just need to modify the regex to check that the Subject contains no
spaces.

I've done some research, and the longest non-coined word in a major
dictionary is 30 characters long, meaning that if it was used twice in a
subject, the total length would still only be 60 characters, There may
be some FPs if the sender used formatting like commas and such, but the
possibility of them using 2 of the word, then formatting without
spacing, would probably be extremely remote.

Any assistance or advice would be greatly appreciated.

Regards,

Lawrence Williams
LCWSoft