spamassassin-users October 2010 archive
Main Archive Page > Month Archives  > spamassassin-users archives
spamassassin-users: Re: rule to catch subject spamming

Re: rule to catch subject spamming

From: Karsten Bräckelmann <guenther_at_nospam>
Date: Sat Oct 23 2010 - 19:36:54 GMT

On Sat, 2010-10-23 at 15:36 -0230, Lawrence @ Rogers wrote:
> On 23/10/2010 2:28 PM, Lawrence @ Rogers wrote:
> > In all cases, the subject contains no spaces (to prevent detection I
> > would think) and is longer than 62 characters (not sure why they do
> > this, but it is true in every sample I've seen so far).
> >
> > I would like to create a rule to pick up on this, but having a bit of
> > difficult with the regex for the rule. This is what I've come up with
> > so far
> >
> > header CR_SUBJECT_SPAMMY Subject =~ /.{62}/

> > I just need to modify the regex to check that the Subject contains no
> > spaces.

There are a number possible solutions to match something like this.
Ultimately, it depends on how strict you want it to be.

The dot matches any char, so doesn't do it. One variant would be as
requested, to match a very long string with no whitespace chars -- the
uppercase \S inverts the \s whitespace.


This *does* allow whitespace and merely looks for a long (sub-)string
with no whitespace. To match only when the entire Subject does not
contain any whitespace, we would need to anchor the match at the
beginning and end, and check for "60 or more".


If, however, you want to strictly match such NoWhiteSpaceMultiWords,
unlike the above not matching if there is punctuation, you could use a
char class (case insensitive /i).


> This is the rule I've come up with now

> header CR_SUBJECT_SPAMMY Subject =~ /^[0-9a-zA-Z,.+]{42,}$/

That would work, too, similar to the second one above. :)

-- char *t="\10pse\0r\0dtu\0.@ghno\x4e\xc8\x79\xf4\xab\x51\x8a\x10\xf4\xf4\xc4"; main(){ char h,m=h=*t++,*x=t+2*h,c,i,l=*x,s=0; for (i=0;i<l;i++){ i%8? c<<=1: (c=*++x); c&128 && (s+=h); if (!(h>>=1)||!t[s+h]){ putchar(t[s]);h=m;s=0; }}}