spamassassin-dev April 2011 archive
Main Archive Page > Month Archives  > spamassassin-dev archives
spamassassin-dev: Lack of sorting in sought Re: svn commit: r108

Lack of sorting in sought Re: svn commit: r1088313 - /spamassassin/trunk/rulesrc/sandbox/jm/20_sought.cf

From: <darxus_at_nospam>
Date: Sun Apr 03 2011 - 16:27:31 GMT
To: SpamAssassin Dev <dev@spamassassin.apache.org>

Over half the lines in this are due entirely to a lack of sorting.

Number of body rules changes:

$ cat sought.txt | grep "^.body __" | wc -l
455

Number of body rules changes that weren't just removing and re-adding
the same rule:

$ cat sought.txt | grep "^.body __" | awk '{print $2}' | sort | uniq -c | sort -nr | grep -c '1 __'
199

On 04/03, jm@apache.org wrote:
> score JM_SOUGHT_1 4.0
> describe JM_SOUGHT_1 Body contains frequently-spammed text patterns

> -body __SEEK_2GEMSF /United Parcel Service notification /
> +body __SEEK_2GEMSF /United Parcel Service notification /

> -body __SEEK_2NAEPI / them pass\?hover\?dancetheirlanguage\? /
> +body __SEEK_2NAEPI / them pass\?hover\?dancetheirlanguage\? /

> -body __SEEK_3VWDKG /Limited time offer \x{96} 555USD Bonus/
> +body __SEEK_3VWDKG /Limited time offer \x{96} 555USD Bonus/

> -body __SEEK_6ZO_TB /Thank you for attention\. Post Express/
> +body __SEEK_6ZO_TB /Thank you for attention\. Post Express/

> -body __SEEK_CWKVHY / the sameinstant:Why,thosesigns\!Yes,thehentracks\! /
> +body __SEEK_CWKVHY / the sameinstant:Why,thosesigns\!Yes,thehentracks\! /

Etc.

-- "Of course there's strength in numbers. But there's strength in sharp weaponry too. Ironically, this lead to what we call 'civilization'." - spore http://www.ChaosReigns.com