spamassassin-users March 2010 archive
Main Archive Page > Month Archives  > spamassassin-users archives
spamassassin-users: Re: A possibly suspect idea

Re: A possibly suspect idea

From: Martin Gregorie <martin_at_nospam>
Date: Fri Mar 12 2010 - 13:52:01 GMT
To: users@spamassassin.apache.org

On Fri, 2010-03-12 at 08:15 +0200, Henrik K wrote:

> Why don't you simply maintain your wordlists in some files and use a script
> to generate portmanteau.cf? You could use Regexp::Assemble module to
> optimize also. Who cares what the actual rules look like? The more words
> (simple alternations) there are in a single RE, the better it performs. If
> you want clarity in the cf, keep the original words listed in a comment
> block.
>
....because that didn't occur to me.

Its a good idea. Better yet, my rule development & test environment can
be easily extended to incorporate it. Thanks.

Your comment about a single regex containing many alternations being
more efficient than several smaller ones raises two questions:

- what is the maximum line length for such a rule?
- does the order of alternations have any effect on performance or
  is alphabetic order good enough? It would certainly make rule
  generation simpler.

 
Martin