spamassassin-users June 2010 archive
Main Archive Page > Month Archives  > spamassassin-users archives
spamassassin-users: Re: A few questions

Re: A few questions

From: Jari Fredriksson <jarif_at_nospam>
Date: Thu Jun 10 2010 - 16:31:17 GMT
To: users@spamassassin.apache.org

On 10.6.2010 19:10, Adam Moffett wrote:
> These issues came up when I was trying to address performance problems,
> I hope they aren't major RTFM items.
>
> 1) I used sa-compile as suggested by the FAQ and the CPU load dropped
> *dramatically*. The question is do I have to run that every time I
> sa-update or will it happen automatically?

Yes, every time.

>
> 2) I disabled the auto whitelist module, and got scan times down from
> 200+ secs to ~40 secs. The AWL db file was over 2.5Gig. The FAQ
> implies that I don't really need AWL, is this the general concensus? If
> I keep using it, is there an easy automatic way to prune the AWL db for
> old or seldom used entries.
>

You can add a timestamp into the awl table, if using SQL back end. I
think the description to that is somewhere in SQL howto in wiki, or
someone will post that later...

> 3) I disabled Bayes and now scan times are down to 1 or 2 secs. That's
> great, but I think bayes really helps so I'd rather keep it. The
> bayes_toks db is 162MB...that seems like a pretty big db to scan for
> every message. I know it does auto expire because I have a multitude of
> bayes_toks.expire files ranging from 40-80MB in size. Can I tune what
> gets expired to reduce the size of the db? Is there another solution?
> We are definitely I/O bound when bayes is enabled because we have long
> scan times but CPU usage stays in the 8-10% range.
>

If you have more than one spamd instance, a separate SQL db would be
good. I use MySQL, while this still is basically a one user system.

-- http://www.iki.fi/jarif/ I use PGP. If there is an incompatibility problem with your mail client, please contact me. You own a dog, but you can only feed a cat.