spamassassin-users March 2011 archive
Main Archive Page > Month Archives  > spamassassin-users archives
spamassassin-users: Re: One thing about bug 6558

Re: One thing about bug 6558

From: David F. Skoll <dfs_at_nospam>
Date: Wed Mar 30 2011 - 14:59:39 GMT
To: users@spamassassin.apache.org

On Wed, 30 Mar 2011 16:51:57 +0200
Marcin Mirosław <marcin@mejor.pl> wrote:

> I'm using postgresql, but machine isn't quick... Any db is slowly
> there.

Using Pg for Bayes data will be really slow. We don't use the SpamAssassin
Bayes implementation and we went through three iterations of storage
back-ends before finding one we liked.

1) PostgreSQL: Convenient but slow.

2) Berkeley DB: Faster than PostgreSQL, but still slow and
occasionally flaky

3) CDB: Very fast, but cannot be incrementally updated. You need to rebuild
the entire DB and then atomically rename it.

In our implementation, it's not a problem to have a read-only DB, so we went
with CDB. It's dramatically faster than Berkeley DB:

     http://www.dmo.ca/blog/benchmarking-hash-databases-on-large-data/

Regards,

David.