|Main Archive Page > Month Archives > spamassassin-users archives|
On 6/21/2011 7:23 AM, David F. Skoll wrote:
> On Tue, 21 Jun 2011 07:06:11 -0700
> Marc Perkel<firstname.lastname@example.org> wrote:
>> Trying to get MySQL bays working in a high volume environment.
>> Dedicated MySQL server with SSD drives. Can someone send me a sample
>> my.cnf file and make other suggestings to keep it running wihout
>> database corruption and other MySQL "features"? Or - should I be
>> using some other DB?
> We've tried various ways of storing Bayes data (we have our own Bayes
> implementation, so this discussion may not correspond exactly with the
> SA implementation.) After trying Berkeley DB files and PostgreSQL---we
> would never use MySQL for any data we care about---we finally settled
> on Dan Bernstein's CDB format. It has by far the best performance.
> See: http://www.dmo.ca/blog/benchmarking-hash-databases-on-large-data/
> Take a look at the "Random Reads" timings. CDB is 6 times faster than
> Berkeley DB!
> CDB is read-only, which means when you want to do Bayes training, you
> have to rewrite the entire database. This is not an issue for our
> system because of how we do Bayes training, but it may be an issue
> with the standard sa-learn.
Thanks David but I need real time updating and it's spread across
multiple servers. So need PostgreSQL or MySQL.
-- Marc Perkel - Sales/Support email@example.com http://www.junkemailfilter.com Junk Email Filter dot com 415-992-3400