spamassassin-users April 2012 archive
Main Archive Page > Month Archives  > spamassassin-users archives
spamassassin-users: Bayes for large heterogeneous user-base (was

Bayes for large heterogeneous user-base (was Re: use_bayes=0 completly disables report function)

From: David F. Skoll <dfs_at_nospam>
Date: Sat Apr 21 2012 - 21:25:51 GMT
To: users@spamassassin.apache.org

On Sat, 21 Apr 2012 14:40:42 -0500 (CDT)
Dave Funk <dbfunk@engineering.uiowa.edu> wrote:

> > Beyes does not make much sense in a multi-user, diverse community
> > such as my university department. Makes sense here (small company;
> > small user base)

> I'll have to disagree with that, as a person running a mail server
> for a university college (organizational unit bigger than a
> department) which has thousands of users. Bayes may not be as deadly
> accurate as it would in a totally homogeneous environment but still
> worthwhile.

+1. Our (commercial) solution includes a nightly-updated Bayes corpus
that includes tokens from a couple of million messages from more than
a million end users and it's still extremely accurate at picking out
spam. As you wrote, ham is all over the place, but spam tends to look
the same.

Regards,

David.