|Main Archive Page > Month Archives > spamassassin-users archives|
Oops, further investigation indicates that Bayes is "on"--thought the
default was "off" for my config. I would be inclined to turn it off as I
have no decent way of teaching it beyond mass-config into the
On 10/17/10 10:37 PM, Jerry Pape wrote:
> Wow, I am grateful for the prompt answers, but I must say they have
> confused me.
> Bayes should not be on in my config and subsequent check of the GUI
> says its not--this may be wrong.
> Further, what are the "scoreset" indexes?
> I don't use Bayes because all of my clients are POP mail and they are
> neither smart|committed enough to mail back ham/spam to educate the
> Additionally, when I used Bayes way back when (without manual
> population) and simply allowed auto-population to occur, I ended up
> with enormous
> .spamassassin sub-files that rapidly eclipsed 50% of the client's disk
> I am certain that I am missing critical configurational understanding
> and optimizations, but
> until your lot kindly educates me--it is what it is and my initial
> dilemma remains unresolved.
> On 10/17/10 7:01 PM, John Hardin wrote:
>> On Sun, 17 Oct 2010, Jerry Pape wrote:
>>> [Not sure if this is the right place to send this--please correct me
>>> if I am in error]
>> This is the place.
>>> Assessment of this header at
>>> http://www.futurequest.net/docs/SA/decode/ yields:
>>> Test Score Description
>>> BAYES_40 0.000 Bayesian spam probability is 20 to 40%
>>> HTML_IMAGE_RATIO_02 0.550 HTML has a low ratio of text to
>>> image area
>>> HTML_MESSAGE 0.001 HTML included in message
>>> HTML_MIME_NO_HTML_TAG 1.052 HTML-only message, but there is
>>> no HTML tag
>>> MIME_HTML_ONLY 1.672 Message only has text/html MIME parts
>>> RDNS_NONE 0.100 Delivered to trusted network by a host with
>>> no rDNS
>>> URIBL_BLACK 1.961 Contains an URL listed in the URIBL blacklist
>>> Total: 5.336
>>> Clearly 5.336 does not equal 3.8.
>> There are four score sets to choose from based on what options you
>> have enabled. The above is for scoreset 2, no BAYES + net tests.
>> Scoreset 3, BAYES + net tests, gives:
>> HTML_MIME_NO_HTML_TAG 0.097
>> MIME_HTML_ONLY_MULTI 0.001
>> HTML_IMAGE_RATIO_02 0.383
>> HTML_MESSAGE 0.001
>> MIME_HTML_ONLY 1.457
>> BAYES_40 -0.185
>> URIBL_BLACK 1.955
>> RDNS_NONE 0.1
>> These are all of the default scores, and match what you're seeing.
>>> I have no idea how to regress and resolve this problem.
>> First off, you need to review your Bayes training. An obviously
>> spammy message shouldn't be hitting BAYES_40. Properly-trained Bayes,
>> hitting BAYES_99, would have scored 7.494 on that message.
>> For analysis in general...
>> This will put the individual rule scores into the headers:
>> add_header all Status "_YESNO_, score=_SCORE_ required=_REQD_
>> tests=_TESTSSCORES_ autolearn=_AUTOLEARN_ version=_VERSION_"
>> "spamassassin --debug area=rules <test_msg_file" is often helpful.
>> The nature of spam changes over time. 3.2, which is only getting
>> critical bug fixes now, will become steadily less effective the more
>> time passes and the spammers evolve new tricks. It's getting to the
>> point that you should really consider upgrading to the latest 3.3