spamassassin-users April 2010 archive
Main Archive Page > Month Archives  > spamassassin-users archives
spamassassin-users: Re: Interesting use of html comments

Re: Interesting use of html comments

From: Kris Deugau <kdeugau_at_nospam>
Date: Wed Apr 21 2010 - 20:44:26 GMT
To: users@spamassassin.apache.org

Giampaolo Tomassoni wrote:
> But, anyway, I see SA 3.3.1 comes with a very good HTMLEval plugin. However,
> it seems to me that it misses a way to, in example, count the length of the
> text commented out with respect to the uncommented one and eventually
> trigger a rule if the ratio is above a given threshold.
>
> Do you think such a rule could be somehow useful? Is there anything close to
> it?

I don't know of anything in the stock rules, but in response to missed
spam reported by customers:

rawbody LONG_COMMENT m|<!--[^>{};]{200,}-->|
describe LONG_COMMENT HTML comment with 200+ characters of "content"
score LONG_COMMENT 1.25
rawbody DUMB_COMMENT_1 m|<!--\n?\s*\d+\s*\n?-->|
describe DUMB_COMMENT_1 HTML comment with bare chunk of numbers
score DUMB_COMMENT_1 1.25
rawbody DUMB_COMMENT_2 m|<!--\n?\s*(?:-{72}\n){2,}-+\n?\s*-->|
describe DUMB_COMMENT_2 HTML comment consisting of several lines of dashes
score DUMB_COMMENT_2 1.25
rawbody BACK2BACK_COMMENT m|--!><!--[\n\s\w]+--!><!--|
describe BACK2BACK_COMMENT HTML structure that looks a lot like
back-to-back empty comments
score BACK2BACK_COMMENT 1.75

(Of course, now that I've published them they'll go totally useless... <g>)

-kgd