Fuz2 false positive

View: New views
10 Messages — Rating Filter:   Alert me  

Fuz2 false positive

by www.isp2dial.com :: Rate this Message:

Reply (Restricted by the Administrator) | Reply to Author | View Threaded | Show Only this Message

Hello,

I am testing DCC, via dccproc in spamassassin.

I was surprised by a hit with the following message, since only a few
people read that list.

Fuz2 looks like a false positive.

I don't want to whitelist every mailing list my users read.  If I had
to do that, I could not justify using DCC.  But if I can disable Fuz2,
I may consider that.  Is it possible?


X-DCC-dcc-servers.net-Metrics: daves 102; Body=3 Fuz1=3 Fuz2=many


>From www-data@...  Fri Apr 11 09:01:51 2008
>Return-Path: <www-data@...>
>Received: from vpsville.ca (mail.vpsville.ca [66.29.75.57])
>        by daves.isp2dial.com (Hard2Crack-0.001) with ESMTP id m3BD1nqw008125
>        for <jak@...>; Fri, 11 Apr 2008 09:01:50 -0400
>Received: from vpsville.ca (unverified [66.29.75.57])
>        by vpsville.ca (SurgeMail 3.8k4) with ESMTP id 157674-1961819
>        for <jak@...>; Fri, 11 Apr 2008 09:01:49 -0400
>To: jak@...
>Subject: New reply to Re: disk space quotas by diliprajan
>From: admin@...
>Errors-To: admin@...
>X-Mailer: FUDforum v2.7.7
>Content-Type: text/plain; charset=ISO-8859-15
>Date: Fri, 11 Apr 2008 09:01:49 -0400
>Message-ID: <1207918909_1307@...>
>X-Virus-Scanned: ClamAV version 0.92.1, clamav-milter version 0.92.1 on daves.isp2dial.com
>X-Virus-Status: Clean
>
>
>To view unread replies go to http://www.vpsville.ca/forum/index.php?t=rview&goto=37#msg_37
>
>If you do not wish to receive further notifications about replies in this topic, please go here: http://www.vpsville.ca/forum/index.php?t=rview&th=22¬ify=1&opt=off


_______________________________________________
DCC mailing list      DCC@...
http://www.rhyolite.com/mailman/listinfo/dcc

RE: Fuz2 false positive

by Rose, Bobby :: Rate this Message:

Reply (Restricted by the Administrator) | Reply to Author | View Threaded | Show Only this Message

So the message body is just those generic 5 lines.  No wonder it's been
seen so many times.  In the case of Fuz2 and probably even Fuz1, if that
same message body is used all the time with minor changes such as
changing the uri to another page of the same site, then from dcc's point
of view it's the same message hash.
 

-----Original Message-----
From: dcc-admin@... [mailto:dcc-admin@...] On Behalf
Of www.isp2dial.com
Sent: Friday, April 11, 2008 12:21 PM
To: dcc@...
Subject: Fuz2 false positive

Hello,

I am testing DCC, via dccproc in spamassassin.

I was surprised by a hit with the following message, since only a few
people read that list.

Fuz2 looks like a false positive.

I don't want to whitelist every mailing list my users read.  If I had to
do that, I could not justify using DCC.  But if I can disable Fuz2, I
may consider that.  Is it possible?


X-DCC-dcc-servers.net-Metrics: daves 102; Body=3 Fuz1=3 Fuz2=many


>From www-data@...  Fri Apr 11 09:01:51 2008
>Return-Path: <www-data@...>
>Received: from vpsville.ca (mail.vpsville.ca [66.29.75.57])
>        by daves.isp2dial.com (Hard2Crack-0.001) with ESMTP id
m3BD1nqw008125

>        for <jak@...>; Fri, 11 Apr 2008 09:01:50 -0400
>Received: from vpsville.ca (unverified [66.29.75.57])
>        by vpsville.ca (SurgeMail 3.8k4) with ESMTP id 157674-1961819
>        for <jak@...>; Fri, 11 Apr 2008 09:01:49 -0400
>To: jak@...
>Subject: New reply to Re: disk space quotas by diliprajan
>From: admin@...
>Errors-To: admin@...
>X-Mailer: FUDforum v2.7.7
>Content-Type: text/plain; charset=ISO-8859-15
>Date: Fri, 11 Apr 2008 09:01:49 -0400
>Message-ID: <1207918909_1307@...>
>X-Virus-Scanned: ClamAV version 0.92.1, clamav-milter version 0.92.1 on

>daves.isp2dial.com
>X-Virus-Status: Clean
>
>
>To view unread replies go to
>http://www.vpsville.ca/forum/index.php?t=rview&goto=37#msg_37
>
>If you do not wish to receive further notifications about replies in
>this topic, please go here:
>http://www.vpsville.ca/forum/index.php?t=rview&th=22¬ify=1&opt=off


_______________________________________________
DCC mailing list      DCC@...
http://www.rhyolite.com/mailman/listinfo/dcc

_______________________________________________
DCC mailing list      DCC@...
http://www.rhyolite.com/mailman/listinfo/dcc

Re: Fuz2 false positive

by Vernon Schryver :: Rate this Message:

Reply (Restricted by the Administrator) | Reply to Author | View Threaded | Show Only this Message

> From: "www.isp2dial.com" <jak@...>

> I am testing DCC, via dccproc in spamassassin.
>
> I was surprised by a hit with the following message, since only a few
> people read that list.
>
> =46uz2 looks like a false positive.
>
> I don't want to whitelist every mailing list my users read.  If I had
> to do that, I could not justify using DCC.  But if I can disable Fuz2,
> I may consider that.  Is it possible?

> X-DCC-dcc-servers.net-Metrics: daves 102; Body=3D3 Fuz1=3D3 Fuz2=3Dmany


> >To view unread replies go to =
> http://www.vpsville.ca/forum/index.php?t=3Drview&goto=3D37#msg_37
> >
> >If you do not wish to receive further notifications about replies in =
> this topic, please go here: =
> http://www.vpsville.ca/forum/index.php?t=3Drview&th=3D22¬ify=3D1&opt=3D=
> off


If DCC detected that bulk mail message as bulk, then it was not a
false positive.  To distinguish solicited from unsolicited bulk
email, you need either local, per-user whitelists or something like
SpamAssassin's scoring for DCC hits.  I recommend whitelists, because
one user's valuable mailing list is another user's offensive spam.
However, most DCC installations use SpamAssassin.

Some installations use CGI scripts similar to those included in the DCC
source and demonstrated at
https://www.rhyolite.com/DCC-demo-cgi-bin
with user name cgi-demo and password cgi-demo
to let users maintain their own white- and blacklists, turn greylisting
on or off, set DCC and DCC Reputation thresholds, and so on.

In this particular case, the fact that particular mail message has a
FUZ2 count of "many" implies that a very similar message was received
by a spam trap or reported as spam by a human recipient.  That suggests
that it might be wise to clean that all mailing lists involving
"www.vpsville.ca" and ensure that they are all configured to confirm
subscription requests with email messages from new subscribers.  To put
it politely, there might be some unwilling subscribers.


Vernon Schryver    vjs@...
_______________________________________________
DCC mailing list      DCC@...
http://www.rhyolite.com/mailman/listinfo/dcc

Re: Fuz2 false positive

by www.isp2dial.com :: Rate this Message:

Reply (Restricted by the Administrator) | Reply to Author | View Threaded | Show Only this Message

On Fri, 11 Apr 2008 16:57:14 GMT, Vernon Schryver
<vjs@...> wrote:

>In this particular case, the fact that particular mail message has a
>FUZ2 count of "many" implies that a very similar message was received
>by a spam trap or reported as spam by a human recipient.  That suggests
>that it might be wise to clean that all mailing lists involving
>"www.vpsville.ca" and ensure that they are all configured to confirm
>subscription requests with email messages from new subscribers.  To put
>it politely, there might be some unwilling subscribers.

I don't think so.  I think your fuz2 algorithm is bogus.

Thanks for the prompt reply.


--
Webmail for Dialup Users
http://www.isp2dial.com/freeaccounts.html
 
_______________________________________________
DCC mailing list      DCC@...
http://www.rhyolite.com/mailman/listinfo/dcc

Re: Fuz2 false positive

by Vernon Schryver :: Rate this Message:

Reply (Restricted by the Administrator) | Reply to Author | View Threaded | Show Only this Message

> From: "www.isp2dial.com" <jak@...>

> I don't think so.  I think your fuz2 algorithm is bogus.
>
> Thanks for the prompt reply.

It can be entirely reasonable to decide that DCC is not useful or
appropriate for a site, but that reason is itself bogus.  Of course
there could be bugs in the body checksum code, but in this case it is
practicallly certain that the other message received somewhere at
08/04/10 20:41:36 UTC and reported with an identical FUZ2 checksum and
a count of MANY was extremely similar to the mail message at issue here.

If one were certain that none of the mailing lists at
http://www.vpsville.ca/forum have unwilling subscribers, including
subscribers who are unclear on a concept or two and reporting mail as
spam that they subscribed to, then one might change the boilerplate of
that not really false positive message so that future messages are not
practically identical to messages from other sites using the same "forum"
software and that do have unwilling subscribers.

Regardless of the details in this case, it would be best to turn off
the DCC client at 69.60.113.26 and demand a refund of the purchase price.


Vernon Schryver    vjs@...
_______________________________________________
DCC mailing list      DCC@...
http://www.rhyolite.com/mailman/listinfo/dcc

Re: Fuz2 false positive

by Paul R. Ganci-2 :: Rate this Message:

Reply (Restricted by the Administrator) | Reply to Author | View Threaded | Show Only this Message

www.isp2dial.com wrote:
> I am testing DCC, via dccproc in spamassassin.
>
> I don't want to whitelist every mailing list my users read.  If I had
> to do that, I could not justify using DCC.  But if I can disable Fuz2,
> I may consider that.  Is it possible?
>  
Have you tried using the spamassassin control (see
http://spamassassin.apache.org/full/3.2.x/doc/Mail_SpamAssassin_Plugin_DCC.html):

dcc_fuz2_max NUMBER
**
    This option sets how often a message's body/fuz1/fuz2 checksum must
    have been reported to the DCC server before SpamAssassin will
    consider the DCC check as matched.

    As nearly all DCC clients are auto-reporting these checksums, you
    should set this to a relatively high value, e.g. |999999| (this is
    DCC's MANY count).

    The default is |999999| for all these options.

Since DCC's many count is 999999 then setting this to 1000000 (or
higher) should in principle disable the fuz2 check in spamassassin since
spamassassin should never get a count higher.

--
Paul (ganci@...)

_______________________________________________
DCC mailing list      DCC@...
http://www.rhyolite.com/mailman/listinfo/dcc

Re: Fuz2 false positive

by Vernon Schryver :: Rate this Message:

Reply (Restricted by the Administrator) | Reply to Author | View Threaded | Show Only this Message

> From: "Paul R. Ganci" <ganci@...>

> dcc_fuz2_max NUMBER
> **
>     This option sets how often a message's body/fuz1/fuz2 checksum must
>     have been reported to the DCC server before SpamAssassin will
>     consider the DCC check as matched.
>
>     As nearly all DCC clients are auto-reporting these checksums, you
>     should set this to a relatively high value, e.g. |999999| (this is
>     DCC's MANY count).
>
>     The default is |999999| for all these options.
>
> Since DCC's many count is 999999 then setting this to 1000000 (or
> higher) should in principle disable the fuz2 check in spamassassin since
> spamassassin should never get a count higher.

The internal numeric equivalent of the DCC checksum value "MANY" is
*not* 999999.  999999 is merely the number to which SpamAssassin
translates the string "MANY".  The true internal value is almost 17
times larger than 999999.  I'll not say what the value is to forestall
other ill advised translations of "many."  "Many" is simply the largest
possible value of a DCC checksum count.  Think of it as like a mathematical
projective infinity or like IEEE 754 floating point +infinity.


To turn off the FUZ2 checksum, try teaching SpamAssassin to look for
the string "bulk" in the X-DCC header instead of any particular number.
(SpamAssassin may already look for "bulk" in X-DCC headers; I've forgotten
and don't feel like looking at the SpamAssassin source yet again.)
Then set your desired threshold by one or more of:
  1. causing SpamAssassin to run dccproc with a suitable -cFUZ2,,X
      where X is a number greater than or equal to 0, the string
      "many", or the string "never."  See the dccproc man page.
  2. causing SpamAssassin to run dccproc with -w whiteclnt and put
      a line like the following to /var/dcc/whiteclnt
        option threshold,FUZ2,X
      See the dcc man page.
  3. using dccifd instead of dccproc and add the line from #2 to
      /var/dcc/whiteclnt
  4. using dccifd instead of dccproc setting DCCM_REJECT_AT or
      DCCIFD_REJECT_AT or adding  -tFUZ2,,X to DCCIFD_ARGS
      in /var/dcc/dcc_conf  See the dccifd man page.

Unless your mail system receives fewer than several 1000 mail messages
per day, dccifd is a far better choice than dccproc.  However, you might
need to teach SpamAssassin to look for the dccifd socket in a directory
other than /var/dcc if wherever you got the DCC source has moved it.

I cannot imagine a reason to turn off the FUZ2 checksum as opposed to
simply not using DCC.  If the DCC checksums don't fit your needs, then
it seems at best odd to waste the CPU cycles, network bandwidth, and
wall clock time getting the checksums....well, there are special cases
such as using a private DCC database of checksums of IP addresses or
other stigmata for rate-limiting out-going email.


Vernon Schryver    vjs@...
_______________________________________________
DCC mailing list      DCC@...
http://www.rhyolite.com/mailman/listinfo/dcc

Re: Fuz2 false positive

by Jeff Mincy :: Rate this Message:

Reply (Restricted by the Administrator) | Reply to Author | View Threaded | Show Only this Message

   From: Vernon Schryver <vjs@...>
   Date: Sat, 19 Apr 2008 02:26:53 GMT
   
   > From: "Paul R. Ganci" <ganci@...>
   
   > dcc_fuz2_max NUMBER
   > **
   >     This option sets how often a message's body/fuz1/fuz2 checksum must
   >     have been reported to the DCC server before SpamAssassin will
   >     consider the DCC check as matched.
   >
   >     As nearly all DCC clients are auto-reporting these checksums, you
   >     should set this to a relatively high value, e.g. |999999| (this is
   >     DCC's MANY count).
   >
   >     The default is |999999| for all these options.
   >
   > Since DCC's many count is 999999 then setting this to 1000000 (or
   > higher) should in principle disable the fuz2 check in spamassassin since
   > spamassassin should never get a count higher.
   
   The internal numeric equivalent of the DCC checksum value "MANY" is
   *not* 999999.  999999 is merely the number to which SpamAssassin
   translates the string "MANY".  The true internal value is almost 17
   times larger than 999999.  I'll not say what the value is to forestall
   other ill advised translations of "many."  "Many" is simply the largest
   possible value of a DCC checksum count.  Think of it as like a mathematical
   projective infinity or like IEEE 754 floating point +infinity.

   To turn off the FUZ2 checksum, try teaching SpamAssassin to look for
   the string "bulk" in the X-DCC header instead of any particular number.
   (SpamAssassin may already look for "bulk" in X-DCC headers; I've forgotten
   and don't feel like looking at the SpamAssassin source yet again.)
   ...
   Vernon Schryver    vjs@...

SpamAssassin translates body/fuz1/fuz2 values of "many" to 999999 and
then compares the translated body/fuz1/fuz2 values to
dcc_body_max/dcc_fuz1_max/dcc_fuz2_max which default to 999999.  So,
by default the DCC_CHECK test hits if at least one of the
body/fuz1/fuz2 values is "many".

SpamAssassin will short-circuit if there is a X-DCC header with
"bulk".  Otherwise, SpamAssassin uses either dccproc or dccifd to get
the dcc response.  
_______________________________________________
DCC mailing list      DCC@...
http://www.rhyolite.com/mailman/listinfo/dcc

Re: Fuz2 false positive

by Jeff Mincy :: Rate this Message:

Reply (Restricted by the Administrator) | Reply to Author | View Threaded | Show Only this Message

   From: Vernon Schryver <vjs@...>
   Date: Sat, 19 Apr 2008 02:26:53 GMT
   
   > From: "Paul R. Ganci" <ganci@...>
   
   > dcc_fuz2_max NUMBER
   > **
   >     This option sets how often a message's body/fuz1/fuz2 checksum must
   >     have been reported to the DCC server before SpamAssassin will
   >     consider the DCC check as matched.
   >
   >     As nearly all DCC clients are auto-reporting these checksums, you
   >     should set this to a relatively high value, e.g. |999999| (this is
   >     DCC's MANY count).
   >
   >     The default is |999999| for all these options.
   >
   > Since DCC's many count is 999999 then setting this to 1000000 (or
   > higher) should in principle disable the fuz2 check in spamassassin since
   > spamassassin should never get a count higher.
   
   The internal numeric equivalent of the DCC checksum value "MANY" is
   *not* 999999.  999999 is merely the number to which SpamAssassin
   translates the string "MANY".  The true internal value is almost 17
   times larger than 999999.  I'll not say what the value is to forestall
   other ill advised translations of "many."  "Many" is simply the largest
   possible value of a DCC checksum count.  Think of it as like a mathematical
   projective infinity or like IEEE 754 floating point +infinity.

   To turn off the FUZ2 checksum, try teaching SpamAssassin to look for
   the string "bulk" in the X-DCC header instead of any particular number.
   (SpamAssassin may already look for "bulk" in X-DCC headers; I've forgotten
   and don't feel like looking at the SpamAssassin source yet again.)
   ...
   Vernon Schryver    vjs@...

Ooops - hit send too soon.

SpamAssassin translates body/fuz1/fuz2 values of "many" to 999999 and
then compares the translated body/fuz1/fuz2 values to
dcc_body_max/dcc_fuz1_max/dcc_fuz2_max which default to 999999.  So,
by default the DCC_CHECK test hits if at least one of the
body/fuz1/fuz2 values is "many".

SpamAssassin will short-circuit if there is a X-DCC header with
"bulk".  Otherwise, SpamAssassin uses either dccproc or dccifd to get
the dcc response.   If there is a X-DCC header with "bulk" then
the DCC_CHECK hits and the body/fuz1/fuz2 counts are ignored.
When SpamAssassin explicitly calls dccproc or dccifd then the "bulk"
string is ignored.  SpamAssassin should presumably notice the "bulk"
string when calling dccproc or dccifd.

Anyway, you seem to object to SpamAssassin doing s/many/999999/
The many/999999 thing doesn't cause any problems does it?

-jeff
_______________________________________________
DCC mailing list      DCC@...
http://www.rhyolite.com/mailman/listinfo/dcc

Re: Fuz2 false positive

by Vernon Schryver :: Rate this Message:

Reply (Restricted by the Administrator) | Reply to Author | View Threaded | Show Only this Message

> From: Jeff Mincy <mincy@...>

> SpamAssassin translates body/fuz1/fuz2 values of "many" to 999999 and
> then compares the translated body/fuz1/fuz2 values to
> dcc_body_max/dcc_fuz1_max/dcc_fuz2_max which default to 999999.  So,
> by default the DCC_CHECK test hits if at least one of the
> body/fuz1/fuz2 values is "many".
>
> SpamAssassin will short-circuit if there is a X-DCC header with
> "bulk".  Otherwise, SpamAssassin uses either dccproc or dccifd to get
> the dcc response.   If there is a X-DCC header with "bulk" then
> the DCC_CHECK hits and the body/fuz1/fuz2 counts are ignored.
> When SpamAssassin explicitly calls dccproc or dccifd then the "bulk"
> string is ignored.  SpamAssassin should presumably notice the "bulk"
> string when calling dccproc or dccifd.
>
> Anyway, you seem to object to SpamAssassin doing s/many/999999/
> The many/999999 thing doesn't cause any problems does it?

SpamAssassin's many/999999 thing dates from before dccproc had -c to
set per-checksum thresholds as well as before dccifd existed, and not
to mention before per-checksum thresholds could be into per-user whiteclnt
files.

Setting SpamAssassin's own threshold for FUZ2 as was suggested would
not have the desired effect of ignoring FUZ2 results because the real
value of "many" is larger than 999999.  Some people run (or once ran)
spam traps that reported bad mail with large counts insteadd of "many".
That could result in a dccifd or dccproc header with a FUZ2 result
larger than 1000000 like "X-DCC...FUZ2=1234567..." that would not be
ignored by setting the SpamAassassin threshold to 1000000.

What should be done by someone with the keys to SpamAssassin's DCC
plugin is to
   - have SpamAssassin pass its thresholds as -c args to dccproc
   - use the rejection status it gets from dccifd instead of the
      X-DCC header
   - if SpamAssassin must use the X-DCC header from dccifd, then always
      and only look for the string "bulk" in the header
   - try harder to find the dccifd socket and to use dccifd instead of dccproc
      and check that the SpamAssassin thresholds are in the dcc_conf file.
   - make SpamAssassin always and only look for the string "bulk" in
      the X-DCC header it gets from dccproc, at the tiny sites using
      dccproc

Then someone who wanted to ignore the FUZ2 result but not the BODY
or FUZ1 results could set the FUZ2 threshold to "never" and get
that very unlikely result.

Instead of turning off only one of the DCC results, it would make far
more sense to adjust the score that SpamAssassin gives a DCC hit.
If you think you have FUZ2 false positives, then you surely think you
have FUZ1 and BODY false positives.

Of course, I still think the right answer is not scoring DCC hits
but per-site and per-user whitelists for solicited bulk email and
rejections of unsolicited bulk email.


Vernon Schryver    vjs@...
_______________________________________________
DCC mailing list      DCC@...
http://www.rhyolite.com/mailman/listinfo/dcc
LightInTheBox - Buy quality products at wholesale price