The Spam Wars

RBLs & C/R (Challenge/Response systems)

Contents:

 

SORBS is out of control

SORBS (the RBL) has become overzealous in RBLing (blocking) many legitimate IPs.  

  • Anyone running a C/R system will eventually get an email from a spammer with return address that is a spamtrap.  This will cause SORBS to blacklist a legitimate sender (examples).  

  • A vacation message or auto-responding 'email address has changed' mailbox will cause SORBS to blacklist the sender's domain.  

  • This morning (4/27/06), a spot check shows that hotmail.com has been blacklisted by SORBS.  

  • Even the mailing list for the anti-spam tool sqlgrey (for postfix) uses a C/R system (as of 4/27/06).  This means that SORBS will blacklist access to other spam tools!  

  • Also, at some level they must be making an exception for big players like Earthlink who use C/R for their customers, but everyone else is out luck.

Their algorithm takes no account of the content (by their own claim) and apparently takes little of the frequency.

They demand $50 to even examine an RBLd IP, but offer no guarantee of not simply blacklisting someone they do not like immediately again, no matter how innocent the triggering messages.  This amounts to hardly more than extortion.

SORBS response is that registration requests, vacation messages, etc. are spam.  While automated emailers can send unwanted messages to innocent third parties, the net volume of such mail is much lower than the amount of incoming spam deflected because most (80-90% in my experience) spam from: addresses are completely bogus and no email to them ever reaches a real person.  Thus, while the pain is transferred to someone else, the net pain to the whole internet community is reduced.

Meanwhile, RBLs are weak filters.  They have a high false negative rate (spam let thru) more or less by design because their unit of resolution is only an IP address.  Even were spam and ham never to come from the same IP address, spammers move around their IP addresses, and RBLs are always playing catchup.  Worse still, RBLs have a substantial false positive rate (ham which is blocked) for some of these reasons:

  • Many auto responders are legit (examples above)

  • A hosted domain with an MTA (outgoing mail server) shared amongst multiple hostees can get them all blacklisted because one of them is a spammer ( even though the domain IP itself is separate).

  • A newly purchased domain may be one a spammer abandoned.  Presumably, these get cleaned up, eventually.

The irony is that SORBS objects to C/R systems because they are less than perfect, preferring the full assault, but they themselves are far leakier than the system they object to.

Rebuttal to "Challenge-Response Anti-Spam Systems Considered Harmful

SORBS points to this website by Karsten M. Self for the rationale behind their policy.  If this is SORBS rationale for RBLs, it is weak.

Mr. Self's points break down as follows:

1) Some C/R systems have flaws 

True, but this is argument for fixing bad implementations, not getting rid of C/R systems.

Claim 4 -'Local whitelisting is good, while C/R is bad'.  This just means the C/R system should be aware of the local whitelist.  EWL  is.

Claim 7 - 'Some C/R systems deadlock'.  Yes, this is a bug in really bad implementations, but not real world ones. EWL has an explicit ringing detector.

Claim 10 - 'Some C/R systems respond badly to mailing lists'.  Yes, C/R systems must be conscious of mailing lists and long cc lists.

2) Bad comparison to filter systems

These objections are of the form 'C/R systems have problems', but the same objection is even worse for filtering systems.

Claim 0 - 'Spammers can easily bypass C/R system'.  Not so, the chance of a spammer hitting an email address of a registered friend is quite low.  In running my C/R system this never happened.  Sometimes spammers will register though.

Claim 1, 5 - 'C/R systems have high false positive rates'.  Their false positive rate on registered senders  is 0, better than filters. The C/R false positive rate on unregistered senders is the rate of friends not registering.  In my experience this is very low, but regardless it is neither clearly less nor greater than the filtering false positive rate.  It's likely lower than the RBL false positive rate.  Also, see below on how good C/R systems minimize this failure to register rate.

Claim 3 - 'C/R violates privacy by letting your ISP know you get email from'  If my ISP handles my  email, it already knows who I send email to and receive from.  And, if instead I run my own server (as I do), then my C/R runs there and ISP stills knows nothing (short of sniffing TCP conversations, which C/R still doesn't change).

3) C/R systems fail to punish the spammers

Claim 2, 11 - Yes, but so do filtering and RBL systems.  This is not an objection to C/R per se.

4) Registration requests are spam too

Claim 2, 6, 9 - They all say the same thing, registration requests are spam, and C/R users should be punished for trying to use them.  This is a valid objection, but not a complete one.  See below

5) Theoretical problems

Claim 6 - 'C/R systems could be used for a DNS attack'.  Seems like a rare thing and fairly weak attack.

Claim 8 - 'C/R systems could be used as spam confirmers.'   This is true only if people choose to register themselves with a spammer.  If you got a registration request from someone you never sent an email to in the first place, would you register?  I think not.  This claim is unlikely to matter much in practice.

 

In defense of Challenge/Response Systems

C/R systems have high resolution. They cut the false negative (spam let thru) rate to near zero since spammers almost never register.  The false positive rate (friendly email recognized as spam) for friends is also cut to near zero since the only lost email is for friends who refuse to register.  See below for the problem of handling legitimate business email.

The only serious criticism of C/R systems is that innocent 3rd parties are burdened with registration requests.  Most (80-90% in my experience) of the return addresses that spammers use are completely bogus.  Email sent to them is dead at the SMTP conversation level.  Some additional percentage of those which do go thru are headed to /dev/null.  Thus, even though an unwanted registration request is unwanted email, C/R systems reduce the net amount of unwanted email ever seen by a person by some large factor (maybe 5 to 1).  This is a net win.  RBLs are making it impossible to extract this 10 to 1 win.  That is a real loss.

Also, a registration request is less annoying, at least to me, than a spam.  It's easier to recognize and less offensive just on its own merits.

C/R systems should not be used as the only filter for incoming email.  Rather they should be used as a lost resort after other techniques fail.  There is a trade-off here because false positives of other filtering systems running first versus keeping registration requests low.  As a practical matter, I set my spamassassin score to 10 (which is relatively high) and challenged all unregistered emails under that. That kept false positives to 0 and registration requests to a sensible fraction.

Mr. Self's criticism that C/R  systems are less accurate than filtering is doubly mistaken when they are used in exactly the place where other filtering systems have already failed.  

A good C/R system:

  • Auto registers to: and cc: addresses in outgoing emails

  • Auto registers cc addresses on incoming emails to registered addresses (Earthlink gets this wrong, as of 4/26/06)

  • Allows simple 'reply' registration, instead of needing to go to a web page and enter things.

  • Has email aliases that auto register for a time window, and then reject all new senders to that address.

  • Detects registration request ringing.

Auto registration

The last two points merit some explanation.  Businesses will often send from email addresses that no one ever reads, and thus they will not respond to registration requests.  The deal with these cases, one approach is to such a businesses an email address of yours that auto-registers the first time you see an email to it.  When that address leaks out, that alias no longer auto registers, or accepts unregistered email.  For instance: 

Jan 15: from: sales@circuitcity.com to: ervandarnell143@kelvinist.com -- sales@circuitcity.com is whitelisted automatically and the email goes thru.

Nov 15: from: moreviagra@aol.com to:  ervandarnell143@kelvinist.com -- email address has leaked, ervandarnell143 is manually changed to no longer be autowhitelist.

Nov 16: from: yetmoreviagra@aol.com to:  ervandarnell143@kelvinist.com-- rejected without comment since this email address no longer auto registers and this from address has never been seen.

Nov 17: from sales@circuitcity.com to:  ervandarnell143@kelvinist.com -- sales@circuitcity.com is let thru since it is already whitelisted.

Ringing, as a feature

Mr. Self lists ringing as bug of C/R systems.  Properly implemented, it is a feature.  First, it can detected without content analysis and is thus reliable.  If Alice sends Bob a registration request and Bob replies with a registration request in turn to Alice.  Alice's C/R system simply notes that Bob failed to register when a registration request was expected.  Alice's C/R system can now either a) just drop Bob's registration request and ignore the whole matter, b) pass it to Alice herself for consideration, or c) hold it for a few days and see if Bob ever registers.   

The feature part is that if a spammer injects a message to Alice from innocent third party Bob, then the two C/R systems ultimately just drop everything and no human is every bothered.  If Bob had really sent Alice an email, then Bob's C/R system would already have whitelisted Alice and never have sent her a registration request. 

SPF + C/R

Though SPF is dead for now, it is a good companion for a C/R system.   The main problem with the C/R system is that innocent third parties get unwanted registration requests.  SPF solves that problem.  An SPF sender is either a valid correspondent or a spammer with their own SPF domain.  Sending a registration request to the latter is irrelevant.  Sending it to the former does not burden a third party.  This should avoid spamtraps as well.

One problem with this is that for people who publish an email address publicly (e.g. a discussion forum) and invite replies (e.g. asking for help) should not expect all who try to help to register just to reach them.

EWL

I wrote EWL as my own personal C/R whitelisting system.  It implements all of the features listed above (the source comments in main.cpp list many others).  I ran it for several years, successfully reducing a spam load of thousands per week to about 10.  

As of May 2006, I have surrendered the use of it because SORBS has blacklisted me.  I installed the latest versions of the following Postfix recipient rules + sqlgrey + spamassassin and tuned them all, leaving me at about 100 spams per week, definitely worse than under C/R.

The problems involved in running it were:

  • Registering legitimate businesses as handled as per auto registration above was problematic because businesses will often change their from: address many months down the road, without notice, and thus fail to use the new auto-registration address.
  • The SMTP server was clogged with undeliverable registration requests, generating needless traffic.
  • Processing of bounce messages, looking or legitimate ones becomes tricky.
  • The final .procmail setup necessary to handle all of the special cases was not simple.

My Current Email Config

Currently, it is postfix + sqlgrey + spamassassin 3.1 + me own whitelister, EWL, running in a purely local mode (no registration requests are actually sent as I have surrendered to SORBS).  The local whitelisting overrides any false positives from spamassassin.  The registration problem is minimal because any time I email a new person they are registered.  Therefore I need to manually whitelist only those incoming addresses to whom I never reply.  

The whitelist override guarantees that everything in inbox is from a whitelisted sender and any whitelisted sender will be seen no matter the spamassassin score. I check inbox frequently.  spamlite gets checked weekly for possible new correspondents.  That amount of spam which sneaks past sqlgrey and spamassassin is ignored at this time.  The point is that I never tolerate spam clogging my inbox until I am ready to address it, at the cost of new correspondents possibly getting a delayed response.

Current problems:

  • Forwarding services - these defeat greylisting directly (because the forwarder retries) and spamassassin indirectly (by giving it confounding sender information which ruins its RBL tests).  Also, forwarders rewrite the headers, if the not the body, which also confounds spamassassin.  A direct check by Eudora (an email client) for [spam] in the subject picks up some cases where the forwarding MTA already detected it as spam.  Backup MX relays cooperate more closely by using very similar checks.
  • Business registration - Emails  from new business addresses require an explicit whitelisting approach.  Currently, I forward an email to myself with a subject of whitelist so as to whitelist the original sender.  It's a bit of overhead, but not too bad.
  • New blacklists - This is now a problem.  Explicitly blacklisting every email that gets past spamassassin is too much work.  Fortunately, there are very few emails in this category.  And, explicit blacklisting is still available for the repetitive senders.

Neither sqlgrey nor spamassassin installed in any direct way on Fedora Core 4 (Red Hat) linux.  I include my install notes here.

Sqlgrey 1.6 config on Fedora Core 4 linux

  1. perl -MCPAN -e shell; install Bundle::CPAN // to get perl-DBI
    also need DBD:MySQL, Net::Server::Multiplex, & IO::Multiplex modules
  2. service mysqld; and use GUI applet service config to start mysqld every time
  3.  mysql ; > CREATE DATABASE sqlgrey; > GRANT ALL ON sqlgrey.* TO sqlgrey@localhost;
  4. rpmbuild -ta sqlgrey-1.6.7.tar.bz2
    seems to build into /var/tmp/sqlgrey-1.6.7-build
    but not to install anything, maybe this step is useless?
  5. groupadd sqlgrey; adduser -g sqlgrey sqlgrey
  6. from /var/tmp/sqlgrey-1.6.7-build: make; make install
  7. vi /etc/sqlgrey/sqlgrey.conf, first pass, the following, at least, need to be set: 
    conf_dir = /etc/sqlgrey
    loglevel =3 # for early testing
    user = sqlgrey
    group = sqlgrey
    db_type = mysql
    pidfile = /var/run/sqlgrey.pid
    prepend = 1
    whitelists_host = sqlgrey.bouton.name
    admin_mail = real person@domain
  8. add 
    *.groups.yahoo.com
    *.yahoogroups.com 
    (rumored not to resend emails) to 
    /etc/sqlgrey/clients_fqdn_whitelist.local 
  9. Backup MX Relays 
    If you do nothing, incoming email from your relays will be greylisted, which is problematic if they have already done greylisting on your behalf.  Add your MX relays to:
    /etc/sqlgrey/clients_ip_whitelist.local 

    note that adding relays to the FQDN list did  not seem to work.
  10. (from the sqlgrey docs) Start by adding check_policy_service after reject_unauth_destination in
    /etc/postfix/main.cf :
    smtpd_recipient_restrictions =
    ...
    reject_unauth_destination
    check_policy_service inet:127.0.0.1:2501
    be careful not to put it after permit_auth_destination, or other
    early permit options.
  11. Add to boot:
    cd /etc/rc5.d # make install already created /etc/rc.d/init.d/sqlgrey
    ln -s ../init.d/sqlgrey S65sqlgrey # I hope 65 is right
  12. download and install: http://www.vanheusden.com/sgwi/
    i) tar -xzf sqlgreywebinterface-0.6.tgz
    ii) vi config.inc.php; change db_pass to ""
    iii) mv sqlgreywebinterface-0.6 /var/www/cgi-bin/sqlgrey -- manual install in web hierarchy
    iv) in conf.d: 
    <Directory "/var/www/cgi-bin/sqlgrey">
    AllowOverride All
    </Directory>
    in /var/www/cgi-bin/sqlgrey/.htaccess:
    AuthUserFile /etc/httpd/passwords
    AuthGroupFile /dev/null
    AuthType Basic
    AuthName "Sqlgrey"
    <Limit GET>
    require user someadmin
    </Limit>
  13. Important usage notes: 
    /etc/sqlgrey/* -- all config info
    /var/log/maillog -- sqlgrey log info, if run in daemon mode
    /usr/sbin/update_sqlgrey_config -- updates global whitelist?
    service sqlgrey restart -- like other servicess
    /usr/bin/sqlgrey-logstats.pl < /var/log/maillog -- summary of what has happened
    but does not count user whitelist matches anywhere in output
    loglevel seems permanently set at 2, regardless of config
    optin/optout -- apparently only for multiple users on one system, i.e.
    disable greylisting for some recipients, not useful for single server 
  14. Serious weakness: greylisting does not address any relay or forwarding services.  This applies to both implicit MX backup relays and explicit forwarding services/discussion groups.  

Spamassassin 3.1.1 daemon config on Fedora Core 4 linux

  1. failed: rpmbuild -tb Mail-SpamAssassin-3.1.1.tar.gz with missing packages
  2. failed: yum install spamassassin -- failed to get the latest version
  3. worked: download; tar -zxf Mail-SpamAssassin-3.1.1.tar.gz; 
  4. perl -MCPAN -e shell [as root]
    o conf prerequisites_policy ask
    install IP::Country
    install Mail::SpamAssassin
    install Mail::SPF::Query  # note, this is different than Mail::SPF, and from a different author
    install IP::Country
    install Razor2 # should have been included by Mail::SpamAssassin already
    install Net::Ident
    install IO::Socket::INET6
    install IO::Socket::SSL
    install Sys::Hostname::Long # needed by Mail::SPF::Query, should have already been included
    install Net::CIDR::Lite # needed by Mail:SPF::Query, should have already been include
    install NetSSLeay -- transitively needed by SSL
  5. check SPF via
    http://www.akadia.com/services/spf.html
    spamassassin -D < mailfile |& grep -i spf
  6.  from Spamassassin directory: make; make install;
  7. cd spamd; cp redhat-rc-script.sh /etc/init.d/spamassassin ;
    # had an existing /etc/rc5.d/K30Spamassassin to remove, then: 
    ln -s /etc/init.d/spamassassin /etc/rc5.d/S81Spamassassin 
  8. in order to use spamc (already loaded processor) in .procmailrc
    add "allow_user_rules 1" to /etc/mail/spamassassin/local.cf so
    that local user_prefs files will work.  Manual warns about the dangers of this, so it is disabled by default.
  9. some possible entries in ~/.spamassassin/user_prefs
    required_hits 5
    trusted_networks <relay>
    always_add_report 1
    rewrite_header subject [SPAM][local]
    add_header all Status _TESTSCORES_
    add_header all Status _YESNO_, hits=_HITS_ required=_REQD_ tests=_TESTSSCORES_ autolearn=_AUTOLEARN_ version=_VERSION_
    use_bayes 1
    score HABEAS_SWE 0.0
    score ALL_TRUSTED 0.0
    score RCVD_IN_SORBS_* 0.1
    [ etc.]
  10. edit /etc/mail/spamassassin/local.cf (trusted_networks parameter with any postfix relays)
  11. More on relays:
    If the relay spamassassin processes mail first, it will confound the local spamassassin by hiding the contents in some cases. The full report should be turned off in any relays (in favor of the header report only).