Faughnan Home | Contact Info | Site Contents | Search

Spam and Junk Email:
Against the Onslaught

Contents
  • Introduction
  • My Current Approach
  • The Cheap and Effective Solution
  • Multiple Email Addresses
  • Client-Side Filtering and Limits
  • Client Proxy Filtering
  • Internet Service Provider Filtering
  • The limits of filtering: Custom address forgery
  • Spam Scams -- WATCH OUT!
  • Miscellaneous Approaches
  • SpamCop
  • Web Resources for Fighting Spam
  • revised: 22 Sep 2006.


    Introduction

    spam
    Unsolicited, unwanted, mass broadcast of advertisements to newsgroups and email addresses. Often used by con artists, crooks, and pornographers. Steals bandwidth and clogs email boxes. Bears a family resemblance to worms (see The limits of filtering: Custom address forgery)

    If not for the spam filtering services I use I'd probably receive about 150 spam messages daily. Many of these are offers for spam services, the rest are usually for pornography, various operations on genital organs, or fraudulent financial services. Increasingly much of the traffic is worm induced viral packages. With spam filtering I get about a dozen messages daily.

    Spammers take email addresses from software programs that extract addresses from newsgroups and web pages. They also trade in lists obtained from commercial web sites and online services. They generally use fake return addresses, but they may also provide a valid email address for the sole purpose of harvesting active email addresses from incoming email. I frequently get spam from "myself", because a spammer has forged my return address.

    It is pointless and counter-productive to send email to a spammer. Almost all return email addresses in spam are forged. Your message either goes nowhere or it goes to an innocent person whose email address has been stolen. At best your effort is wasted, more often your email address will be harvested or some innocent bystander will get your message.

    Why is spam such a problem on the net, when junk mail isn't half of the problem? (We discard two grocery bags of junk mail a week. It's manageable.) It costs very little for spammers to send messages to millions. Even if only 1 in 100 respond, they make money. There's no natural limit to spam.

    Marketing, already a major function of modern economies, is going to become yet more powerful over the next decade. Any channel that allows marketers to get our attention will be used. In the near term we'll find ways to charge marketers for our time, probably using electronic cash. We will be able to control marketing access by the price we set. We may be able to charge spammers to send us email. In the near-term spam is one of the bleakest forms of modern marketing.

    This page is a simple list of things that can be done, and my personal recommendations. Ideas have come from several people. There are many excellent anti-spam resources, and I've placed their URLs at the end of this page.

    My Current Approach

    This is what I'm doing as of September 2006

    The Cheap and Effective Solution; Differential filtering based on the managed reputation of an authenticated sending service

    Introduction and Overview

    In April of 1997 I started ISP filtering services.

    Ok, so maybe someone else thought of the same idea earlier. In a similar vein, I will here reveal how we can stop spam without Microsoft Palladium, legislation, changes to everyone's email filter, huge white lists, black lists, creating a universal digital signature infrastructure, new protocols, etc. It can be done for a moderate cost and we can do it within a year or two. [update 7/2003: A similar approach to this is being seriously discussed, but I think they've missed the key idea - we DON'T need to fix everything at once! Users can choose whether to sign up with an authenticated sending service or not. The motivation to sign up is that one's email is no longer filtered.]

    The secret is to divide email into trusted and untrusted messages based on authenticating the sending service (AuSS :-), not the sender [2]. Mail from non-authenticated sending services gets heavily filtered (yes, legitimate messages are lost and mail may be delayed), mail from authenticated (trusted) sending services is not filtered.

    In other words, mail from authenticated sending services is a kind of "First Class" email, mail from untrusted sending services is a sort of "Third Class" email.

    There are two fundamental ideas here that are easy to miss:

    1. This is reputation management, but it's less onerous than reputation management on individuals. The sending service can choose to apply reputation management internally to its subscribers, but it can use the standard infrastructure every company uses with its customers.
    2. This is incremental. There's an incentive for users to switch to sending services that authenticate (First Class mail), the need to keep customers is an incentive for ISPs to implement sending service authentication -- but there's no mandate.

    Below is some background. I also explain this scheme a bit differently in a blog posting.

    Background

    Filtering today has problems, even the best filtering schemes either pass spam or (more often) block legitimate messages [3]. So today strict filtering is often not recommended, but if we implement this AuSS proposal, strict filtering is a good thing. It filters out spam, and its nasty side-effect of blocking legitimate messages becomes a feature -- it gently, but firmly, pushes users towards an ISP that supports AuSS.

    Ok, so why does authenticating the sending service help? Why can we let email from an authenticated sending service bypass the filters?

    Because if we really know the sending service, we can create rules based on the reputation of the sending service. After all, the sending service authenticates its users (or it wouldn't make any money!). The sending service has the ability to enforce rules against spam (spammers, of course, use their own sending services or use shady ISPs that make their money on the dark side). If we know the reputation of the sending service is good, we can pass its emails without filtering. If we know the sending service is a bad actor, the emails get filtered.

    So email is divided into two categories, the AuSS trusted mail (First Class Mail) that doesn't get filtered, and untrusted messages that get strictly filtered (Third Class mail). People don't like losing email messages due to strict filtering, so they have an incentive to migrate to services that support AuSS.

    So, why can't we do this now? We can't do it now, because so many spammers now forge the address headers and thus disguise the true sending service. So, how can we get around this forgery problem, without creating a lot of technical headaches and requiring a lot of client software upgrades?

    We can use digital signing, as is used for sender authentication  [2], but without the burden of a full authentication system and the need to upgrade everyone's software. It requires only changes to the sending service (SMTP provider) and the receiving service (POP or IMAP provider). Here are the steps:

    1. Sending service does a checksum on message contents.
    2. Sending service encrypts the checksum using its private key.
    3. Encrypted checksum is translated to ascii characters and placed in a custom message header field.
    4. Receiving service sees that message has a custom header field. Receiving service uses public key of sending service to decrypt message (So ISPs have to know the public keys of authenticating services, but that's a relatively trivial key management problem.) and recalculates the checksum.
    5. If checksum matches, message is marked as authenticated and it passes without further filtering. [4]
    6. If checksum doesn't match, message is marked as likely fraudulent and either bounced or subjected to severe filtering.

    That's it. Some checksum software and public signature manipulation at the ISP level. That would do it. It's added mail processing overhead, but that would be outweighed by the gradual elimination of spam-related overhead. All we need to do is have this schemed adopted by AOL, MSN, Yahoo, Earthlink and one or two others and we're home free. Note it works as readily for web mail as for POP or IMAP clients.

    But what about people who don't have access to an ISP that supports sender authentication, or who want to use their own SMTP service? They could of course take their chances with filtering, but I think there would quickly arise secondary authentication services that they could obtain tokens from, perhaps as a web service. They could also send a very brief message of a stereotyped form that's guaranteed to pass an email filter, requesting to be added to one's white list.

    (Incidentally, if we ever have to pay for bandwidth, there's a fringe benefit: spam will get too expensive to bother with!)

    Multiple Email Addresses (aka Disposable email address)

    (I wrote this section in 2000, but by 2002 there's been a move to very short-lived Disposable email addresses or DEAs. Bret Glass has written a good PC Magazine Review: May 2002. Several services provide these; MailShell provides both filtering and DEAs. Some ISPs and SpamCop also allow one to attach a string to an email address with a "plus"sign, as in jfaughnan+amazon@spamcop.net. Your mail service ignores the added text and (in this example) sends it to jfaughnan@spamcop.net. Your email can filter on the added text. This can be another way to create semi-disposable addresses.)

    This is the one approach that really seems to work. The secret is to have at least two email addresses. One email address is your "public address" (PA). The other is your private, unlisted, address.

    Ideally your unlisted address is one you intend to keep for life. The best choice, I think, is a college or university alumni email address. Many institutions will even provide these for local persons who donate modest amounts to the alumni organization. For $25 or so a year you can often join an alumni organization and get an alumni email address. This address is likely to last a long time, though probably a Harvard address will be more enduring than a local community college. Other sources are professional organizations, or quality ISPs. The username on the unlisted address should be a random sequence of letters and numbers so that spammers can't find it using dictionary attacks. GRC passwords is a good source of usernames.

    Your public address can come from anywhere, Gmail and Yahoo are the most popular.  To reduce spam to that account, turn off all directory listings, all mailing and "news" options, and maximize all privacy settings.

    Your unlisted, private, email address goes to friends, family, trusted organizations. Never use it when posting to newsgroups or email lists. Your public address is used when registering on web sites, email lists, maybe even newsgroups.

    You can set up your public address to forward to the your private email, and then filter those messages to a special area, or you can not forward but check the public account intermittently.

    Every few years, as your spam count builds, you cancel the public account and get a new one.

    Client-Side Filtering

    Filtering at the client side is less effective than filtering messages before they are received on your mail server. You still have to pick up the spam, though with IMAP mail I think you might be able to delete messages on your server before transferring them to your client machine. Still, we are at least spared having to wade through the garbage.

    There are two types of filtering: positive and negative. I use both. Positive filtering is extremely powerful (see below). However there are limits to filtering

    Negative filters

    These are used to detect spam, and then to delete it or move it to a folder for later deletion.

    Eudora Pro, Netscape 4.x, Pegasus, Claris Emailer, and many other email programs allow for message filtering. Typically messages are filtered into a spam mailbox, where they can be later deleted enmasse.

    Here are some of the negative filters I know of:

    Positive filters

    Positive filtering is simple and powerful -- you filter out the email addresses you want to keep, and send everything else into a folder for mass deletion. This works surprisingly well in Eudora Pro; in this email program you can create an address entry consisting of a list of addresses, then apply a filter to email when the sender address matches anything in the list.

    The residue from my filters goes into an 'unsorted' folder. It's unusual to see anything worthwhile there; if I do then I add the address to my positive filter list. The rest gets deleted. This is very close to the future Ultimate Solution, but it is occasionally defeated by spammers who forge addresses that belong to my correspondents. As of May 2002 this is not so common, but I suspect it will become more common.

    Client Proxy Filtering

    This software sits on a machine within your home LAN, or on a single machine. It fetches email from your POP servers and can hold that email for use by your email program; in other words it acts as a "proxy" for your email software. The main advantage is that this software can be highly optimized for filtering out spam, and if you have a home LAN it can serve multiple machines and multiple accounts. See however limits to filtering. I'm still exploring this, I'd like a low cost solution that would run as an "service" on a Windows 2000 workstation (so it is active irregardless of who is logged in).

    Internet Service Provider Filtering

    OK, Time for me to try to grab credit. In April of 1997 I wrote to MindSpring and EarthLink, innovative and well-regarded ISPs, and suggested they offer spam filtering as a value-added service to subscribers, like AOL already did. Only MindSpring responded, saying it sounded like a good idea. Mindspring now offers spam filtering. I don't know how good a job they've done, but it was reason enough for me to switch to Mindspring.

    This makes sense. An ISP, with feedback from customers, can maintain a list of the many and ever-changing IP addresses used by spammers. (An LDAP directory service of spammers, maintained by a consortium of ISPs, would be most helpful for this effort.) Of course spammers can also send email from non-spammer servers and disguise the origins, so there limits to IP filtering. The next step is to provide optional content filtering. Personally, I'd accept the small risk of blocking a legitimate email message based on a content filter. (The blocked email would be returned to sender of course.)

    The limits of filtering: Custom address forgery
    (or, how spam and viruses/worms converge)

    All of the approaches outlined above are useful, but as of early 2001 address forgery has become more sophisticated. I own some domains, and I routinely get spam where the sender's address is from someone in my domain, including myself! Of course they (or I) never sent it, it's simply that the spammers now do custom address forgery. For example, when sending email to somebody@somewhen.how forge the return address as somebody@somewhen.how. So now one must filter out one's own email address!

    The next step in spam evolution is to start using valid addresses from other persons within the domain. In this way spam and internet worms/viruses are showing convergent evolution, both using the same technical and social engineering techniques to bypass filters and security systems and to then be activated by a human being (spam seeks to be read, worms to be activated).

    Another problem with filtering is that client side filtering works with headers that are part of the message. It turns out that the real address information that routes mail is not a part of the message content; it's part of the packet information that routers and POP services use and then delete. Special purpose spam software can create message headers that bear little resemblance to the addressing headers.

    You can think of spam and worms/viruses as being quite similar, it's just that spam prays on humans and worms on computers. See also The Ultimate Solution..

    Spam Scams -- WATCH OUT!

    Watch out for these tricky scams:

    Miscellaneous Approaches

    SpamCop

    Recommended. SpamCop provides a free automated spam analysis and reply generation service. It also, for about $30/year, will provide extremely robust spam filtering. If you really hate spam, then sing up for SpamCop.  If you post on newsgroups, you should use a spamcop.net address.

    Web Resources for Fighting Spam

    Coaliton Against Unsolicited Commercial Email
    Congressional anti-spam movement.
    David E. Sorkin - U.S.F. Law Review Spam Article: 2001
    An authoritative review of legal and technical issues.
    Gmail, spam and identity management
    The unexpected consequence's of Gmail's defective spam filtering
    Mail Abuse Prevention System LLC
    A very serious not-for-profit anti-spam organization
    PC Magazine Review: May 2002
    Bret Glass wrote this, and he's wise.
    Ric Ford's Macintouch Anti-Spam Page
    A superb well maintained summary from a well known Macintosh columnist. Includes links to his columns on the topic.
    Slashdot Discussion, Nov 2000
    A fairly technical discussion that covers techniques largely unavailable to the public.
    SpamCop
    An automated anti-spam service.
     

    History

    Footnotes

    [1] This sounds "simple", but validating a digital signature has quite a bit of overhead. One must for each message access the senders public key and apply it to the digital signature to validate their address.
    [2] Authenticating the mail sender (or the sending machine) also eliminates spam, among many other things Microsoft Palladium does this, but this solution requires a LOT of software upgrades and a complex infrastructure. It's more than we need.
    [3] I should know! I've had a half-dozen messages I sent from my work address to my home address filtered out. Believe me, they were not spam.
    [4] Optionally one could filter and just assign metadata attributes for use by client side software, but that's overkill.

    Metadata - Keywords

    Since Google does not use indexing information stored in meta tags, I've reproduced some of the meta tags here to facilitate indexing.

    <meta name="author" content="John G. Faughnan">
    <meta name="description" content="Another personal page on spam and how to manage it.">
    <meta name="keywords" content="spam, junk mail, junk email, junkmail, advertisement, net ads, newsgroup, marketing, CyberPromotions, english,us-en,jfaughnan,jgfaughnan,.en">


    Author: John G. Faughnan.  The views and opinions expressed in this page are strictly those of the page author. Pages are updated on an irregular schedule; suggestions/fixes are welcome but they may take weeks to years to be incorporated. Anyone may freely link to anything on this site and print any page; no permission is needed for citing, linking,  printing, or distributing printed copies.