Application Spam Filtering

As a last resort, you can set up your own spam blocker with your email application’s built-in filter capability. However, be forewarned that this approach requires set-up and regular ongoing maintenance to remain effective enough to catch most of the spam sent your way. At the same time, it is an option under your control and provides a countermeasure if you don’t have other choices.

When reading spam email to obtain keywords to create application level filters, remember to keep your Internet connection turned off or firewall locked so that images in the email don’t display and broadcast your address availability back to the spammers (fortunately some email applications like Thunderbird have this protection built-in). The following guidelines describe how to set up application-level spam filters:

  • Mailbox. Create a special mailbox called “Junk-filter”, or something similar, into which you will direct the filtered mail instead of deleting it. Then every once in awhile scan the junk mailbox to gauge the success of your filters and make sure no legitimate email is being trapped. You can also use this archive to help tune your filters as described under Selection below.
  • Guards. Most email programs have the capability to create a set of “guard filters” that greatly trap known addresses to reduce the potential of filtering legitimate email by mistake. If you application has the capability you should set these up first, and add a new guard every time you add a new address to your address book.

    At the top of the filter list, so they fire before all the others, create a set of guard filters to recognize the email addresses of your friends, family, and current mailing lists. The corresponding action should be called “skip the rest” or equivalent. Any email from usual correspondents will then be kicked out of the filter engine as soon as they are recognized and will not be examined by the rest of the spam filters, thereby greatly reducing false positives and the chance of legitimate email being filtered by mistake.

  • Filters. Now you are ready to set up a set of filters to recognize common spam keywords and then transfer the email into the Junk-filter mailbox. First set up an initial set of filters based on the spam you have been getting most recently, and constructed to trap the most common spam keywords and phrases. Some examples are listed below:
    • Subject contains “mortgage” and
      Subject contains “loan”
    • Subject contains “medication” and
      Subject contains “free shipping”
    • Body contains “remove me” and
      Body contains “click here”
  • Maintenance. After you set up your initial set of filters, monitor the spam you still receive and add a couple of rules each time you check your mail to continually increase the efficiency of your filter engine. Add a few rules each time to catch the most common examples still making it through the filter engine. Over time, the amount of spam that makes it through will become less and less, and will be of increasingly unusual nature (ex: weird subject lines) that they will be easily recognizable as spam by the human eye and easily deleted.
  • Selection. There are two basic goals when drafting a spam filter: make the rule broad enough to be effective at catching spam, and make the rule specific enough to avoid trapping legitimate email by mistake. The following guidelines assist in creation of good spam filter rules:
    • Tuning. You can occasionally go through your trash, which consists largely of spam you had to delete manually, sort the mailbox by subject, and look for common patterns and keywords. You can then create a few new filters to catch similar email, fine-tuning your engine to catch more of the spam that have been escaping your filter engine.
    • Efficiency. If you wonder if a rule is worth creating, you can use the filter mailbox as a useful archive to test the rule. Search the spam mailbox for the filter condition you are considering. If none (or very few) of the spam you’ve so far received match the condition, the rule is probably ineffective and not worthwhile.
    • Safety. Always use guard rules as described above to minimize the chance of trapping legitimate email. However, you still want to remain accessible to the world and new correspondents. If you wonder if a rule is too broad and might catch legitimate email by mistake, then you can search your existing mailboxes for the filter condition. If one or more legitimate email you’ve already received matches the condition, the rule is probably too broad and needs to be made more spam specific by adding keywords or conditions.
    • Fields. With most email applications, filters can target the sender, subject, message body, and other fields. However, because most fields can be faked, the subject and message body are the best for spam filters.

      The subject has both efficiency and safety advantages. Filters that act on the subject are more effective since the subject is pure text and easy to match, whereas HTML content in the body are easier to disguise. Rules that target the subject are also less likely to catch legitimate email by mistake than those that target the body, because there is less chance of a sender using a spam-like phrase in a few words of subject than in many paragraphs of body.