Académique Documents
Professionnel Documents
Culture Documents
Contents
Abstract ............................................................................................................................... 5 Scope................................................................................................................................... 5 Abbreviations...................................................................................................................... 5 1. Introduction..................................................................................................................... 6 2. Overview......................................................................................................................... 7 2.1 Email Technologies .................................................................................................. 8 2.1.1 Simple Mail Transfer Protocol And Extensions ................................................ 8 2.1.2 Post Office Protocol And Internet Message Access Protocol............................ 9 2.1.3 Multipurpose Internet Mail Extensions, Pretty Good Privacy, And Secure Multipurpose Internet Mail Extensions ...................................................................... 9 2.1.4 Web-Based Technologies .................................................................................. 9 2.2 Categorizing Spam.................................................................................................. 10 2.2.1 Email Spam...................................................................................................... 10 2.2.2 Spam For Instant Messaging............................................................................ 11 2.2.3 Spam Over Internet Telephony........................................................................ 11 2.3 Spam Sources.......................................................................................................... 11 2.3.1 Open Relays ..................................................................................................... 11 2.3.2 Disposable Accounts........................................................................................ 12 2.3.3 Proxies.............................................................................................................. 13 2.3.4 Compromised Hosts......................................................................................... 13 3. Anti-Spam Technologies .............................................................................................. 15 3.1 Message Filtering.................................................................................................... 15 3.1.1 Content Filters.................................................................................................. 15 3.1.2 Hashing Filters ................................................................................................. 15 3.1.3 Statistical Filters............................................................................................... 15 3.2 Address Lists........................................................................................................... 16 3.2.1 Domain Name System-Based Systems............................................................ 16 3.2.2 Dynamic Users Lists ........................................................................................ 17 3.3 Client Server Authentication................................................................................... 17 3.3.1 Simple Mail Transfer Protocol Authentication................................................ 17 3.3.2 Post Office Protocol Before Simple Mail Transfer Protocol ........................... 18 3.3.3 Transport-Layer Security ................................................................................. 18 3.4 Packet Filtering And Inspection.............................................................................. 18 3.4.1 Simple Mail Transfer Protocol Egress Filter ................................................... 18 3.4.2 Firewalls........................................................................................................... 19 3.4.3 Traffic Monitoring And Rate Limiting ............................................................ 19 4. Emerging Technologies ................................................................................................ 20 4.1 Domain Authentication........................................................................................... 20 4.1.1 Sender Policy Framework................................................................................ 20 4.1.2 Sender Identification........................................................................................ 21
4.1.3 Domain Keys ................................................................................................... 21 4.1.4 Identified Internet Mail .................................................................................... 22 4.2 Internet Protocol Version 6..................................................................................... 22 4.3 Presence .................................................................................................................. 22 Discussion ......................................................................................................................... 23 Conclusion ........................................................................................................................ 23
Abstract
This paper details the issues relating to the distribution and prevention of unsolicited electronic messaging, or spam. The document provides an overview of existing and emerging technologies used to combat spam. The methods used to distribute spam and evade anti-spam technologies are also discussed. The goal of this paper is to help explain the technical methods involved, in order to improve understanding of the issues at stake.
Scope
This paper covers the topics surrounding email-based spam and the technologies used to prevent it. The technologies explained in this report are meant to capture the current state of the art, and should not be considered exhaustive. This document will be revised in the future to reflect the changing landscape of anti-spam technologies and, as such, should be considered a living document.
Abbreviations
CAN-SPAM DNS DNSBL IIM IMAP MAPS MARID MASS MIME MTA MUA MX POP RBL SMTP SPAM SPIM SPIT SRS TCP TLS UBE UCE VoIP Controlling the assault of non-solicited pornography and marketing Domain name system Domain name system block list Identified Internet mail Internet message access protocol Mail abuse prevention system Mail transfer agent authorization records in DNS Message authentication signature standards Multipurpose Internet mail extensions Mail transfer agent (server) Mail user agent (client) Mail exchanger Post office protocol Real-time black hole list Simple mail transfer protocol Self-promotional advertising message Spam for instant messaging Spam for Internet telephony Sender rewriting scheme Transmission control protocol Transport layer security Unsolicited bulk email Unsolicited commercial email Voice over Internet protocol
1. Introduction
Electronic messaging has been one of the key factors in the growth of the Internet. Users ability to send messages to recipients on the other side of the world at nearly no cost has been very disruptive for other message-delivery methods such as fax and letter mail. The low cost of message delivery has enabled unsolicited senders to deliver their messages using the same media. Some of these unsolicited messages have been classified as spam by users. In the past, spam was simply considered a nuisance by many users. However, in recent years, the volume of this type of unsolicited message has increased. Often, the message content is deceptive, fraudulent or offensive, and the source cannot easily be identified. The current state of electronic messaging has caused concern for many and has led to the development of anti-spam solutions. Solutions have come from a range of areas, including technological, judicial and political. This paper will explain the various technical solutions used to combat spam in electronic mail and related messaging technologies.
2. Overview
Email transmission remains relatively unchanged from the original model developed in the early 1980s. Figure 1 depicts a generic message exchange between a sender and receiver. There are variations to this model that may change the message flow, but the basic exchange remains the same. Variations might include mail relays, mail gateways or proxies, some sender-authentication techniques, web-based email, etc.
1. A client or mail user agent (MUA) first assembles a message to be sent. The MUA then establishes a simple mail transfer protocol (SMTP) connection with their mail transfer agent (MTA). The MUA uses SMTP commands to identify the sender and recipient, and transmits the message to the MTA. In this case, the MTA is a mail server hosted by the senders Internet service provider (ISP). 2. Once the message has been received by the senders MTA, the recipients MTA must be located. Using the domain portion of the recipients email address, a query to the domain name system (DNS) is issued requesting the mail exchanger (MX) record for the recipients domain. The query will return a listing of the recipients mail servers. 3. Once the address is known, a subsequent SMTP session is established and the message is transmitted to the recipients MTA. Once the message is received, it will be stored for retrieval. 4. The recipients MUA uses the post office protocol (POP) or Internet message access protocol (IMAP) to contact the server and retrieve any stored mail.
1. J. Klensin, RFC 2821, Simple Mail Transfer Protocol, April 2001. 2. J. Klensin, N. Freed, M. Rose, E. Stefferud and D. Crocker, RFC 1869, SMTP Service Extensions, November 1995.
2.1.3 Multipurpose Internet Mail Extensions, Pretty Good Privacy, and Secure Multipurpose Internet Mail Extensions
Originally, plain text (7-bit ASCII text) was the only format for content specified for SMTP messages. To extend this format, multipurpose Internet mail extensions (MIME) were developed to provide pictures, data and multimedia content in email. MIME provided a way to send many types of content, but did not address the need for confidentiality in email. Confidentiality and authenticity have been addressed through two privacy methods: pretty good privacy (PGP) extensions and secure/MIME (S/MIME). Both of these standards ensure the message content cannot be altered or viewed by anyone but the intended recipient. To protect the message using either technology, both the sender and recipient must be capable of handling PGP or S/MIME message content. Both clients need to support the same method, either PGP or S/MIME, but the mail servers do not, since they only transfer the message based on the headers, which are not encrypted.
10
11
permitting unknown clients to relay mail is considered poor practice. The open relays that are publicly available have been widely abused by anonymous senders of spam. Many network operators have forbidden open relays on their network, and have methods to detect them. Once an open relay is located, it is usually listed so that other mail servers can be aware of the open relay. The CAN-SPAM Act of 2003 includes penalties if a sender transmits spam through an open relay.3 However, they are only effective if the sender is not anonymous. To transmit spam via an open relay, a sender must first identify an open relay. Once a relay has been identified, a sender can use it to relay messages to targeted recipients. Senders often use dial-up accounts or connect via a proxy to conceal their identity. Once a connection is made, the message, including its list of recipients, is sent to the MTA, and the client can disconnect. The relay will then proceed to send the message to each recipient, without any further involvement from the client. The message headers will record each additional relay the message travels, as well as the source address of the sender. This information can be used to notify the senders service provider of the abuses, and usually results in termination of the source account. Automation software can be used to manage lists of open relays, lists of recipients and the success of message delivery. These tools are commercially available, and have enabled the distribution of unsolicited bulk email.
12
displaying a human-readable image of text that the user must decipher to confirm the validity of the account user.
2.3.3 Proxies
An SMTP proxy can give spammers anonymity, which is essential in avoiding prosecution. A proxy acts as a broker or intermediary between a client and their desired resource. In the case of email spam, a sender transmits a message to the proxy and the proxy then transmits the same message to the recipient. The recipient of the message knows only the address of the proxy, not the sender. Proxies have many legitimate uses, such as between corporate and public networks, and are typically used to temporarily store network resources or inspect the network traffic that travels them. Similar to open relays, there are also open proxies, which fulfill requests on behalf of any client that connects to them. Open proxies can result from poor software configuration or, more likely, from software installed by a malicious user. Open proxies can be installed by malicious users exploiting software vulnerabilities (this method is further examined in Section 2.3.4). One of the obvious uses for open proxies is in sending spam while concealing the senders identity. In order to identify a spammer behind a proxy, access to the proxy is often required. Proxies can also be chained together, adding another degree of difficulty for those trying to trace the source of the spam. Open-proxy lists are maintained, and network operators can use them to block incoming spam from these sources.4
4. http://opm.blitzed.org
Spam Task Force Network and Technology Working Group
13
A more advanced way of compromising a host is to exploit a software vulnerability that exists in either an operating system or application software. Once a vulnerability is found, a specific exploit needs to be developed. The exploit then needs a method of delivery. These can be simple or complex, and can involve custom network packets, infected email attachments, infected peer-to-peer file-sharing downloads, web-based applications, or any of many other methods. To protect hosts from these exploits, software must be maintained (e.g. through application of vendor-issued patches, antivirus definition updates, secure configuration, etc.) and network connections must be protected (e.g. with a firewall). Some Trojan software provides only a control channel that can be used as a conduit to install other malicious programs in the future without the users consent. In order for someone to control a group of compromised hosts, a communication channel needs to be established between the malicious user and the hosts. Often, these channels can be closed with a firewall and antivirus software can be used to remove the malicious software. An example of a virus that carried an open proxy as payload is SoBig, which increased email traffic in early 2003.
14
3. Anti-Spam Technologies
There are many solutions that can be used to combat the various types of email-based spam. Filtering messages based on their properties, blocking message senders, ensuring senders are authentic, and authorizing clients are all methods used to combat spam.
15
Bayesian filters produce a low percentage of false positives and do not require their rules to be updated by an administrator.5 The filter adapts by monitoring what the user classifies as spam, and adjusts likelihood ratios accordingly. Methods spammers have used to bypass Bayesian filters include inserting random lowlikelihood elements in their messages. Insertion of these elements lowers the overall ratio so the message may not be filtered.
5. Kai Wei, A Naive Bayes Spam Filter, fall 2003 (www.eecs.berkeley.edu/~kwei/courses/cs281a/cs281a.pdf). 6. www.mail-abuse.com
Spam Task Force Network and Technology Working Group
16
7. J. Myers, RFC 2554 SMTP Service Extension for Authentication, March 1999.
17
SMTP authentication can prevent spammers from gaining unauthorized use of mail servers. However, if a spammer is able to compromise a user account, they can then send messages without further restrictions. The original specifications allowed for the use of weak passwords, so it is essential to ensure that a strong password algorithm is used.8
8. Ibid.
Spam Task Force Network and Technology Working Group
18
3.4.2 Firewalls
Firewalls can prevent the unauthorized sending of mail by infected hosts, and can help protect insecure hosts from becoming infected. A firewall can allow or deny inbound or outbound connection attempts by or to a host. For instance, most network worms spread by sending connection attempts to random hosts. If an incoming connection attempt is blocked by a firewall, the host will be protected and the worm will be unable to infect the host. Firewalls can also be used to allow or deny outgoing connections (i.e. egress filtering) and can be applied not just to SMTP, as described in the previous section, but to any service. If a host is compromised and is being used to transmit spam, a host-based firewall can also alert the user of the activity.
19
4. Emerging Technologies
4.1 Domain Authentication
Domain-authentication technologies are used to ensure that a senders domain is not forged or spoofed. Because of the openness of SMTP, a sender is able to forge another senders identity. This protocol weakness is frequently exploited by spammers. To avoid prosecution from legal authorities or termination of service from an ISP, spammers must remain anonymous. Protocol enhancements are, therefore, necessary to ensure the authenticity of a senders address. The following section discusses these enhancements. The primary stakeholders behind the technologies are AOL for the sender policy framework (SPF), Microsoft for Sender ID, Yahoo! for Domain Keys, and Cisco Systems for Identified Internet Mail (IIM).
9. www.imc.org/ietf-mxcomp/mail-archive/msg05054.html
Spam Task Force Network and Technology Working Group
20
One known problem with SPF happens when an MTA forwards mail on behalf of a recipient. The domain of the original sender must be passed through without modification, so that the verification will not fail. The sender rewriting scheme (SRS) has been one way of solving the forwarding issue for SPF; the other has been Sender ID. A general concern is that spammers may still be able to register SPF records for domains that can be used to send authenticated spam email. In this case, the spam may reach the recipient if alternative methods are not in place, but the messages source remains known. If the source of a message is known, methods such as blacklisting, registrar notification and legal prosecution can be used. Disposable domains can let spammers send authenticated messages and then simply register a new domain once the abuse has been detected. This problem has not yet been solved, but a partial solution may be domain accreditation.
10. www.imc.org/ietf-mxcomp/mail-archive/msg04673.html
Spam Task Force Network and Technology Working Group
21
The Domain Keys method uses an asymmetric key algorithm with public and private keys. The sending MTA requires a Domain Keys-enabled MTA, which uses a private key to sign the message. Upon receiving a signed message, the receiving MTA looks up the senders domain to find the public key for that domain. The public key is then used to verify that the signature of the sender is valid. If the signature proves to be valid, the recipient can be sure of the senders domain. As with other domain-authentication protocols, only the MTAs have to support the technology.
4.3 Presence
The concept of presence is being used to provide location-aware services for IM and VoIP applications. Information on a users state and geographical location can be accessed by these applications. This information must be dealt with carefully, however. Some wireless spam already exploits this information by using location-specific advertising for example, in airports. If the information is not properly secured, both IM and VoIP technologies could be vulnerable to abuse by spammers.
22
Discussion
With the volumes of spam increasing, a more coordinated approach is needed to combat spam. Several network operator industry associations have developed recommendations and best practices to combat spam. One of the first to put forth recommendations was the Anti-Spam Technical Alliance, which is supported by large-network operators and service providers.11 Their recommendations address known issues in limiting the conventional sources of spam, such as open relays, proxies and compromised hosts. The recommendations are seen as largely beneficial, and if implemented, increase costs for the senders of spam. It has yet to be seen if a single sender-authentication method can be globally adopted and, if so, whether it can decrease the volume of spam that is on the network today. As mentioned in Section 4.1.2, the IETFs MARID working group was not able to come to a consensus on a single, global method. As a result, discussions continue about various sender-authentication methods. Other emerging technologies, such as cryptographic signature methods like IIM, may still prove to be a better solution. However, the most widely adopted and currently available method of sender authentication continues to be the classic SPF, SPF version 1. Spam will only stop once the methods for transmitting spam through given media are no longer cost-effective. The costs of sending spam using SMTP will increase with the technical measures discussed here. In the future, spam will migrate to other, more costeffective technologies, such as IM or VoIP.
Conclusion
The current drive to fight spam has increased the costs of sending email-based spam, and has produced methods that can be applied to other media. Anti-spam technologies must continually be developed and implemented in a coordinated way in order for them to be effective. Emerging technologies have helped remove anonymity from spam, and can also be applied to media other than email. These technologies, when properly used, should also benefit other anti-spam measures, such as legal action. Even with the use of advanced anti-spam technologies, an insecure host can easily be used to sidestep many of these measures. This issue must be addressed through education and the increased awareness of common users. To lessen the burden on users, technological solutions must be as transparent as possible to them.
11. Anti-Spam Technical Alliance, Anti-Spam Technical Alliance Technology and Policy Proposal, June 2004 (http://docs.yahoo.com/docs/pr/pdf/asta_soi.pdf).
23
The descriptions of the anti-spam technologies discussed in this paper are intended to provide an overview of current technological methods. New issues that arise will be included in subsequent revisions of this living document.
References
Anti-Spam Technical Alliance, Anti-Spam Technical Alliance Technology and Policy Proposal, June 2004 (http://docs.yahoo.com/docs/pr/pdf/asta_soi.pdf). J. Klensin, RFC 2821, Simple Mail Transfer Protocol, April 2001. J. Klensin, N. Freed, M. Rose, E. Stefferud and D. Crocker, RFC 1869, SMTP Service Extensions, November 1995. J. Lyon, Purported Responsible Address in E-Mail Messages, August 2004 (http://draft-ietf-marid-pra-00.txt). J. Myers, RFC 2554 SMTP Service Extension for Authentication, March 1999. P. Resnick, RFC 2822, Internet Message Format, April 2001. United States Federal Trade Commission, Public Law 108187, Controlling the Assault of Non-Solicited Pornography and Marketing Act of 2003 (CAN-SPAM Act of 2003), 2003. Kai Wei, A Naive Bayes Spam Filter, fall 2003 (www.eecs.berkeley.edu/~kwei/courses/cs281a/cs281a.pdf).
24