This presentation uses some slides from lecture slides of Associate Prof.
Tran Quang Anh from FIT - HANU
&&
Anti-spamAnti-spam
Group No 2C12Group No 2C12
Contents
1.Background knowledge
2.Spam
3.Anti spam techniques
4.An introduction to Gmail anti-spam
5.Q&A
1. Background knowledge
1.1 Email format: 2 components
• Header
• Body
Separated by a free line.
1. Background knowledge
PRIMARY
FIELDS
SECONDARY FIELDS MIME FIELDS
1. From
2. To
3. Subject
4. Date
5. Message-ID
6. Bcc (Blind Carbon Copy)
7. Cc (Carbon copy)
8. Content-Type
9. Importance
10.In-Reply-To
11.Precedence
12.Received
13.Return-Path
14.Sender
15. X-Originating-IP
16.MIME format
17.Content encoding
18.Content type
19.Content-
Disposition
1. Background knowledge
1.2 Email sending steps
If server Gmail wants to send an email to
manhnv@hanu.edu.vn, it will
Step 1: Check MX record (IP) of
hanu.edu.vn
Step 2: Connect to port 25 in that IP
address
Step 3: Follow SMTP protocol
2. Email Spam
2.1 What is email spam?
UBE (Unsolicited Bulk Email)
Same content but lots of mails
Purposes: Advertisement,
phishing, spreading malware, etc.
2. Email Spam
2.2 Why is email spam?
o Technical consideration
o Sender is anonymous
o Internet (email, ADSL) is prevalent
o Economical consideration
o Low cost to send an email
o Demand of advertisement
2. Email Spam
2.3 Problems caused by
email spam:
o Denied of service (full mail box,
wrong delete)
2. Email Spam
2.3 Problems caused by email
spam:
oVirus
2. Email Spam
2.3 Problems caused by email
spam:
oPhishing
3. Anti-spam
3.1 Anti-spam framework:
3. Anti - spam
3.2 Anti-spam techniques
 Content-based method
 Header-based method
 Protocol-based method
 Sender authentication
 Social network
3. Anti - spam
Content-based method
o Analyze the frequency of top keywords in email (SpamAssassin)
o Effective algorithm: Bayesian filtering algorithm
o Example: giá, c h i, siêu, mi n phí (Vietnamese keywords), free, like,ơ ộ ễ
subscribe, Facebook, hot deal, sale off (English keywords)
3. Anti - spam
 Header-based method
o Examines the headers of email messages to detect spam
o Approaches:
o Whitelist: email addresses of legitimate email in a database
o Blacklist schemes collect the IP addresses of all known spammer
3. Anti - spam
Source: http://www.mcafee.com/threat-intelligence/ip/spam-senders.aspx
3. Anti – spam
 Protocol-based method
3. Anti - spam
 Sender authentication
o Spammer can fake identity (they can claim who they are).
o Sender authentication treat this way.
o How does SA work?
1. SA adds a “marker” to the DNS server, which inform the designated email
servers for a specific domain.
2. A server verify if a received email message actually came from on these email
servers.
o Example: Sender Policy Framework (AOL, HANU), SenderID (Microsoft),
DomainKeys (Yahoo)
3. Anti-spam
 Social network
o PageRank (Google)
o Graph theory:
• Consider an email network with nodes
are users and links are email
transaction activities
• Coefficient: low (do not exchange email
frequently), high
4. Gmail anti-spam
4.1 Gmail anti-spam technique
o Gmail uses multiple techniques:
o SPF (Sender Policy Framework),
o DomainKeys
o DKIM (DomainKeys Identified Mail)
4. Gmail anti-spam
4.2 Gmail header format
o How to read a header? (Demonstration with web browser)
Spam and Anti Spam Techniques

Spam and Anti Spam Techniques

  • 1.
    This presentation usessome slides from lecture slides of Associate Prof. Tran Quang Anh from FIT - HANU && Anti-spamAnti-spam Group No 2C12Group No 2C12
  • 2.
    Contents 1.Background knowledge 2.Spam 3.Anti spamtechniques 4.An introduction to Gmail anti-spam 5.Q&A
  • 3.
    1. Background knowledge 1.1Email format: 2 components • Header • Body Separated by a free line.
  • 4.
    1. Background knowledge PRIMARY FIELDS SECONDARYFIELDS MIME FIELDS 1. From 2. To 3. Subject 4. Date 5. Message-ID 6. Bcc (Blind Carbon Copy) 7. Cc (Carbon copy) 8. Content-Type 9. Importance 10.In-Reply-To 11.Precedence 12.Received 13.Return-Path 14.Sender 15. X-Originating-IP 16.MIME format 17.Content encoding 18.Content type 19.Content- Disposition
  • 5.
    1. Background knowledge 1.2Email sending steps If server Gmail wants to send an email to manhnv@hanu.edu.vn, it will Step 1: Check MX record (IP) of hanu.edu.vn Step 2: Connect to port 25 in that IP address Step 3: Follow SMTP protocol
  • 6.
    2. Email Spam 2.1What is email spam? UBE (Unsolicited Bulk Email) Same content but lots of mails Purposes: Advertisement, phishing, spreading malware, etc.
  • 7.
    2. Email Spam 2.2Why is email spam? o Technical consideration o Sender is anonymous o Internet (email, ADSL) is prevalent o Economical consideration o Low cost to send an email o Demand of advertisement
  • 8.
    2. Email Spam 2.3Problems caused by email spam: o Denied of service (full mail box, wrong delete)
  • 9.
    2. Email Spam 2.3Problems caused by email spam: oVirus
  • 10.
    2. Email Spam 2.3Problems caused by email spam: oPhishing
  • 11.
  • 12.
    3. Anti -spam 3.2 Anti-spam techniques  Content-based method  Header-based method  Protocol-based method  Sender authentication  Social network
  • 13.
    3. Anti -spam Content-based method o Analyze the frequency of top keywords in email (SpamAssassin) o Effective algorithm: Bayesian filtering algorithm o Example: giá, c h i, siêu, mi n phí (Vietnamese keywords), free, like,ơ ộ ễ subscribe, Facebook, hot deal, sale off (English keywords)
  • 14.
    3. Anti -spam  Header-based method o Examines the headers of email messages to detect spam o Approaches: o Whitelist: email addresses of legitimate email in a database o Blacklist schemes collect the IP addresses of all known spammer
  • 15.
    3. Anti -spam Source: http://www.mcafee.com/threat-intelligence/ip/spam-senders.aspx
  • 16.
    3. Anti –spam  Protocol-based method
  • 17.
    3. Anti -spam  Sender authentication o Spammer can fake identity (they can claim who they are). o Sender authentication treat this way. o How does SA work? 1. SA adds a “marker” to the DNS server, which inform the designated email servers for a specific domain. 2. A server verify if a received email message actually came from on these email servers. o Example: Sender Policy Framework (AOL, HANU), SenderID (Microsoft), DomainKeys (Yahoo)
  • 18.
    3. Anti-spam  Socialnetwork o PageRank (Google) o Graph theory: • Consider an email network with nodes are users and links are email transaction activities • Coefficient: low (do not exchange email frequently), high
  • 19.
    4. Gmail anti-spam 4.1Gmail anti-spam technique o Gmail uses multiple techniques: o SPF (Sender Policy Framework), o DomainKeys o DKIM (DomainKeys Identified Mail)
  • 20.
    4. Gmail anti-spam 4.2Gmail header format o How to read a header? (Demonstration with web browser)