E-Mail - Technical Overview


Published on

Published in: Technology
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

E-Mail - Technical Overview

  1. 1. Email Technology Overview <ul><li>Ajit Dhumale </li></ul>
  2. 2. Contents <ul><li>How Email Works? </li></ul><ul><li>Email Protocols </li></ul><ul><ul><li>SMTP </li></ul></ul><ul><ul><li>POP3 </li></ul></ul><ul><ul><li>IMAP </li></ul></ul><ul><li>Email Standards </li></ul><ul><ul><li>RFC822 </li></ul></ul><ul><ul><li>MIME </li></ul></ul><ul><li>Email Server Products </li></ul><ul><ul><li>Cyrus, Sendmail, Exchange </li></ul></ul><ul><li>Resources for more information </li></ul>
  3. 3. How Email Works? Outbound MTA (Queue) Inbound MTA Mailstore (mailboxes) POP/IMAP Send mail via MTA (using SMTP) Fetch mails from mailstore (using POP/IMAP) Route mails to their destination Internal Email
  4. 4. Email Protocols: SMTP (RFC 821) <ul><li>Simple Mail Transfer Protocol </li></ul><ul><ul><li>Users use SMTP to send out mails </li></ul></ul><ul><ul><li>MTAs use SMTP to route mails to their destinations </li></ul></ul><ul><li>MTA: Mail Transfer Agent, accepts mails from the user MUAs (Mail User Agent  Email client) and then routes them to the appropriate destination. SMTP is the popular protocol implemented by MTAs </li></ul>
  5. 5. Typical tasks done by SMTP Server <ul><li>In nutshell make ‘best possible’ effort to deliver mail to appropriate destination once accepted </li></ul><ul><li>Accept email from MUA and validate the fields </li></ul><ul><li>Recipient/Sender resolution, expansion, translation </li></ul><ul><li>Envelope splitting and email routing (DNS-MX Record lookup etc.) </li></ul><ul><ul><li>External </li></ul></ul><ul><ul><li>Internal </li></ul></ul><ul><li>Queuing </li></ul><ul><li>Bouncing </li></ul><ul><li>Interact with systems like anti-spam, anti-virus, filtering, policy enforcement etc. </li></ul>
  6. 6. SMTP Protocol <ul><li>R: 220 smtp.pspl.co.in ESMTP Sendmail 8.12.9/8.12.9 </li></ul><ul><li>S: HELO mymachine </li></ul><ul><li>R: 250 smtp.pspl.co.in Hello [], pleased to meet you </li></ul><ul><li>S: MAIL FROM:<ajitd@persistent.co.in> </li></ul><ul><li>R: 250 OK </li></ul><ul><li>S: RCPT TO:<gtom@mirapoint.com> </li></ul><ul><li>R: 250 OK </li></ul><ul><li>S: RCPT TO:<ajitd@hotmail.com> </li></ul><ul><li>R: 250 OK </li></ul><ul><li>S: DATA </li></ul><ul><li>R: 354 Start mail input; end with <CRLF>.<CRLF> </li></ul><ul><li>S: Blah blah blah... </li></ul><ul><li>S: ...etc. etc. etc. </li></ul><ul><li>S: <CRLF>.<CRLF> </li></ul><ul><li>R: 250 OK </li></ul><ul><li>S: QUIT </li></ul><ul><li>R: 221 2.0.0 smtp.pspl.co.in closing connection </li></ul>+----------+ +----------+ +------+ | | | | | User |<-->| | SMTP | | +------+ | Sender- |Commands/Replies| Receiver-| +------+ | SMTP |<-------------->| SMTP | +------+ | mail |<-->| | and Mail | |<-->| Mail | | Store| | | | | | Store| +------+ +----------+ +----------+ +------+ Sender-SMTP Receiver-SMTP Model for SMTP Use Few corner cases: 1. Avoiding endless loop of bounced mails: MAIL FROM:<>
  7. 7. SMTP Protocol Replies Permanent Negative Completion reply. e.g. 500 Syntax error, command unrecognized 5yz Transient Negative Completion reply. E.g. 452 Requested action not taken: insufficient system storage 4yz Positive Intermediate reply. E.g. 354 Start mail input; end with <CRLF>.<CRLF> 3yz Positive Completion reply. e.g. 221 <domain> Service closing transmission channel 2yz Positive Preliminary reply. 1yz
  8. 8. Envelope splitting Envelope (as received): Rcpt to: gtom@mirapoint.com Rcpt to: [email_address] Rcpt to: [email_address] Rcpt to: a [email_address] Internal recipient Envelope (to mx1.mirapoint.com): Rcpt to: gtom@mirapoint.com Rcpt to: [email_address] Envelope (to mx04.hotmail.com): Rcpt to: [email_address]
  9. 9. Dot stuffing <ul><li>In TOP/RETR command of POP and DATA command on SMTP end of email data in indicated by CRLF.CRLF. i.e. single . On a line </li></ul><ul><ul><li>What if the mail contains a line with . as the first character? It should be escaped by an additional . i.e. ‘.’ Should be replaced with ‘..’ This is called as Dot Stuffing. </li></ul></ul><ul><ul><li>Protocol clients have to do dot stuffing/dot de-stuffing accordingly </li></ul></ul>
  10. 10. SMTP Security Issue <ul><li>Open reply problem </li></ul><ul><ul><li>Reply host file </li></ul></ul><ul><li>Solutions for roaming users </li></ul><ul><ul><li>POP before SMTP </li></ul></ul><ul><ul><li>SMTP Authentication </li></ul></ul>Outside to inside POP/IMAP Server SMTP Server Relay Host db 2. IP of the client machine is added to relay db 1. User authenticates with POP/IMAP server 3. User connects to SMTP server 4. SMTP Server checks for IP in relay host db and allows relaying 5. Entries from db are expired periodically Internal Network Outside Network Inside to Inside Inside to outside Outside to Outside
  11. 11. SMTP Extensions
  12. 12. SMTP server implementation consideration <ul><li>Must make ‘best effort’ to deliver accepted email </li></ul><ul><li>High throughput </li></ul><ul><li>Efficient Queuing of mail </li></ul><ul><li>Graceful handing of overload </li></ul><ul><li>Connection Caching, MX record caching etc. </li></ul><ul><li>Security: SMTP servers are exposed to the world, must protect against DOS attacks , exploits, open relay </li></ul>
  13. 13. LMTP (RFC 2033) <ul><li>Local Mail Transfer Protocol: </li></ul><ul><ul><li>The Local Mail Transfer Protocol or LMTP is a derivate of SMTP , the Simple Mail Transfer Protocol . It is designed as an alternative to normal SMTP for situations where the receiving side does not have a mail queue, such as a Mail Delivery Agent MDA that understands SMTP conversations </li></ul></ul><ul><ul><li>Main difference is that: The change is that after the final &quot;.&quot;, the server returns one reply for each previously successful RCPT command in the mail transaction, in the order that the RCPT commands were issued. Even if there were multiple successful RCPT commands giving the same forward-path, there must be one reply for each successful RCPT command. </li></ul></ul>
  14. 14. Email Protocol: POP3 (RFC 1939) <ul><li>It’s more like a letter box: New mails meant for you get deposited into your letter box. Periodically you check if there is new mail available and if yes then retrieve it from the letter box. </li></ul>
  15. 15. POP3 States & Commands TOP msg n, UIDL [msg], USER, PASS, APOP Optional POP3 Commands QUIT UPDATE STAT, LIST [msg], RETR msg, DELE msg, NOOP, RSET TRANSACTION QUIT AUTHORIZATION Commands State
  16. 16. POP3 server responses <ul><li>Response Type Response Code and Description Single line </li></ul><ul><ul><li>+OK (Success), followed by descriptive text, for example, &quot;+OK message deleted&quot; for the delete operation. Responses are mapped to appropriate callbacks. </li></ul></ul><ul><ul><li>-ERR (failure), followed by descriptive text, for example, &quot;-ERR no such message&quot; for the delete operation. Mapped to error callback. </li></ul></ul><ul><li>Multi-line First line: Like single-line response: </li></ul><ul><ul><li>+OK (Success), followed by descriptive text, for example, &quot;+OK message follows&quot; for the retrieve operation. Responses are mapped to appropriate callbacks. </li></ul></ul><ul><ul><li>-ERR (failure), followed by descriptive text, for example, &quot;-ERR no such message&quot; for the retrieve operation. Mapped to error callback. </li></ul></ul><ul><ul><li>Subsequent lines: More information about the condition. Final line: . (dot) and <CRLF>. (Not considered part of the response.) Note: If an error occurs on a multi-line response, a single line is returned. </li></ul></ul>e.g. C: UIDL S: +OK S: 1 whqtswO00WBw418f9t5JxYwZ S: 2 QhdPYR:00WBw1Ph7x7 S: . e.g. C: STAT S: +OK 2 320 C: LIST 3 S: -ERR no such message, only 2 messages in maildrop
  17. 17. Typical POP Session
  18. 18. POP implementation considerations <ul><li>Quick response to frequent commands </li></ul><ul><ul><li>LIST, UIDL, STAT </li></ul></ul><ul><li>Mailbox locking during POP session </li></ul><ul><ul><li>Response to LIST, UIDL never change during a POP session </li></ul></ul><ul><ul><li>DELE only marks message for deletion, QUIT actually deletes the messages </li></ul></ul><ul><li>POP3 protocol is not friendly for connection pooling </li></ul><ul><ul><li>You have to disconnect to change the user </li></ul></ul>
  19. 19. Email Protocol: IMAP <ul><li>Internet Message Access Protocol (IMAP) is an Internet protocol that allows a client to manipulate electronic mail messages that are stored on a mail server. It allows clients to manipulate a remote message folder (called a mailbox) in the same way they would manipulate local mailboxes. It supports mail operations that let users create, delete, and rename mailboxes; check for new messages; permanently remove messages; set and clear flags; and search. </li></ul>
  20. 20. IMAP Commands <ul><li>Mailbox Level </li></ul><ul><ul><li>SELECT, EXAMINE, STATUS, CREATE, DELETE, RENAME, EXPUNGE </li></ul></ul><ul><li>Mail Listing </li></ul><ul><ul><li>LIST, SERCH </li></ul></ul><ul><li>Retrieval </li></ul><ul><ul><li>FETCH </li></ul></ul><ul><li>Modify flags </li></ul><ul><ul><li>STORE </li></ul></ul>
  21. 21. POP3 Vs IMAP Complexity Protocol granularity Append, Flags Folders Typical Use Feature IMAP protocol is complex which make client and server implementation is demanding task. IMAP clients are not that widely available especially on the compact devices Less complex which makes it easy to implement pop client and server. POP clients are available on wide variety of devices including most of the compact devices Retrieve any sub part of the mail, search Limited retrieve entire mail or headers, no search Yes No Yes No Access mails from multiple machines Access mails from single machine IMAP POP3
  22. 22. IMAP implementation consideration <ul><li>Quick response for frequently used commands </li></ul><ul><ul><li>SELECT, EXAMINE commands </li></ul></ul><ul><ul><li>Caching for quick header retrieval </li></ul></ul><ul><li>IMAP protocol is not friendly for connection pooling </li></ul><ul><ul><li>You have to disconnect to change the user </li></ul></ul><ul><li>Use of IDLE command </li></ul><ul><ul><li>Alternative to polling but still adds to server load due to open connections </li></ul></ul>
  23. 23. IMAP Extensions
  24. 24. Email Standards <ul><li>RFC 822/2822 (Internet Message Format) </li></ul><ul><ul><ul><li>Defines overall message format: syntax and semantics </li></ul></ul></ul><ul><ul><ul><li>Defines syntax and semantics for standard fields like To, Cc, Date etc. </li></ul></ul></ul><ul><li>MIME </li></ul><ul><ul><li>Nested message component structure </li></ul></ul><ul><ul><li>Content-Type indicates what’s the type of data MIME part is holding </li></ul></ul><ul><li>Overall message syntax: </li></ul><ul><li>message = *field *(CRLF *text) </li></ul><ul><li>field = field-name &quot;:&quot; [field-body] CRLF </li></ul><ul><li>field-name = 1*<any CHAR, excluding CTLs, SPACE, and &quot;:&quot;> </li></ul><ul><li>field-body = *text [CRLF LWSP-char field-body] </li></ul><ul><li>e.g. </li></ul><ul><li>From: John Doe <jdoe@machine.example> </li></ul><ul><li>To: Mary Smith < [email_address] > </li></ul><ul><li>Subject: Saying Hello </li></ul><ul><li>Date: Fri, 21 Nov 1997 09:55:06 -0600 </li></ul><ul><li>Message-ID: <1234@local.machine.example> </li></ul><ul><li>This is a message just to say hello. So, &quot;Hello&quot;. </li></ul><ul><ul><ul><li>Header folding </li></ul></ul></ul><ul><ul><ul><li>Subject: This is a test </li></ul></ul></ul><ul><ul><ul><li>Subject: This </li></ul></ul></ul><ul><ul><ul><li>is a test </li></ul></ul></ul>From: &quot;Ajit Dhumale&quot; <ajitd@persistent.co.in> To: <ajitd@persistent.co.in> Subject: Sample Mail Date: Thu, 28 Oct 2004 10:26:38 +0530 MIME-Version: 1.0 Content-Type: multipart/alternative; boundary=&quot;----=_NextPart_000_0011_01C4BCD8.9BE9B8C0&quot; This is a multi-part message in MIME format. ------=_NextPart_000_0011_01C4BCD8.9BE9B8C0 Content-Type: text/plain; charset=&quot;iso-8859-1&quot; Content-Transfer-Encoding: quoted-printable Sample MIME mail. ------=_NextPart_000_0011_01C4BCD8.9BE9B8C0 Content-Type: text/html; charset=&quot;iso-8859-1&quot; Content-Transfer-Encoding: quoted-printable <!DOCTYPE HTML PUBLIC &quot;-//W3C//DTD HTML 4.0 Transitional//EN&quot;> <HTML><HEAD> <META http-equiv=3DContent-Type content=3D&quot;text/html; = charset=3Diso-8859-1&quot;> <META content=3D&quot;MSHTML 6.00.2800.1106&quot; name=3DGENERATOR> <STYLE></STYLE> </HEAD> <BODY bgColor=3D#ffffff> <DIV><FONT face=3DArial size=3D2>Sample <STRONG>MIME</STRONG>=20 mail.</FONT></DIV></BODY></HTML> ------=_NextPart_000_0011_01C4BCD8.9BE9B8C0--
  25. 25. Email Server Products <ul><li>IMAP </li></ul><ul><ul><li>Cyrus </li></ul></ul><ul><ul><li>University of Washington </li></ul></ul><ul><ul><li>Currier </li></ul></ul><ul><li>SMTP </li></ul><ul><ul><li>Sendmail </li></ul></ul><ul><ul><li>Qmail </li></ul></ul><ul><li>Exchange/Lotus Notes </li></ul><ul><ul><li>Email (IMAP,POP3, SMTP) + collaboration </li></ul></ul>
  26. 26. Cyrus Internals <ul><li>Mailbox storage format </li></ul><ul><li>Each mailbox is represented by a directory in the filesystem. Within the directory, each message is stored in its own file in RFC 822 format. The filenames of the message files are the sequentially-assigned UID's, with a period appended. Lines are terminated by CRLF. In addition to message files there are 3 files () containing meta data, indexes and cached data. </li></ul><ul><li>Typical mailbox directory looks like… </li></ul><ul><li>cyrus.header </li></ul><ul><li>cyrus.index </li></ul><ul><li>cyrus.cache </li></ul><ul><li>1. </li></ul><ul><li>2. </li></ul><ul><li>3. </li></ul><ul><li>cyrus.header : Contains variable-length information about the mailbox itself. </li></ul><ul><li>cyrus.index : Contains a header and a sequence of fixed-length records, one record per message in the mailbox. </li></ul><ul><ul><li>The header contains: generation-no, format, minor-version, start-offset, record-size, exists, last-date, last-uid, quota-used, pop3-last, uidvalidity </li></ul></ul><ul><ul><li>Each per-message record contains: </li></ul></ul><ul><li>uid, internaldate, sentdate, rfc822.size, header-size, body-offset, cache-offset(offset in cyrus.cache), last-updated, system-flags, user-flags </li></ul><ul><li>cyrus.cache : Contains a header and a sequence of variable-length records, one record per message in the mailbox. The header contains a &quot;generation number&quot; corresponding to the one in cyrus.index. Each record contains: </li></ul><ul><ul><li>IMAP &quot;envelope&quot;, in format suitable for use in FETCH reply </li></ul></ul><ul><ul><li>IMAP &quot;bodystructure&quot;, in format suitable for use in FETCH reply </li></ul></ul><ul><ul><li>IMAP &quot;body&quot;, in format suitable for use in FETCH reply </li></ul></ul><ul><ul><li>size, file offsets, charsets, and encodings of the various MIME body sections </li></ul></ul><ul><ul><li>From, To, Cc, and BCC strings for searching </li></ul></ul>
  27. 27. Resources for more information <ul><li>Ton of information available at … </li></ul><ul><li>http://athabasca.intranet.pspl.co.in/tw2/bin/view/Techexpertise/MessagingHome </li></ul>