Your SlideShare is downloading. ×
Email as a datasource for applications
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×

Introducing the official SlideShare app

Stunning, full-screen experience for iPhone and Android

Text the download link to your phone

Standard text messaging rates apply

Email as a datasource for applications

2,268
views

Published on

Our email contains years of important personal information: key contacts, versions of documents, discussions around important projects or deals. It's a datasource that too often ignored by developers …

Our email contains years of important personal information: key contacts, versions of documents, discussions around important projects or deals. It's a datasource that too often ignored by developers and for those brave ones who don't, they're in for a bumpy ride dealing with the tedious details of arcane protocols.

The presentation will be about the potential use cases for email data, the varies ways to access it, the common pitfalls and different tools targeted at this.

Published in: Technology

1 Comment
4 Likes
Statistics
Notes
No Downloads
Views
Total Views
2,268
On Slideshare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
0
Comments
1
Likes
4
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Email as a datasource for appsBruno Morency bruno@context.io @brunomorency
  • 2. • Overview of the technologies that make emailWhat this • How your apps can fit in that picturepresentationwill be about • An intro to IMAP and message bodies with common pitfalls. • Overview of Context.IO
  • 3. “The reports of my death weregreatly exaggerated” - Email
  • 4. 2.9 billion in 2010
  • 5. 3.8 billion by 2014 180B messages/day 340M tweets/day
  • 6. group collaboration task management document collaboration customer support app notificationEmail is a communications system project management client relationship applicant tracking photo sharing bug tracking Nigerian extortion
  • 7. Overview of protocols and standardsor “which acronym does what”
  • 8. Protocol forSMTP transmission of emails across the internet
  • 9. • Message transport, nothing to do with content• Defines the envelope (sender and recipients)• Does not define the message headers• Chain from client to recipient’s server
  • 10. DKIM Standards for sender signatures and preventSPF sender spoofing
  • 11. • Complement spam filters• Opens the message and checks headers to decide if it will deliver it to the inbox• As a receiver, it’s one more way to block spam.• As a sender, it’s a tool you must master to avoid ending up in the spam folder• Email deliverability is an industry by itself
  • 12. Protocol to allow aIMAP client to access and manipulate emails on a receiving server.
  • 13. • All messages and their folder organization are on the server• Clients poll to know about with new messages that arrive or actions made through other clients• While it doesn’t send messages, clients usually store sent messages through it
  • 14. Protocol to allow aPOP client to retrieve emails from a receiving server.
  • 15. • The server only serves as a temporary buffer for received messages• Classification and message state is purely a client-side concept• Many clients can access the same account but can’t coordinate anything
  • 16. RFC-822 Standards definingMIME headers and the actual body of the messageMultipart
  • 17. Where does your app fit in there?
  • 18. Typical 1. Send emails to usersthings apps 2. Receive emails from userswant to do 3. Access emails users send andwith email receive.
  • 19. group collaboration task management document collaboration customer support app notificationEmail is a communications system project management client relationship applicant tracking photo sharing bug tracking Nigerian extortion
  • 20. Introduction to IMAP
  • 21. Me: “App Developer, meet IMAP. IMAP,meet App Developer.”IMAP: “I don’t give a sh*t about you, AppDeveloper. Go away!”
  • 22. 1. Connect to the IMAP server and authenticate>"openssl"s_client"-crlf"-connect"imap.gmail.com:993["a"few"lines"of"SSL"and"server"info"]*"OK"Gimap"ready"for"requests"from"123.14.12.20"zw8i38638oab.180a001"LOGIN"username"password*"CAPABILITY"IMAP4rev1"UNSELECT"IDLE"NAMESPACE"QUOTA"ID"XLIST"CHILDREN"X-GM-EXT-1"UIDPLUS"COMPRESS=DEFLATEa001"OK"username"authenticated"(Success)
  • 23. 3. LIST mailboxesa002"LIST"""""*"*"LIST"(HasChildren)""/"""Drive"*"LIST"(Noselect"HasChildren)""/"""Drive/Dev"*"LIST"(HasNoChildren)""/"""Drive/Dev/A"*"LIST"(HasNoChildren)""/"""Drive/Dev/B"*"LIST"(HasNoChildren)""/"""INBOX"*"LIST"(HasNoChildren)""/"""Archive"*"LIST"(HasNoChildren)""/"""Sent"Mail"*"LIST"(HasNoChildren)""/"""Drafts"*"LIST"(HasNoChildren)""/"""Spam"*"LIST"(HasChildren)""/"""My"folder"*"LIST"(HasNoChildren)""/"""My"folder/label"A"*"LIST"(HasNoChildren)""/"""My"folder/label"B"a002"OK"Success
  • 24. 4. SELECT a mailboxa003"SELECT""Drive/Dev"*"FLAGS"(Answered"Flagged"Draft"Deleted"Seen)*"OK"[PERMANENTFLAGS"(Deleted"Seen"*)]"Limited*"OK"[UIDVALIDITY"614213447]"UIDs"valid*"OK"[UIDNEXT"1042]"Predicted"next"UID*"84"EXISTS*"3"RECENTa003"OK"[READ-WRITE]"Drive/Dev"selected."(Success)
  • 25. 4. FETCH messagesa013"FETCH"80:81"(FLAGS"BODY[HEADER.FIELDS"(DATE"FROM"SUBJECT)])*"80"FETCH"(FLAGS"(Seen)"BODY[HEADER.FIELDS"(DATE"FROM"SUBJECT)]"{101}Date:"Mon,"26"Jul"2012"14:05:16"-0400From:"Dominik"Gehl"<dominik@gmail.com>Subject:"test)*"81"FETCH"(FLAGS"(Seen)"BODY[HEADER.FIELDS"(DATE"FROM"SUBJECT)]"{115}From:"Dominik"Gehl"<dominik@context.io>Subject:"Payment"required"errorDate:"Tue,"27"Mar"2012"09:28:01"-0400)a013"OK"Success
  • 26. 4. FLAG a message as reada015"STORE"81"+FLAGS"(Seen)*"81"FETCH"(FLAGS"())a015"OK"Success
  • 27. 4. CLOSE the mailbox and LOGOUT the accounta023"CLOSEa023"OK"Returned"to"authenticated"state."(Success)a024"LOGOUT*"BYE"LOGOUT"Requesteda024"OK"LOGOUT"completed."(Success)
  • 28. That didn’t seem so bad!
  • 29. • There is no persistent primary key you can rely on to retrieve aPitfall #1: messageIdentifying • Message Sequence Numbermessages • Unique Identifier
  • 30. • Ascending and contiguous sequence. If the mailbox saysSequence 11 exist, you can fetch messages with seq. nb. 1 to 11Number • They can (and will) be reassigned during a session.
  • 31. • 32-bit value uniquely identifying a message within a mailbox. • Ascending but not necessarilyUnique incremental nor contiguous.Identifier • If you move a message to(aka UID) another mailbox, it will get a new UID in that new mailbox • Changes if the mailbox UIDVALIDITY changes
  • 32. • Only the INBOX mailbox has a special meaning.Pitfall #2: • Everything else has theSpecial-use meaning the client wants it to have (which may not be infolders (or English)lack thereof) • Gmail has XLIST which add mailbox attributes (Inbox, Sent, Starred, ...)
  • 33. Pitfall #3: • Anything that searches or fetches messages is doneNo data until within the context of a mailboxyou select a • Can’t get account-wide list ofmailbox messages
  • 34. • Its an extension that isnt widelyPitfall #4: available and even then, restricted to a single mailboxThreads • X-GM-THREAD-ID to the rescue
  • 35. • You need to get and parse the body structurePitfall #5:Attachment? • As far as IMAP is concerned, an attachment is the same thing as any other MIME part
  • 36. • Setting the Deleted flag marks the message for deletion but it’sPitfall #6: still thereDeleting • EXPUNGE will remove allmessages messages with Deleted flag from the currently selected mailbox
  • 37. • Purging client side message list is a PITA.Pitfall #7: • Server wont tell you whichKeeping up messages were deleted, you just have to figure out somewith deleted have been and find which onemessages were. • Its the same if you want to keep track of Seen flag.
  • 38. The joys of parsing email messagesYé! I fetched a message! Now what do I do?
  • 39. A simple messageDelivered-To:"sysadmin@context.ioReturn-Path:"<2012050639be@bounces.amazon.com>Received:"by"10.229.135.136"with"SMTP"id"n8mr410292qct.135.1336583200550;""""""""Wed,"09"May"2012"10:06:40"-0700"(PDT)Received:"from"smtp-out.amazon.com"(smtp-out.amazon.com."[72.21.212.39])""""""""by"mx.google.com"with"ESMTP"id"b2si1383913qcd.195.2012.05.06.40;""""""""Wed,"09"May"2012"10:06:40"-0700"(PDT)Date:"Wed,"9"May"2012"17:06:39"+0000"(UTC)From:"Amazon"EC2"Notification"<no-reply-aws@amazon.com>To:""Sys"Admin""<sys@context.io>Cc:""Alerts""<alerts@context.io>Message-ID:"<urn.correios.msg.2012050639be@1336583199032.us-1.amazon.com>Subject:"Notice:"Amazon"EC2"Instance"scheduled"for"retirementMIME-Version:"1.0Content-Type:"text/plain;"charset=UTF-8Content-Transfer-Encoding:"7bitHello,"...
  • 40. A message with an attachmentMIME-Version:"1.0Content-Type:"multipart/mixed;"boundary=_MYBOUNDARY_--_MYBOUNDARY_Content-Type:"text/plainThis"is"the"body"of"the"message.--_MYBOUNDARY_Content-Type:"image/jpeg;"name="IMG_713.jpg"Content-Disposition:"attachment;"filename="IMG_713.jpg";"size=6379099;Content-Transfer-Encoding:"base64/9j/4AAQSkZJRgABAgAAZABkAAD/7AARRHVja3kAAQAEAAAAZA+4AJkFkb2JlAGTAAAAAAQMAAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMD8IAEQgAegG1AwERAIRAQMRAfEASMAAQACAwEBAQEBAAAAAAAAAAAHCAUGCQQDAgEKAQEAAgIDAQAAAAAAAAAAAAAABgcFCAEDBAIQAAEEAgEBBgQGAQUAAAAAAAUCAwQGAQcAEjBQERMUFRBgFhcgQHAhNAhBMSIjJDURAAIC==--_MYBOUNDARY_--
  • 41. A message with alternative partsMIME-Version:"1.0Content-Type:"multipart/alternative;"boundary=_MYBOUNDARY_--_MYBOUNDARY_Content-Type:"text/plain;"charset="us-ascii"Content-Transfer-Encoding:"quoted-printableHello!"Here’s"a"message"with"*rich*"text--_MYBOUNDARY_Content-Type:"text/html;"charset="us-ascii"Content-Transfer-Encoding:"quoted-printable<html><body>Hello!"Here’s"a"message"with"<b>rich</b>"text</body></html>--_MYBOUNDARY_--
  • 42. Pitfall #1: • Great to track messages but spec says its optional.Message-IDis optional ... and it’s not always there.
  • 43. • Refers to Message-ID of other emailsPitfall #2:In-Reply-To • Very useful to rebuild threadsReferences ... until an Outlook user jumps in and replaces it with their own Thread. Topic and Thread.Index headers
  • 44. Pitfall #3: • Content-Disposition tells youAttachments attachment or inline. Should signature image be consideredare what you as a file attachment?decide them • TNEF attachmentsto be
  • 45. webhooksthreads contacts messages files
  • 46. Demo of Context.IO console