ElasticInbox

10,782 views

Published on

Cassandra as an Email Storage system from Cassandra London

Published in: Technology
0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
10,782
On SlideShare
0
From Embeds
0
Number of Embeds
8,692
Actions
Shares
0
Downloads
39
Comments
0
Likes
2
Embeds 0
No embeds

No notes for slide
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • Compaction everywhere, Outlook example\n
  • Compaction everywhere, Outlook example\n
  • Compaction everywhere, Outlook example\n
  • Compaction everywhere, Outlook example\n
  • OpenStack Swift has similarities with Cassandra design.\n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • SuperColumns implementation planned to be replaced in Cassandra 1.2\n
  • Designed for ad/page impression count at Digg/Twitter\n
  • Designed for ad/page impression count at Digg/Twitter\n
  • Designed for ad/page impression count at Digg/Twitter\n
  • \n
  • \n
  • \n
  • \n
  • ElasticInbox

    1. 1. Cassandra as an Email Store Rustam Aliyev • 20 Feb 2012
    2. 2. Emails sent worldwide4.500.000/secEmail Statistics Report 2009-2013, The Radicati Group. 2
    3. 3. Email storage problem MTA LDA 3
    4. 4. Email storage problem MTA LDA Filesystem + RDBMS ≠ Scalability + Availability 4
    5. 5. ElasticInbox 1000 ft view MTA … elasticinbox nodes load-balancing, share-nothing Message Original Metadata Message Blob Store (OpenStack, AWS S3, others) Metadata Store (Cassandra … Cluster) … 5
    6. 6. Why Cassandra?Horizontal ScalabilityHigh Availability, no SPOF and AutomaticReplicationFlexible schemaCountersEmail storage does more writes than reads spam, sent mails, notifications, mailing lists, unread emails, ... 6
    7. 7. Why not Cassandra for BLOBs?Thrift does not support streaming Value has to fit into memory Default max Thrift frame size is 5MBPossible solution: split large files into 1MBchunks Less than 2% of emails >1MB (in our case) 7
    8. 8. Why not Cassandra for BLOBs?Wasted RAM / JVM Heap 200 x 5MB messages R/W = 1GB RAM 8
    9. 9. Why not Cassandra for BLOBs?Wasted RAM / JVM Heap 200 x 5MB messages R/W = 1GB RAMWasted disk space When RF=3, disk space = 6 × data 1TB data 6TB storage required! 8
    10. 10. Why not Cassandra for BLOBs?Wasted RAM / JVM Heap 200 x 5MB messages R/W = 1GB RAMWasted disk space When RF=3, disk space = 6 × data 1TB data 6TB storage required!Wasted CPU More CPU used during compactions 8
    11. 11. Why not Cassandra for BLOBs?Wasted RAM / JVM Heap 200 x 5MB messages R/W = 1GB RAMWasted disk space When RF=3, disk space = 6 × data 1TB data 6TB storage required!Wasted CPU More CPU used during compactionsLeveled Compaction Strategy? New (1.0+), less wasted storage but more I/O. 8
    12. 12. BLOB Stores for BLOBsBLOB Stores are designed for storing BLOBsCan store unlimited number of objects in a singlecontainer.AWS S3, OpenStack Object Store, and other 15supported (thanks @jclouds!).40%-50% more space efficient than BLOBs inCassandra (w/RF=3; 1TB 3.5TB, rather than6TB).Cons: much slower than Cassandra (no memtable). 9
    13. 13. Polyglot PersistenceMartin Fowler: “any decent sized enterprise willhave a variety of different data storagetechnologies for different kinds of data”Martin Fowler, 16 Nov 2011Dont take the example in the diagram too seriously. 10
    14. 14. Data Model 11
    15. 15. Data ModelNoSQL data model is driven by data access pattens: 11
    16. 16. Data ModelNoSQL data model is driven by data access pattens: Email is immutable 11
    17. 17. Data ModelNoSQL data model is driven by data access pattens: Email is immutable Mostly, very recent messages are accessed and updated 11
    18. 18. Data ModelNoSQL data model is driven by data access pattens: Email is immutable Mostly, very recent messages are accessed and updated 11
    19. 19. Data ModelNoSQL data model is driven by data access pattens: Email is immutable Mostly, very recent messages are accessed and updatedBut sometimes, access pattens are driven by NoSQL data model: 11
    20. 20. Data ModelNoSQL data model is driven by data access pattens: Email is immutable Mostly, very recent messages are accessed and updatedBut sometimes, access pattens are driven by NoSQL data model: Synergy between programming model and data model 11
    21. 21. Data ModelNoSQL data model is driven by data access pattens: Email is immutable Mostly, very recent messages are accessed and updatedBut sometimes, access pattens are driven by NoSQL data model: Synergy between programming model and data model Some Gmail features driven BigTable limitations? 11
    22. 22. Data ModelNoSQL data model is driven by data access pattens: Email is immutable Mostly, very recent messages are accessed and updatedBut sometimes, access pattens are driven by NoSQL data model: Synergy between programming model and data model Some Gmail features driven BigTable limitations? Labels instead of folders 11
    23. 23. Data ModelNoSQL data model is driven by data access pattens: Email is immutable Mostly, very recent messages are accessed and updatedBut sometimes, access pattens are driven by NoSQL data model: Synergy between programming model and data model Some Gmail features driven BigTable limitations? Labels instead of folders No custom sorting, only by time 11
    24. 24. Data ModelNoSQL data model is driven by data access pattens: Email is immutable Mostly, very recent messages are accessed and updatedBut sometimes, access pattens are driven by NoSQL data model: Synergy between programming model and data model Some Gmail features driven BigTable limitations? Labels instead of folders No custom sorting, only by time Other examples: “More” pagination 11
    25. 25. Data Model ‒ Column Families4 Column Families: MessageMetadata IndexLabels Accounts CountersAccount ID: String (user@domail.tld)Message ID: TimeUUIDLabel ID: Integer 12
    26. 26. Data Model ‒ AccountsColumn FamilyReserved Labels: 0 = All Mails, 1 = Inbox, 2 =Drafts, ... "Accounts" { "user@elasticinbox.com" { "label:0" : "all", "label:1" : "inbox", "label:2" : "drafts", "label:230": "Custom Label", ... } } 13
    27. 27. Data Model ‒ IndexLabelsColumn FamilyComposite Key : Account + Label IDMessages ordered by time "IndexLabels" { "user@elasticinbox.com:0" { # All Mails "550e8400-e29b-41d4-a716-446655440000" : null, "892e8300-e29b-41d4-a716-446655440000" : null, "a0232400-e29b-41d4-a716-446655440000" : null, ... } "user@elasticinbox.com:1" { # Inbox "550e8400-e29b-41d4-a716-446655440000" : null, "892e8300-e29b-41d4-a716-446655440000" : null, "a0232400-e29b-41d4-a716-446655440000" : null, ... } } 14
    28. 28. Data Model ‒ MessageMetadataSuperColumn FamilyStores message metadata and pre-parsedcontents Message headers, body and attachment infoTimeUUID as unique Message ID, ordered bytime 15
    29. 29. Data Model ‒ MessageMetadata"MessageMetadata" { "user@elasticinbox.com" { "550e8400-e29b-41d4-a716-446655440000" { "from" : "[[Test,test@elasticinbox.com]]", "to" : "[[Me,user@elasticinbox.com],[…]]", "subject" : "Hello world!", "date" : "12 March 2011 01:12:00", "uri" : "blob://aws-s3/550e8400-e29b-41d4-a716-446655440000", "l:1" : null, # Label ID "m:1" : null, # Marker ID "html" : "<html><body>This is message body</body></html>", "parts" : "{2.1: {filename: image.png, ...}}", ... } "892e8300-e29b-41d4-a716-446655440000" { ... } ... }} 16
    30. 30. Data Model ‒ MessageMetadataQuery: List 30 newest messages with label “Inbox” ids[] = SliceQuery(“IndexLabels”, “user@dom.tld:1”, 30) msg[] = MultigetQuery(“MessageMetadata”, “user@dom.tld”, ids[]) Row Key SuperColumn SubColumns "from" : "..." "to" : "..." 550e8400-e29b-41d4-a716-446655440000 "subject" : "..."user@dom.tld "html" : "..." 892e8300-e29b-41d4-a716-446655440000 - // - a0232400-e29b-41d4-a716-446655440000 - // - e5586600-f81d-11df-8cc2-080027267700 - // -some@dom2.tld e5595060-f81d-11df-bc91-080027267700 - // - 17
    31. 31. Data Model ‒ CountersSuperColumn FamilyAccount’s all counters are on the same node"Counters" { "user@elasticinbox.com" { "l:0" { "total_bytes" : 18239090, "total_msg" : 394, "new_msg" : 12 } "l:1" { "total_msg" : 144, "new_msg" : 10 } ...} 18
    32. 32. Data Model ‒ CountersSuperColumn FamilyAccount’s all counters are on the same node"Counters" { Non-atomic "user@elasticinbox.com" { Counters "l:0" { "total_bytes" : 18239090, It’s easy to miscount "total_msg" : 394, "new_msg" : 12 } "l:1" { "total_msg" : 144, "new_msg" : 10 } ...} 18
    33. 33. ElasticInbox in ProductionIn production since Nov 2011~200K accounts, 30M+ messages4 node cluster, RF=3, Cassandra 0.8.xEach 1TB of raw mails = 70GB in Cassandra Metadata + LZF compressed email text/html body 19
    34. 34. ElasticInbox in ProductionCassandra load : 40 requests per second pernodeCassandra latency: 10ms read average, 0.02mswriteWrite to Read ratio: CF Name W:R Ratio MessageMetadata 3:1 IndexLabels 2:1 Accounts 1:50 Counters 2:3 20
    35. 35. Future workPerformance improvements (may involve minorschema changes)Full-text search (preferably on top of Cassandra)POP3 and IMAPBuilt-in filtering rulesMessage threads / conversations 21
    36. 36. Questions?www.elasticinbox.comgithub.com/elasticinbox@elasticinbox @rstml

    ×