ElasticInbox

  • 5,948 views
Uploaded on

Cassandra as an Email Storage system from Cassandra London

Cassandra as an Email Storage system from Cassandra London

More in: Technology
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads

Views

Total Views
5,948
On Slideshare
0
From Embeds
0
Number of Embeds
11

Actions

Shares
Downloads
23
Comments
0
Likes
2

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • Compaction everywhere, Outlook example\n
  • Compaction everywhere, Outlook example\n
  • Compaction everywhere, Outlook example\n
  • Compaction everywhere, Outlook example\n
  • OpenStack Swift has similarities with Cassandra design.\n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • SuperColumns implementation planned to be replaced in Cassandra 1.2\n
  • Designed for ad/page impression count at Digg/Twitter\n
  • Designed for ad/page impression count at Digg/Twitter\n
  • Designed for ad/page impression count at Digg/Twitter\n
  • \n
  • \n
  • \n
  • \n

Transcript

  • 1. Cassandra as an Email Store Rustam Aliyev • 20 Feb 2012
  • 2. Emails sent worldwide4.500.000/secEmail Statistics Report 2009-2013, The Radicati Group. 2
  • 3. Email storage problem MTA LDA 3
  • 4. Email storage problem MTA LDA Filesystem + RDBMS ≠ Scalability + Availability 4
  • 5. ElasticInbox 1000 ft view MTA … elasticinbox nodes load-balancing, share-nothing Message Original Metadata Message Blob Store (OpenStack, AWS S3, others) Metadata Store (Cassandra … Cluster) … 5
  • 6. Why Cassandra?Horizontal ScalabilityHigh Availability, no SPOF and AutomaticReplicationFlexible schemaCountersEmail storage does more writes than reads spam, sent mails, notifications, mailing lists, unread emails, ... 6
  • 7. Why not Cassandra for BLOBs?Thrift does not support streaming Value has to fit into memory Default max Thrift frame size is 5MBPossible solution: split large files into 1MBchunks Less than 2% of emails >1MB (in our case) 7
  • 8. Why not Cassandra for BLOBs?Wasted RAM / JVM Heap 200 x 5MB messages R/W = 1GB RAM 8
  • 9. Why not Cassandra for BLOBs?Wasted RAM / JVM Heap 200 x 5MB messages R/W = 1GB RAMWasted disk space When RF=3, disk space = 6 × data 1TB data 6TB storage required! 8
  • 10. Why not Cassandra for BLOBs?Wasted RAM / JVM Heap 200 x 5MB messages R/W = 1GB RAMWasted disk space When RF=3, disk space = 6 × data 1TB data 6TB storage required!Wasted CPU More CPU used during compactions 8
  • 11. Why not Cassandra for BLOBs?Wasted RAM / JVM Heap 200 x 5MB messages R/W = 1GB RAMWasted disk space When RF=3, disk space = 6 × data 1TB data 6TB storage required!Wasted CPU More CPU used during compactionsLeveled Compaction Strategy? New (1.0+), less wasted storage but more I/O. 8
  • 12. BLOB Stores for BLOBsBLOB Stores are designed for storing BLOBsCan store unlimited number of objects in a singlecontainer.AWS S3, OpenStack Object Store, and other 15supported (thanks @jclouds!).40%-50% more space efficient than BLOBs inCassandra (w/RF=3; 1TB 3.5TB, rather than6TB).Cons: much slower than Cassandra (no memtable). 9
  • 13. Polyglot PersistenceMartin Fowler: “any decent sized enterprise willhave a variety of different data storagetechnologies for different kinds of data”Martin Fowler, 16 Nov 2011Dont take the example in the diagram too seriously. 10
  • 14. Data Model 11
  • 15. Data ModelNoSQL data model is driven by data access pattens: 11
  • 16. Data ModelNoSQL data model is driven by data access pattens: Email is immutable 11
  • 17. Data ModelNoSQL data model is driven by data access pattens: Email is immutable Mostly, very recent messages are accessed and updated 11
  • 18. Data ModelNoSQL data model is driven by data access pattens: Email is immutable Mostly, very recent messages are accessed and updated 11
  • 19. Data ModelNoSQL data model is driven by data access pattens: Email is immutable Mostly, very recent messages are accessed and updatedBut sometimes, access pattens are driven by NoSQL data model: 11
  • 20. Data ModelNoSQL data model is driven by data access pattens: Email is immutable Mostly, very recent messages are accessed and updatedBut sometimes, access pattens are driven by NoSQL data model: Synergy between programming model and data model 11
  • 21. Data ModelNoSQL data model is driven by data access pattens: Email is immutable Mostly, very recent messages are accessed and updatedBut sometimes, access pattens are driven by NoSQL data model: Synergy between programming model and data model Some Gmail features driven BigTable limitations? 11
  • 22. Data ModelNoSQL data model is driven by data access pattens: Email is immutable Mostly, very recent messages are accessed and updatedBut sometimes, access pattens are driven by NoSQL data model: Synergy between programming model and data model Some Gmail features driven BigTable limitations? Labels instead of folders 11
  • 23. Data ModelNoSQL data model is driven by data access pattens: Email is immutable Mostly, very recent messages are accessed and updatedBut sometimes, access pattens are driven by NoSQL data model: Synergy between programming model and data model Some Gmail features driven BigTable limitations? Labels instead of folders No custom sorting, only by time 11
  • 24. Data ModelNoSQL data model is driven by data access pattens: Email is immutable Mostly, very recent messages are accessed and updatedBut sometimes, access pattens are driven by NoSQL data model: Synergy between programming model and data model Some Gmail features driven BigTable limitations? Labels instead of folders No custom sorting, only by time Other examples: “More” pagination 11
  • 25. Data Model ‒ Column Families4 Column Families: MessageMetadata IndexLabels Accounts CountersAccount ID: String (user@domail.tld)Message ID: TimeUUIDLabel ID: Integer 12
  • 26. Data Model ‒ AccountsColumn FamilyReserved Labels: 0 = All Mails, 1 = Inbox, 2 =Drafts, ... "Accounts" { "user@elasticinbox.com" { "label:0" : "all", "label:1" : "inbox", "label:2" : "drafts", "label:230": "Custom Label", ... } } 13
  • 27. Data Model ‒ IndexLabelsColumn FamilyComposite Key : Account + Label IDMessages ordered by time "IndexLabels" { "user@elasticinbox.com:0" { # All Mails "550e8400-e29b-41d4-a716-446655440000" : null, "892e8300-e29b-41d4-a716-446655440000" : null, "a0232400-e29b-41d4-a716-446655440000" : null, ... } "user@elasticinbox.com:1" { # Inbox "550e8400-e29b-41d4-a716-446655440000" : null, "892e8300-e29b-41d4-a716-446655440000" : null, "a0232400-e29b-41d4-a716-446655440000" : null, ... } } 14
  • 28. Data Model ‒ MessageMetadataSuperColumn FamilyStores message metadata and pre-parsedcontents Message headers, body and attachment infoTimeUUID as unique Message ID, ordered bytime 15
  • 29. Data Model ‒ MessageMetadata"MessageMetadata" { "user@elasticinbox.com" { "550e8400-e29b-41d4-a716-446655440000" { "from" : "[[Test,test@elasticinbox.com]]", "to" : "[[Me,user@elasticinbox.com],[…]]", "subject" : "Hello world!", "date" : "12 March 2011 01:12:00", "uri" : "blob://aws-s3/550e8400-e29b-41d4-a716-446655440000", "l:1" : null, # Label ID "m:1" : null, # Marker ID "html" : "<html><body>This is message body</body></html>", "parts" : "{2.1: {filename: image.png, ...}}", ... } "892e8300-e29b-41d4-a716-446655440000" { ... } ... }} 16
  • 30. Data Model ‒ MessageMetadataQuery: List 30 newest messages with label “Inbox” ids[] = SliceQuery(“IndexLabels”, “user@dom.tld:1”, 30) msg[] = MultigetQuery(“MessageMetadata”, “user@dom.tld”, ids[]) Row Key SuperColumn SubColumns "from" : "..." "to" : "..." 550e8400-e29b-41d4-a716-446655440000 "subject" : "..."user@dom.tld "html" : "..." 892e8300-e29b-41d4-a716-446655440000 - // - a0232400-e29b-41d4-a716-446655440000 - // - e5586600-f81d-11df-8cc2-080027267700 - // -some@dom2.tld e5595060-f81d-11df-bc91-080027267700 - // - 17
  • 31. Data Model ‒ CountersSuperColumn FamilyAccount’s all counters are on the same node"Counters" { "user@elasticinbox.com" { "l:0" { "total_bytes" : 18239090, "total_msg" : 394, "new_msg" : 12 } "l:1" { "total_msg" : 144, "new_msg" : 10 } ...} 18
  • 32. Data Model ‒ CountersSuperColumn FamilyAccount’s all counters are on the same node"Counters" { Non-atomic "user@elasticinbox.com" { Counters "l:0" { "total_bytes" : 18239090, It’s easy to miscount "total_msg" : 394, "new_msg" : 12 } "l:1" { "total_msg" : 144, "new_msg" : 10 } ...} 18
  • 33. ElasticInbox in ProductionIn production since Nov 2011~200K accounts, 30M+ messages4 node cluster, RF=3, Cassandra 0.8.xEach 1TB of raw mails = 70GB in Cassandra Metadata + LZF compressed email text/html body 19
  • 34. ElasticInbox in ProductionCassandra load : 40 requests per second pernodeCassandra latency: 10ms read average, 0.02mswriteWrite to Read ratio: CF Name W:R Ratio MessageMetadata 3:1 IndexLabels 2:1 Accounts 1:50 Counters 2:3 20
  • 35. Future workPerformance improvements (may involve minorschema changes)Full-text search (preferably on top of Cassandra)POP3 and IMAPBuilt-in filtering rulesMessage threads / conversations 21
  • 36. Questions?www.elasticinbox.comgithub.com/elasticinbox@elasticinbox @rstml