Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Oscon 2015 - OpenIO: Enabling the petabyte-sized mailboxes

501 views

Published on

Talk we gave at OSCON 2015 in Portland, OR within the Open Messaging Day on how to scale a mailbox management platforms with Cyrus and OpenIO object storage solution.

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Oscon 2015 - OpenIO: Enabling the petabyte-sized mailboxes

  1. 1. Enabling the petabyte-sized mailboxes
  2. 2. @LaurentDenel
  3. 3. Co-founder
  4. 4. 20157 8 50 fellow co-founders years in the field million users launch
  5. 5. 2006 idea & 
 first
 concept 2007 design
 dev starts 2009 1st massive
 production 2012 the project
 is open
 sourced 2014 10+ PB
 managed 2015 OpenIO
 fork
  6. 6. 3 avenue Antoine Pinay
 Parc d'Activité des 4 Vents
 59510 Hem @OpenIO OpenIO github.com/open-io 180 Sansom Street, FI 4 San Francisco, 94104 CA FRANCE USA
  7. 7. • Cyrus scalability issues 
 • OpenIO technology
 • Cyrus IMAP 3.0
 Storage infrastructure challenge Open source object storage OpenIO for a scale-out Cyrus Summary
  8. 8. Cyrus scalability issues Share-nothing clusters • Migrate mailboxes to new clusters as local storage becomes full • Storage cost with SAS drives • Ends up with unused CPU/RAM ressources on Cyrus servers 

  9. 9. Disk used email size IOPS zone
 SSD drives capacity zone SATA drives Number of emails Email distribution by size (based on actual data) Be smart with storage
  10. 10. Storage CPU/RAM
 usage Problem: more storage you need, more servers you waste Over the years
  11. 11. Solution: abstraction between Cyrus servers and storage Cyrus Servers Storage OpenIO
  12. 12. Object storage
  13. 13. Object storage magic The simplest and best way from 1 hard drive to 1,000s
  14. 14. How to keep track of trillions of objects?
  15. 15. Existing technologies Distributed Hash Tables Consistent Hashing Single name node • Good for few large files
 (large web indexes, nosql tables, …) • Bad for numerous small objects
 (emails,…) • Good for trillion of objects • Bad because of rebalancing of part of the data when adding capacity
  16. 16. OpenIO is different 1,000,000,000,000 = 100,000,000 x 10,000 Mailboxes Emails per Mailbox Do not track objects, track containers! Trillions of objects? Let’s do the math
  17. 17. OpenIO technology highlights
  18. 18. Grid & ConscienceDirectory design with indirections
  19. 19. “All problems in computer science can
 be solved by another level of indirection.” David Wheeler
  20. 20. Indirections Pragmatic design Containers … Data chunksContainers store locations of objects, 
 not objects … …
  21. 21. Grid & Conscience Grid of nodes • Massively distributed • Each node takes part in directory & storage services • no SPOF - Resilient to node failures On-the-fly best matchmaking Conscience • Collects metrics of each node • Computes node scores • Real time asynchronous process • Advanced load balancing:
 select the best nodes at a particular time route requests to them
  22. 22. #NeverRebalance
  23. 23. Real life use-case deploy as you grow Time > PB> Deployed Used ISP 30+ M mailboxes
  24. 24. Cyrus 3.0 + OpenIO
  25. 25. Cyrus 3.0 archive pool SATA ArchiveSSD Spool Cyrus
  26. 26. Cyrus 3.0 + OpenIO OpenIO Steps #1 Scale out archive storage #2 Bring Conscience to Cyrus Murder Sept 2015Available #3 Scale-out backend for calendars & contacts End of 2015 To provide the best Cyrus IMAP node when a mailbox is created or migrated To provide per-user tiny databases spread across the grid of nodes
  27. 27. Grid for Apps scale-out back-ends for your apps
  28. 28. object storage grid for apps
  29. 29. Get started
  30. 30. atlas.hashicorp.com/openio/boxes/sds-cyrus/ Get the Vagrant box $ vagrant up
  31. 31. Cheers! github.com/open-io

×