Ceph Day Santa Clara: Ceph at DreamHost

  • 948 views
Uploaded on

Dallas Kashuba, Co-Founder of DreamHost walks through why they went with Ceph at the Santa Clara Ceph Day.

Dallas Kashuba, Co-Founder of DreamHost walks through why they went with Ceph at the Santa Clara Ceph Day.

More in: Technology , Business
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads

Views

Total Views
948
On Slideshare
0
From Embeds
0
Number of Embeds
3

Actions

Shares
Downloads
56
Comments
0
Likes
3

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. Ceph at DreamHost A Storage Journey
  • 2. About Me • One of the original four of DreamHost • Still active daily at DreamHost • Have spent a lot of time working on the Ops side.
  • 3. • Hosting company founded in 1997 • Sage’s other company • shared hosting, virtual servers, dedicated servers, cloud storage, cloud computing • 375k customers, 1.3MM websites
  • 4. Storage Journey A long strange trip
  • 5. His name was Destro
  • 6. ... and then there were more.
  • 7. The First NetApp
  • 8. Remote Failover
  • 9. Remote Failover
  • 10. Meanwhile...
  • 11. ... and still more.
  • 12. Lots of NetApps • Peak of around 125 individual NetApps • Smallish capacity on each (8TB) • Internal software continuously moving data between NetApps • Lots of time spent managing nearly full filers
  • 13. Ideal
  • 14. Reality
  • 15. Hosting Landscape • Included storage had grown from 50MB to gigabytes, then terabytes. • Prices stayed the same. • Eventually went to unlimited Storage • Usage per customer skyrocketed.
  • 16. Failed Experiments
  • 17. Failed Experiments • ATAoE and XFS-based systems • Performance & Stability issues • 2006 era gear
  • 18. Failed Experiments • High capacity • Nice features • Expensive • 85% full and it failed
  • 19. Some Success • First on Sun hardware then Supermicro • Great stability • Not enough IO for front-line network storage
  • 20. Back to Basics
  • 21. Local RAID • SATA drives had grown in capacity and were very cheap • 4-6TB per hosting server • Less dependence on congested network • Smaller failure domains The Good
  • 22. Local RAID • No more quota, too slow to scan filesystem • No more fast failovers • Multiple hour filesystem check with ext3 • More failure domains The Bad
  • 23. Local RAID • Complete RAID loss more common than anticipated • Multiple days to fully restore from backup The Ugly
  • 24. Storage Today Light at the end of the tunnel
  • 25. Hybrid Mix • We learned something from every step of the way • No one size fits all when it comes to storage • Use whatever is best for the job • Be ready to change Best Tool For The Job
  • 26. A Bit of Everything • Clustered NetApps and NFS for email • Local RAID in hosting servers • ZFS and OpenSolaris backup servers • Ceph for DreamObjects and DreamCompute Best Tool For The Job
  • 27. • Object Storage, S3/Swift compatible • 2+ Petabytes raw storage • 3x replication, 900+ OSDs • RGW behind HAProxy • Row, rack, node and disk fault tolerant
  • 28. • OpenStack-based Public Cloud • 3+ Petabytes raw storage • All storage is on Ceph RBD • Boot and Attachable Volumes • Nicira SDN + Ceph, Live Migration
  • 29. HA Load Balancer MySQL / PostgreSQL Horizon Cockpit Pod Glance Keystone Nova Quantum Cinder Nicira NVP Glance Store (Ceph) OSMirrors (apt) Ceph Monitors Opscode Chef Logstash + Graphite Networking Gear 8x - Hypervisor Node 192 GB RAM 64AMD cores 14x - Storage Node 12x - 3TB disks Networking Gear Compute Pod 8x - Hypervisor Node 192 GB RAM 64AMD cores 14x - Storage Nodes 12x - 3TB disks Networking Gear Compute Pod 8x - Hypervisor Node 192 GB RAM 64AMD cores 14x - Storage Nodes 12x - 3TB disks Networking Gear Compute Pod Pods • 512 cores • 1.5TB of RAM • 504TB raw storage • 168TB redundant storage N etworking • ODM switches w/ Linux • 10Gbps everywhere • IPv6 from the ground up • Spine and leaf topology • 120 Gbps between pods (!) The Internets Thar be dragonshere! Nicira NVP Nicira NVP NiciraNVP
  • 30. CephFS & The Future • The return of Failovers • No more backup servers • No more major disk-related outages • Fault tolerant low cost hosting Storage Panacea?
  • 31. Thanks! @dallas dallas@dreamhost.com