SlideShare a Scribd company logo
1 of 31
Ceph at DreamHost
A Storage Journey
About Me
• One of the original four of DreamHost
• Still active daily at DreamHost
• Have spent a lot of time working on the
Ops side.
• Hosting company founded in 1997
• Sage’s other company
• shared hosting, virtual
servers, dedicated servers, cloud
storage, cloud computing
• 375k customers, 1.3MM websites
Storage Journey
A long strange trip
His name was Destro
... and then there
were more.
The First NetApp
Remote Failover
Remote Failover
Meanwhile...
... and still more.
Lots of NetApps
• Peak of around 125 individual NetApps
• Smallish capacity on each (8TB)
• Internal software continuously moving
data between NetApps
• Lots of time spent managing nearly full
filers
Ideal
Reality
Hosting Landscape
• Included storage had grown from 50MB
to gigabytes, then terabytes.
• Prices stayed the same.
• Eventually went to unlimited Storage
• Usage per customer skyrocketed.
Failed Experiments
Failed Experiments
• ATAoE and XFS-based
systems
• Performance &
Stability issues
• 2006 era gear
Failed Experiments
• High capacity
• Nice features
• Expensive
• 85% full and it
failed
Some Success
• First on Sun hardware
then Supermicro
• Great stability
• Not enough IO for
front-line network
storage
Back to Basics
Local RAID
• SATA drives had grown in capacity and
were very cheap
• 4-6TB per hosting server
• Less dependence on congested
network
• Smaller failure domains
The Good
Local RAID
• No more quota, too slow to scan
filesystem
• No more fast failovers
• Multiple hour filesystem check with ext3
• More failure domains
The Bad
Local RAID
• Complete RAID loss more common
than anticipated
• Multiple days to fully restore from
backup
The Ugly
Storage Today
Light at the end of the tunnel
Hybrid Mix
• We learned something from every step
of the way
• No one size fits all when it comes to
storage
• Use whatever is best for the job
• Be ready to change
Best Tool For The Job
A Bit of Everything
• Clustered NetApps and NFS for email
• Local RAID in hosting servers
• ZFS and OpenSolaris backup servers
• Ceph for DreamObjects and
DreamCompute
Best Tool For The Job
• Object Storage, S3/Swift compatible
• 2+ Petabytes raw storage
• 3x replication, 900+ OSDs
• RGW behind HAProxy
• Row, rack, node and disk fault tolerant
• OpenStack-based Public Cloud
• 3+ Petabytes raw storage
• All storage is on Ceph RBD
• Boot and Attachable Volumes
• Nicira SDN + Ceph, Live Migration
HA Load Balancer
MySQL / PostgreSQL
Horizon
Cockpit Pod
Glance
Keystone
Nova
Quantum
Cinder
Nicira NVP
Glance Store (Ceph)
OSMirrors (apt)
Ceph Monitors
Opscode Chef
Logstash + Graphite
Networking Gear
8x - Hypervisor Node
192 GB RAM
64AMD cores
14x - Storage Node
12x - 3TB disks
Networking Gear
Compute Pod
8x - Hypervisor Node
192 GB RAM
64AMD cores
14x - Storage Nodes
12x - 3TB disks
Networking Gear
Compute Pod
8x - Hypervisor Node
192 GB RAM
64AMD cores
14x - Storage Nodes
12x - 3TB disks
Networking Gear
Compute Pod
Pods
• 512 cores
• 1.5TB of RAM
• 504TB raw storage
• 168TB redundant storage
N etworking
• ODM switches w/ Linux
• 10Gbps everywhere
• IPv6 from the ground up
• Spine and leaf topology
• 120 Gbps between pods (!)
The Internets
Thar be dragonshere!
Nicira NVP Nicira NVP NiciraNVP
CephFS & The Future
• The return of Failovers
• No more backup servers
• No more major disk-related outages
• Fault tolerant low cost hosting
Storage Panacea?
Thanks!
@dallas
dallas@dreamhost.com

More Related Content

What's hot

Why learn jenkins via nomad_ci (nomad/consul/docker/jenkins) 
Why learn jenkins via nomad_ci (nomad/consul/docker/jenkins) Why learn jenkins via nomad_ci (nomad/consul/docker/jenkins) 
Why learn jenkins via nomad_ci (nomad/consul/docker/jenkins) Dave Pitts
 
London HUG 14/4 - Deploying and Discovering at Scale with Consul and Nomad
London HUG 14/4 - Deploying and Discovering at Scale with Consul and NomadLondon HUG 14/4 - Deploying and Discovering at Scale with Consul and Nomad
London HUG 14/4 - Deploying and Discovering at Scale with Consul and NomadLondon HashiCorp User Group
 
Ceph Object Storage at Spreadshirt (July 2015, Ceph Berlin Meetup)
Ceph Object Storage at Spreadshirt (July 2015, Ceph Berlin Meetup)Ceph Object Storage at Spreadshirt (July 2015, Ceph Berlin Meetup)
Ceph Object Storage at Spreadshirt (July 2015, Ceph Berlin Meetup)Jens Hadlich
 
Global deduplication for Ceph - Myoungwon Oh
Global deduplication for Ceph - Myoungwon OhGlobal deduplication for Ceph - Myoungwon Oh
Global deduplication for Ceph - Myoungwon OhCeph Community
 
Ceph Object Storage at Spreadshirt
Ceph Object Storage at SpreadshirtCeph Object Storage at Spreadshirt
Ceph Object Storage at SpreadshirtJens Hadlich
 
Scaling MongoDB on Amazon Web Services (DAT209) | AWS re:Invent 2013
Scaling MongoDB on Amazon Web Services (DAT209) | AWS re:Invent 2013Scaling MongoDB on Amazon Web Services (DAT209) | AWS re:Invent 2013
Scaling MongoDB on Amazon Web Services (DAT209) | AWS re:Invent 2013Amazon Web Services
 
Webinar - DreamObjects/Ceph Case Study
Webinar - DreamObjects/Ceph Case StudyWebinar - DreamObjects/Ceph Case Study
Webinar - DreamObjects/Ceph Case StudyCeph Community
 
Spreadshirt Platform - An Architectural Overview
Spreadshirt Platform - An Architectural OverviewSpreadshirt Platform - An Architectural Overview
Spreadshirt Platform - An Architectural OverviewJens Hadlich
 
Day 2 General Session Presentations RedisConf
Day 2 General Session Presentations RedisConfDay 2 General Session Presentations RedisConf
Day 2 General Session Presentations RedisConfRedis Labs
 
SVC / Storwize: cost effective storage planning (BVQ use case)
SVC / Storwize: cost effective storage planning (BVQ use case)SVC / Storwize: cost effective storage planning (BVQ use case)
SVC / Storwize: cost effective storage planning (BVQ use case)Michael Pirker
 
Data Scotland 2019: You can run SQL Server on AWS
Data Scotland 2019: You can run SQL Server on AWSData Scotland 2019: You can run SQL Server on AWS
Data Scotland 2019: You can run SQL Server on AWSJohn McCormack
 
San Francisco HashiCorp User Group at GitHub
San Francisco HashiCorp User Group at GitHubSan Francisco HashiCorp User Group at GitHub
San Francisco HashiCorp User Group at GitHubJon Benson
 
Ceph and cloud stack apr 2014
Ceph and cloud stack   apr 2014Ceph and cloud stack   apr 2014
Ceph and cloud stack apr 2014Ian Colle
 
Scylla Summit 2018: Rebuilding the Ceph Distributed Storage Solution with Sea...
Scylla Summit 2018: Rebuilding the Ceph Distributed Storage Solution with Sea...Scylla Summit 2018: Rebuilding the Ceph Distributed Storage Solution with Sea...
Scylla Summit 2018: Rebuilding the Ceph Distributed Storage Solution with Sea...ScyllaDB
 
Cloud Costing Services
Cloud Costing Services Cloud Costing Services
Cloud Costing Services InnoTech
 
Standing Up Your First Cluster
Standing Up Your First ClusterStanding Up Your First Cluster
Standing Up Your First ClusterDataStax Academy
 
GRU: Taming a Herd of Wild Servers - Oz Katz, Similarweb - DevOpsDays Tel Avi...
GRU: Taming a Herd of Wild Servers - Oz Katz, Similarweb - DevOpsDays Tel Avi...GRU: Taming a Herd of Wild Servers - Oz Katz, Similarweb - DevOpsDays Tel Avi...
GRU: Taming a Herd of Wild Servers - Oz Katz, Similarweb - DevOpsDays Tel Avi...DevOpsDays Tel Aviv
 
MongoDB at community engine
MongoDB at community engineMongoDB at community engine
MongoDB at community enginemathraq
 
MongoDB and Amazon Web Services: Storage Options for MongoDB Deployments
MongoDB and Amazon Web Services: Storage Options for MongoDB DeploymentsMongoDB and Amazon Web Services: Storage Options for MongoDB Deployments
MongoDB and Amazon Web Services: Storage Options for MongoDB DeploymentsMongoDB
 

What's hot (20)

Why learn jenkins via nomad_ci (nomad/consul/docker/jenkins) 
Why learn jenkins via nomad_ci (nomad/consul/docker/jenkins) Why learn jenkins via nomad_ci (nomad/consul/docker/jenkins) 
Why learn jenkins via nomad_ci (nomad/consul/docker/jenkins) 
 
London HUG 14/4 - Deploying and Discovering at Scale with Consul and Nomad
London HUG 14/4 - Deploying and Discovering at Scale with Consul and NomadLondon HUG 14/4 - Deploying and Discovering at Scale with Consul and Nomad
London HUG 14/4 - Deploying and Discovering at Scale with Consul and Nomad
 
Ceph Object Storage at Spreadshirt (July 2015, Ceph Berlin Meetup)
Ceph Object Storage at Spreadshirt (July 2015, Ceph Berlin Meetup)Ceph Object Storage at Spreadshirt (July 2015, Ceph Berlin Meetup)
Ceph Object Storage at Spreadshirt (July 2015, Ceph Berlin Meetup)
 
Drupal performance
Drupal performanceDrupal performance
Drupal performance
 
Global deduplication for Ceph - Myoungwon Oh
Global deduplication for Ceph - Myoungwon OhGlobal deduplication for Ceph - Myoungwon Oh
Global deduplication for Ceph - Myoungwon Oh
 
Ceph Object Storage at Spreadshirt
Ceph Object Storage at SpreadshirtCeph Object Storage at Spreadshirt
Ceph Object Storage at Spreadshirt
 
Scaling MongoDB on Amazon Web Services (DAT209) | AWS re:Invent 2013
Scaling MongoDB on Amazon Web Services (DAT209) | AWS re:Invent 2013Scaling MongoDB on Amazon Web Services (DAT209) | AWS re:Invent 2013
Scaling MongoDB on Amazon Web Services (DAT209) | AWS re:Invent 2013
 
Webinar - DreamObjects/Ceph Case Study
Webinar - DreamObjects/Ceph Case StudyWebinar - DreamObjects/Ceph Case Study
Webinar - DreamObjects/Ceph Case Study
 
Spreadshirt Platform - An Architectural Overview
Spreadshirt Platform - An Architectural OverviewSpreadshirt Platform - An Architectural Overview
Spreadshirt Platform - An Architectural Overview
 
Day 2 General Session Presentations RedisConf
Day 2 General Session Presentations RedisConfDay 2 General Session Presentations RedisConf
Day 2 General Session Presentations RedisConf
 
SVC / Storwize: cost effective storage planning (BVQ use case)
SVC / Storwize: cost effective storage planning (BVQ use case)SVC / Storwize: cost effective storage planning (BVQ use case)
SVC / Storwize: cost effective storage planning (BVQ use case)
 
Data Scotland 2019: You can run SQL Server on AWS
Data Scotland 2019: You can run SQL Server on AWSData Scotland 2019: You can run SQL Server on AWS
Data Scotland 2019: You can run SQL Server on AWS
 
San Francisco HashiCorp User Group at GitHub
San Francisco HashiCorp User Group at GitHubSan Francisco HashiCorp User Group at GitHub
San Francisco HashiCorp User Group at GitHub
 
Ceph and cloud stack apr 2014
Ceph and cloud stack   apr 2014Ceph and cloud stack   apr 2014
Ceph and cloud stack apr 2014
 
Scylla Summit 2018: Rebuilding the Ceph Distributed Storage Solution with Sea...
Scylla Summit 2018: Rebuilding the Ceph Distributed Storage Solution with Sea...Scylla Summit 2018: Rebuilding the Ceph Distributed Storage Solution with Sea...
Scylla Summit 2018: Rebuilding the Ceph Distributed Storage Solution with Sea...
 
Cloud Costing Services
Cloud Costing Services Cloud Costing Services
Cloud Costing Services
 
Standing Up Your First Cluster
Standing Up Your First ClusterStanding Up Your First Cluster
Standing Up Your First Cluster
 
GRU: Taming a Herd of Wild Servers - Oz Katz, Similarweb - DevOpsDays Tel Avi...
GRU: Taming a Herd of Wild Servers - Oz Katz, Similarweb - DevOpsDays Tel Avi...GRU: Taming a Herd of Wild Servers - Oz Katz, Similarweb - DevOpsDays Tel Avi...
GRU: Taming a Herd of Wild Servers - Oz Katz, Similarweb - DevOpsDays Tel Avi...
 
MongoDB at community engine
MongoDB at community engineMongoDB at community engine
MongoDB at community engine
 
MongoDB and Amazon Web Services: Storage Options for MongoDB Deployments
MongoDB and Amazon Web Services: Storage Options for MongoDB DeploymentsMongoDB and Amazon Web Services: Storage Options for MongoDB Deployments
MongoDB and Amazon Web Services: Storage Options for MongoDB Deployments
 

Viewers also liked

Ceph Day London 2014 - Deploying ceph in the wild
Ceph Day London 2014 - Deploying ceph in the wildCeph Day London 2014 - Deploying ceph in the wild
Ceph Day London 2014 - Deploying ceph in the wildCeph Community
 
Ceph Day London 2014 - Ceph Over High-Performance Networks
Ceph Day London 2014 - Ceph Over High-Performance Networks Ceph Day London 2014 - Ceph Over High-Performance Networks
Ceph Day London 2014 - Ceph Over High-Performance Networks Ceph Community
 
Ceph Day London 2014 - Best Practices for Ceph-powered Implementations of Sto...
Ceph Day London 2014 - Best Practices for Ceph-powered Implementations of Sto...Ceph Day London 2014 - Best Practices for Ceph-powered Implementations of Sto...
Ceph Day London 2014 - Best Practices for Ceph-powered Implementations of Sto...Ceph Community
 
Ceph Day New York 2014: Ceph over High Performance Networks
Ceph Day New York 2014: Ceph over High Performance NetworksCeph Day New York 2014: Ceph over High Performance Networks
Ceph Day New York 2014: Ceph over High Performance NetworksCeph Community
 
Ceph Day Nov 2012 - Sage Weil
Ceph Day Nov 2012 - Sage WeilCeph Day Nov 2012 - Sage Weil
Ceph Day Nov 2012 - Sage WeilCeph Community
 
London Ceph Day: Ceph Performance and Optimization
London Ceph Day: Ceph Performance and Optimization London Ceph Day: Ceph Performance and Optimization
London Ceph Day: Ceph Performance and Optimization Ceph Community
 
London Ceph Day: Erasure Coding: Purpose and Progress
London Ceph Day: Erasure Coding: Purpose and Progress London Ceph Day: Erasure Coding: Purpose and Progress
London Ceph Day: Erasure Coding: Purpose and Progress Ceph Community
 
London Ceph Day: Ceph at CERN
London Ceph Day: Ceph at CERNLondon Ceph Day: Ceph at CERN
London Ceph Day: Ceph at CERNCeph Community
 
Ceph as storage for CloudStack
Ceph as storage for CloudStack Ceph as storage for CloudStack
Ceph as storage for CloudStack Ceph Community
 
London Ceph Day: Deploying Ceph and OpenStack with Juju
London Ceph Day: Deploying Ceph and OpenStack with JujuLondon Ceph Day: Deploying Ceph and OpenStack with Juju
London Ceph Day: Deploying Ceph and OpenStack with JujuCeph Community
 
London Ceph Day: Unified Cloud Storage with Synnefo + Ceph + Ganeti
London Ceph Day: Unified Cloud Storage with Synnefo + Ceph + GanetiLondon Ceph Day: Unified Cloud Storage with Synnefo + Ceph + Ganeti
London Ceph Day: Unified Cloud Storage with Synnefo + Ceph + GanetiCeph Community
 
London Ceph Day Keynote: Building Tomorrow's Ceph
London Ceph Day Keynote: Building Tomorrow's Ceph London Ceph Day Keynote: Building Tomorrow's Ceph
London Ceph Day Keynote: Building Tomorrow's Ceph Ceph Community
 
Using Ceph in a Private Cloud - Ceph Day Frankfurt
Using Ceph in a Private Cloud - Ceph Day Frankfurt Using Ceph in a Private Cloud - Ceph Day Frankfurt
Using Ceph in a Private Cloud - Ceph Day Frankfurt Ceph Community
 
Ceph at the Digital Repository of Ireland - Ceph Day Frankfurt
Ceph at the Digital Repository of Ireland - Ceph Day Frankfurt Ceph at the Digital Repository of Ireland - Ceph Day Frankfurt
Ceph at the Digital Repository of Ireland - Ceph Day Frankfurt Ceph Community
 
Best Practices with Ceph as Distributed, Intelligent, Unified Cloud Storage -...
Best Practices with Ceph as Distributed, Intelligent, Unified Cloud Storage -...Best Practices with Ceph as Distributed, Intelligent, Unified Cloud Storage -...
Best Practices with Ceph as Distributed, Intelligent, Unified Cloud Storage -...Ceph Community
 
Webinar - Advance Ceph Features
Webinar - Advance Ceph FeaturesWebinar - Advance Ceph Features
Webinar - Advance Ceph FeaturesCeph Community
 
Wicked Easy Ceph Block Storage & OpenStack Deployment with Crowbar
Wicked Easy Ceph Block Storage & OpenStack Deployment with CrowbarWicked Easy Ceph Block Storage & OpenStack Deployment with Crowbar
Wicked Easy Ceph Block Storage & OpenStack Deployment with CrowbarCeph Community
 
Plano de mídia - 1/9/2015
Plano de mídia - 1/9/2015Plano de mídia - 1/9/2015
Plano de mídia - 1/9/2015Renato Cruz
 

Viewers also liked (20)

Ceph Day London 2014 - Deploying ceph in the wild
Ceph Day London 2014 - Deploying ceph in the wildCeph Day London 2014 - Deploying ceph in the wild
Ceph Day London 2014 - Deploying ceph in the wild
 
Ceph Day London 2014 - Ceph Over High-Performance Networks
Ceph Day London 2014 - Ceph Over High-Performance Networks Ceph Day London 2014 - Ceph Over High-Performance Networks
Ceph Day London 2014 - Ceph Over High-Performance Networks
 
Ceph Day London 2014 - Best Practices for Ceph-powered Implementations of Sto...
Ceph Day London 2014 - Best Practices for Ceph-powered Implementations of Sto...Ceph Day London 2014 - Best Practices for Ceph-powered Implementations of Sto...
Ceph Day London 2014 - Best Practices for Ceph-powered Implementations of Sto...
 
Ceph Day New York 2014: Ceph over High Performance Networks
Ceph Day New York 2014: Ceph over High Performance NetworksCeph Day New York 2014: Ceph over High Performance Networks
Ceph Day New York 2014: Ceph over High Performance Networks
 
Ceph Day Nov 2012 - Sage Weil
Ceph Day Nov 2012 - Sage WeilCeph Day Nov 2012 - Sage Weil
Ceph Day Nov 2012 - Sage Weil
 
Strata - 03/31/2012
Strata - 03/31/2012Strata - 03/31/2012
Strata - 03/31/2012
 
London Ceph Day: Ceph Performance and Optimization
London Ceph Day: Ceph Performance and Optimization London Ceph Day: Ceph Performance and Optimization
London Ceph Day: Ceph Performance and Optimization
 
London Ceph Day: Erasure Coding: Purpose and Progress
London Ceph Day: Erasure Coding: Purpose and Progress London Ceph Day: Erasure Coding: Purpose and Progress
London Ceph Day: Erasure Coding: Purpose and Progress
 
London Ceph Day: Ceph at CERN
London Ceph Day: Ceph at CERNLondon Ceph Day: Ceph at CERN
London Ceph Day: Ceph at CERN
 
Ceph as storage for CloudStack
Ceph as storage for CloudStack Ceph as storage for CloudStack
Ceph as storage for CloudStack
 
London Ceph Day: Deploying Ceph and OpenStack with Juju
London Ceph Day: Deploying Ceph and OpenStack with JujuLondon Ceph Day: Deploying Ceph and OpenStack with Juju
London Ceph Day: Deploying Ceph and OpenStack with Juju
 
London Ceph Day: Unified Cloud Storage with Synnefo + Ceph + Ganeti
London Ceph Day: Unified Cloud Storage with Synnefo + Ceph + GanetiLondon Ceph Day: Unified Cloud Storage with Synnefo + Ceph + Ganeti
London Ceph Day: Unified Cloud Storage with Synnefo + Ceph + Ganeti
 
London Ceph Day Keynote: Building Tomorrow's Ceph
London Ceph Day Keynote: Building Tomorrow's Ceph London Ceph Day Keynote: Building Tomorrow's Ceph
London Ceph Day Keynote: Building Tomorrow's Ceph
 
Using Ceph in a Private Cloud - Ceph Day Frankfurt
Using Ceph in a Private Cloud - Ceph Day Frankfurt Using Ceph in a Private Cloud - Ceph Day Frankfurt
Using Ceph in a Private Cloud - Ceph Day Frankfurt
 
Ceph at the Digital Repository of Ireland - Ceph Day Frankfurt
Ceph at the Digital Repository of Ireland - Ceph Day Frankfurt Ceph at the Digital Repository of Ireland - Ceph Day Frankfurt
Ceph at the Digital Repository of Ireland - Ceph Day Frankfurt
 
Best Practices with Ceph as Distributed, Intelligent, Unified Cloud Storage -...
Best Practices with Ceph as Distributed, Intelligent, Unified Cloud Storage -...Best Practices with Ceph as Distributed, Intelligent, Unified Cloud Storage -...
Best Practices with Ceph as Distributed, Intelligent, Unified Cloud Storage -...
 
Webinar - Advance Ceph Features
Webinar - Advance Ceph FeaturesWebinar - Advance Ceph Features
Webinar - Advance Ceph Features
 
Wicked Easy Ceph Block Storage & OpenStack Deployment with Crowbar
Wicked Easy Ceph Block Storage & OpenStack Deployment with CrowbarWicked Easy Ceph Block Storage & OpenStack Deployment with Crowbar
Wicked Easy Ceph Block Storage & OpenStack Deployment with Crowbar
 
Plano de mídia - 1/9/2015
Plano de mídia - 1/9/2015Plano de mídia - 1/9/2015
Plano de mídia - 1/9/2015
 
62 0422 la restauración del arbol novia
62 0422 la restauración del arbol novia62 0422 la restauración del arbol novia
62 0422 la restauración del arbol novia
 

Similar to Ceph Day Santa Clara: Ceph at DreamHost

Getting started with Riak in the Cloud
Getting started with Riak in the CloudGetting started with Riak in the Cloud
Getting started with Riak in the CloudInes Sombra
 
DrupalCampLA 2014 - Drupal backend performance and scalability
DrupalCampLA 2014 - Drupal backend performance and scalabilityDrupalCampLA 2014 - Drupal backend performance and scalability
DrupalCampLA 2014 - Drupal backend performance and scalabilitycherryhillco
 
HIgh Performance Redis- Tague Griffith, GoPro
HIgh Performance Redis- Tague Griffith, GoProHIgh Performance Redis- Tague Griffith, GoPro
HIgh Performance Redis- Tague Griffith, GoProRedis Labs
 
OpenStack Cinder, Implementation Today and New Trends for Tomorrow
OpenStack Cinder, Implementation Today and New Trends for TomorrowOpenStack Cinder, Implementation Today and New Trends for Tomorrow
OpenStack Cinder, Implementation Today and New Trends for TomorrowEd Balduf
 
High Performance Drupal
High Performance DrupalHigh Performance Drupal
High Performance DrupalChapter Three
 
Writing Scalable Software in Java
Writing Scalable Software in JavaWriting Scalable Software in Java
Writing Scalable Software in JavaRuben Badaró
 
Best practices for highly available and large scale SolrCloud
Best practices for highly available and large scale SolrCloudBest practices for highly available and large scale SolrCloud
Best practices for highly available and large scale SolrCloudAnshum Gupta
 
In-memory Caching in HDFS: Lower Latency, Same Great Taste
In-memory Caching in HDFS: Lower Latency, Same Great TasteIn-memory Caching in HDFS: Lower Latency, Same Great Taste
In-memory Caching in HDFS: Lower Latency, Same Great TasteDataWorks Summit
 
Key-Value-Stores -- The Key to Scaling?
Key-Value-Stores -- The Key to Scaling?Key-Value-Stores -- The Key to Scaling?
Key-Value-Stores -- The Key to Scaling?Tim Lossen
 
High Scalability Toronto: Meetup #2
High Scalability Toronto: Meetup #2High Scalability Toronto: Meetup #2
High Scalability Toronto: Meetup #2ScribbleLive
 
AWS Cloud experience concepts tips and tricks
AWS Cloud experience concepts tips and tricksAWS Cloud experience concepts tips and tricks
AWS Cloud experience concepts tips and tricksDirk Harms-Merbitz
 
Troubleshooting Hadoop: Distributed Debugging
Troubleshooting Hadoop: Distributed DebuggingTroubleshooting Hadoop: Distributed Debugging
Troubleshooting Hadoop: Distributed DebuggingGreat Wide Open
 
End of RAID as we know it with Ceph Replication
End of RAID as we know it with Ceph ReplicationEnd of RAID as we know it with Ceph Replication
End of RAID as we know it with Ceph ReplicationCeph Community
 
Diagnosing Problems in Production - Cassandra
Diagnosing Problems in Production - CassandraDiagnosing Problems in Production - Cassandra
Diagnosing Problems in Production - CassandraJon Haddad
 
Alluxio - Scalable Filesystem Metadata Services
Alluxio - Scalable Filesystem Metadata ServicesAlluxio - Scalable Filesystem Metadata Services
Alluxio - Scalable Filesystem Metadata ServicesAlluxio, Inc.
 
MySQL in the Hosted Cloud
MySQL in the Hosted CloudMySQL in the Hosted Cloud
MySQL in the Hosted CloudColin Charles
 
V mware2012 20121221_final
V mware2012 20121221_finalV mware2012 20121221_final
V mware2012 20121221_finalWeb2Present
 
Life After Sharding: Monitoring and Management of a Complex Data Cloud
Life After Sharding: Monitoring and Management of a Complex Data CloudLife After Sharding: Monitoring and Management of a Complex Data Cloud
Life After Sharding: Monitoring and Management of a Complex Data CloudOSCON Byrum
 
Ceph Day New York 2014: Best Practices for Ceph-Powered Implementations of St...
Ceph Day New York 2014: Best Practices for Ceph-Powered Implementations of St...Ceph Day New York 2014: Best Practices for Ceph-Powered Implementations of St...
Ceph Day New York 2014: Best Practices for Ceph-Powered Implementations of St...Ceph Community
 

Similar to Ceph Day Santa Clara: Ceph at DreamHost (20)

Getting started with Riak in the Cloud
Getting started with Riak in the CloudGetting started with Riak in the Cloud
Getting started with Riak in the Cloud
 
DrupalCampLA 2014 - Drupal backend performance and scalability
DrupalCampLA 2014 - Drupal backend performance and scalabilityDrupalCampLA 2014 - Drupal backend performance and scalability
DrupalCampLA 2014 - Drupal backend performance and scalability
 
HIgh Performance Redis- Tague Griffith, GoPro
HIgh Performance Redis- Tague Griffith, GoProHIgh Performance Redis- Tague Griffith, GoPro
HIgh Performance Redis- Tague Griffith, GoPro
 
OpenStack Cinder, Implementation Today and New Trends for Tomorrow
OpenStack Cinder, Implementation Today and New Trends for TomorrowOpenStack Cinder, Implementation Today and New Trends for Tomorrow
OpenStack Cinder, Implementation Today and New Trends for Tomorrow
 
High Performance Drupal
High Performance DrupalHigh Performance Drupal
High Performance Drupal
 
Writing Scalable Software in Java
Writing Scalable Software in JavaWriting Scalable Software in Java
Writing Scalable Software in Java
 
Best practices for highly available and large scale SolrCloud
Best practices for highly available and large scale SolrCloudBest practices for highly available and large scale SolrCloud
Best practices for highly available and large scale SolrCloud
 
In-memory Caching in HDFS: Lower Latency, Same Great Taste
In-memory Caching in HDFS: Lower Latency, Same Great TasteIn-memory Caching in HDFS: Lower Latency, Same Great Taste
In-memory Caching in HDFS: Lower Latency, Same Great Taste
 
Key-Value-Stores -- The Key to Scaling?
Key-Value-Stores -- The Key to Scaling?Key-Value-Stores -- The Key to Scaling?
Key-Value-Stores -- The Key to Scaling?
 
High Scalability Toronto: Meetup #2
High Scalability Toronto: Meetup #2High Scalability Toronto: Meetup #2
High Scalability Toronto: Meetup #2
 
AWS Cloud experience concepts tips and tricks
AWS Cloud experience concepts tips and tricksAWS Cloud experience concepts tips and tricks
AWS Cloud experience concepts tips and tricks
 
Troubleshooting Hadoop: Distributed Debugging
Troubleshooting Hadoop: Distributed DebuggingTroubleshooting Hadoop: Distributed Debugging
Troubleshooting Hadoop: Distributed Debugging
 
End of RAID as we know it with Ceph Replication
End of RAID as we know it with Ceph ReplicationEnd of RAID as we know it with Ceph Replication
End of RAID as we know it with Ceph Replication
 
Diagnosing Problems in Production - Cassandra
Diagnosing Problems in Production - CassandraDiagnosing Problems in Production - Cassandra
Diagnosing Problems in Production - Cassandra
 
Alluxio - Scalable Filesystem Metadata Services
Alluxio - Scalable Filesystem Metadata ServicesAlluxio - Scalable Filesystem Metadata Services
Alluxio - Scalable Filesystem Metadata Services
 
MySQL in the Hosted Cloud
MySQL in the Hosted CloudMySQL in the Hosted Cloud
MySQL in the Hosted Cloud
 
V mware2012 20121221_final
V mware2012 20121221_finalV mware2012 20121221_final
V mware2012 20121221_final
 
Life After Sharding: Monitoring and Management of a Complex Data Cloud
Life After Sharding: Monitoring and Management of a Complex Data CloudLife After Sharding: Monitoring and Management of a Complex Data Cloud
Life After Sharding: Monitoring and Management of a Complex Data Cloud
 
Redis
RedisRedis
Redis
 
Ceph Day New York 2014: Best Practices for Ceph-Powered Implementations of St...
Ceph Day New York 2014: Best Practices for Ceph-Powered Implementations of St...Ceph Day New York 2014: Best Practices for Ceph-Powered Implementations of St...
Ceph Day New York 2014: Best Practices for Ceph-Powered Implementations of St...
 

Recently uploaded

Science&tech:THE INFORMATION AGE STS.pdf
Science&tech:THE INFORMATION AGE STS.pdfScience&tech:THE INFORMATION AGE STS.pdf
Science&tech:THE INFORMATION AGE STS.pdfjimielynbastida
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
 
costume and set research powerpoint presentation
costume and set research powerpoint presentationcostume and set research powerpoint presentation
costume and set research powerpoint presentationphoebematthew05
 
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024BookNet Canada
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksSoftradix Technologies
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 

Recently uploaded (20)

Science&tech:THE INFORMATION AGE STS.pdf
Science&tech:THE INFORMATION AGE STS.pdfScience&tech:THE INFORMATION AGE STS.pdf
Science&tech:THE INFORMATION AGE STS.pdf
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
 
costume and set research powerpoint presentation
costume and set research powerpoint presentationcostume and set research powerpoint presentation
costume and set research powerpoint presentation
 
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other Frameworks
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping Elbows
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 

Ceph Day Santa Clara: Ceph at DreamHost

  • 1. Ceph at DreamHost A Storage Journey
  • 2. About Me • One of the original four of DreamHost • Still active daily at DreamHost • Have spent a lot of time working on the Ops side.
  • 3. • Hosting company founded in 1997 • Sage’s other company • shared hosting, virtual servers, dedicated servers, cloud storage, cloud computing • 375k customers, 1.3MM websites
  • 4. Storage Journey A long strange trip
  • 5. His name was Destro
  • 6. ... and then there were more.
  • 11. ... and still more.
  • 12. Lots of NetApps • Peak of around 125 individual NetApps • Smallish capacity on each (8TB) • Internal software continuously moving data between NetApps • Lots of time spent managing nearly full filers
  • 13. Ideal
  • 15. Hosting Landscape • Included storage had grown from 50MB to gigabytes, then terabytes. • Prices stayed the same. • Eventually went to unlimited Storage • Usage per customer skyrocketed.
  • 17. Failed Experiments • ATAoE and XFS-based systems • Performance & Stability issues • 2006 era gear
  • 18. Failed Experiments • High capacity • Nice features • Expensive • 85% full and it failed
  • 19. Some Success • First on Sun hardware then Supermicro • Great stability • Not enough IO for front-line network storage
  • 21. Local RAID • SATA drives had grown in capacity and were very cheap • 4-6TB per hosting server • Less dependence on congested network • Smaller failure domains The Good
  • 22. Local RAID • No more quota, too slow to scan filesystem • No more fast failovers • Multiple hour filesystem check with ext3 • More failure domains The Bad
  • 23. Local RAID • Complete RAID loss more common than anticipated • Multiple days to fully restore from backup The Ugly
  • 24. Storage Today Light at the end of the tunnel
  • 25. Hybrid Mix • We learned something from every step of the way • No one size fits all when it comes to storage • Use whatever is best for the job • Be ready to change Best Tool For The Job
  • 26. A Bit of Everything • Clustered NetApps and NFS for email • Local RAID in hosting servers • ZFS and OpenSolaris backup servers • Ceph for DreamObjects and DreamCompute Best Tool For The Job
  • 27. • Object Storage, S3/Swift compatible • 2+ Petabytes raw storage • 3x replication, 900+ OSDs • RGW behind HAProxy • Row, rack, node and disk fault tolerant
  • 28. • OpenStack-based Public Cloud • 3+ Petabytes raw storage • All storage is on Ceph RBD • Boot and Attachable Volumes • Nicira SDN + Ceph, Live Migration
  • 29. HA Load Balancer MySQL / PostgreSQL Horizon Cockpit Pod Glance Keystone Nova Quantum Cinder Nicira NVP Glance Store (Ceph) OSMirrors (apt) Ceph Monitors Opscode Chef Logstash + Graphite Networking Gear 8x - Hypervisor Node 192 GB RAM 64AMD cores 14x - Storage Node 12x - 3TB disks Networking Gear Compute Pod 8x - Hypervisor Node 192 GB RAM 64AMD cores 14x - Storage Nodes 12x - 3TB disks Networking Gear Compute Pod 8x - Hypervisor Node 192 GB RAM 64AMD cores 14x - Storage Nodes 12x - 3TB disks Networking Gear Compute Pod Pods • 512 cores • 1.5TB of RAM • 504TB raw storage • 168TB redundant storage N etworking • ODM switches w/ Linux • 10Gbps everywhere • IPv6 from the ground up • Spine and leaf topology • 120 Gbps between pods (!) The Internets Thar be dragonshere! Nicira NVP Nicira NVP NiciraNVP
  • 30. CephFS & The Future • The return of Failovers • No more backup servers • No more major disk-related outages • Fault tolerant low cost hosting Storage Panacea?