SlideShare a Scribd company logo
Ceph performance
CephDays Frankfurt 2014
Whoami
💥 Sébastien Han
💥 French Cloud Engineer working for eNovance
💥 Daily job focused on Ceph and OpenStack
💥 Blogger
Personal blog: http://www.sebastien-han.fr/blog/
Company blog: http://techs.enovance.com/
Last Cephdays presentation
How does Ceph perform?
42*
*The Hitchhiker's Guide to the Galaxy
The Good
Ceph IO pattern
CRUSH: deterministic object
placement
As soon as a client writes into Ceph, the operation is computed
and the client decides to which OSD the object should belong
Aggregation: cluster level
As soon as you write into Ceph, all the objects get equally spread across the entire
Cluster, understanding machines and disks..
Aggregation: OSD level
As soon as an IO goes into an OSD, no matter how the original pattern was,
it becomes sequential.
The Bad
Ceph IO pattern
Journaling
As soon as an IO goes into an OSD, it gets written twice.
Journal and OSD data on the same
disk
Journal penalty on the disk
Since we write twice, if the journal is stored on the same disk as
the OSD data this will result in the following:
Device: wMB/s
sdb1 - journal 50.11
sdb2 - osd_data 40.25
Filesystem fragmentation
• Objects are stored as files on the OSD filesystem
• Several IO patterns with different block sizes increase filesystem
fragmentation
• Possible root cause: image sparseness
• One year old cluster ends up with (see allocsize options for
XFS):
$ sudo xfs_db -c frag -r /dev/sdd
actual 196334, ideal 122582, fragmentation factor 37.56%
No parallelized reads
• Ceph will always serve the read request from the primary OSD
• Room for Nx times speed up where N is the replica count
Blueprint from Sage for the Giant release
Scrubbing impact
• Consistent object check at the PG level
• Compare replicas versions between each others (Fsck for objects)
• Light scrubbing (daily) checks the object size and attributes.
• Deep scrubbing (weekly) reads the data and uses checksums to ensure
data integrity.
• Corruption exists – ECC memory (10^15 for enterprise disk)
~113TB
• No pain No gain
The Ugly
Ceph IO pattern
IOs to the OSD disk
One IO into Ceph leads to 2 writes, well… the second write is the worst!
The problem
• Several objects map to the same physical disks
• Sequential streams get mixed all together
• Result: The disk seeks like hell
Even worse with erasure coding?
This is just an assumption!
•Since erasure coding does chunks of chunks we can possibly
have this phenomena amplified
CLUSTER
How to build it?
How to start?
Things that you must consider:
•Use case
• IO profile: Bandwidth? IOPS? Mixed?
• How many IOPS or Bandwidth per client do I want to deliver?
• Do I use Ceph in standalone or is it combined with a software solution?
•Amount of data (usable not RAW)
• Replica count
• Do I have a data growth planning?
•Leftover
• How much data am I willing to lose if a node fails? (%)
• Am I ready to be annoyed by the scrubbing process?
•
Things that you must not do
• Don't put a RAID underneath your OSD
• Ceph already manages the replication
• Degraded RAID breaks performances
• Reduce usable space on the cluster
• Don't build high density nodes with a tiny cluster
• Failure consideration and data to re-balance
• Potential full cluster
• Don't run Ceph on your hypervisors (unless you're broke)
• Well maybe…
Firefly: Interesting things
going on
Object store multi-backend
• ObjectStore is born
• Aims to support several backends:
• levelDB (default)
• RocksDB
• Fusionio NVMKV
• Seagate Kinetic
• Yours!
Why is it so good?
• No more journal! Yay!
• Object backends have built-in atomic functions
Firefly leveldb
• Relatively new
• Need to be tested with your workload first
• Tend to be more efficient with small objects
Many thanks!
Questions?
Contact: sebastien@enovance.com
Twitter: @sebastien_han
IRC: leseb

More Related Content

What's hot

OpenStack in Action! 5 - Dell - OpenStack powered solutions - Patrick Hamon
OpenStack in Action! 5 - Dell - OpenStack powered solutions - Patrick HamonOpenStack in Action! 5 - Dell - OpenStack powered solutions - Patrick Hamon
OpenStack in Action! 5 - Dell - OpenStack powered solutions - Patrick Hamon
eNovance
 
Disaggregating Ceph using NVMeoF
Disaggregating Ceph using NVMeoFDisaggregating Ceph using NVMeoF
Disaggregating Ceph using NVMeoF
ShapeBlue
 
Multiple Sites and Disaster Recovery with Ceph: Andrew Hatfield, Red Hat
Multiple Sites and Disaster Recovery with Ceph: Andrew Hatfield, Red HatMultiple Sites and Disaster Recovery with Ceph: Andrew Hatfield, Red Hat
Multiple Sites and Disaster Recovery with Ceph: Andrew Hatfield, Red Hat
OpenStack
 
How to Survive an OpenStack Cloud Meltdown with Ceph
How to Survive an OpenStack Cloud Meltdown with CephHow to Survive an OpenStack Cloud Meltdown with Ceph
How to Survive an OpenStack Cloud Meltdown with Ceph
Sean Cohen
 
OpenStack in Action 4! Vincent Untz - Running multiple hypervisors in your Op...
OpenStack in Action 4! Vincent Untz - Running multiple hypervisors in your Op...OpenStack in Action 4! Vincent Untz - Running multiple hypervisors in your Op...
OpenStack in Action 4! Vincent Untz - Running multiple hypervisors in your Op...
eNovance
 
Manila, an update from Liberty, OpenStack Summit - Tokyo
Manila, an update from Liberty, OpenStack Summit - TokyoManila, an update from Liberty, OpenStack Summit - Tokyo
Manila, an update from Liberty, OpenStack Summit - Tokyo
Sean Cohen
 
Enabling Disaster Recovery as Service (DRaaS) on OpenStack
Enabling Disaster Recovery as Service (DRaaS) on OpenStack Enabling Disaster Recovery as Service (DRaaS) on OpenStack
Enabling Disaster Recovery as Service (DRaaS) on OpenStack
haribabu kasturi
 
CEPH DAY BERLIN - DEPLOYING CEPH IN KUBERNETES WITH ROOK
CEPH DAY BERLIN - DEPLOYING CEPH IN KUBERNETES WITH ROOKCEPH DAY BERLIN - DEPLOYING CEPH IN KUBERNETES WITH ROOK
CEPH DAY BERLIN - DEPLOYING CEPH IN KUBERNETES WITH ROOK
Ceph Community
 
Re-Think of Virtualization and Containerization
Re-Think of Virtualization and ContainerizationRe-Think of Virtualization and Containerization
Re-Think of Virtualization and Containerization
Xu Wang
 
Open stack in action enovance-quantum in action
Open stack in action enovance-quantum in actionOpen stack in action enovance-quantum in action
Open stack in action enovance-quantum in action
eNovance
 
Containers and HPC
Containers and HPCContainers and HPC
Containers and HPC
Olli-Pekka Lehto
 
DockerCon 2016 Ecosystem - Everything You Need to Know About Docker and Stora...
DockerCon 2016 Ecosystem - Everything You Need to Know About Docker and Stora...DockerCon 2016 Ecosystem - Everything You Need to Know About Docker and Stora...
DockerCon 2016 Ecosystem - Everything You Need to Know About Docker and Stora...
ClusterHQ
 
Stor4NFV: Exploration of Cloud native Storage in OPNFV - Ren Qiaowei, Wang Hui
Stor4NFV: Exploration of Cloud native Storage in OPNFV - Ren Qiaowei, Wang HuiStor4NFV: Exploration of Cloud native Storage in OPNFV - Ren Qiaowei, Wang Hui
Stor4NFV: Exploration of Cloud native Storage in OPNFV - Ren Qiaowei, Wang Hui
Ceph Community
 
Antoine Coetsier - billing the cloud
Antoine Coetsier - billing the cloudAntoine Coetsier - billing the cloud
Antoine Coetsier - billing the cloud
ShapeBlue
 
Stateful set in kubernetes implementation & usecases
Stateful set in kubernetes implementation & usecases Stateful set in kubernetes implementation & usecases
Stateful set in kubernetes implementation & usecases
Krishna-Kumar
 
Red Hat Summit 2017: Wicked Fast PaaS: Performance Tuning of OpenShift and D...
Red Hat Summit 2017:  Wicked Fast PaaS: Performance Tuning of OpenShift and D...Red Hat Summit 2017:  Wicked Fast PaaS: Performance Tuning of OpenShift and D...
Red Hat Summit 2017: Wicked Fast PaaS: Performance Tuning of OpenShift and D...
Jeremy Eder
 
OpenNebula Conf 2014 | Using Ceph to provide scalable storage for OpenNebula ...
OpenNebula Conf 2014 | Using Ceph to provide scalable storage for OpenNebula ...OpenNebula Conf 2014 | Using Ceph to provide scalable storage for OpenNebula ...
OpenNebula Conf 2014 | Using Ceph to provide scalable storage for OpenNebula ...
NETWAYS
 
Which Hypervisor is Best?
Which Hypervisor is Best?Which Hypervisor is Best?
Which Hypervisor is Best?
Kyle Bader
 
XCP-ng - past, present and future
XCP-ng - past, present and futureXCP-ng - past, present and future
XCP-ng - past, present and future
ShapeBlue
 
Ceph Tech Talk: Ceph at DigitalOcean
Ceph Tech Talk: Ceph at DigitalOceanCeph Tech Talk: Ceph at DigitalOcean
Ceph Tech Talk: Ceph at DigitalOcean
Ceph Community
 

What's hot (20)

OpenStack in Action! 5 - Dell - OpenStack powered solutions - Patrick Hamon
OpenStack in Action! 5 - Dell - OpenStack powered solutions - Patrick HamonOpenStack in Action! 5 - Dell - OpenStack powered solutions - Patrick Hamon
OpenStack in Action! 5 - Dell - OpenStack powered solutions - Patrick Hamon
 
Disaggregating Ceph using NVMeoF
Disaggregating Ceph using NVMeoFDisaggregating Ceph using NVMeoF
Disaggregating Ceph using NVMeoF
 
Multiple Sites and Disaster Recovery with Ceph: Andrew Hatfield, Red Hat
Multiple Sites and Disaster Recovery with Ceph: Andrew Hatfield, Red HatMultiple Sites and Disaster Recovery with Ceph: Andrew Hatfield, Red Hat
Multiple Sites and Disaster Recovery with Ceph: Andrew Hatfield, Red Hat
 
How to Survive an OpenStack Cloud Meltdown with Ceph
How to Survive an OpenStack Cloud Meltdown with CephHow to Survive an OpenStack Cloud Meltdown with Ceph
How to Survive an OpenStack Cloud Meltdown with Ceph
 
OpenStack in Action 4! Vincent Untz - Running multiple hypervisors in your Op...
OpenStack in Action 4! Vincent Untz - Running multiple hypervisors in your Op...OpenStack in Action 4! Vincent Untz - Running multiple hypervisors in your Op...
OpenStack in Action 4! Vincent Untz - Running multiple hypervisors in your Op...
 
Manila, an update from Liberty, OpenStack Summit - Tokyo
Manila, an update from Liberty, OpenStack Summit - TokyoManila, an update from Liberty, OpenStack Summit - Tokyo
Manila, an update from Liberty, OpenStack Summit - Tokyo
 
Enabling Disaster Recovery as Service (DRaaS) on OpenStack
Enabling Disaster Recovery as Service (DRaaS) on OpenStack Enabling Disaster Recovery as Service (DRaaS) on OpenStack
Enabling Disaster Recovery as Service (DRaaS) on OpenStack
 
CEPH DAY BERLIN - DEPLOYING CEPH IN KUBERNETES WITH ROOK
CEPH DAY BERLIN - DEPLOYING CEPH IN KUBERNETES WITH ROOKCEPH DAY BERLIN - DEPLOYING CEPH IN KUBERNETES WITH ROOK
CEPH DAY BERLIN - DEPLOYING CEPH IN KUBERNETES WITH ROOK
 
Re-Think of Virtualization and Containerization
Re-Think of Virtualization and ContainerizationRe-Think of Virtualization and Containerization
Re-Think of Virtualization and Containerization
 
Open stack in action enovance-quantum in action
Open stack in action enovance-quantum in actionOpen stack in action enovance-quantum in action
Open stack in action enovance-quantum in action
 
Containers and HPC
Containers and HPCContainers and HPC
Containers and HPC
 
DockerCon 2016 Ecosystem - Everything You Need to Know About Docker and Stora...
DockerCon 2016 Ecosystem - Everything You Need to Know About Docker and Stora...DockerCon 2016 Ecosystem - Everything You Need to Know About Docker and Stora...
DockerCon 2016 Ecosystem - Everything You Need to Know About Docker and Stora...
 
Stor4NFV: Exploration of Cloud native Storage in OPNFV - Ren Qiaowei, Wang Hui
Stor4NFV: Exploration of Cloud native Storage in OPNFV - Ren Qiaowei, Wang HuiStor4NFV: Exploration of Cloud native Storage in OPNFV - Ren Qiaowei, Wang Hui
Stor4NFV: Exploration of Cloud native Storage in OPNFV - Ren Qiaowei, Wang Hui
 
Antoine Coetsier - billing the cloud
Antoine Coetsier - billing the cloudAntoine Coetsier - billing the cloud
Antoine Coetsier - billing the cloud
 
Stateful set in kubernetes implementation & usecases
Stateful set in kubernetes implementation & usecases Stateful set in kubernetes implementation & usecases
Stateful set in kubernetes implementation & usecases
 
Red Hat Summit 2017: Wicked Fast PaaS: Performance Tuning of OpenShift and D...
Red Hat Summit 2017:  Wicked Fast PaaS: Performance Tuning of OpenShift and D...Red Hat Summit 2017:  Wicked Fast PaaS: Performance Tuning of OpenShift and D...
Red Hat Summit 2017: Wicked Fast PaaS: Performance Tuning of OpenShift and D...
 
OpenNebula Conf 2014 | Using Ceph to provide scalable storage for OpenNebula ...
OpenNebula Conf 2014 | Using Ceph to provide scalable storage for OpenNebula ...OpenNebula Conf 2014 | Using Ceph to provide scalable storage for OpenNebula ...
OpenNebula Conf 2014 | Using Ceph to provide scalable storage for OpenNebula ...
 
Which Hypervisor is Best?
Which Hypervisor is Best?Which Hypervisor is Best?
Which Hypervisor is Best?
 
XCP-ng - past, present and future
XCP-ng - past, present and futureXCP-ng - past, present and future
XCP-ng - past, present and future
 
Ceph Tech Talk: Ceph at DigitalOcean
Ceph Tech Talk: Ceph at DigitalOceanCeph Tech Talk: Ceph at DigitalOcean
Ceph Tech Talk: Ceph at DigitalOcean
 

Viewers also liked

Ceph Day Seoul - Ceph on Arm Scaleable and Efficient
Ceph Day Seoul - Ceph on Arm Scaleable and Efficient Ceph Day Seoul - Ceph on Arm Scaleable and Efficient
Ceph Day Seoul - Ceph on Arm Scaleable and Efficient
Ceph Community
 
End of RAID as we know it with Ceph Replication
End of RAID as we know it with Ceph ReplicationEnd of RAID as we know it with Ceph Replication
End of RAID as we know it with Ceph Replication
Ceph Community
 
Ceph Day Taipei - Accelerate Ceph via SPDK
Ceph Day Taipei - Accelerate Ceph via SPDK Ceph Day Taipei - Accelerate Ceph via SPDK
Ceph Day Taipei - Accelerate Ceph via SPDK
Ceph Community
 
What's new in Jewel and Beyond
What's new in Jewel and BeyondWhat's new in Jewel and Beyond
What's new in Jewel and Beyond
Sage Weil
 
Ceph, Now and Later: Our Plan for Open Unified Cloud Storage
Ceph, Now and Later: Our Plan for Open Unified Cloud StorageCeph, Now and Later: Our Plan for Open Unified Cloud Storage
Ceph, Now and Later: Our Plan for Open Unified Cloud Storage
Sage Weil
 
Performance comparison of Distributed File Systems on 1Gbit networks
Performance comparison of Distributed File Systems on 1Gbit networksPerformance comparison of Distributed File Systems on 1Gbit networks
Performance comparison of Distributed File Systems on 1Gbit networks
Marian Marinov
 
BlueStore, A New Storage Backend for Ceph, One Year In
BlueStore, A New Storage Backend for Ceph, One Year InBlueStore, A New Storage Backend for Ceph, One Year In
BlueStore, A New Storage Backend for Ceph, One Year In
Sage Weil
 
BlueStore: a new, faster storage backend for Ceph
BlueStore: a new, faster storage backend for CephBlueStore: a new, faster storage backend for Ceph
BlueStore: a new, faster storage backend for Ceph
Sage Weil
 
A crash course in CRUSH
A crash course in CRUSHA crash course in CRUSH
A crash course in CRUSH
Sage Weil
 
BlueStore: a new, faster storage backend for Ceph
BlueStore: a new, faster storage backend for CephBlueStore: a new, faster storage backend for Ceph
BlueStore: a new, faster storage backend for Ceph
Sage Weil
 

Viewers also liked (10)

Ceph Day Seoul - Ceph on Arm Scaleable and Efficient
Ceph Day Seoul - Ceph on Arm Scaleable and Efficient Ceph Day Seoul - Ceph on Arm Scaleable and Efficient
Ceph Day Seoul - Ceph on Arm Scaleable and Efficient
 
End of RAID as we know it with Ceph Replication
End of RAID as we know it with Ceph ReplicationEnd of RAID as we know it with Ceph Replication
End of RAID as we know it with Ceph Replication
 
Ceph Day Taipei - Accelerate Ceph via SPDK
Ceph Day Taipei - Accelerate Ceph via SPDK Ceph Day Taipei - Accelerate Ceph via SPDK
Ceph Day Taipei - Accelerate Ceph via SPDK
 
What's new in Jewel and Beyond
What's new in Jewel and BeyondWhat's new in Jewel and Beyond
What's new in Jewel and Beyond
 
Ceph, Now and Later: Our Plan for Open Unified Cloud Storage
Ceph, Now and Later: Our Plan for Open Unified Cloud StorageCeph, Now and Later: Our Plan for Open Unified Cloud Storage
Ceph, Now and Later: Our Plan for Open Unified Cloud Storage
 
Performance comparison of Distributed File Systems on 1Gbit networks
Performance comparison of Distributed File Systems on 1Gbit networksPerformance comparison of Distributed File Systems on 1Gbit networks
Performance comparison of Distributed File Systems on 1Gbit networks
 
BlueStore, A New Storage Backend for Ceph, One Year In
BlueStore, A New Storage Backend for Ceph, One Year InBlueStore, A New Storage Backend for Ceph, One Year In
BlueStore, A New Storage Backend for Ceph, One Year In
 
BlueStore: a new, faster storage backend for Ceph
BlueStore: a new, faster storage backend for CephBlueStore: a new, faster storage backend for Ceph
BlueStore: a new, faster storage backend for Ceph
 
A crash course in CRUSH
A crash course in CRUSHA crash course in CRUSH
A crash course in CRUSH
 
BlueStore: a new, faster storage backend for Ceph
BlueStore: a new, faster storage backend for CephBlueStore: a new, faster storage backend for Ceph
BlueStore: a new, faster storage backend for Ceph
 

Similar to Ceph Performance and Optimization - Ceph Day Frankfurt

Ceph Day Chicago - Ceph Deployment at Target: Best Practices and Lessons Learned
Ceph Day Chicago - Ceph Deployment at Target: Best Practices and Lessons LearnedCeph Day Chicago - Ceph Deployment at Target: Best Practices and Lessons Learned
Ceph Day Chicago - Ceph Deployment at Target: Best Practices and Lessons Learned
Ceph Community
 
Ceph in the GRNET cloud stack
Ceph in the GRNET cloud stackCeph in the GRNET cloud stack
Ceph in the GRNET cloud stack
Nikos Kormpakis
 
Webinar - Getting Started With Ceph
Webinar - Getting Started With CephWebinar - Getting Started With Ceph
Webinar - Getting Started With Ceph
Ceph Community
 
Ceph: Open Source Storage Software Optimizations on Intel® Architecture for C...
Ceph: Open Source Storage Software Optimizations on Intel® Architecture for C...Ceph: Open Source Storage Software Optimizations on Intel® Architecture for C...
Ceph: Open Source Storage Software Optimizations on Intel® Architecture for C...
Odinot Stanislas
 
Ceph
CephCeph
In-Ceph-tion: Deploying a Ceph cluster on DreamCompute
In-Ceph-tion: Deploying a Ceph cluster on DreamComputeIn-Ceph-tion: Deploying a Ceph cluster on DreamCompute
In-Ceph-tion: Deploying a Ceph cluster on DreamCompute
Patrick McGarry
 
How swift is your Swift - SD.pptx
How swift is your Swift - SD.pptxHow swift is your Swift - SD.pptx
How swift is your Swift - SD.pptx
OpenStack Foundation
 
Surge2012
Surge2012Surge2012
Surge2012
davidapacheco
 
Unite2013-gavilan-pdf
Unite2013-gavilan-pdfUnite2013-gavilan-pdf
Unite2013-gavilan-pdf
David Gavilan
 
Open Source Storage at Scale: Ceph @ GRNET
Open Source Storage at Scale: Ceph @ GRNETOpen Source Storage at Scale: Ceph @ GRNET
Open Source Storage at Scale: Ceph @ GRNET
Nikos Kormpakis
 
Ceph, Xen, and CloudStack: Semper Melior
Ceph, Xen, and CloudStack: Semper MeliorCeph, Xen, and CloudStack: Semper Melior
Ceph, Xen, and CloudStack: Semper Melior
Patrick McGarry
 
Erasure Code at Scale - Thomas William Byrne
Erasure Code at Scale - Thomas William ByrneErasure Code at Scale - Thomas William Byrne
Erasure Code at Scale - Thomas William Byrne
Ceph Community
 
PhegData X - High Performance EBS
PhegData X - High Performance EBSPhegData X - High Performance EBS
PhegData X - High Performance EBS
Hanson Dong
 
Introduction to Cassandra and CQL for Java developers
Introduction to Cassandra and CQL for Java developersIntroduction to Cassandra and CQL for Java developers
Introduction to Cassandra and CQL for Java developers
Julien Anguenot
 
SUE 2018 - Migrating a 130TB Cluster from Elasticsearch 2 to 5 in 20 Hours Wi...
SUE 2018 - Migrating a 130TB Cluster from Elasticsearch 2 to 5 in 20 Hours Wi...SUE 2018 - Migrating a 130TB Cluster from Elasticsearch 2 to 5 in 20 Hours Wi...
SUE 2018 - Migrating a 130TB Cluster from Elasticsearch 2 to 5 in 20 Hours Wi...
Fred de Villamil
 
Ceph and openstack at the boston meetup
Ceph and openstack at the boston meetupCeph and openstack at the boston meetup
Ceph and openstack at the boston meetup
Kamesh Pemmaraju
 
Ceph & OpenStack - Boston Meetup
Ceph & OpenStack - Boston MeetupCeph & OpenStack - Boston Meetup
Ceph & OpenStack - Boston Meetup
Patrick McGarry
 
Debugging ZFS: From Illumos to Linux
Debugging ZFS: From Illumos to LinuxDebugging ZFS: From Illumos to Linux
Debugging ZFS: From Illumos to Linux
Serapheim-Nikolaos Dimitropoulos
 
OpenStack and Ceph: the Winning Pair
OpenStack and Ceph: the Winning PairOpenStack and Ceph: the Winning Pair
OpenStack and Ceph: the Winning Pair
Red_Hat_Storage
 
Ceph Day Santa Clara: Keynote: Building Tomorrow's Ceph
Ceph Day Santa Clara: Keynote: Building Tomorrow's Ceph Ceph Day Santa Clara: Keynote: Building Tomorrow's Ceph
Ceph Day Santa Clara: Keynote: Building Tomorrow's Ceph
Ceph Community
 

Similar to Ceph Performance and Optimization - Ceph Day Frankfurt (20)

Ceph Day Chicago - Ceph Deployment at Target: Best Practices and Lessons Learned
Ceph Day Chicago - Ceph Deployment at Target: Best Practices and Lessons LearnedCeph Day Chicago - Ceph Deployment at Target: Best Practices and Lessons Learned
Ceph Day Chicago - Ceph Deployment at Target: Best Practices and Lessons Learned
 
Ceph in the GRNET cloud stack
Ceph in the GRNET cloud stackCeph in the GRNET cloud stack
Ceph in the GRNET cloud stack
 
Webinar - Getting Started With Ceph
Webinar - Getting Started With CephWebinar - Getting Started With Ceph
Webinar - Getting Started With Ceph
 
Ceph: Open Source Storage Software Optimizations on Intel® Architecture for C...
Ceph: Open Source Storage Software Optimizations on Intel® Architecture for C...Ceph: Open Source Storage Software Optimizations on Intel® Architecture for C...
Ceph: Open Source Storage Software Optimizations on Intel® Architecture for C...
 
Ceph
CephCeph
Ceph
 
In-Ceph-tion: Deploying a Ceph cluster on DreamCompute
In-Ceph-tion: Deploying a Ceph cluster on DreamComputeIn-Ceph-tion: Deploying a Ceph cluster on DreamCompute
In-Ceph-tion: Deploying a Ceph cluster on DreamCompute
 
How swift is your Swift - SD.pptx
How swift is your Swift - SD.pptxHow swift is your Swift - SD.pptx
How swift is your Swift - SD.pptx
 
Surge2012
Surge2012Surge2012
Surge2012
 
Unite2013-gavilan-pdf
Unite2013-gavilan-pdfUnite2013-gavilan-pdf
Unite2013-gavilan-pdf
 
Open Source Storage at Scale: Ceph @ GRNET
Open Source Storage at Scale: Ceph @ GRNETOpen Source Storage at Scale: Ceph @ GRNET
Open Source Storage at Scale: Ceph @ GRNET
 
Ceph, Xen, and CloudStack: Semper Melior
Ceph, Xen, and CloudStack: Semper MeliorCeph, Xen, and CloudStack: Semper Melior
Ceph, Xen, and CloudStack: Semper Melior
 
Erasure Code at Scale - Thomas William Byrne
Erasure Code at Scale - Thomas William ByrneErasure Code at Scale - Thomas William Byrne
Erasure Code at Scale - Thomas William Byrne
 
PhegData X - High Performance EBS
PhegData X - High Performance EBSPhegData X - High Performance EBS
PhegData X - High Performance EBS
 
Introduction to Cassandra and CQL for Java developers
Introduction to Cassandra and CQL for Java developersIntroduction to Cassandra and CQL for Java developers
Introduction to Cassandra and CQL for Java developers
 
SUE 2018 - Migrating a 130TB Cluster from Elasticsearch 2 to 5 in 20 Hours Wi...
SUE 2018 - Migrating a 130TB Cluster from Elasticsearch 2 to 5 in 20 Hours Wi...SUE 2018 - Migrating a 130TB Cluster from Elasticsearch 2 to 5 in 20 Hours Wi...
SUE 2018 - Migrating a 130TB Cluster from Elasticsearch 2 to 5 in 20 Hours Wi...
 
Ceph and openstack at the boston meetup
Ceph and openstack at the boston meetupCeph and openstack at the boston meetup
Ceph and openstack at the boston meetup
 
Ceph & OpenStack - Boston Meetup
Ceph & OpenStack - Boston MeetupCeph & OpenStack - Boston Meetup
Ceph & OpenStack - Boston Meetup
 
Debugging ZFS: From Illumos to Linux
Debugging ZFS: From Illumos to LinuxDebugging ZFS: From Illumos to Linux
Debugging ZFS: From Illumos to Linux
 
OpenStack and Ceph: the Winning Pair
OpenStack and Ceph: the Winning PairOpenStack and Ceph: the Winning Pair
OpenStack and Ceph: the Winning Pair
 
Ceph Day Santa Clara: Keynote: Building Tomorrow's Ceph
Ceph Day Santa Clara: Keynote: Building Tomorrow's Ceph Ceph Day Santa Clara: Keynote: Building Tomorrow's Ceph
Ceph Day Santa Clara: Keynote: Building Tomorrow's Ceph
 

Recently uploaded

LeadMagnet IQ Review: Unlock the Secret to Effortless Traffic and Leads.pdf
LeadMagnet IQ Review:  Unlock the Secret to Effortless Traffic and Leads.pdfLeadMagnet IQ Review:  Unlock the Secret to Effortless Traffic and Leads.pdf
LeadMagnet IQ Review: Unlock the Secret to Effortless Traffic and Leads.pdf
SelfMade bd
 
Zaitechno Handheld Raman Spectrometer.pdf
Zaitechno Handheld Raman Spectrometer.pdfZaitechno Handheld Raman Spectrometer.pdf
Zaitechno Handheld Raman Spectrometer.pdf
AmandaCheung15
 
Russian Girls Call Navi Mumbai 🎈🔥9920725232 🔥💋🎈 Provide Best And Top Girl Ser...
Russian Girls Call Navi Mumbai 🎈🔥9920725232 🔥💋🎈 Provide Best And Top Girl Ser...Russian Girls Call Navi Mumbai 🎈🔥9920725232 🔥💋🎈 Provide Best And Top Girl Ser...
Russian Girls Call Navi Mumbai 🎈🔥9920725232 🔥💋🎈 Provide Best And Top Girl Ser...
bellared2
 
It's your unstructured data: How to get your GenAI app to production (and spe...
It's your unstructured data: How to get your GenAI app to production (and spe...It's your unstructured data: How to get your GenAI app to production (and spe...
It's your unstructured data: How to get your GenAI app to production (and spe...
Zilliz
 
Types of Weaving loom machine & it's technology
Types of Weaving loom machine & it's technologyTypes of Weaving loom machine & it's technology
Types of Weaving loom machine & it's technology
ldtexsolbl
 
UX Webinar Series: Essentials for Adopting Passkeys as the Foundation of your...
UX Webinar Series: Essentials for Adopting Passkeys as the Foundation of your...UX Webinar Series: Essentials for Adopting Passkeys as the Foundation of your...
UX Webinar Series: Essentials for Adopting Passkeys as the Foundation of your...
FIDO Alliance
 
Generative AI Reasoning Tech Talk - July 2024
Generative AI Reasoning Tech Talk - July 2024Generative AI Reasoning Tech Talk - July 2024
Generative AI Reasoning Tech Talk - July 2024
siddu769252
 
Improving Learning Content Efficiency with Reusable Learning Content
Improving Learning Content Efficiency with Reusable Learning ContentImproving Learning Content Efficiency with Reusable Learning Content
Improving Learning Content Efficiency with Reusable Learning Content
Enterprise Knowledge
 
Opencast Summit 2024 — Opencast @ University of Münster
Opencast Summit 2024 — Opencast @ University of MünsterOpencast Summit 2024 — Opencast @ University of Münster
Opencast Summit 2024 — Opencast @ University of Münster
Matthias Neugebauer
 
Google I/O Extended Harare Merged Slides
Google I/O Extended Harare Merged SlidesGoogle I/O Extended Harare Merged Slides
Google I/O Extended Harare Merged Slides
Google Developer Group - Harare
 
leewayhertz.com-AI agents for healthcare Applications benefits and implementa...
leewayhertz.com-AI agents for healthcare Applications benefits and implementa...leewayhertz.com-AI agents for healthcare Applications benefits and implementa...
leewayhertz.com-AI agents for healthcare Applications benefits and implementa...
alexjohnson7307
 
Girls call Kolkata 👀 XXXXXXXXXXX 👀 Rs.9.5 K Cash Payment With Room Delivery
Girls call Kolkata 👀 XXXXXXXXXXX 👀 Rs.9.5 K Cash Payment With Room Delivery Girls call Kolkata 👀 XXXXXXXXXXX 👀 Rs.9.5 K Cash Payment With Room Delivery
Girls call Kolkata 👀 XXXXXXXXXXX 👀 Rs.9.5 K Cash Payment With Room Delivery
sunilverma7884
 
MAKE MONEY ONLINE Unlock Your Income Potential Today.pptx
MAKE MONEY ONLINE Unlock Your Income Potential Today.pptxMAKE MONEY ONLINE Unlock Your Income Potential Today.pptx
MAKE MONEY ONLINE Unlock Your Income Potential Today.pptx
janagijoythi
 
Mule Experience Hub and Release Channel with Java 17
Mule Experience Hub and Release Channel with Java 17Mule Experience Hub and Release Channel with Java 17
Mule Experience Hub and Release Channel with Java 17
Bhajan Mehta
 
EuroPython 2024 - Streamlining Testing in a Large Python Codebase
EuroPython 2024 - Streamlining Testing in a Large Python CodebaseEuroPython 2024 - Streamlining Testing in a Large Python Codebase
EuroPython 2024 - Streamlining Testing in a Large Python Codebase
Jimmy Lai
 
Finetuning GenAI For Hacking and Defending
Finetuning GenAI For Hacking and DefendingFinetuning GenAI For Hacking and Defending
Finetuning GenAI For Hacking and Defending
Priyanka Aash
 
Sonkoloniya documentation - ONEprojukti.pdf
Sonkoloniya documentation - ONEprojukti.pdfSonkoloniya documentation - ONEprojukti.pdf
Sonkoloniya documentation - ONEprojukti.pdf
SubhamMandal40
 
Communications Mining Series - Zero to Hero - Session 3
Communications Mining Series - Zero to Hero - Session 3Communications Mining Series - Zero to Hero - Session 3
Communications Mining Series - Zero to Hero - Session 3
DianaGray10
 
Mastering OnlyFans Clone App Development: Key Strategies for Success
Mastering OnlyFans Clone App Development: Key Strategies for SuccessMastering OnlyFans Clone App Development: Key Strategies for Success
Mastering OnlyFans Clone App Development: Key Strategies for Success
David Wilson
 
Step-By-Step Process to Develop a Mobile App From Scratch
Step-By-Step Process to Develop a Mobile App From ScratchStep-By-Step Process to Develop a Mobile App From Scratch
Step-By-Step Process to Develop a Mobile App From Scratch
softsuave
 

Recently uploaded (20)

LeadMagnet IQ Review: Unlock the Secret to Effortless Traffic and Leads.pdf
LeadMagnet IQ Review:  Unlock the Secret to Effortless Traffic and Leads.pdfLeadMagnet IQ Review:  Unlock the Secret to Effortless Traffic and Leads.pdf
LeadMagnet IQ Review: Unlock the Secret to Effortless Traffic and Leads.pdf
 
Zaitechno Handheld Raman Spectrometer.pdf
Zaitechno Handheld Raman Spectrometer.pdfZaitechno Handheld Raman Spectrometer.pdf
Zaitechno Handheld Raman Spectrometer.pdf
 
Russian Girls Call Navi Mumbai 🎈🔥9920725232 🔥💋🎈 Provide Best And Top Girl Ser...
Russian Girls Call Navi Mumbai 🎈🔥9920725232 🔥💋🎈 Provide Best And Top Girl Ser...Russian Girls Call Navi Mumbai 🎈🔥9920725232 🔥💋🎈 Provide Best And Top Girl Ser...
Russian Girls Call Navi Mumbai 🎈🔥9920725232 🔥💋🎈 Provide Best And Top Girl Ser...
 
It's your unstructured data: How to get your GenAI app to production (and spe...
It's your unstructured data: How to get your GenAI app to production (and spe...It's your unstructured data: How to get your GenAI app to production (and spe...
It's your unstructured data: How to get your GenAI app to production (and spe...
 
Types of Weaving loom machine & it's technology
Types of Weaving loom machine & it's technologyTypes of Weaving loom machine & it's technology
Types of Weaving loom machine & it's technology
 
UX Webinar Series: Essentials for Adopting Passkeys as the Foundation of your...
UX Webinar Series: Essentials for Adopting Passkeys as the Foundation of your...UX Webinar Series: Essentials for Adopting Passkeys as the Foundation of your...
UX Webinar Series: Essentials for Adopting Passkeys as the Foundation of your...
 
Generative AI Reasoning Tech Talk - July 2024
Generative AI Reasoning Tech Talk - July 2024Generative AI Reasoning Tech Talk - July 2024
Generative AI Reasoning Tech Talk - July 2024
 
Improving Learning Content Efficiency with Reusable Learning Content
Improving Learning Content Efficiency with Reusable Learning ContentImproving Learning Content Efficiency with Reusable Learning Content
Improving Learning Content Efficiency with Reusable Learning Content
 
Opencast Summit 2024 — Opencast @ University of Münster
Opencast Summit 2024 — Opencast @ University of MünsterOpencast Summit 2024 — Opencast @ University of Münster
Opencast Summit 2024 — Opencast @ University of Münster
 
Google I/O Extended Harare Merged Slides
Google I/O Extended Harare Merged SlidesGoogle I/O Extended Harare Merged Slides
Google I/O Extended Harare Merged Slides
 
leewayhertz.com-AI agents for healthcare Applications benefits and implementa...
leewayhertz.com-AI agents for healthcare Applications benefits and implementa...leewayhertz.com-AI agents for healthcare Applications benefits and implementa...
leewayhertz.com-AI agents for healthcare Applications benefits and implementa...
 
Girls call Kolkata 👀 XXXXXXXXXXX 👀 Rs.9.5 K Cash Payment With Room Delivery
Girls call Kolkata 👀 XXXXXXXXXXX 👀 Rs.9.5 K Cash Payment With Room Delivery Girls call Kolkata 👀 XXXXXXXXXXX 👀 Rs.9.5 K Cash Payment With Room Delivery
Girls call Kolkata 👀 XXXXXXXXXXX 👀 Rs.9.5 K Cash Payment With Room Delivery
 
MAKE MONEY ONLINE Unlock Your Income Potential Today.pptx
MAKE MONEY ONLINE Unlock Your Income Potential Today.pptxMAKE MONEY ONLINE Unlock Your Income Potential Today.pptx
MAKE MONEY ONLINE Unlock Your Income Potential Today.pptx
 
Mule Experience Hub and Release Channel with Java 17
Mule Experience Hub and Release Channel with Java 17Mule Experience Hub and Release Channel with Java 17
Mule Experience Hub and Release Channel with Java 17
 
EuroPython 2024 - Streamlining Testing in a Large Python Codebase
EuroPython 2024 - Streamlining Testing in a Large Python CodebaseEuroPython 2024 - Streamlining Testing in a Large Python Codebase
EuroPython 2024 - Streamlining Testing in a Large Python Codebase
 
Finetuning GenAI For Hacking and Defending
Finetuning GenAI For Hacking and DefendingFinetuning GenAI For Hacking and Defending
Finetuning GenAI For Hacking and Defending
 
Sonkoloniya documentation - ONEprojukti.pdf
Sonkoloniya documentation - ONEprojukti.pdfSonkoloniya documentation - ONEprojukti.pdf
Sonkoloniya documentation - ONEprojukti.pdf
 
Communications Mining Series - Zero to Hero - Session 3
Communications Mining Series - Zero to Hero - Session 3Communications Mining Series - Zero to Hero - Session 3
Communications Mining Series - Zero to Hero - Session 3
 
Mastering OnlyFans Clone App Development: Key Strategies for Success
Mastering OnlyFans Clone App Development: Key Strategies for SuccessMastering OnlyFans Clone App Development: Key Strategies for Success
Mastering OnlyFans Clone App Development: Key Strategies for Success
 
Step-By-Step Process to Develop a Mobile App From Scratch
Step-By-Step Process to Develop a Mobile App From ScratchStep-By-Step Process to Develop a Mobile App From Scratch
Step-By-Step Process to Develop a Mobile App From Scratch
 

Ceph Performance and Optimization - Ceph Day Frankfurt

  • 2. Whoami 💥 Sébastien Han 💥 French Cloud Engineer working for eNovance 💥 Daily job focused on Ceph and OpenStack 💥 Blogger Personal blog: http://www.sebastien-han.fr/blog/ Company blog: http://techs.enovance.com/ Last Cephdays presentation
  • 3. How does Ceph perform? 42* *The Hitchhiker's Guide to the Galaxy
  • 5. CRUSH: deterministic object placement As soon as a client writes into Ceph, the operation is computed and the client decides to which OSD the object should belong
  • 6. Aggregation: cluster level As soon as you write into Ceph, all the objects get equally spread across the entire Cluster, understanding machines and disks..
  • 7. Aggregation: OSD level As soon as an IO goes into an OSD, no matter how the original pattern was, it becomes sequential.
  • 8. The Bad Ceph IO pattern
  • 9. Journaling As soon as an IO goes into an OSD, it gets written twice.
  • 10. Journal and OSD data on the same disk Journal penalty on the disk Since we write twice, if the journal is stored on the same disk as the OSD data this will result in the following: Device: wMB/s sdb1 - journal 50.11 sdb2 - osd_data 40.25
  • 11. Filesystem fragmentation • Objects are stored as files on the OSD filesystem • Several IO patterns with different block sizes increase filesystem fragmentation • Possible root cause: image sparseness • One year old cluster ends up with (see allocsize options for XFS): $ sudo xfs_db -c frag -r /dev/sdd actual 196334, ideal 122582, fragmentation factor 37.56%
  • 12. No parallelized reads • Ceph will always serve the read request from the primary OSD • Room for Nx times speed up where N is the replica count Blueprint from Sage for the Giant release
  • 13. Scrubbing impact • Consistent object check at the PG level • Compare replicas versions between each others (Fsck for objects) • Light scrubbing (daily) checks the object size and attributes. • Deep scrubbing (weekly) reads the data and uses checksums to ensure data integrity. • Corruption exists – ECC memory (10^15 for enterprise disk) ~113TB • No pain No gain
  • 14. The Ugly Ceph IO pattern
  • 15. IOs to the OSD disk One IO into Ceph leads to 2 writes, well… the second write is the worst!
  • 16. The problem • Several objects map to the same physical disks • Sequential streams get mixed all together • Result: The disk seeks like hell
  • 17. Even worse with erasure coding? This is just an assumption! •Since erasure coding does chunks of chunks we can possibly have this phenomena amplified
  • 19. How to start? Things that you must consider: •Use case • IO profile: Bandwidth? IOPS? Mixed? • How many IOPS or Bandwidth per client do I want to deliver? • Do I use Ceph in standalone or is it combined with a software solution? •Amount of data (usable not RAW) • Replica count • Do I have a data growth planning? •Leftover • How much data am I willing to lose if a node fails? (%) • Am I ready to be annoyed by the scrubbing process? •
  • 20. Things that you must not do • Don't put a RAID underneath your OSD • Ceph already manages the replication • Degraded RAID breaks performances • Reduce usable space on the cluster • Don't build high density nodes with a tiny cluster • Failure consideration and data to re-balance • Potential full cluster • Don't run Ceph on your hypervisors (unless you're broke) • Well maybe…
  • 22. Object store multi-backend • ObjectStore is born • Aims to support several backends: • levelDB (default) • RocksDB • Fusionio NVMKV • Seagate Kinetic • Yours!
  • 23. Why is it so good? • No more journal! Yay! • Object backends have built-in atomic functions
  • 24. Firefly leveldb • Relatively new • Need to be tested with your workload first • Tend to be more efficient with small objects