SlideShare a Scribd company logo
1 of 61
Download to read offline
Unlocking The Performance Secrets of
Ceph Object Storage
Karan Singh
Sr. Storage Architect
Storage Solution Architectures Team
a Teaser for Shared Data Lake Solution
and
WHO DO WE HELP?
COMMUNITY CUSTOMERS
SOLUTIONS
ARCHITECTS
HOW DO WE HELP?
ARCHITECTURE PERFORMANCE
OPTIMIZATIONS
OBJECT STORAGE
DIGITAL MEDIA DATA WAREHOUSE
WHAT IS OBJECT STORAGE IS GOOD FOR ?
ARCHIVE
MULTI-PETABYTE!
BACKUP
OBJECT STORAGE PERFORMANCE & SIZING ?
● Executed ~1500 unique tests
● Evaluated Small, Medium & Large Object sizes
● Storing millions of Objects (~130 Million to be precise)
■ 400 hours of testing for one of the scenario
● Have seen performance improvements as high as ~500%
● Testing was conducted on QCT sponsored LAB
LAB PARTNER
PERFORMANCE & SIZING GUIDE
Benchmarking Environment
&
Methodology
Hardware Configurations Tested
HIGH DENSITY SERVERSSTANDARD DENSITY SERVERS
● 6x OSD Nodes
○ 12x HDD (7.2K rpm, 6TB SATA)
○ 1x Intel P3700 800G NVMe
○ 128G Memory
○ 1x Intel Xeon E5-2660 v3
○ 1x 40GbE
● 4x RGW Nodes
○ 1x 40GbE
○ 96G Memory
○ 1x Intel Xeon E5-2670 v3
● 8x Client Nodes
○ 1x 10GbE
○ 96G Memory
○ 1x Intel Xeon E5-2670
● 1x Ceph MON (*)
○ 1x 10GbE
○ 96G Memory
○ 1x Intel Xeon E5-2670
● 6x OSD Nodes
○ 35x HDD (7.2K rpm, 6TB SATA)
○ 2x Intel P3700 800G NVMe
○ 128G Memory
○ 2x Intel Xeon E5-2660 v3
○ 1x 40GbE
● 4x RGW Nodes
○ 1x 40GbE
○ 96G Memory
○ 1x Intel Xeon E5-2670 v3
● 8x Client Nodes
○ 1x 10GbE
○ 96G Memory
○ 1x Intel Xeon E5-2670
● 1x Ceph MON(*)
○ 1x 10GbE
○ 96G Memory
○ 1x Intel Xeon E5-2670
LAB PARTNER
(*) For production usage use 3 MONs at least
Lab Setup
Front View Rear View
LAB PARTNER
Benchmarking Methodology & Tools
CEPH Red Hat Ceph Storage 2.0 ( Jewel )
OS Red Hat Enterprise Linux 7.2
TOOLS
● Ceph Benchmarking Tool (CBT)
● FIO 2.11-12
● iPerf3
● COSBench 0.4.2.c3
● Intel Cache Acceleration Software (CAS) 03.01.01
Payload Selection
● 64K - Small size object (Thumbnail images, small files etc.)
● 1M - Medium size object (Images, text files etc.)
● 32M - Large size object (Analytics Engines , HD images , log files, backup etc.)
● 64M - Large size object (Analytics Engines , Videos etc. )
Optimizing for
Small Object Operations (OPS)
14
Small Object (Read Ops)
Higher Read ops could have been achieved by adding more RGW hosts, until reaching disk saturation.
Bucket Index on flash media was optimal for small object workload
● Client read ops scaled linearly while increasing RGW hosts (OSDs on standard density servers)
● Performance limited by number of RGW hosts available in our test environment
● Best observed perf. was 8900 Read OPS with bucket index on Flash Media on standard density servers
Read Operations are simple !!
… Let’s Talk About Write …
RGW Object Consists of
Object Index Object Data
TAIL TAIL TAIL TAILHEAD
RADOS
GATEWAY
rgw.buckets.index Pool rgw.buckets.data Pool
TAIL TAIL TAIL TAILHEAD
RGW Write Operation
Sends write req. to RGW
Can we Improve This ?
… Yes We Can !!
Put bucket index on Flash Media
RADOS
GATEWAY
rgw.buckets.index Pool rgw.buckets.data Pool
TAIL TAIL TAIL TAILHEAD
FlashFlash
Sends write req. to RGW
20
Small Object (Write Ops)
● Client write ops scaled sub-linearly while increasing RGW hosts
● Performance limited by disk saturation on Ceph OSD hosts
● Best observed Write OPS was 5000 on High Density Servers with bucket index on flash media
Higher write ops could have been achieved by adding more OSD hosts
Small Object (Write Ops) Saturating HDDs
Latency Improvements
SERVER
CONFIGURATION
TOTAL OSDs IN
CLUSTER
OBJECT
SIZE
BUCKET INDEX
CONFIGURATION
AVERAGE READ
LATENCY (ms)
AVERAGE WRITE
LATENCY (ms)
Standard Density 72 64 K HDD Index 142 318
Standard Density 72 64 K NVMe Index 57 229
High Density 210 64 K HDD Index 51 176
High Density 210 64 K NVMe Index 54 104
● Configuring Bucket Index on flash helped reduce:
● P99 Read latency reduced by ~149% (Standard density servers)
● P99 Write latency reduced by ~69% (High density servers)
Optimizing for
Large Object Throughput (MBps)
24
● Client read throughput scaled near-linearly while increasing RGW hosts
● Performance limited by number of RGW hosts available in our test environment
● Best observed read throughput was ~4GB/s with 32M object size on standard density servers
Large Object (Read MBps)
Higher read throughput could have been achieved by adding more RGW hosts
Read Operations are simple !!
… Let’s Talk About Write …
RADOS GATEWAY
RGW Write Operation Sends write req. to RGW
4MB 4MB 4MB 4MB RGW Object Stripes
RADOS GATEWAY
RGW Write Operation Sends write req. to RGW
4MB 4MB 4MB 4MB RGW Object Stripes
C C C C C CC C RGW Object Chunks (512K Default)
RADOS GATEWAY
RGW Write Operation Sends write req. to RGW
4MB 4MB 4MB 4MB RGW Object Stripes
C C C C C CC C RGW Object Chunks (512K Default)
K K K K M M EC ( K + M ) Chunks
RADOS GATEWAY
RGW Write Operation Sends write req. to RGW
4MB 4MB 4MB 4MB RGW Object Stripes
C C C C C CC C RGW Object Chunks (512K Default)
K K K K M M EC ( K + M ) Chunks
OSD OSD OSD OSD OSD OSD EC Chunks Written to OSD/Disk
RADOS GATEWAY
RGW Write Operation Sends write req. to RGW
Can we Improve This ?
… Yes We Can !!
4MB 4MB 4MB 4MB RGW Object Stripes
C C C C C CC C RGW Object Chunks (512K Default)
K K K K M M EC ( K + M ) Chunks
OSD OSD OSD OSD OSD OSD EC Chunks Written to OSD/Disk
RADOS GATEWAY
RGW Write Operation (Before Tuning)
Sends write req. to RGW
4MB 4MB 4MB 4MB RGW Object Stripes
C C C C C CC C RGW Object Chunks (512K Default)
K K K K M M EC ( K + M ) Chunks
OSD OSD OSD OSD OSD OSD EC Chunks Written to OSD/Disk
Tune This Stage
RADOS GATEWAY
RGW Write Operation (Where to Tune)
Sends write req. to RGW
4MB 4MB 4MB 4MB RGW Object Stripes
C RGW Object Chunks (4M Tuned)
K K K K M M EC ( K + M ) Chunks
OSD OSD OSD OSD OSD OSD EC Chunks Written to OSD/Disk
Reduced RGW Chunks
RADOS GATEWAY
RGW Write Operation (After Tuning)
Sends write req. to RGW
35
DEFAULT vs. TUNED
Tuning rgw_max_chunk_size to 4M , helped improve write throughput
~40% Higher
Throughput
After Tuning
Improved Large Object Write (MBps)
36
● Client write throughput scaled near-linearly while increasing RGW hosts
● Best observed (untuned) write throughput was 2.1 GB/s with 32M object size on high density servers
Large Object Write (MBps)
Where is the Bottleneck ??
Large Object Write
Bottleneck : DISK SATURATION
38
● Performance limited by disk saturation on Ceph OSD hosts
Large Object Write (MBps)
Higher write throughput could have been achieved by adding more OSD hosts
Disk Saturation
Designing for
Higher Object Count (100M+ Objects)
Designing for Higher Object Count
100M+ OBJECTS
● Tested Configurations
○ Default OSD Filestore settings
○ Tuned OSD Filestore settings
○ Default OSD Filestore settings + Intel CAS (Metadata Caching)
○ Tuned OSD Filestore settings + Intel CAS (Metadata Caching)
● Test Details
○ High Density Servers
○ 64K object size
○ 50 Hours each test run ( 200 Hrs total testing )
○ Cluster filled up to ~130 Million Objects
41
Best Config: Default OSD Filestore settings + Intel CAS (Metadata caching)
Designing for Higher Object Count (Read)
42
Designing for Higher Object Count (Read)
COMPARING DIFFERENT CONFIGURATIONS
43
Best Config: Default OSD Filestore settings + Intel CAS (Metadata caching)
Designing for Higher Object Count (Write)
44
Designing for Higher Object Count (Write)
COMPARING DIFFERENT CONFIGURATIONS
● Steady write latency with Intel CAS, as the cluster grew from 0 to 100M+ objects
● ~100% lower as compared to Default Ceph configuration
Higher Object Count Test (Latency)
Secrets to
Object Storage Performance
(Key takeaways)
● Object size
● Object count
● Server density
● Client : RGW : OSD Ratio
● Bucket index placement scheme
● Caching
● Data protection scheme
● Price/Performance
#1 Architectural Consideration for object storage cluster
● 12 Bay OSD Hosts
● 10GbE RGW Hosts
● Bucket Indices on Flash Media
#2 Recommendations for Small Object Workloads
#3 Recommendations for Large Object Workloads
● 12 Bay OSD Hosts, 10GbE - Performance
● 35 Bay OSD Hosts, 40GbE - Price / Performance
● 10GbE RGW Hosts
● Tune rgw_max_chunk_size to 4M
● Bucket Indices on Flash Media
Caution : Do not increase rgw_max_chunk_size beyond 4M , this
causes OSD slow requests and OSD flapping issues.
● 35 Bay OSD Hosts, 40GbE (Price/Performance)
● Bucket Indices on Flash Media
● Intel Cache Acceleration Software (CAS)
○ Only Metadata Caching
#4 Recommendations for High Object Count Workloads
For Small Object (64K) 100% Write
● 1 RGW for every 50 OSDs (HDD)
For Large Object (32M) 100% Write
● 1 RGW for every 100 OSDs (HDD)
#5 Recommendations for RGW Sizing
#6 RGW : 10GbE or 40GbE / Dedicated or Colo ?
● Take Away : Dedicated RGW node with 1 x 10GbE connectivity
Shared Data Lake Solution
Using
Ceph Object Storage
(Teaser)
Discontinuity in Big Data Infrastructure - Why?
hadoop
Before
hadoop, spark, hive, tez, presto,
impala, kafka, hbase, etc.
Now
● Congestion
● Multiple teams competing
causing delay in response from analytics applications
for the same big data resources
Discontinuity Presents Choice
?
Compute
Storage
C C C
Shared Storage
C
S
C
S
Get a bigger cluster
● Lacks isolation - still have noisy
neighbors
● Lacks elasticity - rigid cluster size
● Can’t scale compute/storage costs
separately
Get more clusters
● Cost of duplicating big datasets
● Lacks on-demand provisioning
● Can’t scale compute/storage costs
separately
On-demand Compute and Storage pools
● Isolation of high-priority workloads
● Shared big datasets
● On-demand provisioning
● Compute/storage costs scale separately
C
S
C
S
COMPUTE
HDFS TMP
COMPUTE
HDFS TMP
COMPUTE
HDFS TMP
Data Analytics Emerging Patterns
Multiple analytic clusters, provisioned on-demand, sourcing from a common object store
STORAGE
ANALYTIC CLUSTER 1
COMPUTE
STORAGE
COMPUTE
STORAGE
COMPUTE
STORAGE
WORKER
HADOOP CLUSTER 1
ANALYTIC CLUSTER 2 ANALYTIC CLUSTER 3
COMMON PERSISTENT STORE
2
3
Red Hat Elastic Infrastructure for Analytics
Kafka
compute
instances
Hive/MapReduce
compute
instances
Gold SLA
Spark SQL
compute
Bronze SLA
Elastic Compute Resource Pool
Shared Data Lake on
Presto
compute
instances
Platinum SLA
Interactive query
Spark
compute
instances
Silver SLA
Batch Query & Joins
Ingest ETL
S3A
Analytics vendors focus on analytics. Red Hat on infrastructure.
Analytics
Red Hat provides
infrastructure
software
Analytics vendors
provide analytics
software
OpenStack or OpenShift Provisioned Compute Pool
InfrastructureShared Data Lake on Ceph Object Storage
Red Hat Elastic Infrastructure for Analytics
Red Hat Solution Development Lab (at QCT)
TPC-DS
& logsynth
data gen
Hive or Spark
ORC or parquet ETL
(1/10/100TB)
n concurrent
Impala
compute
n concurrent
Spark SQL
54 deep TPC-DS
queries Spark SQL
compute
n concurrent
Spark SQL
Enrichment
dataset joins -
100 days of logs
Interactive
query
Batch
query
ETL
Transformation
Ingest
Ceph RBD block volume pool
(Intermediate shuffle files,
hadoop.tmp, yarn.log, etc)
Ceph S3A object bucket pool
(source data, transformed data)
HA Proxy tier
20x Ceph hosts
(700 OSDs)
RGW tier (12x)
Shared Data Lake on Ceph Object Storage
Reference Architecture
Coming
This Fall ...
THANK YOU
plus.google.com/+RedHat
linkedin.com/company/red-hat
youtube.com/user/RedHatVideos
facebook.com/redhatinc
twitter.com/RedHatNews
Get Ceph Object Storage P&S Guide : http://bit.ly/object-ra

More Related Content

What's hot

Ceph RBD Update - June 2021
Ceph RBD Update - June 2021Ceph RBD Update - June 2021
Ceph RBD Update - June 2021Ceph Community
 
Ceph Day Melbourne - Troubleshooting Ceph
Ceph Day Melbourne - Troubleshooting Ceph Ceph Day Melbourne - Troubleshooting Ceph
Ceph Day Melbourne - Troubleshooting Ceph Ceph Community
 
A crash course in CRUSH
A crash course in CRUSHA crash course in CRUSH
A crash course in CRUSHSage Weil
 
Ceph Day Beijing - Ceph All-Flash Array Design Based on NUMA Architecture
Ceph Day Beijing - Ceph All-Flash Array Design Based on NUMA ArchitectureCeph Day Beijing - Ceph All-Flash Array Design Based on NUMA Architecture
Ceph Day Beijing - Ceph All-Flash Array Design Based on NUMA ArchitectureDanielle Womboldt
 
Revisiting CephFS MDS and mClock QoS Scheduler
Revisiting CephFS MDS and mClock QoS SchedulerRevisiting CephFS MDS and mClock QoS Scheduler
Revisiting CephFS MDS and mClock QoS SchedulerYongseok Oh
 
[OpenInfra Days Korea 2018] Day 2 - CEPH 운영자를 위한 Object Storage Performance T...
[OpenInfra Days Korea 2018] Day 2 - CEPH 운영자를 위한 Object Storage Performance T...[OpenInfra Days Korea 2018] Day 2 - CEPH 운영자를 위한 Object Storage Performance T...
[OpenInfra Days Korea 2018] Day 2 - CEPH 운영자를 위한 Object Storage Performance T...OpenStack Korea Community
 
Optimizing Kubernetes Resource Requests/Limits for Cost-Efficiency and Latenc...
Optimizing Kubernetes Resource Requests/Limits for Cost-Efficiency and Latenc...Optimizing Kubernetes Resource Requests/Limits for Cost-Efficiency and Latenc...
Optimizing Kubernetes Resource Requests/Limits for Cost-Efficiency and Latenc...Henning Jacobs
 
Ceph Introduction 2017
Ceph Introduction 2017  Ceph Introduction 2017
Ceph Introduction 2017 Karan Singh
 
ceph optimization on ssd ilsoo byun-short
ceph optimization on ssd ilsoo byun-shortceph optimization on ssd ilsoo byun-short
ceph optimization on ssd ilsoo byun-shortNAVER D2
 
[OpenStack Days Korea 2016] Track1 - All flash CEPH 구성 및 최적화
[OpenStack Days Korea 2016] Track1 - All flash CEPH 구성 및 최적화[OpenStack Days Korea 2016] Track1 - All flash CEPH 구성 및 최적화
[OpenStack Days Korea 2016] Track1 - All flash CEPH 구성 및 최적화OpenStack Korea Community
 
Ceph Intro and Architectural Overview by Ross Turk
Ceph Intro and Architectural Overview by Ross TurkCeph Intro and Architectural Overview by Ross Turk
Ceph Intro and Architectural Overview by Ross Turkbuildacloud
 
Storage tiering and erasure coding in Ceph (SCaLE13x)
Storage tiering and erasure coding in Ceph (SCaLE13x)Storage tiering and erasure coding in Ceph (SCaLE13x)
Storage tiering and erasure coding in Ceph (SCaLE13x)Sage Weil
 
Ceph Block Devices: A Deep Dive
Ceph Block Devices:  A Deep DiveCeph Block Devices:  A Deep Dive
Ceph Block Devices: A Deep DiveRed_Hat_Storage
 
Ceph - A distributed storage system
Ceph - A distributed storage systemCeph - A distributed storage system
Ceph - A distributed storage systemItalo Santos
 
Build an High-Performance and High-Durable Block Storage Service Based on Ceph
Build an High-Performance and High-Durable Block Storage Service Based on CephBuild an High-Performance and High-Durable Block Storage Service Based on Ceph
Build an High-Performance and High-Durable Block Storage Service Based on CephRongze Zhu
 
Nick Fisk - low latency Ceph
Nick Fisk - low latency CephNick Fisk - low latency Ceph
Nick Fisk - low latency CephShapeBlue
 
Storage 101: Rook and Ceph - Open Infrastructure Denver 2019
Storage 101: Rook and Ceph - Open Infrastructure Denver 2019Storage 101: Rook and Ceph - Open Infrastructure Denver 2019
Storage 101: Rook and Ceph - Open Infrastructure Denver 2019Sean Cohen
 

What's hot (20)

Ceph RBD Update - June 2021
Ceph RBD Update - June 2021Ceph RBD Update - June 2021
Ceph RBD Update - June 2021
 
Ceph
CephCeph
Ceph
 
Ceph Day Melbourne - Troubleshooting Ceph
Ceph Day Melbourne - Troubleshooting Ceph Ceph Day Melbourne - Troubleshooting Ceph
Ceph Day Melbourne - Troubleshooting Ceph
 
A crash course in CRUSH
A crash course in CRUSHA crash course in CRUSH
A crash course in CRUSH
 
Ceph Day Beijing - Ceph All-Flash Array Design Based on NUMA Architecture
Ceph Day Beijing - Ceph All-Flash Array Design Based on NUMA ArchitectureCeph Day Beijing - Ceph All-Flash Array Design Based on NUMA Architecture
Ceph Day Beijing - Ceph All-Flash Array Design Based on NUMA Architecture
 
Revisiting CephFS MDS and mClock QoS Scheduler
Revisiting CephFS MDS and mClock QoS SchedulerRevisiting CephFS MDS and mClock QoS Scheduler
Revisiting CephFS MDS and mClock QoS Scheduler
 
[OpenInfra Days Korea 2018] Day 2 - CEPH 운영자를 위한 Object Storage Performance T...
[OpenInfra Days Korea 2018] Day 2 - CEPH 운영자를 위한 Object Storage Performance T...[OpenInfra Days Korea 2018] Day 2 - CEPH 운영자를 위한 Object Storage Performance T...
[OpenInfra Days Korea 2018] Day 2 - CEPH 운영자를 위한 Object Storage Performance T...
 
Optimizing Kubernetes Resource Requests/Limits for Cost-Efficiency and Latenc...
Optimizing Kubernetes Resource Requests/Limits for Cost-Efficiency and Latenc...Optimizing Kubernetes Resource Requests/Limits for Cost-Efficiency and Latenc...
Optimizing Kubernetes Resource Requests/Limits for Cost-Efficiency and Latenc...
 
Ceph Introduction 2017
Ceph Introduction 2017  Ceph Introduction 2017
Ceph Introduction 2017
 
ceph optimization on ssd ilsoo byun-short
ceph optimization on ssd ilsoo byun-shortceph optimization on ssd ilsoo byun-short
ceph optimization on ssd ilsoo byun-short
 
[OpenStack Days Korea 2016] Track1 - All flash CEPH 구성 및 최적화
[OpenStack Days Korea 2016] Track1 - All flash CEPH 구성 및 최적화[OpenStack Days Korea 2016] Track1 - All flash CEPH 구성 및 최적화
[OpenStack Days Korea 2016] Track1 - All flash CEPH 구성 및 최적화
 
Ceph Intro and Architectural Overview by Ross Turk
Ceph Intro and Architectural Overview by Ross TurkCeph Intro and Architectural Overview by Ross Turk
Ceph Intro and Architectural Overview by Ross Turk
 
Storage tiering and erasure coding in Ceph (SCaLE13x)
Storage tiering and erasure coding in Ceph (SCaLE13x)Storage tiering and erasure coding in Ceph (SCaLE13x)
Storage tiering and erasure coding in Ceph (SCaLE13x)
 
Ceph as software define storage
Ceph as software define storageCeph as software define storage
Ceph as software define storage
 
Ceph Block Devices: A Deep Dive
Ceph Block Devices:  A Deep DiveCeph Block Devices:  A Deep Dive
Ceph Block Devices: A Deep Dive
 
Ceph - A distributed storage system
Ceph - A distributed storage systemCeph - A distributed storage system
Ceph - A distributed storage system
 
Build an High-Performance and High-Durable Block Storage Service Based on Ceph
Build an High-Performance and High-Durable Block Storage Service Based on CephBuild an High-Performance and High-Durable Block Storage Service Based on Ceph
Build an High-Performance and High-Durable Block Storage Service Based on Ceph
 
Nick Fisk - low latency Ceph
Nick Fisk - low latency CephNick Fisk - low latency Ceph
Nick Fisk - low latency Ceph
 
Storage 101: Rook and Ceph - Open Infrastructure Denver 2019
Storage 101: Rook and Ceph - Open Infrastructure Denver 2019Storage 101: Rook and Ceph - Open Infrastructure Denver 2019
Storage 101: Rook and Ceph - Open Infrastructure Denver 2019
 
Bluestore
BluestoreBluestore
Bluestore
 

Similar to Ceph Object Storage Performance Secrets and Ceph Data Lake Solution

Ceph Performance: Projects Leading up to Jewel
Ceph Performance: Projects Leading up to JewelCeph Performance: Projects Leading up to Jewel
Ceph Performance: Projects Leading up to JewelColleen Corrice
 
Ceph Performance: Projects Leading Up to Jewel
Ceph Performance: Projects Leading Up to JewelCeph Performance: Projects Leading Up to Jewel
Ceph Performance: Projects Leading Up to JewelRed_Hat_Storage
 
QCT Ceph Solution - Design Consideration and Reference Architecture
QCT Ceph Solution - Design Consideration and Reference ArchitectureQCT Ceph Solution - Design Consideration and Reference Architecture
QCT Ceph Solution - Design Consideration and Reference ArchitectureCeph Community
 
QCT Ceph Solution - Design Consideration and Reference Architecture
QCT Ceph Solution - Design Consideration and Reference ArchitectureQCT Ceph Solution - Design Consideration and Reference Architecture
QCT Ceph Solution - Design Consideration and Reference ArchitecturePatrick McGarry
 
Ceph Day Shanghai - SSD/NVM Technology Boosting Ceph Performance
Ceph Day Shanghai - SSD/NVM Technology Boosting Ceph Performance Ceph Day Shanghai - SSD/NVM Technology Boosting Ceph Performance
Ceph Day Shanghai - SSD/NVM Technology Boosting Ceph Performance Ceph Community
 
Red Hat Storage Day Atlanta - Designing Ceph Clusters Using Intel-Based Hardw...
Red Hat Storage Day Atlanta - Designing Ceph Clusters Using Intel-Based Hardw...Red Hat Storage Day Atlanta - Designing Ceph Clusters Using Intel-Based Hardw...
Red Hat Storage Day Atlanta - Designing Ceph Clusters Using Intel-Based Hardw...Red_Hat_Storage
 
Ceph Community Talk on High-Performance Solid Sate Ceph
Ceph Community Talk on High-Performance Solid Sate Ceph Ceph Community Talk on High-Performance Solid Sate Ceph
Ceph Community Talk on High-Performance Solid Sate Ceph Ceph Community
 
Red Hat Storage Day New York - New Reference Architectures
Red Hat Storage Day New York - New Reference ArchitecturesRed Hat Storage Day New York - New Reference Architectures
Red Hat Storage Day New York - New Reference ArchitecturesRed_Hat_Storage
 
Ceph Day Taipei - Delivering cost-effective, high performance, Ceph cluster
Ceph Day Taipei - Delivering cost-effective, high performance, Ceph cluster Ceph Day Taipei - Delivering cost-effective, high performance, Ceph cluster
Ceph Day Taipei - Delivering cost-effective, high performance, Ceph cluster Ceph Community
 
Ceph Day KL - Delivering cost-effective, high performance Ceph cluster
Ceph Day KL - Delivering cost-effective, high performance Ceph clusterCeph Day KL - Delivering cost-effective, high performance Ceph cluster
Ceph Day KL - Delivering cost-effective, high performance Ceph clusterCeph Community
 
Ceph Day Tokyo - Delivering cost effective, high performance Ceph cluster
Ceph Day Tokyo - Delivering cost effective, high performance Ceph clusterCeph Day Tokyo - Delivering cost effective, high performance Ceph cluster
Ceph Day Tokyo - Delivering cost effective, high performance Ceph clusterCeph Community
 
Ceph Day Seoul - Delivering Cost Effective, High Performance Ceph cluster
Ceph Day Seoul - Delivering Cost Effective, High Performance Ceph cluster Ceph Day Seoul - Delivering Cost Effective, High Performance Ceph cluster
Ceph Day Seoul - Delivering Cost Effective, High Performance Ceph cluster Ceph Community
 
Accelerating HBase with NVMe and Bucket Cache
Accelerating HBase with NVMe and Bucket CacheAccelerating HBase with NVMe and Bucket Cache
Accelerating HBase with NVMe and Bucket CacheNicolas Poggi
 
3.INTEL.Optane_on_ceph_v2.pdf
3.INTEL.Optane_on_ceph_v2.pdf3.INTEL.Optane_on_ceph_v2.pdf
3.INTEL.Optane_on_ceph_v2.pdfhellobank1
 
Your 1st Ceph cluster
Your 1st Ceph clusterYour 1st Ceph cluster
Your 1st Ceph clusterMirantis
 
Ceph Day Beijing - Optimizing Ceph Performance by Leveraging Intel Optane and...
Ceph Day Beijing - Optimizing Ceph Performance by Leveraging Intel Optane and...Ceph Day Beijing - Optimizing Ceph Performance by Leveraging Intel Optane and...
Ceph Day Beijing - Optimizing Ceph Performance by Leveraging Intel Optane and...Danielle Womboldt
 
Ceph Day Beijing - Optimizing Ceph performance by leveraging Intel Optane and...
Ceph Day Beijing - Optimizing Ceph performance by leveraging Intel Optane and...Ceph Day Beijing - Optimizing Ceph performance by leveraging Intel Optane and...
Ceph Day Beijing - Optimizing Ceph performance by leveraging Intel Optane and...Ceph Community
 
Ceph Day San Jose - All-Flahs Ceph on NUMA-Balanced Server
Ceph Day San Jose - All-Flahs Ceph on NUMA-Balanced Server Ceph Day San Jose - All-Flahs Ceph on NUMA-Balanced Server
Ceph Day San Jose - All-Flahs Ceph on NUMA-Balanced Server Ceph Community
 
Ceph Day Seoul - AFCeph: SKT Scale Out Storage Ceph
Ceph Day Seoul - AFCeph: SKT Scale Out Storage Ceph Ceph Day Seoul - AFCeph: SKT Scale Out Storage Ceph
Ceph Day Seoul - AFCeph: SKT Scale Out Storage Ceph Ceph Community
 

Similar to Ceph Object Storage Performance Secrets and Ceph Data Lake Solution (20)

Ceph Performance: Projects Leading up to Jewel
Ceph Performance: Projects Leading up to JewelCeph Performance: Projects Leading up to Jewel
Ceph Performance: Projects Leading up to Jewel
 
Ceph Performance: Projects Leading Up to Jewel
Ceph Performance: Projects Leading Up to JewelCeph Performance: Projects Leading Up to Jewel
Ceph Performance: Projects Leading Up to Jewel
 
QCT Ceph Solution - Design Consideration and Reference Architecture
QCT Ceph Solution - Design Consideration and Reference ArchitectureQCT Ceph Solution - Design Consideration and Reference Architecture
QCT Ceph Solution - Design Consideration and Reference Architecture
 
QCT Ceph Solution - Design Consideration and Reference Architecture
QCT Ceph Solution - Design Consideration and Reference ArchitectureQCT Ceph Solution - Design Consideration and Reference Architecture
QCT Ceph Solution - Design Consideration and Reference Architecture
 
Ceph Day Shanghai - SSD/NVM Technology Boosting Ceph Performance
Ceph Day Shanghai - SSD/NVM Technology Boosting Ceph Performance Ceph Day Shanghai - SSD/NVM Technology Boosting Ceph Performance
Ceph Day Shanghai - SSD/NVM Technology Boosting Ceph Performance
 
Red Hat Storage Day Atlanta - Designing Ceph Clusters Using Intel-Based Hardw...
Red Hat Storage Day Atlanta - Designing Ceph Clusters Using Intel-Based Hardw...Red Hat Storage Day Atlanta - Designing Ceph Clusters Using Intel-Based Hardw...
Red Hat Storage Day Atlanta - Designing Ceph Clusters Using Intel-Based Hardw...
 
Ceph Community Talk on High-Performance Solid Sate Ceph
Ceph Community Talk on High-Performance Solid Sate Ceph Ceph Community Talk on High-Performance Solid Sate Ceph
Ceph Community Talk on High-Performance Solid Sate Ceph
 
Red Hat Storage Day New York - New Reference Architectures
Red Hat Storage Day New York - New Reference ArchitecturesRed Hat Storage Day New York - New Reference Architectures
Red Hat Storage Day New York - New Reference Architectures
 
Ceph Day Taipei - Delivering cost-effective, high performance, Ceph cluster
Ceph Day Taipei - Delivering cost-effective, high performance, Ceph cluster Ceph Day Taipei - Delivering cost-effective, high performance, Ceph cluster
Ceph Day Taipei - Delivering cost-effective, high performance, Ceph cluster
 
Ceph Day KL - Delivering cost-effective, high performance Ceph cluster
Ceph Day KL - Delivering cost-effective, high performance Ceph clusterCeph Day KL - Delivering cost-effective, high performance Ceph cluster
Ceph Day KL - Delivering cost-effective, high performance Ceph cluster
 
Ceph Day Tokyo - Delivering cost effective, high performance Ceph cluster
Ceph Day Tokyo - Delivering cost effective, high performance Ceph clusterCeph Day Tokyo - Delivering cost effective, high performance Ceph cluster
Ceph Day Tokyo - Delivering cost effective, high performance Ceph cluster
 
Ceph Day Seoul - Delivering Cost Effective, High Performance Ceph cluster
Ceph Day Seoul - Delivering Cost Effective, High Performance Ceph cluster Ceph Day Seoul - Delivering Cost Effective, High Performance Ceph cluster
Ceph Day Seoul - Delivering Cost Effective, High Performance Ceph cluster
 
Accelerating HBase with NVMe and Bucket Cache
Accelerating HBase with NVMe and Bucket CacheAccelerating HBase with NVMe and Bucket Cache
Accelerating HBase with NVMe and Bucket Cache
 
3.INTEL.Optane_on_ceph_v2.pdf
3.INTEL.Optane_on_ceph_v2.pdf3.INTEL.Optane_on_ceph_v2.pdf
3.INTEL.Optane_on_ceph_v2.pdf
 
Your 1st Ceph cluster
Your 1st Ceph clusterYour 1st Ceph cluster
Your 1st Ceph cluster
 
Ceph Day Beijing - Optimizing Ceph Performance by Leveraging Intel Optane and...
Ceph Day Beijing - Optimizing Ceph Performance by Leveraging Intel Optane and...Ceph Day Beijing - Optimizing Ceph Performance by Leveraging Intel Optane and...
Ceph Day Beijing - Optimizing Ceph Performance by Leveraging Intel Optane and...
 
Ceph Day Beijing - Optimizing Ceph performance by leveraging Intel Optane and...
Ceph Day Beijing - Optimizing Ceph performance by leveraging Intel Optane and...Ceph Day Beijing - Optimizing Ceph performance by leveraging Intel Optane and...
Ceph Day Beijing - Optimizing Ceph performance by leveraging Intel Optane and...
 
Ceph Day San Jose - All-Flahs Ceph on NUMA-Balanced Server
Ceph Day San Jose - All-Flahs Ceph on NUMA-Balanced Server Ceph Day San Jose - All-Flahs Ceph on NUMA-Balanced Server
Ceph Day San Jose - All-Flahs Ceph on NUMA-Balanced Server
 
Galaxy Big Data with MariaDB
Galaxy Big Data with MariaDBGalaxy Big Data with MariaDB
Galaxy Big Data with MariaDB
 
Ceph Day Seoul - AFCeph: SKT Scale Out Storage Ceph
Ceph Day Seoul - AFCeph: SKT Scale Out Storage Ceph Ceph Day Seoul - AFCeph: SKT Scale Out Storage Ceph
Ceph Day Seoul - AFCeph: SKT Scale Out Storage Ceph
 

More from Karan Singh

Demo : Twitter Sentiment Analysis on Kubernetes using Kafka, MongoDB with Ope...
Demo : Twitter Sentiment Analysis on Kubernetes using Kafka, MongoDB with Ope...Demo : Twitter Sentiment Analysis on Kubernetes using Kafka, MongoDB with Ope...
Demo : Twitter Sentiment Analysis on Kubernetes using Kafka, MongoDB with Ope...Karan Singh
 
Managing data analytics in a hybrid cloud
Managing data analytics in a hybrid cloudManaging data analytics in a hybrid cloud
Managing data analytics in a hybrid cloudKaran Singh
 
CEPH introduction , Bootstrapping your first Ceph cluster in just 10 minutes
CEPH introduction , Bootstrapping your first Ceph cluster in just 10 minutesCEPH introduction , Bootstrapping your first Ceph cluster in just 10 minutes
CEPH introduction , Bootstrapping your first Ceph cluster in just 10 minutesKaran Singh
 
Ceph meetup-helsinki-karan
Ceph meetup-helsinki-karanCeph meetup-helsinki-karan
Ceph meetup-helsinki-karanKaran Singh
 
Ceph meetup-helsinki-karan
Ceph meetup-helsinki-karanCeph meetup-helsinki-karan
Ceph meetup-helsinki-karanKaran Singh
 
Ceph and Openstack in a Nutshell
Ceph and Openstack in a NutshellCeph and Openstack in a Nutshell
Ceph and Openstack in a NutshellKaran Singh
 

More from Karan Singh (6)

Demo : Twitter Sentiment Analysis on Kubernetes using Kafka, MongoDB with Ope...
Demo : Twitter Sentiment Analysis on Kubernetes using Kafka, MongoDB with Ope...Demo : Twitter Sentiment Analysis on Kubernetes using Kafka, MongoDB with Ope...
Demo : Twitter Sentiment Analysis on Kubernetes using Kafka, MongoDB with Ope...
 
Managing data analytics in a hybrid cloud
Managing data analytics in a hybrid cloudManaging data analytics in a hybrid cloud
Managing data analytics in a hybrid cloud
 
CEPH introduction , Bootstrapping your first Ceph cluster in just 10 minutes
CEPH introduction , Bootstrapping your first Ceph cluster in just 10 minutesCEPH introduction , Bootstrapping your first Ceph cluster in just 10 minutes
CEPH introduction , Bootstrapping your first Ceph cluster in just 10 minutes
 
Ceph meetup-helsinki-karan
Ceph meetup-helsinki-karanCeph meetup-helsinki-karan
Ceph meetup-helsinki-karan
 
Ceph meetup-helsinki-karan
Ceph meetup-helsinki-karanCeph meetup-helsinki-karan
Ceph meetup-helsinki-karan
 
Ceph and Openstack in a Nutshell
Ceph and Openstack in a NutshellCeph and Openstack in a Nutshell
Ceph and Openstack in a Nutshell
 

Recently uploaded

Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 

Recently uploaded (20)

Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 

Ceph Object Storage Performance Secrets and Ceph Data Lake Solution

  • 1. Unlocking The Performance Secrets of Ceph Object Storage Karan Singh Sr. Storage Architect Storage Solution Architectures Team a Teaser for Shared Data Lake Solution and
  • 2. WHO DO WE HELP? COMMUNITY CUSTOMERS SOLUTIONS ARCHITECTS
  • 3. HOW DO WE HELP? ARCHITECTURE PERFORMANCE OPTIMIZATIONS
  • 5. DIGITAL MEDIA DATA WAREHOUSE WHAT IS OBJECT STORAGE IS GOOD FOR ? ARCHIVE MULTI-PETABYTE! BACKUP
  • 6. OBJECT STORAGE PERFORMANCE & SIZING ? ● Executed ~1500 unique tests ● Evaluated Small, Medium & Large Object sizes ● Storing millions of Objects (~130 Million to be precise) ■ 400 hours of testing for one of the scenario ● Have seen performance improvements as high as ~500% ● Testing was conducted on QCT sponsored LAB LAB PARTNER
  • 9. Hardware Configurations Tested HIGH DENSITY SERVERSSTANDARD DENSITY SERVERS ● 6x OSD Nodes ○ 12x HDD (7.2K rpm, 6TB SATA) ○ 1x Intel P3700 800G NVMe ○ 128G Memory ○ 1x Intel Xeon E5-2660 v3 ○ 1x 40GbE ● 4x RGW Nodes ○ 1x 40GbE ○ 96G Memory ○ 1x Intel Xeon E5-2670 v3 ● 8x Client Nodes ○ 1x 10GbE ○ 96G Memory ○ 1x Intel Xeon E5-2670 ● 1x Ceph MON (*) ○ 1x 10GbE ○ 96G Memory ○ 1x Intel Xeon E5-2670 ● 6x OSD Nodes ○ 35x HDD (7.2K rpm, 6TB SATA) ○ 2x Intel P3700 800G NVMe ○ 128G Memory ○ 2x Intel Xeon E5-2660 v3 ○ 1x 40GbE ● 4x RGW Nodes ○ 1x 40GbE ○ 96G Memory ○ 1x Intel Xeon E5-2670 v3 ● 8x Client Nodes ○ 1x 10GbE ○ 96G Memory ○ 1x Intel Xeon E5-2670 ● 1x Ceph MON(*) ○ 1x 10GbE ○ 96G Memory ○ 1x Intel Xeon E5-2670 LAB PARTNER (*) For production usage use 3 MONs at least
  • 10. Lab Setup Front View Rear View LAB PARTNER
  • 11. Benchmarking Methodology & Tools CEPH Red Hat Ceph Storage 2.0 ( Jewel ) OS Red Hat Enterprise Linux 7.2 TOOLS ● Ceph Benchmarking Tool (CBT) ● FIO 2.11-12 ● iPerf3 ● COSBench 0.4.2.c3 ● Intel Cache Acceleration Software (CAS) 03.01.01
  • 12. Payload Selection ● 64K - Small size object (Thumbnail images, small files etc.) ● 1M - Medium size object (Images, text files etc.) ● 32M - Large size object (Analytics Engines , HD images , log files, backup etc.) ● 64M - Large size object (Analytics Engines , Videos etc. )
  • 13. Optimizing for Small Object Operations (OPS)
  • 14. 14 Small Object (Read Ops) Higher Read ops could have been achieved by adding more RGW hosts, until reaching disk saturation. Bucket Index on flash media was optimal for small object workload ● Client read ops scaled linearly while increasing RGW hosts (OSDs on standard density servers) ● Performance limited by number of RGW hosts available in our test environment ● Best observed perf. was 8900 Read OPS with bucket index on Flash Media on standard density servers
  • 15. Read Operations are simple !! … Let’s Talk About Write …
  • 16. RGW Object Consists of Object Index Object Data TAIL TAIL TAIL TAILHEAD
  • 17. RADOS GATEWAY rgw.buckets.index Pool rgw.buckets.data Pool TAIL TAIL TAIL TAILHEAD RGW Write Operation Sends write req. to RGW
  • 18. Can we Improve This ? … Yes We Can !!
  • 19. Put bucket index on Flash Media RADOS GATEWAY rgw.buckets.index Pool rgw.buckets.data Pool TAIL TAIL TAIL TAILHEAD FlashFlash Sends write req. to RGW
  • 20. 20 Small Object (Write Ops) ● Client write ops scaled sub-linearly while increasing RGW hosts ● Performance limited by disk saturation on Ceph OSD hosts ● Best observed Write OPS was 5000 on High Density Servers with bucket index on flash media Higher write ops could have been achieved by adding more OSD hosts
  • 21. Small Object (Write Ops) Saturating HDDs
  • 22. Latency Improvements SERVER CONFIGURATION TOTAL OSDs IN CLUSTER OBJECT SIZE BUCKET INDEX CONFIGURATION AVERAGE READ LATENCY (ms) AVERAGE WRITE LATENCY (ms) Standard Density 72 64 K HDD Index 142 318 Standard Density 72 64 K NVMe Index 57 229 High Density 210 64 K HDD Index 51 176 High Density 210 64 K NVMe Index 54 104 ● Configuring Bucket Index on flash helped reduce: ● P99 Read latency reduced by ~149% (Standard density servers) ● P99 Write latency reduced by ~69% (High density servers)
  • 23. Optimizing for Large Object Throughput (MBps)
  • 24. 24 ● Client read throughput scaled near-linearly while increasing RGW hosts ● Performance limited by number of RGW hosts available in our test environment ● Best observed read throughput was ~4GB/s with 32M object size on standard density servers Large Object (Read MBps) Higher read throughput could have been achieved by adding more RGW hosts
  • 25. Read Operations are simple !! … Let’s Talk About Write …
  • 26. RADOS GATEWAY RGW Write Operation Sends write req. to RGW
  • 27. 4MB 4MB 4MB 4MB RGW Object Stripes RADOS GATEWAY RGW Write Operation Sends write req. to RGW
  • 28. 4MB 4MB 4MB 4MB RGW Object Stripes C C C C C CC C RGW Object Chunks (512K Default) RADOS GATEWAY RGW Write Operation Sends write req. to RGW
  • 29. 4MB 4MB 4MB 4MB RGW Object Stripes C C C C C CC C RGW Object Chunks (512K Default) K K K K M M EC ( K + M ) Chunks RADOS GATEWAY RGW Write Operation Sends write req. to RGW
  • 30. 4MB 4MB 4MB 4MB RGW Object Stripes C C C C C CC C RGW Object Chunks (512K Default) K K K K M M EC ( K + M ) Chunks OSD OSD OSD OSD OSD OSD EC Chunks Written to OSD/Disk RADOS GATEWAY RGW Write Operation Sends write req. to RGW
  • 31. Can we Improve This ? … Yes We Can !!
  • 32. 4MB 4MB 4MB 4MB RGW Object Stripes C C C C C CC C RGW Object Chunks (512K Default) K K K K M M EC ( K + M ) Chunks OSD OSD OSD OSD OSD OSD EC Chunks Written to OSD/Disk RADOS GATEWAY RGW Write Operation (Before Tuning) Sends write req. to RGW
  • 33. 4MB 4MB 4MB 4MB RGW Object Stripes C C C C C CC C RGW Object Chunks (512K Default) K K K K M M EC ( K + M ) Chunks OSD OSD OSD OSD OSD OSD EC Chunks Written to OSD/Disk Tune This Stage RADOS GATEWAY RGW Write Operation (Where to Tune) Sends write req. to RGW
  • 34. 4MB 4MB 4MB 4MB RGW Object Stripes C RGW Object Chunks (4M Tuned) K K K K M M EC ( K + M ) Chunks OSD OSD OSD OSD OSD OSD EC Chunks Written to OSD/Disk Reduced RGW Chunks RADOS GATEWAY RGW Write Operation (After Tuning) Sends write req. to RGW
  • 35. 35 DEFAULT vs. TUNED Tuning rgw_max_chunk_size to 4M , helped improve write throughput ~40% Higher Throughput After Tuning Improved Large Object Write (MBps)
  • 36. 36 ● Client write throughput scaled near-linearly while increasing RGW hosts ● Best observed (untuned) write throughput was 2.1 GB/s with 32M object size on high density servers Large Object Write (MBps) Where is the Bottleneck ??
  • 37. Large Object Write Bottleneck : DISK SATURATION
  • 38. 38 ● Performance limited by disk saturation on Ceph OSD hosts Large Object Write (MBps) Higher write throughput could have been achieved by adding more OSD hosts Disk Saturation
  • 39. Designing for Higher Object Count (100M+ Objects)
  • 40. Designing for Higher Object Count 100M+ OBJECTS ● Tested Configurations ○ Default OSD Filestore settings ○ Tuned OSD Filestore settings ○ Default OSD Filestore settings + Intel CAS (Metadata Caching) ○ Tuned OSD Filestore settings + Intel CAS (Metadata Caching) ● Test Details ○ High Density Servers ○ 64K object size ○ 50 Hours each test run ( 200 Hrs total testing ) ○ Cluster filled up to ~130 Million Objects
  • 41. 41 Best Config: Default OSD Filestore settings + Intel CAS (Metadata caching) Designing for Higher Object Count (Read)
  • 42. 42 Designing for Higher Object Count (Read) COMPARING DIFFERENT CONFIGURATIONS
  • 43. 43 Best Config: Default OSD Filestore settings + Intel CAS (Metadata caching) Designing for Higher Object Count (Write)
  • 44. 44 Designing for Higher Object Count (Write) COMPARING DIFFERENT CONFIGURATIONS
  • 45. ● Steady write latency with Intel CAS, as the cluster grew from 0 to 100M+ objects ● ~100% lower as compared to Default Ceph configuration Higher Object Count Test (Latency)
  • 46. Secrets to Object Storage Performance (Key takeaways)
  • 47. ● Object size ● Object count ● Server density ● Client : RGW : OSD Ratio ● Bucket index placement scheme ● Caching ● Data protection scheme ● Price/Performance #1 Architectural Consideration for object storage cluster
  • 48. ● 12 Bay OSD Hosts ● 10GbE RGW Hosts ● Bucket Indices on Flash Media #2 Recommendations for Small Object Workloads
  • 49. #3 Recommendations for Large Object Workloads ● 12 Bay OSD Hosts, 10GbE - Performance ● 35 Bay OSD Hosts, 40GbE - Price / Performance ● 10GbE RGW Hosts ● Tune rgw_max_chunk_size to 4M ● Bucket Indices on Flash Media Caution : Do not increase rgw_max_chunk_size beyond 4M , this causes OSD slow requests and OSD flapping issues.
  • 50. ● 35 Bay OSD Hosts, 40GbE (Price/Performance) ● Bucket Indices on Flash Media ● Intel Cache Acceleration Software (CAS) ○ Only Metadata Caching #4 Recommendations for High Object Count Workloads
  • 51. For Small Object (64K) 100% Write ● 1 RGW for every 50 OSDs (HDD) For Large Object (32M) 100% Write ● 1 RGW for every 100 OSDs (HDD) #5 Recommendations for RGW Sizing
  • 52. #6 RGW : 10GbE or 40GbE / Dedicated or Colo ? ● Take Away : Dedicated RGW node with 1 x 10GbE connectivity
  • 53. Shared Data Lake Solution Using Ceph Object Storage (Teaser)
  • 54. Discontinuity in Big Data Infrastructure - Why? hadoop Before hadoop, spark, hive, tez, presto, impala, kafka, hbase, etc. Now ● Congestion ● Multiple teams competing causing delay in response from analytics applications for the same big data resources
  • 55. Discontinuity Presents Choice ? Compute Storage C C C Shared Storage C S C S Get a bigger cluster ● Lacks isolation - still have noisy neighbors ● Lacks elasticity - rigid cluster size ● Can’t scale compute/storage costs separately Get more clusters ● Cost of duplicating big datasets ● Lacks on-demand provisioning ● Can’t scale compute/storage costs separately On-demand Compute and Storage pools ● Isolation of high-priority workloads ● Shared big datasets ● On-demand provisioning ● Compute/storage costs scale separately C S C S
  • 56. COMPUTE HDFS TMP COMPUTE HDFS TMP COMPUTE HDFS TMP Data Analytics Emerging Patterns Multiple analytic clusters, provisioned on-demand, sourcing from a common object store STORAGE ANALYTIC CLUSTER 1 COMPUTE STORAGE COMPUTE STORAGE COMPUTE STORAGE WORKER HADOOP CLUSTER 1 ANALYTIC CLUSTER 2 ANALYTIC CLUSTER 3 COMMON PERSISTENT STORE 2 3
  • 57. Red Hat Elastic Infrastructure for Analytics Kafka compute instances Hive/MapReduce compute instances Gold SLA Spark SQL compute Bronze SLA Elastic Compute Resource Pool Shared Data Lake on Presto compute instances Platinum SLA Interactive query Spark compute instances Silver SLA Batch Query & Joins Ingest ETL S3A
  • 58. Analytics vendors focus on analytics. Red Hat on infrastructure. Analytics Red Hat provides infrastructure software Analytics vendors provide analytics software OpenStack or OpenShift Provisioned Compute Pool InfrastructureShared Data Lake on Ceph Object Storage Red Hat Elastic Infrastructure for Analytics
  • 59. Red Hat Solution Development Lab (at QCT) TPC-DS & logsynth data gen Hive or Spark ORC or parquet ETL (1/10/100TB) n concurrent Impala compute n concurrent Spark SQL 54 deep TPC-DS queries Spark SQL compute n concurrent Spark SQL Enrichment dataset joins - 100 days of logs Interactive query Batch query ETL Transformation Ingest Ceph RBD block volume pool (Intermediate shuffle files, hadoop.tmp, yarn.log, etc) Ceph S3A object bucket pool (source data, transformed data) HA Proxy tier 20x Ceph hosts (700 OSDs) RGW tier (12x)
  • 60. Shared Data Lake on Ceph Object Storage Reference Architecture Coming This Fall ...