SlideShare a Scribd company logo
Reddy Chagam – Principal Engineer, Storage Architect
Stephen L Blinick – Senior Cloud Storage Performance Engineer
Acknowledgments: Warren Wang, Anton Thaker (WalMart)
Orlando Moreno, Vishal Verma (Intel)
Intel technologies’ features and benefits depend on system configuration and may require
enabled hardware, software or service activation. Performance varies depending on system
configuration. No computer system can be absolutely secure. Check with your system
manufacturer or retailer or learn more at http://intel.com.
Software and workloads used in performance tests may have been optimized for performance
only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are
measured using specific computer systems, components, software, operations and functions.
Any change to any of those factors may cause the results to vary. You should consult other
information and performance tests to assist you in fully evaluating your contemplated purchases,
including the performance of that product when combined with other products.
§ Configurations: Ceph v0.94.3 Hammer Release, CentOS 7.1, 3.10-229 Kernel, Linked with JEMalloc 3.6, CBT used for testing and data acquisition, OSD
System Config: Intel Xeon E5-2699 v3 2x@ 2.30 GHz, 72 cores w/ HT, 96GB, Cache 46080KB, 128GB DDR4, Each system with 4x P3700 800GB NVMe, partitioned into 4
OSD’s each, 16 OSD’s total per node, FIO Client Systems: Intel Xeon E5-2699 v3 2x@ 2.30 GHz, 72 cores w/ HT, 96GB, Cache 46080KB, 128GB DDR4, Single 10GbE
network for client & replication data transfer. FIO 2.2.8 with LibRBD engine. Tests run by Intel DCG Storage Group iin Intel lab. Ceph configuration and CBT YAML file
provided in backup slides.
§ For more information go to http://www.intel.com/performance.
Intel, Intel Inside and the Intel logo are trademarks of Intel Corporation in the United States and
other countries. *Other names and brands may be claimed as the property of others.
© 2015 Intel Corporation.
12
DCG Storage Group 13
Agenda
• The transition to flash and the impact of NVMe
• NVMe technology with Ceph
• Cassandra & Ceph – a case for storage convergence
• The all-NVMe high-density Ceph Cluster
• Raw performance measurements and observations
• Examining performance of a Cassandra DB like workload
DCG Storage Group
Evolution of Non-Volatile Memory Storage Devices
PCIe
NVMe
10s us
>10 DW/day
<10 DW/day
100s K
10s K
PCIe NVMe
GB/s
SATA/SAS
SSDs
~100s MB/s
HDDs
~sub 100
MB/s
SATA/SAS
SSDs
100s us
HDDs
~ ms
IOPsEndurance
4K Read Latency
PCI Express® (PCIe)
NVM Express™ (NVMe)
3D XPoint™
DIMMs
3D XPoint
NVM SSDs
NVM Plays a Key Role in Delivering Performance for latency sensitive workloads
DCG Storage Group 15
Ceph Workloads
StoragePerformance
(IOPS,Throughput)
Storage Capacity
(PB)
Lower Higher
LowerHigher
Boot
Volumes
CDN
Enterprise
Dropbox
Backup,
Archive
Remote
Disks
VDI
App
Storage
BigData
Mobile
Content
Depot
Databases
Block
Object
NVM
Focus
Test &
Dev
Cloud
DVR
HPC
DCG Storage Group 16
Caching
Ceph - NVM Usages
Virtual Machine
Baremetal
RADOS
Node
Hypervisor
Guest
VM
Qemu/VirtioQemu/Virtio
ApplicationApplication
Kernel
User
RBD DriverRBD Driver
RADOSRADOS
ApplicationApplication
RADOS
Protocol
RADOS
Protocol
RBDRBD
RADOSRADOS
RADOS Protocol RADOS Protocol
OSDOSD
JournalJournal FilestoreFilestore
NVMNVM
File SystemFile System
10GbE
Client caching w/
write through
NVM
NVMNVM
NVM
NVMNVM
Journaling
Read cache
OSD data
DCG Storage Group 17
Cassandra – What and Why?
Cassandra Ring
p1
p1
p20
p5
p3
p6
p5
p2 p4p8
p10
p7
Client
• Cassandra is column-oriented NoSQL DB with CQL
interface
 Each row has unique key which is used for partitioning
 No relations
 A row can have multiple columns – not necessarily same no. of
columns
• Open source, distributed, decentralized, highly available,
linearly scalable, multi DC, …..
• Used for analytics, real-time insights, fraud-detection,
IOT/sensor data, messaging etc.
Usecases: http://www.planetcassandra.org/apachecassandra-use-cases/
• Ceph is a popular open source unified storage platform
• Many large scale Ceph deployments in production
• End customers prefer converged infrastructure to
support multiple workloads (e.g. analytics) to achieve
CapEx, OpEx savings
• Several customers are asking for Cassandra workload on
Ceph
DCG Storage Group
IP Fabric
18
Ceph and Cassandra Integration
Virtual Machine
Hypervisor
Guest VM
Qemu/VirtioQemu/Virtio
ApplicationApplication
RBDRBD
RADOSRADOS
CassandraCassandra
Virtual Machine
Hypervisor
Guest VM
Qemu/VirtioQemu/Virtio
ApplicationApplication
RBDRBD
RADOSRADOS
CassandraCassandra
Virtual Machine
Hypervisor
Guest VM
Qemu/VirtioQemu/Virtio
ApplicationApplication
RBDRBD
RADOSRADOS
CassandraCassandra
Ceph Storage Cluster
SSD SSD
OSDOSDOSDOSD OSDOSD
SSD SSD
OSDOSDOSDOSD OSDOSD
SSD SSD
OSDOSDOSDOSD OSDOSD
SSD SSD
OSDOSDOSDOSD OSDOSD
MON MON
Deployment Considerations
• Bootable Ceph volumes
(OS & Cassandra data)
• Cassandra RBD data
volumes
• Data protection
(Cassandra or Ceph)
DCG Storage Group
Ceph Storage Cluster
Hardware Environment Overview
Ceph network (192.168.142.0/24) - 10Gbps
CBT / Zabbix /
Monitoring
CBT / Zabbix /
Monitoring FIO RBD ClientFIO RBD Client
• OSD System Config: Intel Xeon E5-2699 v3 2x@ 2.30 GHz, 72 cores w/ HT, 96GB, Cache 46080KB, 128GB DDR4
• Each system with 4x P3700 800GB NVMe, partitioned into 4 OSD’s each, 16 OSD’s total per node
• FIO Client Systems: Intel Xeon E5-2699 v3 2x@ 2.30 GHz, 72 cores w/ HT, 96GB, Cache 46080KB, 128GB DDR4
• Ceph v0.94.3 Hammer Release, CentOS 7.1, 3.10-229 Kernel, Linked with JEMalloc 3.6
• CBT used for testing and data acquisition
• Single 10GbE network for client & replication data transfer, Replication factor 2
• OSD System Config: Intel Xeon E5-2699 v3 2x@ 2.30 GHz, 72 cores w/ HT, 96GB, Cache 46080KB, 128GB DDR4
• Each system with 4x P3700 800GB NVMe, partitioned into 4 OSD’s each, 16 OSD’s total per node
• FIO Client Systems: Intel Xeon E5-2699 v3 2x@ 2.30 GHz, 72 cores w/ HT, 96GB, Cache 46080KB, 128GB DDR4
• Ceph v0.94.3 Hammer Release, CentOS 7.1, 3.10-229 Kernel, Linked with JEMalloc 3.6
• CBT used for testing and data acquisition
• Single 10GbE network for client & replication data transfer, Replication factor 2
FIO RBD ClientFIO RBD Client
FIO RBD ClientFIO RBD Client
FIO RBD ClientFIO RBD Client
FIO RBD ClientFIO RBD Client
FIO RBD ClientFIO RBD Client
FatTwin (4x dual-socket XeonE5 v3)
FatTwin (4x dual-socket XeonE5 v3)
CephOSD1CephOSD1
NVMe1NVMe1 NVMe3NVMe3
NVMe2NVMe2 NVMe4NVMe4
CephOSD2CephOSD2
CephOSD3CephOSD3
CephOSD4CephOSD4
CephOSD16CephOSD16
…
CephOSD1CephOSD1
NVMe1NVMe1 NVMe3NVMe3
NVMe2NVMe2 NVMe4NVMe4
CephOSD2CephOSD2
CephOSD3CephOSD3
CephOSD4CephOSD4
CephOSD16CephOSD16
…
CephOSD1CephOSD1
NVMe1NVMe1 NVMe3NVMe3
NVMe2NVMe2 NVMe4NVMe4
CephOSD2CephOSD2
CephOSD3CephOSD3
CephOSD4CephOSD4
CephOSD16CephOSD16
…
CephOSD1CephOSD1
NVMe1NVMe1 NVMe3NVMe3
NVMe2NVMe2 NVMe4NVMe4
CephOSD2CephOSD2
CephOSD3CephOSD3
CephOSD4CephOSD4
CephOSD16CephOSD16
…
CephOSD1CephOSD1
NVMe1NVMe1 NVMe3NVMe3
NVMe2NVMe2 NVMe4NVMe4
CephOSD2CephOSD2
CephOSD3CephOSD3
CephOSD4CephOSD4
CephOSD16CephOSD16
…
SuperMicro 1028U SuperMicro 1028U SuperMicro 1028U SuperMicro 1028U SuperMicro 1028U
Intel Xeon E5 v3 18 Core CPUs
Intel P3700 NVMe PCI-e Flash
Intel Xeon E5 v3 18 Core CPUs
Intel P3700 NVMe PCI-e Flash
Easily serviceable NVMe Drives
DCG Storage Group
• High performance NVMe devices are capable of high parallelism at low latency
• DC P3700 800GB Raw Performance: 460K read IOPS & 90K Write IOPS at QD=128
• By using multiple OSD partitions, Ceph performance scales linearly
• Reduces lock contention within a single OSD process
• Lower latency at all queue-depths, biggest impact to random reads
• Introduces the concept of multiple OSD’s on the same physical device
• Conceptually similar crushmap data placement rules as managing disks in an enclosure
• High Resiliency of “Data Center” Class NVMe devices
• At least 10 Drive writes per day
• Power loss protection, full data path protection, device level telemetry
Multi-partitioning flash devices
NVMe1NVMe1
CephOSD1CephOSD1
CephOSD2CephOSD2
CephOSD3CephOSD3
CephOSD4CephOSD4
Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Any difference in system hardware or
software design or configuration may affect actual performance. See configuration slides in backup for details on software configuration and test benchmark
parameters.
DCG Storage Group 21
Partitioning multiple OSD’s per NVMe
• Multiple OSD’s per NVMe result in higher performance, lower latency, and better CPU utilization
0
2
4
6
8
10
12
0 200,000 400,000 600,000 800,000 1,000,000 1,200,000
AvgLatency(ms)
IOPS
Latency vs IOPS - 4K Random Read - Multiple OSD's per Device comparison
5 nodes, 20/40/80 OSDs, Intel DC P3700 Xeon E5 2699v3 Dual Socket /
128GB Ram / 10GbE
Ceph0.94.3 w/ JEMalloc,
1 OSD/NVMe 2 OSD/NVMe 4 OSD/NVMe
Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Any difference in system hardware or
software design or configuration may affect actual performance. See configuration slides in backup for details on software configuration and test benchmark
parameters.
0
10
20
30
40
50
60
70
80
90
%CPUUtilization
Single Node CPU Utilization Comparison - 4K Random Reads@QD32
4/8/16 OSDs, Intel DC P3700, Xeon E5 2699v3 Dual Socket /
128GB Ram / 10GbE
Ceph0.94.3 w/ JEMalloc
1 OSD/NVMe 2 OSD/NVMe 4 OSD/NVMe
Single OSD
Double OSD
Quad OSD
DCG Storage Group
4K Random Read & Write Performance Summary
22
First Ceph cluster to break 1 Million 4K random IOPS
Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Any difference in system hardware or
software design or configuration may affect actual performance. See configuration slides in backup for details on software configuration and test benchmark
parameters.
Workload Pattern Max IOPS
4K 100% Random Reads (2TB Dataset)
1.35Million
4K 100% Random Reads (4.8TB Dataset)
1.15Million
4K 100% Random Writes (4.8TB Dataset)
200K
4K 70%/30% Read/Write OLTP Mix
(4.8TB Dataset) 452K
DCG Storage Group
0
1
2
3
4
5
6
7
8
9
10
0 200,000 400,000 600,000 800,000 1,000,000 1,200,000 1,400,000
AvgLatency(ms)
IOPS
IODepth Scaling - Latency vs IOPS - Read, Write, and 70/30 4K Random Mix
5 nodes, 60 OSDs, Xeon E5 2699v3 Dual Socket / 128GB Ram / 10GbE
Ceph0.94.3 w/ JEMalloc
100% 4K RandomRead 100% 4K RandomWrite 70/30% 4K Random OLTP 100% 4K RandomRead - 2TB DataSet
4K Random Read & Write Performance and Latency
23
First Ceph cluster to break 1 Million 4K random IOPS, ~1ms response time
171K 100% 4k Random
Write IOPS @ 6ms
400K 70/30% (OLTP) 4k
Random IOPS @~3ms
1M 100% 4k Random
Read IOPS @~1.1ms
Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Any difference in system hardware or
software design or configuration may affect actual performance. See configuration slides in backup for details on software configuration and test benchmark
parameters.
1.35M 4k Random Read
IOPS w/ 2TB Hot Data
DCG Storage Group
Sequential performance (512KB)
24
• With 10gbE per node, both writes and reads are achieving line rate bottlenecked by the OSD node single interface.
• Higher throughputs would be possible through bonding or 40GbE connectivity.
3,214
5,888 5,631
0
1000
2000
3000
4000
5000
6000
7000
100% Write 100% Read 70/30% R/W Mix
MB/s
512k Sequential Performance Bandwidth
5 nodes, 80 OSDs, DC P3700, Xeon E5 2699v3 Dual Socket / 128GB
Ram / 10GbE
Ceph0.94.3 w/ JEMalloc
Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Any difference in system hardware or
software design or configuration may affect actual performance. See configuration slides in backup for details on software configuration and test benchmark
parameters.
DCG Storage Group
Cassandra-like workload
25
242K IOPS at < 2ms latency
• Based on a typical customer cassanda workload profile
• 50% Reads and 50% Writes, predominantly 8K Reads and 12K Writes, FIO Queue depth = 8
78%
19%
3%
8K 5K 7K
92%
5%
12K 33K 115K 50K 80K
0
0.5
1
1.5
2
2.5
0.00
50,000.00
100,000.00
150,000.00
200,000.00
250,000.00
300,000.00
Latency(ms)
IOPS
Cassandra like workload - 50/50 Read/Write Mix
5 nodes, 80 OSDs, Xeon E5 2699v3 Dual Socket / 128GB
Ram / 10GbE
Ceph0.94.3 w/ JEMalloc
IOPS Latency
IO-Size Breakdown
Reads Writes
Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Any difference in system hardware or
software design or configuration may affect actual performance. See configuration slides in backup for details on software configuration and test benchmark
parameters.
DCG Storage Group 26
Summary & Conclusions
• Flash technology including NVMe enables new performance capabilities in small
footprints
• Ceph and Cassandra provide a compelling case for feature-rich converged
storage that can support latency sensitive analytics workloads
• Using the latest standard high-volume servers and Ceph, you can now build an
open, high density, scalable, high performance cluster that can handle a low-
latency mixed workload.
• Ceph performance improvements over recent releases are significant, and today
over 1 Million random IOPS is achievable in 5U with ~1ms latency.
• Next steps:
• Address small block write performance, limited by Filestore backend
• Improve long tail latency for transactional workloads
Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Any difference in system hardware or
software design or configuration may affect actual performance. See configuration slides in backup for details on software configuration and test benchmark
parameters.
Thank you!
28
DCG Storage Group 29
Configuration Detail – ceph.conf
Section Perf. Tuning Parameter Default Tuned
[global]
Authentication
auth_client_required cephx none
auth_cluster_required cephx none
auth_service_required cephx none
Debug logging
debug_lockdep 0/1 0/0
debug_context 0/1 0/0
debug_crush 1/1 0/0
debug_buffer 0/1 0/0
debug_timer 0/1 0/0
debug_filer 0/1 0/0
debug_objector 0/1 0/0
debug_rados 0/5 0/0
debug_rbd 0/5 0/0
debug_ms 0/5 0/0
debug_monc 0/5 0/0
debug_tp 0/5 0/0
debug_auth 1/5 0/0
debug_finisher 1/5 0/0
debug_heartbeatmap 1/5 0/0
debug_perfcounter 1/5 0/0
debug_rgw 1/5 0/0
debug_asok 1/5 0/0
debug_throttle 1/1 0/0
DCG Storage Group 30
Configuration Detail – ceph.conf (continued)Section Perf. Tuning Parameter Default Tuned
[global]
CBT specific
mon_pg_warn_max_object_skew 10 10000
mon_pg_warn_min_per_osd 0 0
mon_pg_warn_max_per_osd 32768 32768
osd_pg_bits 8 8
osd_pgp_bits 8 8
RBD cache rbd_cache true true
Other
mon_compact_on_trim true false
log_to_syslog false false
log_file /var/log/ceph/$name.log /var/log/ceph/$name.log
perf true true
mutex_perf_counter false true
throttler_perf_counter true false
[mon] CBT specific
mon_data /var/lib/ceph/mon/ceph-0 /home/bmpa/tmp_cbt/ceph/mon.$id
mon_max_pool_pg_num 65536 166496
mon_osd_max_split_count 32 10000
[osd]
Filestore parameters
filestore_wbthrottle_enable true false
filestore_queue_max_bytes 104857600 1048576000
filestore_queue_committing_max_bytes 104857600 1048576000
filestore_queue_max_ops 50 5000
filestore_queue_committing_max_ops 500 5000
filestore_max_sync_interval 5 10
filestore_fd_cache_size 128 64
filestore_fd_cache_shards 16 32
filestore_op_threads 2 6
Mount parameters
osd_mount_options_xfs rw,noatime,inode64,logbsize=256k,delaylog
osd_mkfs_options_xfs -f -i size=2048
Journal parameters
journal_max_write_entries 100 1000
journal_queue_max_ops 300 3000
journal_max_write_bytes 10485760 1048576000
journal_queue_max_bytes 33554432 1048576000
Op tracker osd_enable_op_tracker true false
OSD client
osd_client_message_size_cap 524288000 0
osd_client_message_cap 100 0
Objecter
objecter_inflight_ops 1024 102400
objecter_inflight_op_bytes 104857600 1048576000
Throttles ms_dispatch_throttle_bytes 104857600 1048576000
OSD number of threads
osd_op_threads 2 32
osd_op_num_shards 5 5
osd_op_num_threads_per_shard 2 2
DCG Storage Group 31
Configuration Detail - CBT YAML File
cluster:
user: "bmpa"
head: "ft01"
clients: ["ft01", "ft02", "ft03", "ft04", "ft05", "ft06"]
osds: ["hswNode01", "hswNode02", "hswNode03", "hswNode04", "hswNode05"]
mons:
ft02:
a: "192.168.142.202:6789"
osds_per_node: 8
fs: xfs
mkfs_opts: '-f -i size=2048 -n size=64k'
mount_opts: '-o inode64,noatime,logbsize=256k'
conf_file: '/home/bmpa/cbt/ceph_nvme_2partition_5node_hsw.conf'
use_existing: False
rebuild_every_test: False
clusterid: "ceph"
iterations: 1
tmp_dir: "/home/bmpa/tmp_cbt"
pool_profiles:
2rep:
pg_size: 4096
pgp_size: 4096
replication: 2
DCG Storage Group 32
Configuration Detail - CBT YAML File (Continued)
benchmarks:
librbdfio:
time: 300
ramp: 600
vol_size: 81920
mode: ['randrw‘]
rwmixread: [0, 70, 100]
op_size: [4096]
procs_per_volume: [1]
volumes_per_client: [10]
use_existing_volumes: False
iodepth: [4, 8, 16, 32, 64, 96, 128]
osd_ra: [128]
norandommap: True
cmd_path: '/usr/bin/fio'
pool_profile: '2rep'
log_avg_msec: 250
DCG Storage Group 33
Storage Node Diagram
Two CPU Sockets: Socket 0 and Socket 1
 Socket 0
• 2 NVMes
• Intel X540-AT2 (10Gbps)
• 64GB: 8x 8GB 2133 DIMMs
 Socket 1
• 2 NVMes
• 64GB: 8x 8GB 2133 DIMMs
Explore additional
optimizations using
cgroups, IRQ affinity
DCG Storage Group
• Generally available server designs built for high density and high performance
• High density 1U standard high volume server
• Dual socket 3rd Generation Xeon E5 (2699v3)
• 10 Front-removable 2.5” Formfactor Drive slots, 8639 connector
• Multiple 10Gb network ports, additional slots for 40Gb networking
• Intel DC P3700 NVMe drives are available in 2.5” drive form-factor
• Allowing easier service in a datacenter environment
High Performance Ceph Node Hardware Building
Blocks
Accelerating Cassandra Workloads on Ceph with All-Flash PCIE SSDS

More Related Content

What's hot

不揮発メモリ(NVDIMM)とLinuxの対応動向について(for comsys 2019 ver.)
不揮発メモリ(NVDIMM)とLinuxの対応動向について(for comsys 2019 ver.)不揮発メモリ(NVDIMM)とLinuxの対応動向について(for comsys 2019 ver.)
不揮発メモリ(NVDIMM)とLinuxの対応動向について(for comsys 2019 ver.)
Yasunori Goto
 
Hadoop -NameNode HAの仕組み-
Hadoop -NameNode HAの仕組み-Hadoop -NameNode HAの仕組み-
Hadoop -NameNode HAの仕組み-
Yuki Gonda
 
Introduction to Amazon Redshift
Introduction to Amazon RedshiftIntroduction to Amazon Redshift
Introduction to Amazon Redshift
Amazon Web Services
 
Memory Management in Apache Spark
Memory Management in Apache SparkMemory Management in Apache Spark
Memory Management in Apache Spark
Databricks
 
不揮発メモリ(NVDIMM)とLinuxの対応動向について
不揮発メモリ(NVDIMM)とLinuxの対応動向について不揮発メモリ(NVDIMM)とLinuxの対応動向について
不揮発メモリ(NVDIMM)とLinuxの対応動向について
Yasunori Goto
 
Hyperspace: An Indexing Subsystem for Apache Spark
Hyperspace: An Indexing Subsystem for Apache SparkHyperspace: An Indexing Subsystem for Apache Spark
Hyperspace: An Indexing Subsystem for Apache Spark
Databricks
 
Overview of new features in Apache Ranger
Overview of new features in Apache RangerOverview of new features in Apache Ranger
Overview of new features in Apache Ranger
DataWorks Summit
 
大量のデータ処理や分析に使えるOSS Apache Spark入門(Open Source Conference 2021 Online/Kyoto 発表資料)
大量のデータ処理や分析に使えるOSS Apache Spark入門(Open Source Conference 2021 Online/Kyoto 発表資料)大量のデータ処理や分析に使えるOSS Apache Spark入門(Open Source Conference 2021 Online/Kyoto 発表資料)
大量のデータ処理や分析に使えるOSS Apache Spark入門(Open Source Conference 2021 Online/Kyoto 発表資料)
NTT DATA Technology & Innovation
 
Supermicro’s Universal GPU: Modular, Standards Based and Built for the Future
Supermicro’s Universal GPU: Modular, Standards Based and Built for the FutureSupermicro’s Universal GPU: Modular, Standards Based and Built for the Future
Supermicro’s Universal GPU: Modular, Standards Based and Built for the Future
Rebekah Rodriguez
 
Presto ベースのマネージドサービス Amazon Athena
Presto ベースのマネージドサービス Amazon AthenaPresto ベースのマネージドサービス Amazon Athena
Presto ベースのマネージドサービス Amazon Athena
Amazon Web Services Japan
 
ChatGPTのデータソースにPostgreSQLを使う[詳細版](オープンデベロッパーズカンファレンス2023 発表資料)
ChatGPTのデータソースにPostgreSQLを使う[詳細版](オープンデベロッパーズカンファレンス2023 発表資料)ChatGPTのデータソースにPostgreSQLを使う[詳細版](オープンデベロッパーズカンファレンス2023 発表資料)
ChatGPTのデータソースにPostgreSQLを使う[詳細版](オープンデベロッパーズカンファレンス2023 発表資料)
NTT DATA Technology & Innovation
 
NetApp & Storage fundamentals
NetApp & Storage fundamentalsNetApp & Storage fundamentals
NetApp & Storage fundamentals
Shashidhar Basavaraju
 
使いこなそうGUC
使いこなそうGUC使いこなそうGUC
使いこなそうGUC
Akio Ishida
 
ストリーム処理におけるApache Avroの活用について(NTTデータ テクノロジーカンファレンス 2019 講演資料、2019/09/05)
ストリーム処理におけるApache Avroの活用について(NTTデータ テクノロジーカンファレンス 2019 講演資料、2019/09/05)ストリーム処理におけるApache Avroの活用について(NTTデータ テクノロジーカンファレンス 2019 講演資料、2019/09/05)
ストリーム処理におけるApache Avroの活用について(NTTデータ テクノロジーカンファレンス 2019 講演資料、2019/09/05)
NTT DATA Technology & Innovation
 
分析指向データレイク実現の次の一手 ~Delta Lake、なにそれおいしいの?~(NTTデータ テクノロジーカンファレンス 2020 発表資料)
分析指向データレイク実現の次の一手 ~Delta Lake、なにそれおいしいの?~(NTTデータ テクノロジーカンファレンス 2020 発表資料)分析指向データレイク実現の次の一手 ~Delta Lake、なにそれおいしいの?~(NTTデータ テクノロジーカンファレンス 2020 発表資料)
分析指向データレイク実現の次の一手 ~Delta Lake、なにそれおいしいの?~(NTTデータ テクノロジーカンファレンス 2020 発表資料)
NTT DATA Technology & Innovation
 
これからLDAPを始めるなら 「389-ds」を使ってみよう
これからLDAPを始めるなら 「389-ds」を使ってみようこれからLDAPを始めるなら 「389-ds」を使ってみよう
これからLDAPを始めるなら 「389-ds」を使ってみよう
Nobuyuki Sasaki
 
C12 AlwaysOn 可用性グループとデータベースミラーリングのIO特製の比較 by 多田典史
C12 AlwaysOn 可用性グループとデータベースミラーリングのIO特製の比較 by 多田典史C12 AlwaysOn 可用性グループとデータベースミラーリングのIO特製の比較 by 多田典史
C12 AlwaysOn 可用性グループとデータベースミラーリングのIO特製の比較 by 多田典史Insight Technology, Inc.
 
Using Time Window Compaction Strategy For Time Series Workloads
Using Time Window Compaction Strategy For Time Series WorkloadsUsing Time Window Compaction Strategy For Time Series Workloads
Using Time Window Compaction Strategy For Time Series Workloads
Jeff Jirsa
 
Apache Kafkaって本当に大丈夫?~故障検証のオーバービューと興味深い挙動の紹介~
Apache Kafkaって本当に大丈夫?~故障検証のオーバービューと興味深い挙動の紹介~Apache Kafkaって本当に大丈夫?~故障検証のオーバービューと興味深い挙動の紹介~
Apache Kafkaって本当に大丈夫?~故障検証のオーバービューと興味深い挙動の紹介~
NTT DATA OSS Professional Services
 
Hiveを高速化するLLAP
Hiveを高速化するLLAPHiveを高速化するLLAP
Hiveを高速化するLLAP
Yahoo!デベロッパーネットワーク
 

What's hot (20)

不揮発メモリ(NVDIMM)とLinuxの対応動向について(for comsys 2019 ver.)
不揮発メモリ(NVDIMM)とLinuxの対応動向について(for comsys 2019 ver.)不揮発メモリ(NVDIMM)とLinuxの対応動向について(for comsys 2019 ver.)
不揮発メモリ(NVDIMM)とLinuxの対応動向について(for comsys 2019 ver.)
 
Hadoop -NameNode HAの仕組み-
Hadoop -NameNode HAの仕組み-Hadoop -NameNode HAの仕組み-
Hadoop -NameNode HAの仕組み-
 
Introduction to Amazon Redshift
Introduction to Amazon RedshiftIntroduction to Amazon Redshift
Introduction to Amazon Redshift
 
Memory Management in Apache Spark
Memory Management in Apache SparkMemory Management in Apache Spark
Memory Management in Apache Spark
 
不揮発メモリ(NVDIMM)とLinuxの対応動向について
不揮発メモリ(NVDIMM)とLinuxの対応動向について不揮発メモリ(NVDIMM)とLinuxの対応動向について
不揮発メモリ(NVDIMM)とLinuxの対応動向について
 
Hyperspace: An Indexing Subsystem for Apache Spark
Hyperspace: An Indexing Subsystem for Apache SparkHyperspace: An Indexing Subsystem for Apache Spark
Hyperspace: An Indexing Subsystem for Apache Spark
 
Overview of new features in Apache Ranger
Overview of new features in Apache RangerOverview of new features in Apache Ranger
Overview of new features in Apache Ranger
 
大量のデータ処理や分析に使えるOSS Apache Spark入門(Open Source Conference 2021 Online/Kyoto 発表資料)
大量のデータ処理や分析に使えるOSS Apache Spark入門(Open Source Conference 2021 Online/Kyoto 発表資料)大量のデータ処理や分析に使えるOSS Apache Spark入門(Open Source Conference 2021 Online/Kyoto 発表資料)
大量のデータ処理や分析に使えるOSS Apache Spark入門(Open Source Conference 2021 Online/Kyoto 発表資料)
 
Supermicro’s Universal GPU: Modular, Standards Based and Built for the Future
Supermicro’s Universal GPU: Modular, Standards Based and Built for the FutureSupermicro’s Universal GPU: Modular, Standards Based and Built for the Future
Supermicro’s Universal GPU: Modular, Standards Based and Built for the Future
 
Presto ベースのマネージドサービス Amazon Athena
Presto ベースのマネージドサービス Amazon AthenaPresto ベースのマネージドサービス Amazon Athena
Presto ベースのマネージドサービス Amazon Athena
 
ChatGPTのデータソースにPostgreSQLを使う[詳細版](オープンデベロッパーズカンファレンス2023 発表資料)
ChatGPTのデータソースにPostgreSQLを使う[詳細版](オープンデベロッパーズカンファレンス2023 発表資料)ChatGPTのデータソースにPostgreSQLを使う[詳細版](オープンデベロッパーズカンファレンス2023 発表資料)
ChatGPTのデータソースにPostgreSQLを使う[詳細版](オープンデベロッパーズカンファレンス2023 発表資料)
 
NetApp & Storage fundamentals
NetApp & Storage fundamentalsNetApp & Storage fundamentals
NetApp & Storage fundamentals
 
使いこなそうGUC
使いこなそうGUC使いこなそうGUC
使いこなそうGUC
 
ストリーム処理におけるApache Avroの活用について(NTTデータ テクノロジーカンファレンス 2019 講演資料、2019/09/05)
ストリーム処理におけるApache Avroの活用について(NTTデータ テクノロジーカンファレンス 2019 講演資料、2019/09/05)ストリーム処理におけるApache Avroの活用について(NTTデータ テクノロジーカンファレンス 2019 講演資料、2019/09/05)
ストリーム処理におけるApache Avroの活用について(NTTデータ テクノロジーカンファレンス 2019 講演資料、2019/09/05)
 
分析指向データレイク実現の次の一手 ~Delta Lake、なにそれおいしいの?~(NTTデータ テクノロジーカンファレンス 2020 発表資料)
分析指向データレイク実現の次の一手 ~Delta Lake、なにそれおいしいの?~(NTTデータ テクノロジーカンファレンス 2020 発表資料)分析指向データレイク実現の次の一手 ~Delta Lake、なにそれおいしいの?~(NTTデータ テクノロジーカンファレンス 2020 発表資料)
分析指向データレイク実現の次の一手 ~Delta Lake、なにそれおいしいの?~(NTTデータ テクノロジーカンファレンス 2020 発表資料)
 
これからLDAPを始めるなら 「389-ds」を使ってみよう
これからLDAPを始めるなら 「389-ds」を使ってみようこれからLDAPを始めるなら 「389-ds」を使ってみよう
これからLDAPを始めるなら 「389-ds」を使ってみよう
 
C12 AlwaysOn 可用性グループとデータベースミラーリングのIO特製の比較 by 多田典史
C12 AlwaysOn 可用性グループとデータベースミラーリングのIO特製の比較 by 多田典史C12 AlwaysOn 可用性グループとデータベースミラーリングのIO特製の比較 by 多田典史
C12 AlwaysOn 可用性グループとデータベースミラーリングのIO特製の比較 by 多田典史
 
Using Time Window Compaction Strategy For Time Series Workloads
Using Time Window Compaction Strategy For Time Series WorkloadsUsing Time Window Compaction Strategy For Time Series Workloads
Using Time Window Compaction Strategy For Time Series Workloads
 
Apache Kafkaって本当に大丈夫?~故障検証のオーバービューと興味深い挙動の紹介~
Apache Kafkaって本当に大丈夫?~故障検証のオーバービューと興味深い挙動の紹介~Apache Kafkaって本当に大丈夫?~故障検証のオーバービューと興味深い挙動の紹介~
Apache Kafkaって本当に大丈夫?~故障検証のオーバービューと興味深い挙動の紹介~
 
Hiveを高速化するLLAP
Hiveを高速化するLLAPHiveを高速化するLLAP
Hiveを高速化するLLAP
 

Viewers also liked

Ceph on Intel: Intel Storage Components, Benchmarks, and Contributions
Ceph on Intel: Intel Storage Components, Benchmarks, and ContributionsCeph on Intel: Intel Storage Components, Benchmarks, and Contributions
Ceph on Intel: Intel Storage Components, Benchmarks, and Contributions
Red_Hat_Storage
 
The Little Warehouse That Couldn't Or: How We Learned to Stop Worrying and Mo...
The Little Warehouse That Couldn't Or: How We Learned to Stop Worrying and Mo...The Little Warehouse That Couldn't Or: How We Learned to Stop Worrying and Mo...
The Little Warehouse That Couldn't Or: How We Learned to Stop Worrying and Mo...
Spark Summit
 
Tachyon-2014-11-21-amp-camp5
Tachyon-2014-11-21-amp-camp5Tachyon-2014-11-21-amp-camp5
Tachyon-2014-11-21-amp-camp5
Haoyuan Li
 
Linux Filesystems, RAID, and more
Linux Filesystems, RAID, and moreLinux Filesystems, RAID, and more
Linux Filesystems, RAID, and more
Mark Wong
 
Lessons Learned with Spark at the US Patent & Trademark Office-(Christopher B...
Lessons Learned with Spark at the US Patent & Trademark Office-(Christopher B...Lessons Learned with Spark at the US Patent & Trademark Office-(Christopher B...
Lessons Learned with Spark at the US Patent & Trademark Office-(Christopher B...
Spark Summit
 
The Hot Rod Protocol in Infinispan
The Hot Rod Protocol in InfinispanThe Hot Rod Protocol in Infinispan
The Hot Rod Protocol in Infinispan
Galder Zamarreño
 
Advanced Data Retrieval and Analytics with Apache Spark and Openstack Swift
Advanced Data Retrieval and Analytics with Apache Spark and Openstack SwiftAdvanced Data Retrieval and Analytics with Apache Spark and Openstack Swift
Advanced Data Retrieval and Analytics with Apache Spark and Openstack Swift
Daniel Krook
 
Scaling up genomic analysis with ADAM
Scaling up genomic analysis with ADAMScaling up genomic analysis with ADAM
Scaling up genomic analysis with ADAM
fnothaft
 
ELC-E 2010: The Right Approach to Minimal Boot Times
ELC-E 2010: The Right Approach to Minimal Boot TimesELC-E 2010: The Right Approach to Minimal Boot Times
ELC-E 2010: The Right Approach to Minimal Boot Times
andrewmurraympc
 
Velox: Models in Action
Velox: Models in ActionVelox: Models in Action
Velox: Models in Action
Dan Crankshaw
 
Naïveté vs. Experience
Naïveté vs. ExperienceNaïveté vs. Experience
Naïveté vs. Experience
Mike Fogus
 
SparkR: Enabling Interactive Data Science at Scale
SparkR: Enabling Interactive Data Science at ScaleSparkR: Enabling Interactive Data Science at Scale
SparkR: Enabling Interactive Data Science at Scale
jeykottalam
 
SampleClean: Bringing Data Cleaning into the BDAS Stack
SampleClean: Bringing Data Cleaning into the BDAS StackSampleClean: Bringing Data Cleaning into the BDAS Stack
SampleClean: Bringing Data Cleaning into the BDAS Stack
jeykottalam
 
A Curious Course on Coroutines and Concurrency
A Curious Course on Coroutines and ConcurrencyA Curious Course on Coroutines and Concurrency
A Curious Course on Coroutines and Concurrency
David Beazley (Dabeaz LLC)
 
Lab 5: Interconnecting a Datacenter using Mininet
Lab 5: Interconnecting a Datacenter using MininetLab 5: Interconnecting a Datacenter using Mininet
Lab 5: Interconnecting a Datacenter using Mininet
Zubair Nabi
 
Spark on Mesos-A Deep Dive-(Dean Wampler and Tim Chen, Typesafe and Mesosphere)
Spark on Mesos-A Deep Dive-(Dean Wampler and Tim Chen, Typesafe and Mesosphere)Spark on Mesos-A Deep Dive-(Dean Wampler and Tim Chen, Typesafe and Mesosphere)
Spark on Mesos-A Deep Dive-(Dean Wampler and Tim Chen, Typesafe and Mesosphere)
Spark Summit
 
Best Practices for Virtualizing Apache Hadoop
Best Practices for Virtualizing Apache HadoopBest Practices for Virtualizing Apache Hadoop
Best Practices for Virtualizing Apache Hadoop
Hortonworks
 
Python in Action (Part 2)
Python in Action (Part 2)Python in Action (Part 2)
Python in Action (Part 2)
David Beazley (Dabeaz LLC)
 

Viewers also liked (20)

Ceph on Intel: Intel Storage Components, Benchmarks, and Contributions
Ceph on Intel: Intel Storage Components, Benchmarks, and ContributionsCeph on Intel: Intel Storage Components, Benchmarks, and Contributions
Ceph on Intel: Intel Storage Components, Benchmarks, and Contributions
 
The Little Warehouse That Couldn't Or: How We Learned to Stop Worrying and Mo...
The Little Warehouse That Couldn't Or: How We Learned to Stop Worrying and Mo...The Little Warehouse That Couldn't Or: How We Learned to Stop Worrying and Mo...
The Little Warehouse That Couldn't Or: How We Learned to Stop Worrying and Mo...
 
Open Stack Cheat Sheet V1
Open Stack Cheat Sheet V1Open Stack Cheat Sheet V1
Open Stack Cheat Sheet V1
 
Tachyon-2014-11-21-amp-camp5
Tachyon-2014-11-21-amp-camp5Tachyon-2014-11-21-amp-camp5
Tachyon-2014-11-21-amp-camp5
 
Linux Filesystems, RAID, and more
Linux Filesystems, RAID, and moreLinux Filesystems, RAID, and more
Linux Filesystems, RAID, and more
 
Lessons Learned with Spark at the US Patent & Trademark Office-(Christopher B...
Lessons Learned with Spark at the US Patent & Trademark Office-(Christopher B...Lessons Learned with Spark at the US Patent & Trademark Office-(Christopher B...
Lessons Learned with Spark at the US Patent & Trademark Office-(Christopher B...
 
The Hot Rod Protocol in Infinispan
The Hot Rod Protocol in InfinispanThe Hot Rod Protocol in Infinispan
The Hot Rod Protocol in Infinispan
 
Advanced Data Retrieval and Analytics with Apache Spark and Openstack Swift
Advanced Data Retrieval and Analytics with Apache Spark and Openstack SwiftAdvanced Data Retrieval and Analytics with Apache Spark and Openstack Swift
Advanced Data Retrieval and Analytics with Apache Spark and Openstack Swift
 
Scaling up genomic analysis with ADAM
Scaling up genomic analysis with ADAMScaling up genomic analysis with ADAM
Scaling up genomic analysis with ADAM
 
ELC-E 2010: The Right Approach to Minimal Boot Times
ELC-E 2010: The Right Approach to Minimal Boot TimesELC-E 2010: The Right Approach to Minimal Boot Times
ELC-E 2010: The Right Approach to Minimal Boot Times
 
Velox: Models in Action
Velox: Models in ActionVelox: Models in Action
Velox: Models in Action
 
Naïveté vs. Experience
Naïveté vs. ExperienceNaïveté vs. Experience
Naïveté vs. Experience
 
SparkR: Enabling Interactive Data Science at Scale
SparkR: Enabling Interactive Data Science at ScaleSparkR: Enabling Interactive Data Science at Scale
SparkR: Enabling Interactive Data Science at Scale
 
SampleClean: Bringing Data Cleaning into the BDAS Stack
SampleClean: Bringing Data Cleaning into the BDAS StackSampleClean: Bringing Data Cleaning into the BDAS Stack
SampleClean: Bringing Data Cleaning into the BDAS Stack
 
OpenStack Cheat Sheet V2
OpenStack Cheat Sheet V2OpenStack Cheat Sheet V2
OpenStack Cheat Sheet V2
 
A Curious Course on Coroutines and Concurrency
A Curious Course on Coroutines and ConcurrencyA Curious Course on Coroutines and Concurrency
A Curious Course on Coroutines and Concurrency
 
Lab 5: Interconnecting a Datacenter using Mininet
Lab 5: Interconnecting a Datacenter using MininetLab 5: Interconnecting a Datacenter using Mininet
Lab 5: Interconnecting a Datacenter using Mininet
 
Spark on Mesos-A Deep Dive-(Dean Wampler and Tim Chen, Typesafe and Mesosphere)
Spark on Mesos-A Deep Dive-(Dean Wampler and Tim Chen, Typesafe and Mesosphere)Spark on Mesos-A Deep Dive-(Dean Wampler and Tim Chen, Typesafe and Mesosphere)
Spark on Mesos-A Deep Dive-(Dean Wampler and Tim Chen, Typesafe and Mesosphere)
 
Best Practices for Virtualizing Apache Hadoop
Best Practices for Virtualizing Apache HadoopBest Practices for Virtualizing Apache Hadoop
Best Practices for Virtualizing Apache Hadoop
 
Python in Action (Part 2)
Python in Action (Part 2)Python in Action (Part 2)
Python in Action (Part 2)
 

Similar to Accelerating Cassandra Workloads on Ceph with All-Flash PCIE SSDS

Accelerating Virtual Machine Access with the Storage Performance Development ...
Accelerating Virtual Machine Access with the Storage Performance Development ...Accelerating Virtual Machine Access with the Storage Performance Development ...
Accelerating Virtual Machine Access with the Storage Performance Development ...
Michelle Holley
 
Ceph Day Taipei - Accelerate Ceph via SPDK
Ceph Day Taipei - Accelerate Ceph via SPDK Ceph Day Taipei - Accelerate Ceph via SPDK
Ceph Day Taipei - Accelerate Ceph via SPDK
Ceph Community
 
Ceph Day Beijing - Ceph all-flash array design based on NUMA architecture
Ceph Day Beijing - Ceph all-flash array design based on NUMA architectureCeph Day Beijing - Ceph all-flash array design based on NUMA architecture
Ceph Day Beijing - Ceph all-flash array design based on NUMA architecture
Ceph Community
 
Ceph Day Beijing - Ceph All-Flash Array Design Based on NUMA Architecture
Ceph Day Beijing - Ceph All-Flash Array Design Based on NUMA ArchitectureCeph Day Beijing - Ceph All-Flash Array Design Based on NUMA Architecture
Ceph Day Beijing - Ceph All-Flash Array Design Based on NUMA Architecture
Danielle Womboldt
 
Ceph Day Beijing - SPDK in Ceph
Ceph Day Beijing - SPDK in CephCeph Day Beijing - SPDK in Ceph
Ceph Day Beijing - SPDK in Ceph
Ceph Community
 
Ceph Day Beijing - SPDK for Ceph
Ceph Day Beijing - SPDK for CephCeph Day Beijing - SPDK for Ceph
Ceph Day Beijing - SPDK for Ceph
Danielle Womboldt
 
DAOS - Scale-Out Software-Defined Storage for HPC/Big Data/AI Convergence
DAOS - Scale-Out Software-Defined Storage for HPC/Big Data/AI ConvergenceDAOS - Scale-Out Software-Defined Storage for HPC/Big Data/AI Convergence
DAOS - Scale-Out Software-Defined Storage for HPC/Big Data/AI Convergence
inside-BigData.com
 
Red Hat Storage Day New York - Intel Unlocking Big Data Infrastructure Effici...
Red Hat Storage Day New York - Intel Unlocking Big Data Infrastructure Effici...Red Hat Storage Day New York - Intel Unlocking Big Data Infrastructure Effici...
Red Hat Storage Day New York - Intel Unlocking Big Data Infrastructure Effici...
Red_Hat_Storage
 
Ceph Day San Jose - All-Flahs Ceph on NUMA-Balanced Server
Ceph Day San Jose - All-Flahs Ceph on NUMA-Balanced Server Ceph Day San Jose - All-Flahs Ceph on NUMA-Balanced Server
Ceph Day San Jose - All-Flahs Ceph on NUMA-Balanced Server
Ceph Community
 
Ceph Day Seoul - Delivering Cost Effective, High Performance Ceph cluster
Ceph Day Seoul - Delivering Cost Effective, High Performance Ceph cluster Ceph Day Seoul - Delivering Cost Effective, High Performance Ceph cluster
Ceph Day Seoul - Delivering Cost Effective, High Performance Ceph cluster
Ceph Community
 
Hands-on Lab: How to Unleash Your Storage Performance by Using NVM Express™ B...
Hands-on Lab: How to Unleash Your Storage Performance by Using NVM Express™ B...Hands-on Lab: How to Unleash Your Storage Performance by Using NVM Express™ B...
Hands-on Lab: How to Unleash Your Storage Performance by Using NVM Express™ B...
Odinot Stanislas
 
Impact of Intel Optane Technology on HPC
Impact of Intel Optane Technology on HPCImpact of Intel Optane Technology on HPC
Impact of Intel Optane Technology on HPC
MemVerge
 
Ceph Day Shanghai - SSD/NVM Technology Boosting Ceph Performance
Ceph Day Shanghai - SSD/NVM Technology Boosting Ceph Performance Ceph Day Shanghai - SSD/NVM Technology Boosting Ceph Performance
Ceph Day Shanghai - SSD/NVM Technology Boosting Ceph Performance
Ceph Community
 
Ceph Day Taipei - Delivering cost-effective, high performance, Ceph cluster
Ceph Day Taipei - Delivering cost-effective, high performance, Ceph cluster Ceph Day Taipei - Delivering cost-effective, high performance, Ceph cluster
Ceph Day Taipei - Delivering cost-effective, high performance, Ceph cluster
Ceph Community
 
Ceph Day Tokyo - Delivering cost effective, high performance Ceph cluster
Ceph Day Tokyo - Delivering cost effective, high performance Ceph clusterCeph Day Tokyo - Delivering cost effective, high performance Ceph cluster
Ceph Day Tokyo - Delivering cost effective, high performance Ceph cluster
Ceph Community
 
Ceph Day KL - Delivering cost-effective, high performance Ceph cluster
Ceph Day KL - Delivering cost-effective, high performance Ceph clusterCeph Day KL - Delivering cost-effective, high performance Ceph cluster
Ceph Day KL - Delivering cost-effective, high performance Ceph cluster
Ceph Community
 
Red Hat Storage Day Atlanta - Designing Ceph Clusters Using Intel-Based Hardw...
Red Hat Storage Day Atlanta - Designing Ceph Clusters Using Intel-Based Hardw...Red Hat Storage Day Atlanta - Designing Ceph Clusters Using Intel-Based Hardw...
Red Hat Storage Day Atlanta - Designing Ceph Clusters Using Intel-Based Hardw...
Red_Hat_Storage
 
JetStor portfolio update final_2020-2021
JetStor portfolio update final_2020-2021JetStor portfolio update final_2020-2021
JetStor portfolio update final_2020-2021
Gene Leyzarovich
 
Ceph Community Talk on High-Performance Solid Sate Ceph
Ceph Community Talk on High-Performance Solid Sate Ceph Ceph Community Talk on High-Performance Solid Sate Ceph
Ceph Community Talk on High-Performance Solid Sate Ceph
Ceph Community
 
Optimized HPC/AI cloud with OpenStack acceleration service and composable har...
Optimized HPC/AI cloud with OpenStack acceleration service and composable har...Optimized HPC/AI cloud with OpenStack acceleration service and composable har...
Optimized HPC/AI cloud with OpenStack acceleration service and composable har...
Shuquan Huang
 

Similar to Accelerating Cassandra Workloads on Ceph with All-Flash PCIE SSDS (20)

Accelerating Virtual Machine Access with the Storage Performance Development ...
Accelerating Virtual Machine Access with the Storage Performance Development ...Accelerating Virtual Machine Access with the Storage Performance Development ...
Accelerating Virtual Machine Access with the Storage Performance Development ...
 
Ceph Day Taipei - Accelerate Ceph via SPDK
Ceph Day Taipei - Accelerate Ceph via SPDK Ceph Day Taipei - Accelerate Ceph via SPDK
Ceph Day Taipei - Accelerate Ceph via SPDK
 
Ceph Day Beijing - Ceph all-flash array design based on NUMA architecture
Ceph Day Beijing - Ceph all-flash array design based on NUMA architectureCeph Day Beijing - Ceph all-flash array design based on NUMA architecture
Ceph Day Beijing - Ceph all-flash array design based on NUMA architecture
 
Ceph Day Beijing - Ceph All-Flash Array Design Based on NUMA Architecture
Ceph Day Beijing - Ceph All-Flash Array Design Based on NUMA ArchitectureCeph Day Beijing - Ceph All-Flash Array Design Based on NUMA Architecture
Ceph Day Beijing - Ceph All-Flash Array Design Based on NUMA Architecture
 
Ceph Day Beijing - SPDK in Ceph
Ceph Day Beijing - SPDK in CephCeph Day Beijing - SPDK in Ceph
Ceph Day Beijing - SPDK in Ceph
 
Ceph Day Beijing - SPDK for Ceph
Ceph Day Beijing - SPDK for CephCeph Day Beijing - SPDK for Ceph
Ceph Day Beijing - SPDK for Ceph
 
DAOS - Scale-Out Software-Defined Storage for HPC/Big Data/AI Convergence
DAOS - Scale-Out Software-Defined Storage for HPC/Big Data/AI ConvergenceDAOS - Scale-Out Software-Defined Storage for HPC/Big Data/AI Convergence
DAOS - Scale-Out Software-Defined Storage for HPC/Big Data/AI Convergence
 
Red Hat Storage Day New York - Intel Unlocking Big Data Infrastructure Effici...
Red Hat Storage Day New York - Intel Unlocking Big Data Infrastructure Effici...Red Hat Storage Day New York - Intel Unlocking Big Data Infrastructure Effici...
Red Hat Storage Day New York - Intel Unlocking Big Data Infrastructure Effici...
 
Ceph Day San Jose - All-Flahs Ceph on NUMA-Balanced Server
Ceph Day San Jose - All-Flahs Ceph on NUMA-Balanced Server Ceph Day San Jose - All-Flahs Ceph on NUMA-Balanced Server
Ceph Day San Jose - All-Flahs Ceph on NUMA-Balanced Server
 
Ceph Day Seoul - Delivering Cost Effective, High Performance Ceph cluster
Ceph Day Seoul - Delivering Cost Effective, High Performance Ceph cluster Ceph Day Seoul - Delivering Cost Effective, High Performance Ceph cluster
Ceph Day Seoul - Delivering Cost Effective, High Performance Ceph cluster
 
Hands-on Lab: How to Unleash Your Storage Performance by Using NVM Express™ B...
Hands-on Lab: How to Unleash Your Storage Performance by Using NVM Express™ B...Hands-on Lab: How to Unleash Your Storage Performance by Using NVM Express™ B...
Hands-on Lab: How to Unleash Your Storage Performance by Using NVM Express™ B...
 
Impact of Intel Optane Technology on HPC
Impact of Intel Optane Technology on HPCImpact of Intel Optane Technology on HPC
Impact of Intel Optane Technology on HPC
 
Ceph Day Shanghai - SSD/NVM Technology Boosting Ceph Performance
Ceph Day Shanghai - SSD/NVM Technology Boosting Ceph Performance Ceph Day Shanghai - SSD/NVM Technology Boosting Ceph Performance
Ceph Day Shanghai - SSD/NVM Technology Boosting Ceph Performance
 
Ceph Day Taipei - Delivering cost-effective, high performance, Ceph cluster
Ceph Day Taipei - Delivering cost-effective, high performance, Ceph cluster Ceph Day Taipei - Delivering cost-effective, high performance, Ceph cluster
Ceph Day Taipei - Delivering cost-effective, high performance, Ceph cluster
 
Ceph Day Tokyo - Delivering cost effective, high performance Ceph cluster
Ceph Day Tokyo - Delivering cost effective, high performance Ceph clusterCeph Day Tokyo - Delivering cost effective, high performance Ceph cluster
Ceph Day Tokyo - Delivering cost effective, high performance Ceph cluster
 
Ceph Day KL - Delivering cost-effective, high performance Ceph cluster
Ceph Day KL - Delivering cost-effective, high performance Ceph clusterCeph Day KL - Delivering cost-effective, high performance Ceph cluster
Ceph Day KL - Delivering cost-effective, high performance Ceph cluster
 
Red Hat Storage Day Atlanta - Designing Ceph Clusters Using Intel-Based Hardw...
Red Hat Storage Day Atlanta - Designing Ceph Clusters Using Intel-Based Hardw...Red Hat Storage Day Atlanta - Designing Ceph Clusters Using Intel-Based Hardw...
Red Hat Storage Day Atlanta - Designing Ceph Clusters Using Intel-Based Hardw...
 
JetStor portfolio update final_2020-2021
JetStor portfolio update final_2020-2021JetStor portfolio update final_2020-2021
JetStor portfolio update final_2020-2021
 
Ceph Community Talk on High-Performance Solid Sate Ceph
Ceph Community Talk on High-Performance Solid Sate Ceph Ceph Community Talk on High-Performance Solid Sate Ceph
Ceph Community Talk on High-Performance Solid Sate Ceph
 
Optimized HPC/AI cloud with OpenStack acceleration service and composable har...
Optimized HPC/AI cloud with OpenStack acceleration service and composable har...Optimized HPC/AI cloud with OpenStack acceleration service and composable har...
Optimized HPC/AI cloud with OpenStack acceleration service and composable har...
 

Recently uploaded

LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
DanBrown980551
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance
 
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
Product School
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
Laura Byrne
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
Product School
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
Frank van Harmelen
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
91mobiles
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance
 
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Product School
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
RTTS
 
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualitySoftware Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Inflectra
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance
 
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
Product School
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
Alan Dix
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
Prayukth K V
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Product School
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
DianaGray10
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
Cheryl Hung
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
James Anderson
 

Recently uploaded (20)

LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
 
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
 
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
 
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualitySoftware Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
 
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
 

Accelerating Cassandra Workloads on Ceph with All-Flash PCIE SSDS

  • 1. Reddy Chagam – Principal Engineer, Storage Architect Stephen L Blinick – Senior Cloud Storage Performance Engineer Acknowledgments: Warren Wang, Anton Thaker (WalMart) Orlando Moreno, Vishal Verma (Intel)
  • 2. Intel technologies’ features and benefits depend on system configuration and may require enabled hardware, software or service activation. Performance varies depending on system configuration. No computer system can be absolutely secure. Check with your system manufacturer or retailer or learn more at http://intel.com. Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products. § Configurations: Ceph v0.94.3 Hammer Release, CentOS 7.1, 3.10-229 Kernel, Linked with JEMalloc 3.6, CBT used for testing and data acquisition, OSD System Config: Intel Xeon E5-2699 v3 2x@ 2.30 GHz, 72 cores w/ HT, 96GB, Cache 46080KB, 128GB DDR4, Each system with 4x P3700 800GB NVMe, partitioned into 4 OSD’s each, 16 OSD’s total per node, FIO Client Systems: Intel Xeon E5-2699 v3 2x@ 2.30 GHz, 72 cores w/ HT, 96GB, Cache 46080KB, 128GB DDR4, Single 10GbE network for client & replication data transfer. FIO 2.2.8 with LibRBD engine. Tests run by Intel DCG Storage Group iin Intel lab. Ceph configuration and CBT YAML file provided in backup slides. § For more information go to http://www.intel.com/performance. Intel, Intel Inside and the Intel logo are trademarks of Intel Corporation in the United States and other countries. *Other names and brands may be claimed as the property of others. © 2015 Intel Corporation. 12
  • 3. DCG Storage Group 13 Agenda • The transition to flash and the impact of NVMe • NVMe technology with Ceph • Cassandra & Ceph – a case for storage convergence • The all-NVMe high-density Ceph Cluster • Raw performance measurements and observations • Examining performance of a Cassandra DB like workload
  • 4. DCG Storage Group Evolution of Non-Volatile Memory Storage Devices PCIe NVMe 10s us >10 DW/day <10 DW/day 100s K 10s K PCIe NVMe GB/s SATA/SAS SSDs ~100s MB/s HDDs ~sub 100 MB/s SATA/SAS SSDs 100s us HDDs ~ ms IOPsEndurance 4K Read Latency PCI Express® (PCIe) NVM Express™ (NVMe) 3D XPoint™ DIMMs 3D XPoint NVM SSDs NVM Plays a Key Role in Delivering Performance for latency sensitive workloads
  • 5. DCG Storage Group 15 Ceph Workloads StoragePerformance (IOPS,Throughput) Storage Capacity (PB) Lower Higher LowerHigher Boot Volumes CDN Enterprise Dropbox Backup, Archive Remote Disks VDI App Storage BigData Mobile Content Depot Databases Block Object NVM Focus Test & Dev Cloud DVR HPC
  • 6. DCG Storage Group 16 Caching Ceph - NVM Usages Virtual Machine Baremetal RADOS Node Hypervisor Guest VM Qemu/VirtioQemu/Virtio ApplicationApplication Kernel User RBD DriverRBD Driver RADOSRADOS ApplicationApplication RADOS Protocol RADOS Protocol RBDRBD RADOSRADOS RADOS Protocol RADOS Protocol OSDOSD JournalJournal FilestoreFilestore NVMNVM File SystemFile System 10GbE Client caching w/ write through NVM NVMNVM NVM NVMNVM Journaling Read cache OSD data
  • 7. DCG Storage Group 17 Cassandra – What and Why? Cassandra Ring p1 p1 p20 p5 p3 p6 p5 p2 p4p8 p10 p7 Client • Cassandra is column-oriented NoSQL DB with CQL interface  Each row has unique key which is used for partitioning  No relations  A row can have multiple columns – not necessarily same no. of columns • Open source, distributed, decentralized, highly available, linearly scalable, multi DC, ….. • Used for analytics, real-time insights, fraud-detection, IOT/sensor data, messaging etc. Usecases: http://www.planetcassandra.org/apachecassandra-use-cases/ • Ceph is a popular open source unified storage platform • Many large scale Ceph deployments in production • End customers prefer converged infrastructure to support multiple workloads (e.g. analytics) to achieve CapEx, OpEx savings • Several customers are asking for Cassandra workload on Ceph
  • 8. DCG Storage Group IP Fabric 18 Ceph and Cassandra Integration Virtual Machine Hypervisor Guest VM Qemu/VirtioQemu/Virtio ApplicationApplication RBDRBD RADOSRADOS CassandraCassandra Virtual Machine Hypervisor Guest VM Qemu/VirtioQemu/Virtio ApplicationApplication RBDRBD RADOSRADOS CassandraCassandra Virtual Machine Hypervisor Guest VM Qemu/VirtioQemu/Virtio ApplicationApplication RBDRBD RADOSRADOS CassandraCassandra Ceph Storage Cluster SSD SSD OSDOSDOSDOSD OSDOSD SSD SSD OSDOSDOSDOSD OSDOSD SSD SSD OSDOSDOSDOSD OSDOSD SSD SSD OSDOSDOSDOSD OSDOSD MON MON Deployment Considerations • Bootable Ceph volumes (OS & Cassandra data) • Cassandra RBD data volumes • Data protection (Cassandra or Ceph)
  • 9. DCG Storage Group Ceph Storage Cluster Hardware Environment Overview Ceph network (192.168.142.0/24) - 10Gbps CBT / Zabbix / Monitoring CBT / Zabbix / Monitoring FIO RBD ClientFIO RBD Client • OSD System Config: Intel Xeon E5-2699 v3 2x@ 2.30 GHz, 72 cores w/ HT, 96GB, Cache 46080KB, 128GB DDR4 • Each system with 4x P3700 800GB NVMe, partitioned into 4 OSD’s each, 16 OSD’s total per node • FIO Client Systems: Intel Xeon E5-2699 v3 2x@ 2.30 GHz, 72 cores w/ HT, 96GB, Cache 46080KB, 128GB DDR4 • Ceph v0.94.3 Hammer Release, CentOS 7.1, 3.10-229 Kernel, Linked with JEMalloc 3.6 • CBT used for testing and data acquisition • Single 10GbE network for client & replication data transfer, Replication factor 2 • OSD System Config: Intel Xeon E5-2699 v3 2x@ 2.30 GHz, 72 cores w/ HT, 96GB, Cache 46080KB, 128GB DDR4 • Each system with 4x P3700 800GB NVMe, partitioned into 4 OSD’s each, 16 OSD’s total per node • FIO Client Systems: Intel Xeon E5-2699 v3 2x@ 2.30 GHz, 72 cores w/ HT, 96GB, Cache 46080KB, 128GB DDR4 • Ceph v0.94.3 Hammer Release, CentOS 7.1, 3.10-229 Kernel, Linked with JEMalloc 3.6 • CBT used for testing and data acquisition • Single 10GbE network for client & replication data transfer, Replication factor 2 FIO RBD ClientFIO RBD Client FIO RBD ClientFIO RBD Client FIO RBD ClientFIO RBD Client FIO RBD ClientFIO RBD Client FIO RBD ClientFIO RBD Client FatTwin (4x dual-socket XeonE5 v3) FatTwin (4x dual-socket XeonE5 v3) CephOSD1CephOSD1 NVMe1NVMe1 NVMe3NVMe3 NVMe2NVMe2 NVMe4NVMe4 CephOSD2CephOSD2 CephOSD3CephOSD3 CephOSD4CephOSD4 CephOSD16CephOSD16 … CephOSD1CephOSD1 NVMe1NVMe1 NVMe3NVMe3 NVMe2NVMe2 NVMe4NVMe4 CephOSD2CephOSD2 CephOSD3CephOSD3 CephOSD4CephOSD4 CephOSD16CephOSD16 … CephOSD1CephOSD1 NVMe1NVMe1 NVMe3NVMe3 NVMe2NVMe2 NVMe4NVMe4 CephOSD2CephOSD2 CephOSD3CephOSD3 CephOSD4CephOSD4 CephOSD16CephOSD16 … CephOSD1CephOSD1 NVMe1NVMe1 NVMe3NVMe3 NVMe2NVMe2 NVMe4NVMe4 CephOSD2CephOSD2 CephOSD3CephOSD3 CephOSD4CephOSD4 CephOSD16CephOSD16 … CephOSD1CephOSD1 NVMe1NVMe1 NVMe3NVMe3 NVMe2NVMe2 NVMe4NVMe4 CephOSD2CephOSD2 CephOSD3CephOSD3 CephOSD4CephOSD4 CephOSD16CephOSD16 … SuperMicro 1028U SuperMicro 1028U SuperMicro 1028U SuperMicro 1028U SuperMicro 1028U Intel Xeon E5 v3 18 Core CPUs Intel P3700 NVMe PCI-e Flash Intel Xeon E5 v3 18 Core CPUs Intel P3700 NVMe PCI-e Flash Easily serviceable NVMe Drives
  • 10. DCG Storage Group • High performance NVMe devices are capable of high parallelism at low latency • DC P3700 800GB Raw Performance: 460K read IOPS & 90K Write IOPS at QD=128 • By using multiple OSD partitions, Ceph performance scales linearly • Reduces lock contention within a single OSD process • Lower latency at all queue-depths, biggest impact to random reads • Introduces the concept of multiple OSD’s on the same physical device • Conceptually similar crushmap data placement rules as managing disks in an enclosure • High Resiliency of “Data Center” Class NVMe devices • At least 10 Drive writes per day • Power loss protection, full data path protection, device level telemetry Multi-partitioning flash devices NVMe1NVMe1 CephOSD1CephOSD1 CephOSD2CephOSD2 CephOSD3CephOSD3 CephOSD4CephOSD4 Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Any difference in system hardware or software design or configuration may affect actual performance. See configuration slides in backup for details on software configuration and test benchmark parameters.
  • 11. DCG Storage Group 21 Partitioning multiple OSD’s per NVMe • Multiple OSD’s per NVMe result in higher performance, lower latency, and better CPU utilization 0 2 4 6 8 10 12 0 200,000 400,000 600,000 800,000 1,000,000 1,200,000 AvgLatency(ms) IOPS Latency vs IOPS - 4K Random Read - Multiple OSD's per Device comparison 5 nodes, 20/40/80 OSDs, Intel DC P3700 Xeon E5 2699v3 Dual Socket / 128GB Ram / 10GbE Ceph0.94.3 w/ JEMalloc, 1 OSD/NVMe 2 OSD/NVMe 4 OSD/NVMe Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Any difference in system hardware or software design or configuration may affect actual performance. See configuration slides in backup for details on software configuration and test benchmark parameters. 0 10 20 30 40 50 60 70 80 90 %CPUUtilization Single Node CPU Utilization Comparison - 4K Random Reads@QD32 4/8/16 OSDs, Intel DC P3700, Xeon E5 2699v3 Dual Socket / 128GB Ram / 10GbE Ceph0.94.3 w/ JEMalloc 1 OSD/NVMe 2 OSD/NVMe 4 OSD/NVMe Single OSD Double OSD Quad OSD
  • 12. DCG Storage Group 4K Random Read & Write Performance Summary 22 First Ceph cluster to break 1 Million 4K random IOPS Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Any difference in system hardware or software design or configuration may affect actual performance. See configuration slides in backup for details on software configuration and test benchmark parameters. Workload Pattern Max IOPS 4K 100% Random Reads (2TB Dataset) 1.35Million 4K 100% Random Reads (4.8TB Dataset) 1.15Million 4K 100% Random Writes (4.8TB Dataset) 200K 4K 70%/30% Read/Write OLTP Mix (4.8TB Dataset) 452K
  • 13. DCG Storage Group 0 1 2 3 4 5 6 7 8 9 10 0 200,000 400,000 600,000 800,000 1,000,000 1,200,000 1,400,000 AvgLatency(ms) IOPS IODepth Scaling - Latency vs IOPS - Read, Write, and 70/30 4K Random Mix 5 nodes, 60 OSDs, Xeon E5 2699v3 Dual Socket / 128GB Ram / 10GbE Ceph0.94.3 w/ JEMalloc 100% 4K RandomRead 100% 4K RandomWrite 70/30% 4K Random OLTP 100% 4K RandomRead - 2TB DataSet 4K Random Read & Write Performance and Latency 23 First Ceph cluster to break 1 Million 4K random IOPS, ~1ms response time 171K 100% 4k Random Write IOPS @ 6ms 400K 70/30% (OLTP) 4k Random IOPS @~3ms 1M 100% 4k Random Read IOPS @~1.1ms Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Any difference in system hardware or software design or configuration may affect actual performance. See configuration slides in backup for details on software configuration and test benchmark parameters. 1.35M 4k Random Read IOPS w/ 2TB Hot Data
  • 14. DCG Storage Group Sequential performance (512KB) 24 • With 10gbE per node, both writes and reads are achieving line rate bottlenecked by the OSD node single interface. • Higher throughputs would be possible through bonding or 40GbE connectivity. 3,214 5,888 5,631 0 1000 2000 3000 4000 5000 6000 7000 100% Write 100% Read 70/30% R/W Mix MB/s 512k Sequential Performance Bandwidth 5 nodes, 80 OSDs, DC P3700, Xeon E5 2699v3 Dual Socket / 128GB Ram / 10GbE Ceph0.94.3 w/ JEMalloc Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Any difference in system hardware or software design or configuration may affect actual performance. See configuration slides in backup for details on software configuration and test benchmark parameters.
  • 15. DCG Storage Group Cassandra-like workload 25 242K IOPS at < 2ms latency • Based on a typical customer cassanda workload profile • 50% Reads and 50% Writes, predominantly 8K Reads and 12K Writes, FIO Queue depth = 8 78% 19% 3% 8K 5K 7K 92% 5% 12K 33K 115K 50K 80K 0 0.5 1 1.5 2 2.5 0.00 50,000.00 100,000.00 150,000.00 200,000.00 250,000.00 300,000.00 Latency(ms) IOPS Cassandra like workload - 50/50 Read/Write Mix 5 nodes, 80 OSDs, Xeon E5 2699v3 Dual Socket / 128GB Ram / 10GbE Ceph0.94.3 w/ JEMalloc IOPS Latency IO-Size Breakdown Reads Writes Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Any difference in system hardware or software design or configuration may affect actual performance. See configuration slides in backup for details on software configuration and test benchmark parameters.
  • 16. DCG Storage Group 26 Summary & Conclusions • Flash technology including NVMe enables new performance capabilities in small footprints • Ceph and Cassandra provide a compelling case for feature-rich converged storage that can support latency sensitive analytics workloads • Using the latest standard high-volume servers and Ceph, you can now build an open, high density, scalable, high performance cluster that can handle a low- latency mixed workload. • Ceph performance improvements over recent releases are significant, and today over 1 Million random IOPS is achievable in 5U with ~1ms latency. • Next steps: • Address small block write performance, limited by Filestore backend • Improve long tail latency for transactional workloads Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Any difference in system hardware or software design or configuration may affect actual performance. See configuration slides in backup for details on software configuration and test benchmark parameters.
  • 18. 28
  • 19. DCG Storage Group 29 Configuration Detail – ceph.conf Section Perf. Tuning Parameter Default Tuned [global] Authentication auth_client_required cephx none auth_cluster_required cephx none auth_service_required cephx none Debug logging debug_lockdep 0/1 0/0 debug_context 0/1 0/0 debug_crush 1/1 0/0 debug_buffer 0/1 0/0 debug_timer 0/1 0/0 debug_filer 0/1 0/0 debug_objector 0/1 0/0 debug_rados 0/5 0/0 debug_rbd 0/5 0/0 debug_ms 0/5 0/0 debug_monc 0/5 0/0 debug_tp 0/5 0/0 debug_auth 1/5 0/0 debug_finisher 1/5 0/0 debug_heartbeatmap 1/5 0/0 debug_perfcounter 1/5 0/0 debug_rgw 1/5 0/0 debug_asok 1/5 0/0 debug_throttle 1/1 0/0
  • 20. DCG Storage Group 30 Configuration Detail – ceph.conf (continued)Section Perf. Tuning Parameter Default Tuned [global] CBT specific mon_pg_warn_max_object_skew 10 10000 mon_pg_warn_min_per_osd 0 0 mon_pg_warn_max_per_osd 32768 32768 osd_pg_bits 8 8 osd_pgp_bits 8 8 RBD cache rbd_cache true true Other mon_compact_on_trim true false log_to_syslog false false log_file /var/log/ceph/$name.log /var/log/ceph/$name.log perf true true mutex_perf_counter false true throttler_perf_counter true false [mon] CBT specific mon_data /var/lib/ceph/mon/ceph-0 /home/bmpa/tmp_cbt/ceph/mon.$id mon_max_pool_pg_num 65536 166496 mon_osd_max_split_count 32 10000 [osd] Filestore parameters filestore_wbthrottle_enable true false filestore_queue_max_bytes 104857600 1048576000 filestore_queue_committing_max_bytes 104857600 1048576000 filestore_queue_max_ops 50 5000 filestore_queue_committing_max_ops 500 5000 filestore_max_sync_interval 5 10 filestore_fd_cache_size 128 64 filestore_fd_cache_shards 16 32 filestore_op_threads 2 6 Mount parameters osd_mount_options_xfs rw,noatime,inode64,logbsize=256k,delaylog osd_mkfs_options_xfs -f -i size=2048 Journal parameters journal_max_write_entries 100 1000 journal_queue_max_ops 300 3000 journal_max_write_bytes 10485760 1048576000 journal_queue_max_bytes 33554432 1048576000 Op tracker osd_enable_op_tracker true false OSD client osd_client_message_size_cap 524288000 0 osd_client_message_cap 100 0 Objecter objecter_inflight_ops 1024 102400 objecter_inflight_op_bytes 104857600 1048576000 Throttles ms_dispatch_throttle_bytes 104857600 1048576000 OSD number of threads osd_op_threads 2 32 osd_op_num_shards 5 5 osd_op_num_threads_per_shard 2 2
  • 21. DCG Storage Group 31 Configuration Detail - CBT YAML File cluster: user: "bmpa" head: "ft01" clients: ["ft01", "ft02", "ft03", "ft04", "ft05", "ft06"] osds: ["hswNode01", "hswNode02", "hswNode03", "hswNode04", "hswNode05"] mons: ft02: a: "192.168.142.202:6789" osds_per_node: 8 fs: xfs mkfs_opts: '-f -i size=2048 -n size=64k' mount_opts: '-o inode64,noatime,logbsize=256k' conf_file: '/home/bmpa/cbt/ceph_nvme_2partition_5node_hsw.conf' use_existing: False rebuild_every_test: False clusterid: "ceph" iterations: 1 tmp_dir: "/home/bmpa/tmp_cbt" pool_profiles: 2rep: pg_size: 4096 pgp_size: 4096 replication: 2
  • 22. DCG Storage Group 32 Configuration Detail - CBT YAML File (Continued) benchmarks: librbdfio: time: 300 ramp: 600 vol_size: 81920 mode: ['randrw‘] rwmixread: [0, 70, 100] op_size: [4096] procs_per_volume: [1] volumes_per_client: [10] use_existing_volumes: False iodepth: [4, 8, 16, 32, 64, 96, 128] osd_ra: [128] norandommap: True cmd_path: '/usr/bin/fio' pool_profile: '2rep' log_avg_msec: 250
  • 23. DCG Storage Group 33 Storage Node Diagram Two CPU Sockets: Socket 0 and Socket 1  Socket 0 • 2 NVMes • Intel X540-AT2 (10Gbps) • 64GB: 8x 8GB 2133 DIMMs  Socket 1 • 2 NVMes • 64GB: 8x 8GB 2133 DIMMs Explore additional optimizations using cgroups, IRQ affinity
  • 24. DCG Storage Group • Generally available server designs built for high density and high performance • High density 1U standard high volume server • Dual socket 3rd Generation Xeon E5 (2699v3) • 10 Front-removable 2.5” Formfactor Drive slots, 8639 connector • Multiple 10Gb network ports, additional slots for 40Gb networking • Intel DC P3700 NVMe drives are available in 2.5” drive form-factor • Allowing easier service in a datacenter environment High Performance Ceph Node Hardware Building Blocks