SlideShare a Scribd company logo
CEPH AND ROCKSDB
SAGE WEIL
HIVEDATA ROCKSDB MEETUP - 2016.02.03
2
OUTLINE
●
Ceph background
●
FileStore - why POSIX failed us
●
BlueStore – a new Ceph OSD backend
●
RocksDB changes
– journal recycling
– BlueRocksEnv
– EnvMirror
– delayed merge?
●
Summary
3
CEPH
● Object, block, and file storage in a single cluster
● All components scale horizontally
● No single point of failure
● Hardware agnostic, commodity hardware
● Self-manage whenever possible
● Open source (LGPL)
● Move beyond legacy approaches
– client/cluster instead of client/server
– avoid ad hoc approaches HA
4
CEPH COMPONENTS
RGW
A web services gateway
for object storage,
compatible with S3 and
Swift
LIBRADOS
A library allowing apps to directly access RADOS (C, C++, Java, Python, Ruby, PHP)
RADOS
A software-based, reliable, autonomous, distributed object store comprised of
self-healing, self-managing, intelligent storage nodes and lightweight monitors
RBD
A reliable, fully-distributed
block device with cloud
platform integration
CEPHFS
A distributed file system
with POSIX semantics and
scale-out metadata
management
OBJECT BLOCK FILE
5
OBJECT STORAGE DAEMONS (OSDS)
FS
DISK
OSD
DISK
OSD
FS
DISK
OSD
FS
DISK
OSD
FS
btrfs
xfs
ext4
M
M
M
6
OBJECT STORAGE DAEMONS (OSDS)
FS
DISK
OSD
DISK
OSD
FS
DISK
OSD
FS
DISK
OSD
FS
btrfs
xfs
ext4
M
M
M
FileStore FileStoreFileStoreFileStore
7
POSIX FAILS: TRANSACTIONS
●
OSD carefully manages consistency of its data
●
All writes are transactions (we need A+D; OSD provides C+I)
●
Most are simple
– write some bytes to object (file)
– update object attribute (file xattr)
– append to update log (leveldb insert)
...but others are arbitrarily large/complex
● Btrfs transaction hooks failed for various reasons
● But write-ahead journals work okay
– write entire serialized transactions to well-optimized FileJournal
– then apply it to the file system
– half our disk throughput
8
POSIX FAILS: ENUMERATION
● Ceph objects are distributed by a 32-bit hash
● Enumeration is in hash order
– scrubbing
– “backfill” (data rebalancing, recovery)
– enumeration via librados client API
● POSIX readdir is not well-ordered
● Need O(1) “split” for a given shard/range
● Build directory tree by hash-value prefix
– split any directory when size > ~100 files
– merge when size < ~20 files
– read entire directory, sort in-memory
...
A/A03224D3_qwer
A/A247233E_zxcv
...
B/8/B823032D_foo
B/8/B8474342_bar
B/9/B924273B_baz
B/A/BA4328D2_asdf
...
9
WE WANT TO AVOID POSIX FILE INTERFACE
● POSIX has the wrong metadata model for us
– rocksdb perfect for managing our namespace
● NewStore = rocksdb + object files
● Layering over POSIX duplicates consistency
overhead
– XFS/ext4 journal writes for fs consistency
– rocksdb wal writes for our metadata
● BlueStore = NewStore over block HDD
OSD
SSD SSD
OSD
HDD
OSD
BlueStore BlueStore BlueStore
RocksDBRocksDB
10
WHY ROCKSDB?
● Ideal key/value interface
– transactions
– ordered enumeration
– fast commits to log/journal
● Common interface
– can always swap in another KV DB if we want
● Abstract storage backend (rocksdb::Env)
● C++ interface
● Strong and active open source community
11
BlueStore
BLUESTORE DESIGN
BlueFS
RocksDB
BlockDeviceBlockDeviceBlockDevice
BlueRocksEnv
data metadata
● rocksdb
– object metadata (onode) in rocksdb
– write-ahead log (small writes/overwrites)
– ceph key/value “omap” data
– allocator metadata (free extent list)
● block device
– object data
● pluggable allocator
● rocksdb shares block device(s)
– BlueRocksEnv is rocksdb::Env
– BlueFS is super-simple C++ “file system”
● 2x faster on HDD, more on SSD
Allocator
ROCKSDB
13
ROCKSDB: JOURNAL RECYCLING
● Problem: 1 small (4 KB) Ceph write → 3-4 disk IOs!
– BlueStore: write 4 KB of user data
– rocksdb: append record to WAL
● write update block at end of log file
● fsync: XFS/ext4/BlueFS journals inode size/alloc update to its journal
● fallocate(2) doesn't help
– data blocks are not pre-zeroed; fsync still has to update alloc metadata
● rocksdb LogReader only understands two modes
– read until end of file (need accurate file size)
– read all valid records, then ignore zeros at end (need zeroed tail)
14
ROCKSDB: JOURNAL RECYCLING (2)
● Put old log files on recycle list (instead of deleting them)
● LogWriter
– overwrite old log data with new log data
– include log number in each record
● LogReader
– stop replaying when we get garbage (bad CRC)
– or when we get a valid CRC but record is from a previous log incarnation
● Now we get one log append → one IO!
● Upstream, but missing a bug fix (PR #881)
15
ROCKSDB: BLUEROCKSENV + BLUEFS
● class BlueRocksEnv : public rocksdb::EnvWrapper
– passes file IO operations to BlueFS
● BlueFS is a super-simple “file system”
– all metadata loaded in RAM on start/mount
– no need to store block free list; calculate it on startup
– coarse allocation unit (1 MB blocks)
– all metadata updates written to a journal
– journal rewritten/compacted when it gets large
● Map “directories” (db/, db.wal/, db.bulk/) to different block devices
– WAL on NVRAM, NVMe, SSD
– level0 and hot SSTs on SSD
– cold SSTs on HDD
● BlueStore periodically balances free space between itself and BlueFS
BlueStore
BlueFS
RocksDB
BlockDeviceBlockDeviceBlockDevice
BlueRocksEnv
data metadata
Allocator
16
ROCKSDB: ENVMIRROR
● include/rocksdb/utilities/env_mirror.h
● class EnvMirror : public rocksdb::EnvWrapper {
EnvMirror(Env* a, Env* b)
● mirrors all writes to both a and b
● sends all reads to both a and b
– verifies the results are identical
● Invaluable when debugging BlueRocksEnv
– validate BlueRocksEnv vs rocksdb's default PosixEnv
17
ROCKSDB: DELAYED LOG MERGE
● We write lots of short-lived records to log
– insert wal_1 = 4 KB
– insert wal_2 = 8 KB
– …
– insert wal_10 = 4 KB
– delete wal_1
– insert wal_11 = 4 KB
● Goal
– prevent short-lived records from ever getting amplified
– keep, say, 2N logs
– merge oldest N to new level0 SST, but also remove keys updated/deleted in
newest N logs
18
SUMMARY
● Ceph is great
● POSIX was poor choice for storing objects
● Our new BlueStore backend is awesome
● RocksDB rocks and was easy to embed
● Log recycling speeds up commits (now upstream)
● Delayed merge will help too (coming soon)
THANK YOU!
Sage Weil
CEPH PRINCIPAL ARCHITECT
sage@redhat.com
@liewegas

More Related Content

What's hot

Ceph BlueStore - новый тип хранилища в Ceph / Максим Воронцов, (Redsys)
Ceph BlueStore - новый тип хранилища в Ceph / Максим Воронцов, (Redsys)Ceph BlueStore - новый тип хранилища в Ceph / Максим Воронцов, (Redsys)
Ceph BlueStore - новый тип хранилища в Ceph / Максим Воронцов, (Redsys)
Ontico
 
Understanding blue store, Ceph's new storage backend - Tim Serong, SUSE
Understanding blue store, Ceph's new storage backend - Tim Serong, SUSEUnderstanding blue store, Ceph's new storage backend - Tim Serong, SUSE
Understanding blue store, Ceph's new storage backend - Tim Serong, SUSE
OpenStack
 
RocksDB compaction
RocksDB compactionRocksDB compaction
RocksDB compaction
MIJIN AN
 
RocksDB detail
RocksDB detailRocksDB detail
RocksDB detail
MIJIN AN
 
The Hive Think Tank: Rocking the Database World with RocksDB
The Hive Think Tank: Rocking the Database World with RocksDBThe Hive Think Tank: Rocking the Database World with RocksDB
The Hive Think Tank: Rocking the Database World with RocksDB
The Hive
 
Hadoop over rgw
Hadoop over rgwHadoop over rgw
Hadoop over rgw
zhouyuan
 
Tech Talk: RocksDB Slides by Dhruba Borthakur & Haobo Xu of Facebook
Tech Talk: RocksDB Slides by Dhruba Borthakur & Haobo Xu of FacebookTech Talk: RocksDB Slides by Dhruba Borthakur & Haobo Xu of Facebook
Tech Talk: RocksDB Slides by Dhruba Borthakur & Haobo Xu of Facebook
The Hive
 
Storage Engine Wars at Parse
Storage Engine Wars at ParseStorage Engine Wars at Parse
Storage Engine Wars at Parse
MongoDB
 
BlueStore: a new, faster storage backend for Ceph
BlueStore: a new, faster storage backend for CephBlueStore: a new, faster storage backend for Ceph
BlueStore: a new, faster storage backend for Ceph
Sage Weil
 
M|18 How to use MyRocks with MariaDB Server
M|18 How to use MyRocks with MariaDB ServerM|18 How to use MyRocks with MariaDB Server
M|18 How to use MyRocks with MariaDB Server
MariaDB plc
 
A crash course in CRUSH
A crash course in CRUSHA crash course in CRUSH
A crash course in CRUSH
Sage Weil
 
Distributed Storage and Compute With Ceph's librados (Vault 2015)
Distributed Storage and Compute With Ceph's librados (Vault 2015)Distributed Storage and Compute With Ceph's librados (Vault 2015)
Distributed Storage and Compute With Ceph's librados (Vault 2015)
Sage Weil
 
librados
libradoslibrados
librados
Patrick McGarry
 
QCT Ceph Solution - Design Consideration and Reference Architecture
QCT Ceph Solution - Design Consideration and Reference ArchitectureQCT Ceph Solution - Design Consideration and Reference Architecture
QCT Ceph Solution - Design Consideration and Reference Architecture
Patrick McGarry
 
Unified readonly cache for ceph
Unified readonly cache for cephUnified readonly cache for ceph
Unified readonly cache for ceph
zhouyuan
 
Towards Application Driven Storage
Towards Application Driven StorageTowards Application Driven Storage
Towards Application Driven Storage
Javier González
 
Introduction to Redis
Introduction to RedisIntroduction to Redis
Introduction to Redis
Arnab Mitra
 
Ceph Introduction 2017
Ceph Introduction 2017  Ceph Introduction 2017
Ceph Introduction 2017
Karan Singh
 
Ceph - A distributed storage system
Ceph - A distributed storage systemCeph - A distributed storage system
Ceph - A distributed storage system
Italo Santos
 

What's hot (19)

Ceph BlueStore - новый тип хранилища в Ceph / Максим Воронцов, (Redsys)
Ceph BlueStore - новый тип хранилища в Ceph / Максим Воронцов, (Redsys)Ceph BlueStore - новый тип хранилища в Ceph / Максим Воронцов, (Redsys)
Ceph BlueStore - новый тип хранилища в Ceph / Максим Воронцов, (Redsys)
 
Understanding blue store, Ceph's new storage backend - Tim Serong, SUSE
Understanding blue store, Ceph's new storage backend - Tim Serong, SUSEUnderstanding blue store, Ceph's new storage backend - Tim Serong, SUSE
Understanding blue store, Ceph's new storage backend - Tim Serong, SUSE
 
RocksDB compaction
RocksDB compactionRocksDB compaction
RocksDB compaction
 
RocksDB detail
RocksDB detailRocksDB detail
RocksDB detail
 
The Hive Think Tank: Rocking the Database World with RocksDB
The Hive Think Tank: Rocking the Database World with RocksDBThe Hive Think Tank: Rocking the Database World with RocksDB
The Hive Think Tank: Rocking the Database World with RocksDB
 
Hadoop over rgw
Hadoop over rgwHadoop over rgw
Hadoop over rgw
 
Tech Talk: RocksDB Slides by Dhruba Borthakur & Haobo Xu of Facebook
Tech Talk: RocksDB Slides by Dhruba Borthakur & Haobo Xu of FacebookTech Talk: RocksDB Slides by Dhruba Borthakur & Haobo Xu of Facebook
Tech Talk: RocksDB Slides by Dhruba Borthakur & Haobo Xu of Facebook
 
Storage Engine Wars at Parse
Storage Engine Wars at ParseStorage Engine Wars at Parse
Storage Engine Wars at Parse
 
BlueStore: a new, faster storage backend for Ceph
BlueStore: a new, faster storage backend for CephBlueStore: a new, faster storage backend for Ceph
BlueStore: a new, faster storage backend for Ceph
 
M|18 How to use MyRocks with MariaDB Server
M|18 How to use MyRocks with MariaDB ServerM|18 How to use MyRocks with MariaDB Server
M|18 How to use MyRocks with MariaDB Server
 
A crash course in CRUSH
A crash course in CRUSHA crash course in CRUSH
A crash course in CRUSH
 
Distributed Storage and Compute With Ceph's librados (Vault 2015)
Distributed Storage and Compute With Ceph's librados (Vault 2015)Distributed Storage and Compute With Ceph's librados (Vault 2015)
Distributed Storage and Compute With Ceph's librados (Vault 2015)
 
librados
libradoslibrados
librados
 
QCT Ceph Solution - Design Consideration and Reference Architecture
QCT Ceph Solution - Design Consideration and Reference ArchitectureQCT Ceph Solution - Design Consideration and Reference Architecture
QCT Ceph Solution - Design Consideration and Reference Architecture
 
Unified readonly cache for ceph
Unified readonly cache for cephUnified readonly cache for ceph
Unified readonly cache for ceph
 
Towards Application Driven Storage
Towards Application Driven StorageTowards Application Driven Storage
Towards Application Driven Storage
 
Introduction to Redis
Introduction to RedisIntroduction to Redis
Introduction to Redis
 
Ceph Introduction 2017
Ceph Introduction 2017  Ceph Introduction 2017
Ceph Introduction 2017
 
Ceph - A distributed storage system
Ceph - A distributed storage systemCeph - A distributed storage system
Ceph - A distributed storage system
 

Viewers also liked

Bizitzaren historia
Bizitzaren  historiaBizitzaren  historia
Bizitzaren historia
Joserra Abarretegui
 
Untethered health in a networked society by James Mathews
Untethered health in a networked society by James MathewsUntethered health in a networked society by James Mathews
Untethered health in a networked society by James Mathews
The Hive
 
Advanced Visual Analytics and Real-time Analytics at Platform scale by Brian ...
Advanced Visual Analytics and Real-time Analytics at Platform scale by Brian ...Advanced Visual Analytics and Real-time Analytics at Platform scale by Brian ...
Advanced Visual Analytics and Real-time Analytics at Platform scale by Brian ...
The Hive
 
My magazine edited
My magazine editedMy magazine edited
My magazine edited
sofiamorana1
 
Redefine healthcare with IT by Niranjan Thirumale
Redefine healthcare with IT by Niranjan ThirumaleRedefine healthcare with IT by Niranjan Thirumale
Redefine healthcare with IT by Niranjan Thirumale
The Hive
 
Expt panel hive_data_rp_20130320_final-1
Expt panel hive_data_rp_20130320_final-1Expt panel hive_data_rp_20130320_final-1
Expt panel hive_data_rp_20130320_final-1
The Hive
 
Startup Series: Lean Analytics, Innovation, and Tilting at Windmills
Startup Series: Lean Analytics, Innovation, and Tilting at WindmillsStartup Series: Lean Analytics, Innovation, and Tilting at Windmills
Startup Series: Lean Analytics, Innovation, and Tilting at Windmills
The Hive
 
[Japanese Content] TM Ravi_ Tokyo Presentation_TheHive_Sept 2013
[Japanese Content] TM Ravi_ Tokyo Presentation_TheHive_Sept 2013[Japanese Content] TM Ravi_ Tokyo Presentation_TheHive_Sept 2013
[Japanese Content] TM Ravi_ Tokyo Presentation_TheHive_Sept 2013The Hive
 
Alan Gates, Hortonworks_Hadoop&SQL
Alan Gates, Hortonworks_Hadoop&SQLAlan Gates, Hortonworks_Hadoop&SQL
Alan Gates, Hortonworks_Hadoop&SQL
The Hive
 
Notes from the (greasy) field by Ranjit Nair - Co-founder and CTO, Altizon
Notes from the (greasy) field by Ranjit Nair - Co-founder and CTO, AltizonNotes from the (greasy) field by Ranjit Nair - Co-founder and CTO, Altizon
Notes from the (greasy) field by Ranjit Nair - Co-founder and CTO, Altizon
The Hive
 
The Hive "Data Virtualization" Introduction - Jim Green, CEO of Composite Sof...
The Hive "Data Virtualization" Introduction - Jim Green, CEO of Composite Sof...The Hive "Data Virtualization" Introduction - Jim Green, CEO of Composite Sof...
The Hive "Data Virtualization" Introduction - Jim Green, CEO of Composite Sof...
The Hive
 
Big Data App servor by Lance Riedel, CTO, The Hive for The Hive India event
Big Data App servor by Lance Riedel, CTO, The Hive for The Hive India eventBig Data App servor by Lance Riedel, CTO, The Hive for The Hive India event
Big Data App servor by Lance Riedel, CTO, The Hive for The Hive India event
The Hive
 
[Japanese Content] Lance Riedel_The App Server, The Hive in Tokyo_Aug29
[Japanese Content] Lance Riedel_The App Server, The Hive in Tokyo_Aug29[Japanese Content] Lance Riedel_The App Server, The Hive in Tokyo_Aug29
[Japanese Content] Lance Riedel_The App Server, The Hive in Tokyo_Aug29The Hive
 
Opportunites in Big Data by Sumant Mandal, Founder of The Hive for The Hive I...
Opportunites in Big Data by Sumant Mandal, Founder of The Hive for The Hive I...Opportunites in Big Data by Sumant Mandal, Founder of The Hive for The Hive I...
Opportunites in Big Data by Sumant Mandal, Founder of The Hive for The Hive I...
The Hive
 
Very beautiful
Very beautifulVery beautiful
Very beautifulasmaeazed
 
Susheel Patel, Pivotal_Hadoop&SQL
Susheel Patel, Pivotal_Hadoop&SQLSusheel Patel, Pivotal_Hadoop&SQL
Susheel Patel, Pivotal_Hadoop&SQL
The Hive
 
1.nigam shah stanford_meetup
1.nigam shah stanford_meetup1.nigam shah stanford_meetup
1.nigam shah stanford_meetup
The Hive
 
Search at Linkedin by Sriram Sankar and Kumaresh Pattabiraman
Search at Linkedin by Sriram Sankar and Kumaresh PattabiramanSearch at Linkedin by Sriram Sankar and Kumaresh Pattabiraman
Search at Linkedin by Sriram Sankar and Kumaresh Pattabiraman
The Hive
 

Viewers also liked (20)

Bizitzaren historia
Bizitzaren  historiaBizitzaren  historia
Bizitzaren historia
 
Untethered health in a networked society by James Mathews
Untethered health in a networked society by James MathewsUntethered health in a networked society by James Mathews
Untethered health in a networked society by James Mathews
 
Advanced Visual Analytics and Real-time Analytics at Platform scale by Brian ...
Advanced Visual Analytics and Real-time Analytics at Platform scale by Brian ...Advanced Visual Analytics and Real-time Analytics at Platform scale by Brian ...
Advanced Visual Analytics and Real-time Analytics at Platform scale by Brian ...
 
My magazine edited
My magazine editedMy magazine edited
My magazine edited
 
Redefine healthcare with IT by Niranjan Thirumale
Redefine healthcare with IT by Niranjan ThirumaleRedefine healthcare with IT by Niranjan Thirumale
Redefine healthcare with IT by Niranjan Thirumale
 
Expt panel hive_data_rp_20130320_final-1
Expt panel hive_data_rp_20130320_final-1Expt panel hive_data_rp_20130320_final-1
Expt panel hive_data_rp_20130320_final-1
 
Startup Series: Lean Analytics, Innovation, and Tilting at Windmills
Startup Series: Lean Analytics, Innovation, and Tilting at WindmillsStartup Series: Lean Analytics, Innovation, and Tilting at Windmills
Startup Series: Lean Analytics, Innovation, and Tilting at Windmills
 
[Japanese Content] TM Ravi_ Tokyo Presentation_TheHive_Sept 2013
[Japanese Content] TM Ravi_ Tokyo Presentation_TheHive_Sept 2013[Japanese Content] TM Ravi_ Tokyo Presentation_TheHive_Sept 2013
[Japanese Content] TM Ravi_ Tokyo Presentation_TheHive_Sept 2013
 
Alan Gates, Hortonworks_Hadoop&SQL
Alan Gates, Hortonworks_Hadoop&SQLAlan Gates, Hortonworks_Hadoop&SQL
Alan Gates, Hortonworks_Hadoop&SQL
 
Notes from the (greasy) field by Ranjit Nair - Co-founder and CTO, Altizon
Notes from the (greasy) field by Ranjit Nair - Co-founder and CTO, AltizonNotes from the (greasy) field by Ranjit Nair - Co-founder and CTO, Altizon
Notes from the (greasy) field by Ranjit Nair - Co-founder and CTO, Altizon
 
The Hive "Data Virtualization" Introduction - Jim Green, CEO of Composite Sof...
The Hive "Data Virtualization" Introduction - Jim Green, CEO of Composite Sof...The Hive "Data Virtualization" Introduction - Jim Green, CEO of Composite Sof...
The Hive "Data Virtualization" Introduction - Jim Green, CEO of Composite Sof...
 
Big Data App servor by Lance Riedel, CTO, The Hive for The Hive India event
Big Data App servor by Lance Riedel, CTO, The Hive for The Hive India eventBig Data App servor by Lance Riedel, CTO, The Hive for The Hive India event
Big Data App servor by Lance Riedel, CTO, The Hive for The Hive India event
 
[Japanese Content] Lance Riedel_The App Server, The Hive in Tokyo_Aug29
[Japanese Content] Lance Riedel_The App Server, The Hive in Tokyo_Aug29[Japanese Content] Lance Riedel_The App Server, The Hive in Tokyo_Aug29
[Japanese Content] Lance Riedel_The App Server, The Hive in Tokyo_Aug29
 
Opportunites in Big Data by Sumant Mandal, Founder of The Hive for The Hive I...
Opportunites in Big Data by Sumant Mandal, Founder of The Hive for The Hive I...Opportunites in Big Data by Sumant Mandal, Founder of The Hive for The Hive I...
Opportunites in Big Data by Sumant Mandal, Founder of The Hive for The Hive I...
 
Very beautiful
Very beautifulVery beautiful
Very beautiful
 
Susheel Patel, Pivotal_Hadoop&SQL
Susheel Patel, Pivotal_Hadoop&SQLSusheel Patel, Pivotal_Hadoop&SQL
Susheel Patel, Pivotal_Hadoop&SQL
 
1.nigam shah stanford_meetup
1.nigam shah stanford_meetup1.nigam shah stanford_meetup
1.nigam shah stanford_meetup
 
Search at Linkedin by Sriram Sankar and Kumaresh Pattabiraman
Search at Linkedin by Sriram Sankar and Kumaresh PattabiramanSearch at Linkedin by Sriram Sankar and Kumaresh Pattabiraman
Search at Linkedin by Sriram Sankar and Kumaresh Pattabiraman
 
San martin 2013 2014
San martin 2013 2014San martin 2013 2014
San martin 2013 2014
 
San martin 2013 2014
San martin 2013 2014San martin 2013 2014
San martin 2013 2014
 

Similar to The Hive Think Tank: Ceph + RocksDB by Sage Weil, Red Hat.

What's new in Luminous and Beyond
What's new in Luminous and BeyondWhat's new in Luminous and Beyond
What's new in Luminous and Beyond
Sage Weil
 
BlueStore: a new, faster storage backend for Ceph
BlueStore: a new, faster storage backend for CephBlueStore: a new, faster storage backend for Ceph
BlueStore: a new, faster storage backend for Ceph
Sage Weil
 
Ceph Tech Talk: Bluestore
Ceph Tech Talk: BluestoreCeph Tech Talk: Bluestore
Ceph Tech Talk: Bluestore
Ceph Community
 
What's new in Jewel and Beyond
What's new in Jewel and BeyondWhat's new in Jewel and Beyond
What's new in Jewel and Beyond
Sage Weil
 
Ceph Day New York 2014: Future of CephFS
Ceph Day New York 2014:  Future of CephFS Ceph Day New York 2014:  Future of CephFS
Ceph Day New York 2014: Future of CephFS
Ceph Community
 
CephFS in Jewel: Stable at Last
CephFS in Jewel: Stable at LastCephFS in Jewel: Stable at Last
CephFS in Jewel: Stable at Last
Ceph Community
 
20171101 taco scargo luminous is out, what's in it for you
20171101 taco scargo   luminous is out, what's in it for you20171101 taco scargo   luminous is out, what's in it for you
20171101 taco scargo luminous is out, what's in it for you
Taco Scargo
 
Community Update at OpenStack Summit Boston
Community Update at OpenStack Summit BostonCommunity Update at OpenStack Summit Boston
Community Update at OpenStack Summit Boston
Sage Weil
 
Storage tiering and erasure coding in Ceph (SCaLE13x)
Storage tiering and erasure coding in Ceph (SCaLE13x)Storage tiering and erasure coding in Ceph (SCaLE13x)
Storage tiering and erasure coding in Ceph (SCaLE13x)
Sage Weil
 
Open Source Storage at Scale: Ceph @ GRNET
Open Source Storage at Scale: Ceph @ GRNETOpen Source Storage at Scale: Ceph @ GRNET
Open Source Storage at Scale: Ceph @ GRNET
Nikos Kormpakis
 
INFINISTORE(tm) - Scalable Open Source Storage Arhcitecture
INFINISTORE(tm) - Scalable Open Source Storage ArhcitectureINFINISTORE(tm) - Scalable Open Source Storage Arhcitecture
INFINISTORE(tm) - Scalable Open Source Storage Arhcitecture
Thomas Uhl
 
Scale 10x 01:22:12
Scale 10x 01:22:12Scale 10x 01:22:12
Scale 10x 01:22:12
Ceph Community
 
XenSummit - 08/28/2012
XenSummit - 08/28/2012XenSummit - 08/28/2012
XenSummit - 08/28/2012
Ceph Community
 
Block Storage For VMs With Ceph
Block Storage For VMs With CephBlock Storage For VMs With Ceph
Block Storage For VMs With Ceph
The Linux Foundation
 
Bluestore
BluestoreBluestore
Bluestore
Patrick McGarry
 
Bluestore
BluestoreBluestore
Bluestore
Ceph Community
 
Ippevent : openshift Introduction
Ippevent : openshift IntroductionIppevent : openshift Introduction
Ippevent : openshift Introduction
kanedafromparis
 
Ceph Day Santa Clara: The Future of CephFS + Developing with Librados
Ceph Day Santa Clara: The Future of CephFS + Developing with LibradosCeph Day Santa Clara: The Future of CephFS + Developing with Librados
Ceph Day Santa Clara: The Future of CephFS + Developing with Librados
Ceph Community
 
Strata - 03/31/2012
Strata - 03/31/2012Strata - 03/31/2012
Strata - 03/31/2012
Ceph Community
 
Ceph Block Devices: A Deep Dive
Ceph Block Devices:  A Deep DiveCeph Block Devices:  A Deep Dive
Ceph Block Devices: A Deep Dive
Red_Hat_Storage
 

Similar to The Hive Think Tank: Ceph + RocksDB by Sage Weil, Red Hat. (20)

What's new in Luminous and Beyond
What's new in Luminous and BeyondWhat's new in Luminous and Beyond
What's new in Luminous and Beyond
 
BlueStore: a new, faster storage backend for Ceph
BlueStore: a new, faster storage backend for CephBlueStore: a new, faster storage backend for Ceph
BlueStore: a new, faster storage backend for Ceph
 
Ceph Tech Talk: Bluestore
Ceph Tech Talk: BluestoreCeph Tech Talk: Bluestore
Ceph Tech Talk: Bluestore
 
What's new in Jewel and Beyond
What's new in Jewel and BeyondWhat's new in Jewel and Beyond
What's new in Jewel and Beyond
 
Ceph Day New York 2014: Future of CephFS
Ceph Day New York 2014:  Future of CephFS Ceph Day New York 2014:  Future of CephFS
Ceph Day New York 2014: Future of CephFS
 
CephFS in Jewel: Stable at Last
CephFS in Jewel: Stable at LastCephFS in Jewel: Stable at Last
CephFS in Jewel: Stable at Last
 
20171101 taco scargo luminous is out, what's in it for you
20171101 taco scargo   luminous is out, what's in it for you20171101 taco scargo   luminous is out, what's in it for you
20171101 taco scargo luminous is out, what's in it for you
 
Community Update at OpenStack Summit Boston
Community Update at OpenStack Summit BostonCommunity Update at OpenStack Summit Boston
Community Update at OpenStack Summit Boston
 
Storage tiering and erasure coding in Ceph (SCaLE13x)
Storage tiering and erasure coding in Ceph (SCaLE13x)Storage tiering and erasure coding in Ceph (SCaLE13x)
Storage tiering and erasure coding in Ceph (SCaLE13x)
 
Open Source Storage at Scale: Ceph @ GRNET
Open Source Storage at Scale: Ceph @ GRNETOpen Source Storage at Scale: Ceph @ GRNET
Open Source Storage at Scale: Ceph @ GRNET
 
INFINISTORE(tm) - Scalable Open Source Storage Arhcitecture
INFINISTORE(tm) - Scalable Open Source Storage ArhcitectureINFINISTORE(tm) - Scalable Open Source Storage Arhcitecture
INFINISTORE(tm) - Scalable Open Source Storage Arhcitecture
 
Scale 10x 01:22:12
Scale 10x 01:22:12Scale 10x 01:22:12
Scale 10x 01:22:12
 
XenSummit - 08/28/2012
XenSummit - 08/28/2012XenSummit - 08/28/2012
XenSummit - 08/28/2012
 
Block Storage For VMs With Ceph
Block Storage For VMs With CephBlock Storage For VMs With Ceph
Block Storage For VMs With Ceph
 
Bluestore
BluestoreBluestore
Bluestore
 
Bluestore
BluestoreBluestore
Bluestore
 
Ippevent : openshift Introduction
Ippevent : openshift IntroductionIppevent : openshift Introduction
Ippevent : openshift Introduction
 
Ceph Day Santa Clara: The Future of CephFS + Developing with Librados
Ceph Day Santa Clara: The Future of CephFS + Developing with LibradosCeph Day Santa Clara: The Future of CephFS + Developing with Librados
Ceph Day Santa Clara: The Future of CephFS + Developing with Librados
 
Strata - 03/31/2012
Strata - 03/31/2012Strata - 03/31/2012
Strata - 03/31/2012
 
Ceph Block Devices: A Deep Dive
Ceph Block Devices:  A Deep DiveCeph Block Devices:  A Deep Dive
Ceph Block Devices: A Deep Dive
 

More from The Hive

"Responsible AI", by Charlie Muirhead
"Responsible AI", by Charlie Muirhead"Responsible AI", by Charlie Muirhead
"Responsible AI", by Charlie Muirhead
The Hive
 
Translating a Trillion Points of Data into Therapies, Diagnostics, and New In...
Translating a Trillion Points of Data into Therapies, Diagnostics, and New In...Translating a Trillion Points of Data into Therapies, Diagnostics, and New In...
Translating a Trillion Points of Data into Therapies, Diagnostics, and New In...
The Hive
 
Digital Transformation; Digital Twins for Delivering Business Value in IIoT
Digital Transformation; Digital Twins for Delivering Business Value in IIoTDigital Transformation; Digital Twins for Delivering Business Value in IIoT
Digital Transformation; Digital Twins for Delivering Business Value in IIoT
The Hive
 
Quantum Computing (IBM Q) - Hive Think Tank Event w/ Dr. Bob Sutor - 02.22.18
Quantum Computing (IBM Q) - Hive Think Tank Event w/ Dr. Bob Sutor - 02.22.18Quantum Computing (IBM Q) - Hive Think Tank Event w/ Dr. Bob Sutor - 02.22.18
Quantum Computing (IBM Q) - Hive Think Tank Event w/ Dr. Bob Sutor - 02.22.18
The Hive
 
The Hive Think Tank: Rendezvous Architecture Makes Machine Learning Logistics...
The Hive Think Tank: Rendezvous Architecture Makes Machine Learning Logistics...The Hive Think Tank: Rendezvous Architecture Makes Machine Learning Logistics...
The Hive Think Tank: Rendezvous Architecture Makes Machine Learning Logistics...
The Hive
 
Data Science in the Enterprise
Data Science in the EnterpriseData Science in the Enterprise
Data Science in the Enterprise
The Hive
 
AI in Software for Augmenting Intelligence Across the Enterprise
AI in Software for Augmenting Intelligence Across the EnterpriseAI in Software for Augmenting Intelligence Across the Enterprise
AI in Software for Augmenting Intelligence Across the Enterprise
The Hive
 
“ High Precision Analytics for Healthcare: Promises and Challenges” by Sriram...
“ High Precision Analytics for Healthcare: Promises and Challenges” by Sriram...“ High Precision Analytics for Healthcare: Promises and Challenges” by Sriram...
“ High Precision Analytics for Healthcare: Promises and Challenges” by Sriram...
The Hive
 
"The Future of Manufacturing" by Sujeet Chand, SVP&CTO, Rockwell Automation
"The Future of Manufacturing" by Sujeet Chand, SVP&CTO, Rockwell Automation"The Future of Manufacturing" by Sujeet Chand, SVP&CTO, Rockwell Automation
"The Future of Manufacturing" by Sujeet Chand, SVP&CTO, Rockwell Automation
The Hive
 
Social Impact & Ethics of AI by Steve Omohundro
Social Impact & Ethics of AI by Steve OmohundroSocial Impact & Ethics of AI by Steve Omohundro
Social Impact & Ethics of AI by Steve Omohundro
The Hive
 
The Hive Think Tank: AI in The Enterprise by Venkat Srinivasan
The Hive Think Tank: AI in The Enterprise by Venkat SrinivasanThe Hive Think Tank: AI in The Enterprise by Venkat Srinivasan
The Hive Think Tank: AI in The Enterprise by Venkat Srinivasan
The Hive
 
The Hive Think Tank: Machine Learning Applications in Genomics by Prof. Jian ...
The Hive Think Tank: Machine Learning Applications in Genomics by Prof. Jian ...The Hive Think Tank: Machine Learning Applications in Genomics by Prof. Jian ...
The Hive Think Tank: Machine Learning Applications in Genomics by Prof. Jian ...
The Hive
 
The Hive Think Tank: The Future Of Customer Support - AI Driven Automation
The Hive Think Tank: The Future Of Customer Support - AI Driven AutomationThe Hive Think Tank: The Future Of Customer Support - AI Driven Automation
The Hive Think Tank: The Future Of Customer Support - AI Driven Automation
The Hive
 
The Hive Think Tank: Talk by Mohandas Pai - India at 2030, How Tech Entrepren...
The Hive Think Tank: Talk by Mohandas Pai - India at 2030, How Tech Entrepren...The Hive Think Tank: Talk by Mohandas Pai - India at 2030, How Tech Entrepren...
The Hive Think Tank: Talk by Mohandas Pai - India at 2030, How Tech Entrepren...
The Hive
 
The Hive Think Tank: The Content Trap - Strategist's Guide to Digital Change
The Hive Think Tank: The Content Trap - Strategist's Guide to Digital ChangeThe Hive Think Tank: The Content Trap - Strategist's Guide to Digital Change
The Hive Think Tank: The Content Trap - Strategist's Guide to Digital Change
The Hive
 
Deep Visual Understanding from Deep Learning by Prof. Jitendra Malik
Deep Visual Understanding from Deep Learning by Prof. Jitendra MalikDeep Visual Understanding from Deep Learning by Prof. Jitendra Malik
Deep Visual Understanding from Deep Learning by Prof. Jitendra Malik
The Hive
 
The Hive Think Tank: Heron at Twitter
The Hive Think Tank: Heron at TwitterThe Hive Think Tank: Heron at Twitter
The Hive Think Tank: Heron at Twitter
The Hive
 
The Hive Think Tank: Unpacking AI for Healthcare
The Hive Think Tank: Unpacking AI for Healthcare The Hive Think Tank: Unpacking AI for Healthcare
The Hive Think Tank: Unpacking AI for Healthcare
The Hive
 
The Hive Think Tank: Translating IoT into Innovation at Every Level by Prith ...
The Hive Think Tank: Translating IoT into Innovation at Every Level by Prith ...The Hive Think Tank: Translating IoT into Innovation at Every Level by Prith ...
The Hive Think Tank: Translating IoT into Innovation at Every Level by Prith ...
The Hive
 
The Hive Think Tank - The Microsoft Big Data Stack by Raghu Ramakrishnan, CTO...
The Hive Think Tank - The Microsoft Big Data Stack by Raghu Ramakrishnan, CTO...The Hive Think Tank - The Microsoft Big Data Stack by Raghu Ramakrishnan, CTO...
The Hive Think Tank - The Microsoft Big Data Stack by Raghu Ramakrishnan, CTO...
The Hive
 

More from The Hive (20)

"Responsible AI", by Charlie Muirhead
"Responsible AI", by Charlie Muirhead"Responsible AI", by Charlie Muirhead
"Responsible AI", by Charlie Muirhead
 
Translating a Trillion Points of Data into Therapies, Diagnostics, and New In...
Translating a Trillion Points of Data into Therapies, Diagnostics, and New In...Translating a Trillion Points of Data into Therapies, Diagnostics, and New In...
Translating a Trillion Points of Data into Therapies, Diagnostics, and New In...
 
Digital Transformation; Digital Twins for Delivering Business Value in IIoT
Digital Transformation; Digital Twins for Delivering Business Value in IIoTDigital Transformation; Digital Twins for Delivering Business Value in IIoT
Digital Transformation; Digital Twins for Delivering Business Value in IIoT
 
Quantum Computing (IBM Q) - Hive Think Tank Event w/ Dr. Bob Sutor - 02.22.18
Quantum Computing (IBM Q) - Hive Think Tank Event w/ Dr. Bob Sutor - 02.22.18Quantum Computing (IBM Q) - Hive Think Tank Event w/ Dr. Bob Sutor - 02.22.18
Quantum Computing (IBM Q) - Hive Think Tank Event w/ Dr. Bob Sutor - 02.22.18
 
The Hive Think Tank: Rendezvous Architecture Makes Machine Learning Logistics...
The Hive Think Tank: Rendezvous Architecture Makes Machine Learning Logistics...The Hive Think Tank: Rendezvous Architecture Makes Machine Learning Logistics...
The Hive Think Tank: Rendezvous Architecture Makes Machine Learning Logistics...
 
Data Science in the Enterprise
Data Science in the EnterpriseData Science in the Enterprise
Data Science in the Enterprise
 
AI in Software for Augmenting Intelligence Across the Enterprise
AI in Software for Augmenting Intelligence Across the EnterpriseAI in Software for Augmenting Intelligence Across the Enterprise
AI in Software for Augmenting Intelligence Across the Enterprise
 
“ High Precision Analytics for Healthcare: Promises and Challenges” by Sriram...
“ High Precision Analytics for Healthcare: Promises and Challenges” by Sriram...“ High Precision Analytics for Healthcare: Promises and Challenges” by Sriram...
“ High Precision Analytics for Healthcare: Promises and Challenges” by Sriram...
 
"The Future of Manufacturing" by Sujeet Chand, SVP&CTO, Rockwell Automation
"The Future of Manufacturing" by Sujeet Chand, SVP&CTO, Rockwell Automation"The Future of Manufacturing" by Sujeet Chand, SVP&CTO, Rockwell Automation
"The Future of Manufacturing" by Sujeet Chand, SVP&CTO, Rockwell Automation
 
Social Impact & Ethics of AI by Steve Omohundro
Social Impact & Ethics of AI by Steve OmohundroSocial Impact & Ethics of AI by Steve Omohundro
Social Impact & Ethics of AI by Steve Omohundro
 
The Hive Think Tank: AI in The Enterprise by Venkat Srinivasan
The Hive Think Tank: AI in The Enterprise by Venkat SrinivasanThe Hive Think Tank: AI in The Enterprise by Venkat Srinivasan
The Hive Think Tank: AI in The Enterprise by Venkat Srinivasan
 
The Hive Think Tank: Machine Learning Applications in Genomics by Prof. Jian ...
The Hive Think Tank: Machine Learning Applications in Genomics by Prof. Jian ...The Hive Think Tank: Machine Learning Applications in Genomics by Prof. Jian ...
The Hive Think Tank: Machine Learning Applications in Genomics by Prof. Jian ...
 
The Hive Think Tank: The Future Of Customer Support - AI Driven Automation
The Hive Think Tank: The Future Of Customer Support - AI Driven AutomationThe Hive Think Tank: The Future Of Customer Support - AI Driven Automation
The Hive Think Tank: The Future Of Customer Support - AI Driven Automation
 
The Hive Think Tank: Talk by Mohandas Pai - India at 2030, How Tech Entrepren...
The Hive Think Tank: Talk by Mohandas Pai - India at 2030, How Tech Entrepren...The Hive Think Tank: Talk by Mohandas Pai - India at 2030, How Tech Entrepren...
The Hive Think Tank: Talk by Mohandas Pai - India at 2030, How Tech Entrepren...
 
The Hive Think Tank: The Content Trap - Strategist's Guide to Digital Change
The Hive Think Tank: The Content Trap - Strategist's Guide to Digital ChangeThe Hive Think Tank: The Content Trap - Strategist's Guide to Digital Change
The Hive Think Tank: The Content Trap - Strategist's Guide to Digital Change
 
Deep Visual Understanding from Deep Learning by Prof. Jitendra Malik
Deep Visual Understanding from Deep Learning by Prof. Jitendra MalikDeep Visual Understanding from Deep Learning by Prof. Jitendra Malik
Deep Visual Understanding from Deep Learning by Prof. Jitendra Malik
 
The Hive Think Tank: Heron at Twitter
The Hive Think Tank: Heron at TwitterThe Hive Think Tank: Heron at Twitter
The Hive Think Tank: Heron at Twitter
 
The Hive Think Tank: Unpacking AI for Healthcare
The Hive Think Tank: Unpacking AI for Healthcare The Hive Think Tank: Unpacking AI for Healthcare
The Hive Think Tank: Unpacking AI for Healthcare
 
The Hive Think Tank: Translating IoT into Innovation at Every Level by Prith ...
The Hive Think Tank: Translating IoT into Innovation at Every Level by Prith ...The Hive Think Tank: Translating IoT into Innovation at Every Level by Prith ...
The Hive Think Tank: Translating IoT into Innovation at Every Level by Prith ...
 
The Hive Think Tank - The Microsoft Big Data Stack by Raghu Ramakrishnan, CTO...
The Hive Think Tank - The Microsoft Big Data Stack by Raghu Ramakrishnan, CTO...The Hive Think Tank - The Microsoft Big Data Stack by Raghu Ramakrishnan, CTO...
The Hive Think Tank - The Microsoft Big Data Stack by Raghu Ramakrishnan, CTO...
 

Recently uploaded

“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...
“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...
“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...
Edge AI and Vision Alliance
 
Main news related to the CCS TSI 2023 (2023/1695)
Main news related to the CCS TSI 2023 (2023/1695)Main news related to the CCS TSI 2023 (2023/1695)
Main news related to the CCS TSI 2023 (2023/1695)
Jakub Marek
 
A Comprehensive Guide to DeFi Development Services in 2024
A Comprehensive Guide to DeFi Development Services in 2024A Comprehensive Guide to DeFi Development Services in 2024
A Comprehensive Guide to DeFi Development Services in 2024
Intelisync
 
Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
Zilliz
 
leewayhertz.com-AI in predictive maintenance Use cases technologies benefits ...
leewayhertz.com-AI in predictive maintenance Use cases technologies benefits ...leewayhertz.com-AI in predictive maintenance Use cases technologies benefits ...
leewayhertz.com-AI in predictive maintenance Use cases technologies benefits ...
alexjohnson7307
 
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with SlackLet's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
shyamraj55
 
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUHCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
panagenda
 
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
saastr
 
Fueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte WebinarFueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte Webinar
Zilliz
 
Energy Efficient Video Encoding for Cloud and Edge Computing Instances
Energy Efficient Video Encoding for Cloud and Edge Computing InstancesEnergy Efficient Video Encoding for Cloud and Edge Computing Instances
Energy Efficient Video Encoding for Cloud and Edge Computing Instances
Alpen-Adria-Universität
 
Astute Business Solutions | Oracle Cloud Partner |
Astute Business Solutions | Oracle Cloud Partner |Astute Business Solutions | Oracle Cloud Partner |
Astute Business Solutions | Oracle Cloud Partner |
AstuteBusiness
 
Introduction of Cybersecurity with OSS at Code Europe 2024
Introduction of Cybersecurity with OSS  at Code Europe 2024Introduction of Cybersecurity with OSS  at Code Europe 2024
Introduction of Cybersecurity with OSS at Code Europe 2024
Hiroshi SHIBATA
 
Digital Marketing Trends in 2024 | Guide for Staying Ahead
Digital Marketing Trends in 2024 | Guide for Staying AheadDigital Marketing Trends in 2024 | Guide for Staying Ahead
Digital Marketing Trends in 2024 | Guide for Staying Ahead
Wask
 
June Patch Tuesday
June Patch TuesdayJune Patch Tuesday
June Patch Tuesday
Ivanti
 
AWS Cloud Cost Optimization Presentation.pptx
AWS Cloud Cost Optimization Presentation.pptxAWS Cloud Cost Optimization Presentation.pptx
AWS Cloud Cost Optimization Presentation.pptx
HarisZaheer8
 
Serial Arm Control in Real Time Presentation
Serial Arm Control in Real Time PresentationSerial Arm Control in Real Time Presentation
Serial Arm Control in Real Time Presentation
tolgahangng
 
Taking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdfTaking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdf
ssuserfac0301
 
WeTestAthens: Postman's AI & Automation Techniques
WeTestAthens: Postman's AI & Automation TechniquesWeTestAthens: Postman's AI & Automation Techniques
WeTestAthens: Postman's AI & Automation Techniques
Postman
 
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
Alex Pruden
 
SAP S/4 HANA sourcing and procurement to Public cloud
SAP S/4 HANA sourcing and procurement to Public cloudSAP S/4 HANA sourcing and procurement to Public cloud
SAP S/4 HANA sourcing and procurement to Public cloud
maazsz111
 

Recently uploaded (20)

“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...
“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...
“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...
 
Main news related to the CCS TSI 2023 (2023/1695)
Main news related to the CCS TSI 2023 (2023/1695)Main news related to the CCS TSI 2023 (2023/1695)
Main news related to the CCS TSI 2023 (2023/1695)
 
A Comprehensive Guide to DeFi Development Services in 2024
A Comprehensive Guide to DeFi Development Services in 2024A Comprehensive Guide to DeFi Development Services in 2024
A Comprehensive Guide to DeFi Development Services in 2024
 
Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
 
leewayhertz.com-AI in predictive maintenance Use cases technologies benefits ...
leewayhertz.com-AI in predictive maintenance Use cases technologies benefits ...leewayhertz.com-AI in predictive maintenance Use cases technologies benefits ...
leewayhertz.com-AI in predictive maintenance Use cases technologies benefits ...
 
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with SlackLet's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
 
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUHCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
 
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
 
Fueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte WebinarFueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte Webinar
 
Energy Efficient Video Encoding for Cloud and Edge Computing Instances
Energy Efficient Video Encoding for Cloud and Edge Computing InstancesEnergy Efficient Video Encoding for Cloud and Edge Computing Instances
Energy Efficient Video Encoding for Cloud and Edge Computing Instances
 
Astute Business Solutions | Oracle Cloud Partner |
Astute Business Solutions | Oracle Cloud Partner |Astute Business Solutions | Oracle Cloud Partner |
Astute Business Solutions | Oracle Cloud Partner |
 
Introduction of Cybersecurity with OSS at Code Europe 2024
Introduction of Cybersecurity with OSS  at Code Europe 2024Introduction of Cybersecurity with OSS  at Code Europe 2024
Introduction of Cybersecurity with OSS at Code Europe 2024
 
Digital Marketing Trends in 2024 | Guide for Staying Ahead
Digital Marketing Trends in 2024 | Guide for Staying AheadDigital Marketing Trends in 2024 | Guide for Staying Ahead
Digital Marketing Trends in 2024 | Guide for Staying Ahead
 
June Patch Tuesday
June Patch TuesdayJune Patch Tuesday
June Patch Tuesday
 
AWS Cloud Cost Optimization Presentation.pptx
AWS Cloud Cost Optimization Presentation.pptxAWS Cloud Cost Optimization Presentation.pptx
AWS Cloud Cost Optimization Presentation.pptx
 
Serial Arm Control in Real Time Presentation
Serial Arm Control in Real Time PresentationSerial Arm Control in Real Time Presentation
Serial Arm Control in Real Time Presentation
 
Taking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdfTaking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdf
 
WeTestAthens: Postman's AI & Automation Techniques
WeTestAthens: Postman's AI & Automation TechniquesWeTestAthens: Postman's AI & Automation Techniques
WeTestAthens: Postman's AI & Automation Techniques
 
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
 
SAP S/4 HANA sourcing and procurement to Public cloud
SAP S/4 HANA sourcing and procurement to Public cloudSAP S/4 HANA sourcing and procurement to Public cloud
SAP S/4 HANA sourcing and procurement to Public cloud
 

The Hive Think Tank: Ceph + RocksDB by Sage Weil, Red Hat.

  • 1. CEPH AND ROCKSDB SAGE WEIL HIVEDATA ROCKSDB MEETUP - 2016.02.03
  • 2. 2 OUTLINE ● Ceph background ● FileStore - why POSIX failed us ● BlueStore – a new Ceph OSD backend ● RocksDB changes – journal recycling – BlueRocksEnv – EnvMirror – delayed merge? ● Summary
  • 3. 3 CEPH ● Object, block, and file storage in a single cluster ● All components scale horizontally ● No single point of failure ● Hardware agnostic, commodity hardware ● Self-manage whenever possible ● Open source (LGPL) ● Move beyond legacy approaches – client/cluster instead of client/server – avoid ad hoc approaches HA
  • 4. 4 CEPH COMPONENTS RGW A web services gateway for object storage, compatible with S3 and Swift LIBRADOS A library allowing apps to directly access RADOS (C, C++, Java, Python, Ruby, PHP) RADOS A software-based, reliable, autonomous, distributed object store comprised of self-healing, self-managing, intelligent storage nodes and lightweight monitors RBD A reliable, fully-distributed block device with cloud platform integration CEPHFS A distributed file system with POSIX semantics and scale-out metadata management OBJECT BLOCK FILE
  • 5. 5 OBJECT STORAGE DAEMONS (OSDS) FS DISK OSD DISK OSD FS DISK OSD FS DISK OSD FS btrfs xfs ext4 M M M
  • 6. 6 OBJECT STORAGE DAEMONS (OSDS) FS DISK OSD DISK OSD FS DISK OSD FS DISK OSD FS btrfs xfs ext4 M M M FileStore FileStoreFileStoreFileStore
  • 7. 7 POSIX FAILS: TRANSACTIONS ● OSD carefully manages consistency of its data ● All writes are transactions (we need A+D; OSD provides C+I) ● Most are simple – write some bytes to object (file) – update object attribute (file xattr) – append to update log (leveldb insert) ...but others are arbitrarily large/complex ● Btrfs transaction hooks failed for various reasons ● But write-ahead journals work okay – write entire serialized transactions to well-optimized FileJournal – then apply it to the file system – half our disk throughput
  • 8. 8 POSIX FAILS: ENUMERATION ● Ceph objects are distributed by a 32-bit hash ● Enumeration is in hash order – scrubbing – “backfill” (data rebalancing, recovery) – enumeration via librados client API ● POSIX readdir is not well-ordered ● Need O(1) “split” for a given shard/range ● Build directory tree by hash-value prefix – split any directory when size > ~100 files – merge when size < ~20 files – read entire directory, sort in-memory ... A/A03224D3_qwer A/A247233E_zxcv ... B/8/B823032D_foo B/8/B8474342_bar B/9/B924273B_baz B/A/BA4328D2_asdf ...
  • 9. 9 WE WANT TO AVOID POSIX FILE INTERFACE ● POSIX has the wrong metadata model for us – rocksdb perfect for managing our namespace ● NewStore = rocksdb + object files ● Layering over POSIX duplicates consistency overhead – XFS/ext4 journal writes for fs consistency – rocksdb wal writes for our metadata ● BlueStore = NewStore over block HDD OSD SSD SSD OSD HDD OSD BlueStore BlueStore BlueStore RocksDBRocksDB
  • 10. 10 WHY ROCKSDB? ● Ideal key/value interface – transactions – ordered enumeration – fast commits to log/journal ● Common interface – can always swap in another KV DB if we want ● Abstract storage backend (rocksdb::Env) ● C++ interface ● Strong and active open source community
  • 11. 11 BlueStore BLUESTORE DESIGN BlueFS RocksDB BlockDeviceBlockDeviceBlockDevice BlueRocksEnv data metadata ● rocksdb – object metadata (onode) in rocksdb – write-ahead log (small writes/overwrites) – ceph key/value “omap” data – allocator metadata (free extent list) ● block device – object data ● pluggable allocator ● rocksdb shares block device(s) – BlueRocksEnv is rocksdb::Env – BlueFS is super-simple C++ “file system” ● 2x faster on HDD, more on SSD Allocator
  • 13. 13 ROCKSDB: JOURNAL RECYCLING ● Problem: 1 small (4 KB) Ceph write → 3-4 disk IOs! – BlueStore: write 4 KB of user data – rocksdb: append record to WAL ● write update block at end of log file ● fsync: XFS/ext4/BlueFS journals inode size/alloc update to its journal ● fallocate(2) doesn't help – data blocks are not pre-zeroed; fsync still has to update alloc metadata ● rocksdb LogReader only understands two modes – read until end of file (need accurate file size) – read all valid records, then ignore zeros at end (need zeroed tail)
  • 14. 14 ROCKSDB: JOURNAL RECYCLING (2) ● Put old log files on recycle list (instead of deleting them) ● LogWriter – overwrite old log data with new log data – include log number in each record ● LogReader – stop replaying when we get garbage (bad CRC) – or when we get a valid CRC but record is from a previous log incarnation ● Now we get one log append → one IO! ● Upstream, but missing a bug fix (PR #881)
  • 15. 15 ROCKSDB: BLUEROCKSENV + BLUEFS ● class BlueRocksEnv : public rocksdb::EnvWrapper – passes file IO operations to BlueFS ● BlueFS is a super-simple “file system” – all metadata loaded in RAM on start/mount – no need to store block free list; calculate it on startup – coarse allocation unit (1 MB blocks) – all metadata updates written to a journal – journal rewritten/compacted when it gets large ● Map “directories” (db/, db.wal/, db.bulk/) to different block devices – WAL on NVRAM, NVMe, SSD – level0 and hot SSTs on SSD – cold SSTs on HDD ● BlueStore periodically balances free space between itself and BlueFS BlueStore BlueFS RocksDB BlockDeviceBlockDeviceBlockDevice BlueRocksEnv data metadata Allocator
  • 16. 16 ROCKSDB: ENVMIRROR ● include/rocksdb/utilities/env_mirror.h ● class EnvMirror : public rocksdb::EnvWrapper { EnvMirror(Env* a, Env* b) ● mirrors all writes to both a and b ● sends all reads to both a and b – verifies the results are identical ● Invaluable when debugging BlueRocksEnv – validate BlueRocksEnv vs rocksdb's default PosixEnv
  • 17. 17 ROCKSDB: DELAYED LOG MERGE ● We write lots of short-lived records to log – insert wal_1 = 4 KB – insert wal_2 = 8 KB – … – insert wal_10 = 4 KB – delete wal_1 – insert wal_11 = 4 KB ● Goal – prevent short-lived records from ever getting amplified – keep, say, 2N logs – merge oldest N to new level0 SST, but also remove keys updated/deleted in newest N logs
  • 18. 18 SUMMARY ● Ceph is great ● POSIX was poor choice for storing objects ● Our new BlueStore backend is awesome ● RocksDB rocks and was easy to embed ● Log recycling speeds up commits (now upstream) ● Delayed merge will help too (coming soon)
  • 19. THANK YOU! Sage Weil CEPH PRINCIPAL ARCHITECT sage@redhat.com @liewegas