SlideShare a Scribd company logo
1 of 19
Download to read offline
CEPH AND ROCKSDB
SAGE WEIL
HIVEDATA ROCKSDB MEETUP - 2016.02.03
2
OUTLINE
●
Ceph background
●
FileStore - why POSIX failed us
●
BlueStore – a new Ceph OSD backend
●
RocksDB changes
– journal recycling
– BlueRocksEnv
– EnvMirror
– delayed merge?
●
Summary
3
CEPH
● Object, block, and file storage in a single cluster
● All components scale horizontally
● No single point of failure
● Hardware agnostic, commodity hardware
● Self-manage whenever possible
● Open source (LGPL)
● Move beyond legacy approaches
– client/cluster instead of client/server
– avoid ad hoc approaches HA
4
CEPH COMPONENTS
RGW
A web services gateway
for object storage,
compatible with S3 and
Swift
LIBRADOS
A library allowing apps to directly access RADOS (C, C++, Java, Python, Ruby, PHP)
RADOS
A software-based, reliable, autonomous, distributed object store comprised of
self-healing, self-managing, intelligent storage nodes and lightweight monitors
RBD
A reliable, fully-distributed
block device with cloud
platform integration
CEPHFS
A distributed file system
with POSIX semantics and
scale-out metadata
management
OBJECT BLOCK FILE
5
OBJECT STORAGE DAEMONS (OSDS)
FS
DISK
OSD
DISK
OSD
FS
DISK
OSD
FS
DISK
OSD
FS
btrfs
xfs
ext4
M
M
M
6
OBJECT STORAGE DAEMONS (OSDS)
FS
DISK
OSD
DISK
OSD
FS
DISK
OSD
FS
DISK
OSD
FS
btrfs
xfs
ext4
M
M
M
FileStore FileStoreFileStoreFileStore
7
POSIX FAILS: TRANSACTIONS
●
OSD carefully manages consistency of its data
●
All writes are transactions (we need A+D; OSD provides C+I)
●
Most are simple
– write some bytes to object (file)
– update object attribute (file xattr)
– append to update log (leveldb insert)
...but others are arbitrarily large/complex
● Btrfs transaction hooks failed for various reasons
● But write-ahead journals work okay
– write entire serialized transactions to well-optimized FileJournal
– then apply it to the file system
– half our disk throughput
8
POSIX FAILS: ENUMERATION
● Ceph objects are distributed by a 32-bit hash
● Enumeration is in hash order
– scrubbing
– “backfill” (data rebalancing, recovery)
– enumeration via librados client API
● POSIX readdir is not well-ordered
● Need O(1) “split” for a given shard/range
● Build directory tree by hash-value prefix
– split any directory when size > ~100 files
– merge when size < ~20 files
– read entire directory, sort in-memory
...
A/A03224D3_qwer
A/A247233E_zxcv
...
B/8/B823032D_foo
B/8/B8474342_bar
B/9/B924273B_baz
B/A/BA4328D2_asdf
...
9
WE WANT TO AVOID POSIX FILE INTERFACE
● POSIX has the wrong metadata model for us
– rocksdb perfect for managing our namespace
● NewStore = rocksdb + object files
● Layering over POSIX duplicates consistency
overhead
– XFS/ext4 journal writes for fs consistency
– rocksdb wal writes for our metadata
● BlueStore = NewStore over block HDD
OSD
SSD SSD
OSD
HDD
OSD
BlueStore BlueStore BlueStore
RocksDBRocksDB
10
WHY ROCKSDB?
● Ideal key/value interface
– transactions
– ordered enumeration
– fast commits to log/journal
● Common interface
– can always swap in another KV DB if we want
● Abstract storage backend (rocksdb::Env)
● C++ interface
● Strong and active open source community
11
BlueStore
BLUESTORE DESIGN
BlueFS
RocksDB
BlockDeviceBlockDeviceBlockDevice
BlueRocksEnv
data metadata
● rocksdb
– object metadata (onode) in rocksdb
– write-ahead log (small writes/overwrites)
– ceph key/value “omap” data
– allocator metadata (free extent list)
● block device
– object data
● pluggable allocator
● rocksdb shares block device(s)
– BlueRocksEnv is rocksdb::Env
– BlueFS is super-simple C++ “file system”
● 2x faster on HDD, more on SSD
Allocator
ROCKSDB
13
ROCKSDB: JOURNAL RECYCLING
● Problem: 1 small (4 KB) Ceph write → 3-4 disk IOs!
– BlueStore: write 4 KB of user data
– rocksdb: append record to WAL
● write update block at end of log file
● fsync: XFS/ext4/BlueFS journals inode size/alloc update to its journal
● fallocate(2) doesn't help
– data blocks are not pre-zeroed; fsync still has to update alloc metadata
● rocksdb LogReader only understands two modes
– read until end of file (need accurate file size)
– read all valid records, then ignore zeros at end (need zeroed tail)
14
ROCKSDB: JOURNAL RECYCLING (2)
● Put old log files on recycle list (instead of deleting them)
● LogWriter
– overwrite old log data with new log data
– include log number in each record
● LogReader
– stop replaying when we get garbage (bad CRC)
– or when we get a valid CRC but record is from a previous log incarnation
● Now we get one log append → one IO!
● Upstream, but missing a bug fix (PR #881)
15
ROCKSDB: BLUEROCKSENV + BLUEFS
● class BlueRocksEnv : public rocksdb::EnvWrapper
– passes file IO operations to BlueFS
● BlueFS is a super-simple “file system”
– all metadata loaded in RAM on start/mount
– no need to store block free list; calculate it on startup
– coarse allocation unit (1 MB blocks)
– all metadata updates written to a journal
– journal rewritten/compacted when it gets large
● Map “directories” (db/, db.wal/, db.bulk/) to different block devices
– WAL on NVRAM, NVMe, SSD
– level0 and hot SSTs on SSD
– cold SSTs on HDD
● BlueStore periodically balances free space between itself and BlueFS
BlueStore
BlueFS
RocksDB
BlockDeviceBlockDeviceBlockDevice
BlueRocksEnv
data metadata
Allocator
16
ROCKSDB: ENVMIRROR
● include/rocksdb/utilities/env_mirror.h
● class EnvMirror : public rocksdb::EnvWrapper {
EnvMirror(Env* a, Env* b)
● mirrors all writes to both a and b
● sends all reads to both a and b
– verifies the results are identical
● Invaluable when debugging BlueRocksEnv
– validate BlueRocksEnv vs rocksdb's default PosixEnv
17
ROCKSDB: DELAYED LOG MERGE
● We write lots of short-lived records to log
– insert wal_1 = 4 KB
– insert wal_2 = 8 KB
– …
– insert wal_10 = 4 KB
– delete wal_1
– insert wal_11 = 4 KB
● Goal
– prevent short-lived records from ever getting amplified
– keep, say, 2N logs
– merge oldest N to new level0 SST, but also remove keys updated/deleted in
newest N logs
18
SUMMARY
● Ceph is great
● POSIX was poor choice for storing objects
● Our new BlueStore backend is awesome
● RocksDB rocks and was easy to embed
● Log recycling speeds up commits (now upstream)
● Delayed merge will help too (coming soon)
THANK YOU!
Sage Weil
CEPH PRINCIPAL ARCHITECT
sage@redhat.com
@liewegas

More Related Content

What's hot

Ceph BlueStore - новый тип хранилища в Ceph / Максим Воронцов, (Redsys)
Ceph BlueStore - новый тип хранилища в Ceph / Максим Воронцов, (Redsys)Ceph BlueStore - новый тип хранилища в Ceph / Максим Воронцов, (Redsys)
Ceph BlueStore - новый тип хранилища в Ceph / Максим Воронцов, (Redsys)Ontico
 
Understanding blue store, Ceph's new storage backend - Tim Serong, SUSE
Understanding blue store, Ceph's new storage backend - Tim Serong, SUSEUnderstanding blue store, Ceph's new storage backend - Tim Serong, SUSE
Understanding blue store, Ceph's new storage backend - Tim Serong, SUSEOpenStack
 
RocksDB compaction
RocksDB compactionRocksDB compaction
RocksDB compactionMIJIN AN
 
RocksDB detail
RocksDB detailRocksDB detail
RocksDB detailMIJIN AN
 
The Hive Think Tank: Rocking the Database World with RocksDB
The Hive Think Tank: Rocking the Database World with RocksDBThe Hive Think Tank: Rocking the Database World with RocksDB
The Hive Think Tank: Rocking the Database World with RocksDBThe Hive
 
Hadoop over rgw
Hadoop over rgwHadoop over rgw
Hadoop over rgwzhouyuan
 
Tech Talk: RocksDB Slides by Dhruba Borthakur & Haobo Xu of Facebook
Tech Talk: RocksDB Slides by Dhruba Borthakur & Haobo Xu of FacebookTech Talk: RocksDB Slides by Dhruba Borthakur & Haobo Xu of Facebook
Tech Talk: RocksDB Slides by Dhruba Borthakur & Haobo Xu of FacebookThe Hive
 
Storage Engine Wars at Parse
Storage Engine Wars at ParseStorage Engine Wars at Parse
Storage Engine Wars at ParseMongoDB
 
BlueStore: a new, faster storage backend for Ceph
BlueStore: a new, faster storage backend for CephBlueStore: a new, faster storage backend for Ceph
BlueStore: a new, faster storage backend for CephSage Weil
 
M|18 How to use MyRocks with MariaDB Server
M|18 How to use MyRocks with MariaDB ServerM|18 How to use MyRocks with MariaDB Server
M|18 How to use MyRocks with MariaDB ServerMariaDB plc
 
A crash course in CRUSH
A crash course in CRUSHA crash course in CRUSH
A crash course in CRUSHSage Weil
 
Distributed Storage and Compute With Ceph's librados (Vault 2015)
Distributed Storage and Compute With Ceph's librados (Vault 2015)Distributed Storage and Compute With Ceph's librados (Vault 2015)
Distributed Storage and Compute With Ceph's librados (Vault 2015)Sage Weil
 
QCT Ceph Solution - Design Consideration and Reference Architecture
QCT Ceph Solution - Design Consideration and Reference ArchitectureQCT Ceph Solution - Design Consideration and Reference Architecture
QCT Ceph Solution - Design Consideration and Reference ArchitecturePatrick McGarry
 
Unified readonly cache for ceph
Unified readonly cache for cephUnified readonly cache for ceph
Unified readonly cache for cephzhouyuan
 
Towards Application Driven Storage
Towards Application Driven StorageTowards Application Driven Storage
Towards Application Driven StorageJavier González
 
Introduction to Redis
Introduction to RedisIntroduction to Redis
Introduction to RedisArnab Mitra
 
Ceph Introduction 2017
Ceph Introduction 2017  Ceph Introduction 2017
Ceph Introduction 2017 Karan Singh
 
Ceph - A distributed storage system
Ceph - A distributed storage systemCeph - A distributed storage system
Ceph - A distributed storage systemItalo Santos
 

What's hot (19)

Ceph BlueStore - новый тип хранилища в Ceph / Максим Воронцов, (Redsys)
Ceph BlueStore - новый тип хранилища в Ceph / Максим Воронцов, (Redsys)Ceph BlueStore - новый тип хранилища в Ceph / Максим Воронцов, (Redsys)
Ceph BlueStore - новый тип хранилища в Ceph / Максим Воронцов, (Redsys)
 
Understanding blue store, Ceph's new storage backend - Tim Serong, SUSE
Understanding blue store, Ceph's new storage backend - Tim Serong, SUSEUnderstanding blue store, Ceph's new storage backend - Tim Serong, SUSE
Understanding blue store, Ceph's new storage backend - Tim Serong, SUSE
 
RocksDB compaction
RocksDB compactionRocksDB compaction
RocksDB compaction
 
RocksDB detail
RocksDB detailRocksDB detail
RocksDB detail
 
The Hive Think Tank: Rocking the Database World with RocksDB
The Hive Think Tank: Rocking the Database World with RocksDBThe Hive Think Tank: Rocking the Database World with RocksDB
The Hive Think Tank: Rocking the Database World with RocksDB
 
Hadoop over rgw
Hadoop over rgwHadoop over rgw
Hadoop over rgw
 
Tech Talk: RocksDB Slides by Dhruba Borthakur & Haobo Xu of Facebook
Tech Talk: RocksDB Slides by Dhruba Borthakur & Haobo Xu of FacebookTech Talk: RocksDB Slides by Dhruba Borthakur & Haobo Xu of Facebook
Tech Talk: RocksDB Slides by Dhruba Borthakur & Haobo Xu of Facebook
 
Storage Engine Wars at Parse
Storage Engine Wars at ParseStorage Engine Wars at Parse
Storage Engine Wars at Parse
 
BlueStore: a new, faster storage backend for Ceph
BlueStore: a new, faster storage backend for CephBlueStore: a new, faster storage backend for Ceph
BlueStore: a new, faster storage backend for Ceph
 
M|18 How to use MyRocks with MariaDB Server
M|18 How to use MyRocks with MariaDB ServerM|18 How to use MyRocks with MariaDB Server
M|18 How to use MyRocks with MariaDB Server
 
A crash course in CRUSH
A crash course in CRUSHA crash course in CRUSH
A crash course in CRUSH
 
Distributed Storage and Compute With Ceph's librados (Vault 2015)
Distributed Storage and Compute With Ceph's librados (Vault 2015)Distributed Storage and Compute With Ceph's librados (Vault 2015)
Distributed Storage and Compute With Ceph's librados (Vault 2015)
 
librados
libradoslibrados
librados
 
QCT Ceph Solution - Design Consideration and Reference Architecture
QCT Ceph Solution - Design Consideration and Reference ArchitectureQCT Ceph Solution - Design Consideration and Reference Architecture
QCT Ceph Solution - Design Consideration and Reference Architecture
 
Unified readonly cache for ceph
Unified readonly cache for cephUnified readonly cache for ceph
Unified readonly cache for ceph
 
Towards Application Driven Storage
Towards Application Driven StorageTowards Application Driven Storage
Towards Application Driven Storage
 
Introduction to Redis
Introduction to RedisIntroduction to Redis
Introduction to Redis
 
Ceph Introduction 2017
Ceph Introduction 2017  Ceph Introduction 2017
Ceph Introduction 2017
 
Ceph - A distributed storage system
Ceph - A distributed storage systemCeph - A distributed storage system
Ceph - A distributed storage system
 

Viewers also liked

Untethered health in a networked society by James Mathews
Untethered health in a networked society by James MathewsUntethered health in a networked society by James Mathews
Untethered health in a networked society by James MathewsThe Hive
 
Advanced Visual Analytics and Real-time Analytics at Platform scale by Brian ...
Advanced Visual Analytics and Real-time Analytics at Platform scale by Brian ...Advanced Visual Analytics and Real-time Analytics at Platform scale by Brian ...
Advanced Visual Analytics and Real-time Analytics at Platform scale by Brian ...The Hive
 
My magazine edited
My magazine editedMy magazine edited
My magazine editedsofiamorana1
 
Redefine healthcare with IT by Niranjan Thirumale
Redefine healthcare with IT by Niranjan ThirumaleRedefine healthcare with IT by Niranjan Thirumale
Redefine healthcare with IT by Niranjan ThirumaleThe Hive
 
Expt panel hive_data_rp_20130320_final-1
Expt panel hive_data_rp_20130320_final-1Expt panel hive_data_rp_20130320_final-1
Expt panel hive_data_rp_20130320_final-1The Hive
 
Startup Series: Lean Analytics, Innovation, and Tilting at Windmills
Startup Series: Lean Analytics, Innovation, and Tilting at WindmillsStartup Series: Lean Analytics, Innovation, and Tilting at Windmills
Startup Series: Lean Analytics, Innovation, and Tilting at WindmillsThe Hive
 
[Japanese Content] TM Ravi_ Tokyo Presentation_TheHive_Sept 2013
[Japanese Content] TM Ravi_ Tokyo Presentation_TheHive_Sept 2013[Japanese Content] TM Ravi_ Tokyo Presentation_TheHive_Sept 2013
[Japanese Content] TM Ravi_ Tokyo Presentation_TheHive_Sept 2013The Hive
 
Alan Gates, Hortonworks_Hadoop&SQL
Alan Gates, Hortonworks_Hadoop&SQLAlan Gates, Hortonworks_Hadoop&SQL
Alan Gates, Hortonworks_Hadoop&SQLThe Hive
 
Notes from the (greasy) field by Ranjit Nair - Co-founder and CTO, Altizon
Notes from the (greasy) field by Ranjit Nair - Co-founder and CTO, AltizonNotes from the (greasy) field by Ranjit Nair - Co-founder and CTO, Altizon
Notes from the (greasy) field by Ranjit Nair - Co-founder and CTO, AltizonThe Hive
 
The Hive "Data Virtualization" Introduction - Jim Green, CEO of Composite Sof...
The Hive "Data Virtualization" Introduction - Jim Green, CEO of Composite Sof...The Hive "Data Virtualization" Introduction - Jim Green, CEO of Composite Sof...
The Hive "Data Virtualization" Introduction - Jim Green, CEO of Composite Sof...The Hive
 
Big Data App servor by Lance Riedel, CTO, The Hive for The Hive India event
Big Data App servor by Lance Riedel, CTO, The Hive for The Hive India eventBig Data App servor by Lance Riedel, CTO, The Hive for The Hive India event
Big Data App servor by Lance Riedel, CTO, The Hive for The Hive India eventThe Hive
 
[Japanese Content] Lance Riedel_The App Server, The Hive in Tokyo_Aug29
[Japanese Content] Lance Riedel_The App Server, The Hive in Tokyo_Aug29[Japanese Content] Lance Riedel_The App Server, The Hive in Tokyo_Aug29
[Japanese Content] Lance Riedel_The App Server, The Hive in Tokyo_Aug29The Hive
 
Opportunites in Big Data by Sumant Mandal, Founder of The Hive for The Hive I...
Opportunites in Big Data by Sumant Mandal, Founder of The Hive for The Hive I...Opportunites in Big Data by Sumant Mandal, Founder of The Hive for The Hive I...
Opportunites in Big Data by Sumant Mandal, Founder of The Hive for The Hive I...The Hive
 
Very beautiful
Very beautifulVery beautiful
Very beautifulasmaeazed
 
Susheel Patel, Pivotal_Hadoop&SQL
Susheel Patel, Pivotal_Hadoop&SQLSusheel Patel, Pivotal_Hadoop&SQL
Susheel Patel, Pivotal_Hadoop&SQLThe Hive
 
1.nigam shah stanford_meetup
1.nigam shah stanford_meetup1.nigam shah stanford_meetup
1.nigam shah stanford_meetupThe Hive
 
Search at Linkedin by Sriram Sankar and Kumaresh Pattabiraman
Search at Linkedin by Sriram Sankar and Kumaresh PattabiramanSearch at Linkedin by Sriram Sankar and Kumaresh Pattabiraman
Search at Linkedin by Sriram Sankar and Kumaresh PattabiramanThe Hive
 

Viewers also liked (20)

Bizitzaren historia
Bizitzaren  historiaBizitzaren  historia
Bizitzaren historia
 
Untethered health in a networked society by James Mathews
Untethered health in a networked society by James MathewsUntethered health in a networked society by James Mathews
Untethered health in a networked society by James Mathews
 
Advanced Visual Analytics and Real-time Analytics at Platform scale by Brian ...
Advanced Visual Analytics and Real-time Analytics at Platform scale by Brian ...Advanced Visual Analytics and Real-time Analytics at Platform scale by Brian ...
Advanced Visual Analytics and Real-time Analytics at Platform scale by Brian ...
 
My magazine edited
My magazine editedMy magazine edited
My magazine edited
 
Redefine healthcare with IT by Niranjan Thirumale
Redefine healthcare with IT by Niranjan ThirumaleRedefine healthcare with IT by Niranjan Thirumale
Redefine healthcare with IT by Niranjan Thirumale
 
Expt panel hive_data_rp_20130320_final-1
Expt panel hive_data_rp_20130320_final-1Expt panel hive_data_rp_20130320_final-1
Expt panel hive_data_rp_20130320_final-1
 
Startup Series: Lean Analytics, Innovation, and Tilting at Windmills
Startup Series: Lean Analytics, Innovation, and Tilting at WindmillsStartup Series: Lean Analytics, Innovation, and Tilting at Windmills
Startup Series: Lean Analytics, Innovation, and Tilting at Windmills
 
[Japanese Content] TM Ravi_ Tokyo Presentation_TheHive_Sept 2013
[Japanese Content] TM Ravi_ Tokyo Presentation_TheHive_Sept 2013[Japanese Content] TM Ravi_ Tokyo Presentation_TheHive_Sept 2013
[Japanese Content] TM Ravi_ Tokyo Presentation_TheHive_Sept 2013
 
Alan Gates, Hortonworks_Hadoop&SQL
Alan Gates, Hortonworks_Hadoop&SQLAlan Gates, Hortonworks_Hadoop&SQL
Alan Gates, Hortonworks_Hadoop&SQL
 
Notes from the (greasy) field by Ranjit Nair - Co-founder and CTO, Altizon
Notes from the (greasy) field by Ranjit Nair - Co-founder and CTO, AltizonNotes from the (greasy) field by Ranjit Nair - Co-founder and CTO, Altizon
Notes from the (greasy) field by Ranjit Nair - Co-founder and CTO, Altizon
 
The Hive "Data Virtualization" Introduction - Jim Green, CEO of Composite Sof...
The Hive "Data Virtualization" Introduction - Jim Green, CEO of Composite Sof...The Hive "Data Virtualization" Introduction - Jim Green, CEO of Composite Sof...
The Hive "Data Virtualization" Introduction - Jim Green, CEO of Composite Sof...
 
Big Data App servor by Lance Riedel, CTO, The Hive for The Hive India event
Big Data App servor by Lance Riedel, CTO, The Hive for The Hive India eventBig Data App servor by Lance Riedel, CTO, The Hive for The Hive India event
Big Data App servor by Lance Riedel, CTO, The Hive for The Hive India event
 
[Japanese Content] Lance Riedel_The App Server, The Hive in Tokyo_Aug29
[Japanese Content] Lance Riedel_The App Server, The Hive in Tokyo_Aug29[Japanese Content] Lance Riedel_The App Server, The Hive in Tokyo_Aug29
[Japanese Content] Lance Riedel_The App Server, The Hive in Tokyo_Aug29
 
Opportunites in Big Data by Sumant Mandal, Founder of The Hive for The Hive I...
Opportunites in Big Data by Sumant Mandal, Founder of The Hive for The Hive I...Opportunites in Big Data by Sumant Mandal, Founder of The Hive for The Hive I...
Opportunites in Big Data by Sumant Mandal, Founder of The Hive for The Hive I...
 
Very beautiful
Very beautifulVery beautiful
Very beautiful
 
Susheel Patel, Pivotal_Hadoop&SQL
Susheel Patel, Pivotal_Hadoop&SQLSusheel Patel, Pivotal_Hadoop&SQL
Susheel Patel, Pivotal_Hadoop&SQL
 
1.nigam shah stanford_meetup
1.nigam shah stanford_meetup1.nigam shah stanford_meetup
1.nigam shah stanford_meetup
 
Search at Linkedin by Sriram Sankar and Kumaresh Pattabiraman
Search at Linkedin by Sriram Sankar and Kumaresh PattabiramanSearch at Linkedin by Sriram Sankar and Kumaresh Pattabiraman
Search at Linkedin by Sriram Sankar and Kumaresh Pattabiraman
 
San martin 2013 2014
San martin 2013 2014San martin 2013 2014
San martin 2013 2014
 
San martin 2013 2014
San martin 2013 2014San martin 2013 2014
San martin 2013 2014
 

Similar to The Hive Think Tank: Ceph + RocksDB by Sage Weil, Red Hat.

What's new in Luminous and Beyond
What's new in Luminous and BeyondWhat's new in Luminous and Beyond
What's new in Luminous and BeyondSage Weil
 
BlueStore: a new, faster storage backend for Ceph
BlueStore: a new, faster storage backend for CephBlueStore: a new, faster storage backend for Ceph
BlueStore: a new, faster storage backend for CephSage Weil
 
Ceph Tech Talk: Bluestore
Ceph Tech Talk: BluestoreCeph Tech Talk: Bluestore
Ceph Tech Talk: BluestoreCeph Community
 
What's new in Jewel and Beyond
What's new in Jewel and BeyondWhat's new in Jewel and Beyond
What's new in Jewel and BeyondSage Weil
 
Ceph Day New York 2014: Future of CephFS
Ceph Day New York 2014:  Future of CephFS Ceph Day New York 2014:  Future of CephFS
Ceph Day New York 2014: Future of CephFS Ceph Community
 
CephFS in Jewel: Stable at Last
CephFS in Jewel: Stable at LastCephFS in Jewel: Stable at Last
CephFS in Jewel: Stable at LastCeph Community
 
20171101 taco scargo luminous is out, what's in it for you
20171101 taco scargo   luminous is out, what's in it for you20171101 taco scargo   luminous is out, what's in it for you
20171101 taco scargo luminous is out, what's in it for youTaco Scargo
 
Community Update at OpenStack Summit Boston
Community Update at OpenStack Summit BostonCommunity Update at OpenStack Summit Boston
Community Update at OpenStack Summit BostonSage Weil
 
Storage tiering and erasure coding in Ceph (SCaLE13x)
Storage tiering and erasure coding in Ceph (SCaLE13x)Storage tiering and erasure coding in Ceph (SCaLE13x)
Storage tiering and erasure coding in Ceph (SCaLE13x)Sage Weil
 
Open Source Storage at Scale: Ceph @ GRNET
Open Source Storage at Scale: Ceph @ GRNETOpen Source Storage at Scale: Ceph @ GRNET
Open Source Storage at Scale: Ceph @ GRNETNikos Kormpakis
 
INFINISTORE(tm) - Scalable Open Source Storage Arhcitecture
INFINISTORE(tm) - Scalable Open Source Storage ArhcitectureINFINISTORE(tm) - Scalable Open Source Storage Arhcitecture
INFINISTORE(tm) - Scalable Open Source Storage ArhcitectureThomas Uhl
 
Ippevent : openshift Introduction
Ippevent : openshift IntroductionIppevent : openshift Introduction
Ippevent : openshift Introductionkanedafromparis
 
Ceph Day Santa Clara: The Future of CephFS + Developing with Librados
Ceph Day Santa Clara: The Future of CephFS + Developing with LibradosCeph Day Santa Clara: The Future of CephFS + Developing with Librados
Ceph Day Santa Clara: The Future of CephFS + Developing with LibradosCeph Community
 
Ceph Block Devices: A Deep Dive
Ceph Block Devices:  A Deep DiveCeph Block Devices:  A Deep Dive
Ceph Block Devices: A Deep DiveRed_Hat_Storage
 

Similar to The Hive Think Tank: Ceph + RocksDB by Sage Weil, Red Hat. (20)

What's new in Luminous and Beyond
What's new in Luminous and BeyondWhat's new in Luminous and Beyond
What's new in Luminous and Beyond
 
BlueStore: a new, faster storage backend for Ceph
BlueStore: a new, faster storage backend for CephBlueStore: a new, faster storage backend for Ceph
BlueStore: a new, faster storage backend for Ceph
 
Ceph Tech Talk: Bluestore
Ceph Tech Talk: BluestoreCeph Tech Talk: Bluestore
Ceph Tech Talk: Bluestore
 
What's new in Jewel and Beyond
What's new in Jewel and BeyondWhat's new in Jewel and Beyond
What's new in Jewel and Beyond
 
Ceph Day New York 2014: Future of CephFS
Ceph Day New York 2014:  Future of CephFS Ceph Day New York 2014:  Future of CephFS
Ceph Day New York 2014: Future of CephFS
 
CephFS in Jewel: Stable at Last
CephFS in Jewel: Stable at LastCephFS in Jewel: Stable at Last
CephFS in Jewel: Stable at Last
 
20171101 taco scargo luminous is out, what's in it for you
20171101 taco scargo   luminous is out, what's in it for you20171101 taco scargo   luminous is out, what's in it for you
20171101 taco scargo luminous is out, what's in it for you
 
Community Update at OpenStack Summit Boston
Community Update at OpenStack Summit BostonCommunity Update at OpenStack Summit Boston
Community Update at OpenStack Summit Boston
 
Storage tiering and erasure coding in Ceph (SCaLE13x)
Storage tiering and erasure coding in Ceph (SCaLE13x)Storage tiering and erasure coding in Ceph (SCaLE13x)
Storage tiering and erasure coding in Ceph (SCaLE13x)
 
Open Source Storage at Scale: Ceph @ GRNET
Open Source Storage at Scale: Ceph @ GRNETOpen Source Storage at Scale: Ceph @ GRNET
Open Source Storage at Scale: Ceph @ GRNET
 
INFINISTORE(tm) - Scalable Open Source Storage Arhcitecture
INFINISTORE(tm) - Scalable Open Source Storage ArhcitectureINFINISTORE(tm) - Scalable Open Source Storage Arhcitecture
INFINISTORE(tm) - Scalable Open Source Storage Arhcitecture
 
Scale 10x 01:22:12
Scale 10x 01:22:12Scale 10x 01:22:12
Scale 10x 01:22:12
 
XenSummit - 08/28/2012
XenSummit - 08/28/2012XenSummit - 08/28/2012
XenSummit - 08/28/2012
 
Block Storage For VMs With Ceph
Block Storage For VMs With CephBlock Storage For VMs With Ceph
Block Storage For VMs With Ceph
 
Bluestore
BluestoreBluestore
Bluestore
 
Bluestore
BluestoreBluestore
Bluestore
 
Ippevent : openshift Introduction
Ippevent : openshift IntroductionIppevent : openshift Introduction
Ippevent : openshift Introduction
 
Ceph Day Santa Clara: The Future of CephFS + Developing with Librados
Ceph Day Santa Clara: The Future of CephFS + Developing with LibradosCeph Day Santa Clara: The Future of CephFS + Developing with Librados
Ceph Day Santa Clara: The Future of CephFS + Developing with Librados
 
Strata - 03/31/2012
Strata - 03/31/2012Strata - 03/31/2012
Strata - 03/31/2012
 
Ceph Block Devices: A Deep Dive
Ceph Block Devices:  A Deep DiveCeph Block Devices:  A Deep Dive
Ceph Block Devices: A Deep Dive
 

More from The Hive

"Responsible AI", by Charlie Muirhead
"Responsible AI", by Charlie Muirhead"Responsible AI", by Charlie Muirhead
"Responsible AI", by Charlie MuirheadThe Hive
 
Translating a Trillion Points of Data into Therapies, Diagnostics, and New In...
Translating a Trillion Points of Data into Therapies, Diagnostics, and New In...Translating a Trillion Points of Data into Therapies, Diagnostics, and New In...
Translating a Trillion Points of Data into Therapies, Diagnostics, and New In...The Hive
 
Digital Transformation; Digital Twins for Delivering Business Value in IIoT
Digital Transformation; Digital Twins for Delivering Business Value in IIoTDigital Transformation; Digital Twins for Delivering Business Value in IIoT
Digital Transformation; Digital Twins for Delivering Business Value in IIoTThe Hive
 
Quantum Computing (IBM Q) - Hive Think Tank Event w/ Dr. Bob Sutor - 02.22.18
Quantum Computing (IBM Q) - Hive Think Tank Event w/ Dr. Bob Sutor - 02.22.18Quantum Computing (IBM Q) - Hive Think Tank Event w/ Dr. Bob Sutor - 02.22.18
Quantum Computing (IBM Q) - Hive Think Tank Event w/ Dr. Bob Sutor - 02.22.18The Hive
 
The Hive Think Tank: Rendezvous Architecture Makes Machine Learning Logistics...
The Hive Think Tank: Rendezvous Architecture Makes Machine Learning Logistics...The Hive Think Tank: Rendezvous Architecture Makes Machine Learning Logistics...
The Hive Think Tank: Rendezvous Architecture Makes Machine Learning Logistics...The Hive
 
Data Science in the Enterprise
Data Science in the EnterpriseData Science in the Enterprise
Data Science in the EnterpriseThe Hive
 
AI in Software for Augmenting Intelligence Across the Enterprise
AI in Software for Augmenting Intelligence Across the EnterpriseAI in Software for Augmenting Intelligence Across the Enterprise
AI in Software for Augmenting Intelligence Across the EnterpriseThe Hive
 
“ High Precision Analytics for Healthcare: Promises and Challenges” by Sriram...
“ High Precision Analytics for Healthcare: Promises and Challenges” by Sriram...“ High Precision Analytics for Healthcare: Promises and Challenges” by Sriram...
“ High Precision Analytics for Healthcare: Promises and Challenges” by Sriram...The Hive
 
"The Future of Manufacturing" by Sujeet Chand, SVP&CTO, Rockwell Automation
"The Future of Manufacturing" by Sujeet Chand, SVP&CTO, Rockwell Automation"The Future of Manufacturing" by Sujeet Chand, SVP&CTO, Rockwell Automation
"The Future of Manufacturing" by Sujeet Chand, SVP&CTO, Rockwell AutomationThe Hive
 
Social Impact & Ethics of AI by Steve Omohundro
Social Impact & Ethics of AI by Steve OmohundroSocial Impact & Ethics of AI by Steve Omohundro
Social Impact & Ethics of AI by Steve OmohundroThe Hive
 
The Hive Think Tank: AI in The Enterprise by Venkat Srinivasan
The Hive Think Tank: AI in The Enterprise by Venkat SrinivasanThe Hive Think Tank: AI in The Enterprise by Venkat Srinivasan
The Hive Think Tank: AI in The Enterprise by Venkat SrinivasanThe Hive
 
The Hive Think Tank: Machine Learning Applications in Genomics by Prof. Jian ...
The Hive Think Tank: Machine Learning Applications in Genomics by Prof. Jian ...The Hive Think Tank: Machine Learning Applications in Genomics by Prof. Jian ...
The Hive Think Tank: Machine Learning Applications in Genomics by Prof. Jian ...The Hive
 
The Hive Think Tank: The Future Of Customer Support - AI Driven Automation
The Hive Think Tank: The Future Of Customer Support - AI Driven AutomationThe Hive Think Tank: The Future Of Customer Support - AI Driven Automation
The Hive Think Tank: The Future Of Customer Support - AI Driven AutomationThe Hive
 
The Hive Think Tank: Talk by Mohandas Pai - India at 2030, How Tech Entrepren...
The Hive Think Tank: Talk by Mohandas Pai - India at 2030, How Tech Entrepren...The Hive Think Tank: Talk by Mohandas Pai - India at 2030, How Tech Entrepren...
The Hive Think Tank: Talk by Mohandas Pai - India at 2030, How Tech Entrepren...The Hive
 
The Hive Think Tank: The Content Trap - Strategist's Guide to Digital Change
The Hive Think Tank: The Content Trap - Strategist's Guide to Digital ChangeThe Hive Think Tank: The Content Trap - Strategist's Guide to Digital Change
The Hive Think Tank: The Content Trap - Strategist's Guide to Digital ChangeThe Hive
 
Deep Visual Understanding from Deep Learning by Prof. Jitendra Malik
Deep Visual Understanding from Deep Learning by Prof. Jitendra MalikDeep Visual Understanding from Deep Learning by Prof. Jitendra Malik
Deep Visual Understanding from Deep Learning by Prof. Jitendra MalikThe Hive
 
The Hive Think Tank: Heron at Twitter
The Hive Think Tank: Heron at TwitterThe Hive Think Tank: Heron at Twitter
The Hive Think Tank: Heron at TwitterThe Hive
 
The Hive Think Tank: Unpacking AI for Healthcare
The Hive Think Tank: Unpacking AI for Healthcare The Hive Think Tank: Unpacking AI for Healthcare
The Hive Think Tank: Unpacking AI for Healthcare The Hive
 
The Hive Think Tank: Translating IoT into Innovation at Every Level by Prith ...
The Hive Think Tank: Translating IoT into Innovation at Every Level by Prith ...The Hive Think Tank: Translating IoT into Innovation at Every Level by Prith ...
The Hive Think Tank: Translating IoT into Innovation at Every Level by Prith ...The Hive
 
The Hive Think Tank - The Microsoft Big Data Stack by Raghu Ramakrishnan, CTO...
The Hive Think Tank - The Microsoft Big Data Stack by Raghu Ramakrishnan, CTO...The Hive Think Tank - The Microsoft Big Data Stack by Raghu Ramakrishnan, CTO...
The Hive Think Tank - The Microsoft Big Data Stack by Raghu Ramakrishnan, CTO...The Hive
 

More from The Hive (20)

"Responsible AI", by Charlie Muirhead
"Responsible AI", by Charlie Muirhead"Responsible AI", by Charlie Muirhead
"Responsible AI", by Charlie Muirhead
 
Translating a Trillion Points of Data into Therapies, Diagnostics, and New In...
Translating a Trillion Points of Data into Therapies, Diagnostics, and New In...Translating a Trillion Points of Data into Therapies, Diagnostics, and New In...
Translating a Trillion Points of Data into Therapies, Diagnostics, and New In...
 
Digital Transformation; Digital Twins for Delivering Business Value in IIoT
Digital Transformation; Digital Twins for Delivering Business Value in IIoTDigital Transformation; Digital Twins for Delivering Business Value in IIoT
Digital Transformation; Digital Twins for Delivering Business Value in IIoT
 
Quantum Computing (IBM Q) - Hive Think Tank Event w/ Dr. Bob Sutor - 02.22.18
Quantum Computing (IBM Q) - Hive Think Tank Event w/ Dr. Bob Sutor - 02.22.18Quantum Computing (IBM Q) - Hive Think Tank Event w/ Dr. Bob Sutor - 02.22.18
Quantum Computing (IBM Q) - Hive Think Tank Event w/ Dr. Bob Sutor - 02.22.18
 
The Hive Think Tank: Rendezvous Architecture Makes Machine Learning Logistics...
The Hive Think Tank: Rendezvous Architecture Makes Machine Learning Logistics...The Hive Think Tank: Rendezvous Architecture Makes Machine Learning Logistics...
The Hive Think Tank: Rendezvous Architecture Makes Machine Learning Logistics...
 
Data Science in the Enterprise
Data Science in the EnterpriseData Science in the Enterprise
Data Science in the Enterprise
 
AI in Software for Augmenting Intelligence Across the Enterprise
AI in Software for Augmenting Intelligence Across the EnterpriseAI in Software for Augmenting Intelligence Across the Enterprise
AI in Software for Augmenting Intelligence Across the Enterprise
 
“ High Precision Analytics for Healthcare: Promises and Challenges” by Sriram...
“ High Precision Analytics for Healthcare: Promises and Challenges” by Sriram...“ High Precision Analytics for Healthcare: Promises and Challenges” by Sriram...
“ High Precision Analytics for Healthcare: Promises and Challenges” by Sriram...
 
"The Future of Manufacturing" by Sujeet Chand, SVP&CTO, Rockwell Automation
"The Future of Manufacturing" by Sujeet Chand, SVP&CTO, Rockwell Automation"The Future of Manufacturing" by Sujeet Chand, SVP&CTO, Rockwell Automation
"The Future of Manufacturing" by Sujeet Chand, SVP&CTO, Rockwell Automation
 
Social Impact & Ethics of AI by Steve Omohundro
Social Impact & Ethics of AI by Steve OmohundroSocial Impact & Ethics of AI by Steve Omohundro
Social Impact & Ethics of AI by Steve Omohundro
 
The Hive Think Tank: AI in The Enterprise by Venkat Srinivasan
The Hive Think Tank: AI in The Enterprise by Venkat SrinivasanThe Hive Think Tank: AI in The Enterprise by Venkat Srinivasan
The Hive Think Tank: AI in The Enterprise by Venkat Srinivasan
 
The Hive Think Tank: Machine Learning Applications in Genomics by Prof. Jian ...
The Hive Think Tank: Machine Learning Applications in Genomics by Prof. Jian ...The Hive Think Tank: Machine Learning Applications in Genomics by Prof. Jian ...
The Hive Think Tank: Machine Learning Applications in Genomics by Prof. Jian ...
 
The Hive Think Tank: The Future Of Customer Support - AI Driven Automation
The Hive Think Tank: The Future Of Customer Support - AI Driven AutomationThe Hive Think Tank: The Future Of Customer Support - AI Driven Automation
The Hive Think Tank: The Future Of Customer Support - AI Driven Automation
 
The Hive Think Tank: Talk by Mohandas Pai - India at 2030, How Tech Entrepren...
The Hive Think Tank: Talk by Mohandas Pai - India at 2030, How Tech Entrepren...The Hive Think Tank: Talk by Mohandas Pai - India at 2030, How Tech Entrepren...
The Hive Think Tank: Talk by Mohandas Pai - India at 2030, How Tech Entrepren...
 
The Hive Think Tank: The Content Trap - Strategist's Guide to Digital Change
The Hive Think Tank: The Content Trap - Strategist's Guide to Digital ChangeThe Hive Think Tank: The Content Trap - Strategist's Guide to Digital Change
The Hive Think Tank: The Content Trap - Strategist's Guide to Digital Change
 
Deep Visual Understanding from Deep Learning by Prof. Jitendra Malik
Deep Visual Understanding from Deep Learning by Prof. Jitendra MalikDeep Visual Understanding from Deep Learning by Prof. Jitendra Malik
Deep Visual Understanding from Deep Learning by Prof. Jitendra Malik
 
The Hive Think Tank: Heron at Twitter
The Hive Think Tank: Heron at TwitterThe Hive Think Tank: Heron at Twitter
The Hive Think Tank: Heron at Twitter
 
The Hive Think Tank: Unpacking AI for Healthcare
The Hive Think Tank: Unpacking AI for Healthcare The Hive Think Tank: Unpacking AI for Healthcare
The Hive Think Tank: Unpacking AI for Healthcare
 
The Hive Think Tank: Translating IoT into Innovation at Every Level by Prith ...
The Hive Think Tank: Translating IoT into Innovation at Every Level by Prith ...The Hive Think Tank: Translating IoT into Innovation at Every Level by Prith ...
The Hive Think Tank: Translating IoT into Innovation at Every Level by Prith ...
 
The Hive Think Tank - The Microsoft Big Data Stack by Raghu Ramakrishnan, CTO...
The Hive Think Tank - The Microsoft Big Data Stack by Raghu Ramakrishnan, CTO...The Hive Think Tank - The Microsoft Big Data Stack by Raghu Ramakrishnan, CTO...
The Hive Think Tank - The Microsoft Big Data Stack by Raghu Ramakrishnan, CTO...
 

Recently uploaded

Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfOrbitshub
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Victor Rentea
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Zilliz
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...apidays
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native ApplicationsWSO2
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamUiPathCommunity
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfOverkill Security
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUKSpring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUKJago de Vreede
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusZilliz
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistandanishmna97
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Orbitshub
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingEdi Saputra
 

Recently uploaded (20)

Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUKSpring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 

The Hive Think Tank: Ceph + RocksDB by Sage Weil, Red Hat.

  • 1. CEPH AND ROCKSDB SAGE WEIL HIVEDATA ROCKSDB MEETUP - 2016.02.03
  • 2. 2 OUTLINE ● Ceph background ● FileStore - why POSIX failed us ● BlueStore – a new Ceph OSD backend ● RocksDB changes – journal recycling – BlueRocksEnv – EnvMirror – delayed merge? ● Summary
  • 3. 3 CEPH ● Object, block, and file storage in a single cluster ● All components scale horizontally ● No single point of failure ● Hardware agnostic, commodity hardware ● Self-manage whenever possible ● Open source (LGPL) ● Move beyond legacy approaches – client/cluster instead of client/server – avoid ad hoc approaches HA
  • 4. 4 CEPH COMPONENTS RGW A web services gateway for object storage, compatible with S3 and Swift LIBRADOS A library allowing apps to directly access RADOS (C, C++, Java, Python, Ruby, PHP) RADOS A software-based, reliable, autonomous, distributed object store comprised of self-healing, self-managing, intelligent storage nodes and lightweight monitors RBD A reliable, fully-distributed block device with cloud platform integration CEPHFS A distributed file system with POSIX semantics and scale-out metadata management OBJECT BLOCK FILE
  • 5. 5 OBJECT STORAGE DAEMONS (OSDS) FS DISK OSD DISK OSD FS DISK OSD FS DISK OSD FS btrfs xfs ext4 M M M
  • 6. 6 OBJECT STORAGE DAEMONS (OSDS) FS DISK OSD DISK OSD FS DISK OSD FS DISK OSD FS btrfs xfs ext4 M M M FileStore FileStoreFileStoreFileStore
  • 7. 7 POSIX FAILS: TRANSACTIONS ● OSD carefully manages consistency of its data ● All writes are transactions (we need A+D; OSD provides C+I) ● Most are simple – write some bytes to object (file) – update object attribute (file xattr) – append to update log (leveldb insert) ...but others are arbitrarily large/complex ● Btrfs transaction hooks failed for various reasons ● But write-ahead journals work okay – write entire serialized transactions to well-optimized FileJournal – then apply it to the file system – half our disk throughput
  • 8. 8 POSIX FAILS: ENUMERATION ● Ceph objects are distributed by a 32-bit hash ● Enumeration is in hash order – scrubbing – “backfill” (data rebalancing, recovery) – enumeration via librados client API ● POSIX readdir is not well-ordered ● Need O(1) “split” for a given shard/range ● Build directory tree by hash-value prefix – split any directory when size > ~100 files – merge when size < ~20 files – read entire directory, sort in-memory ... A/A03224D3_qwer A/A247233E_zxcv ... B/8/B823032D_foo B/8/B8474342_bar B/9/B924273B_baz B/A/BA4328D2_asdf ...
  • 9. 9 WE WANT TO AVOID POSIX FILE INTERFACE ● POSIX has the wrong metadata model for us – rocksdb perfect for managing our namespace ● NewStore = rocksdb + object files ● Layering over POSIX duplicates consistency overhead – XFS/ext4 journal writes for fs consistency – rocksdb wal writes for our metadata ● BlueStore = NewStore over block HDD OSD SSD SSD OSD HDD OSD BlueStore BlueStore BlueStore RocksDBRocksDB
  • 10. 10 WHY ROCKSDB? ● Ideal key/value interface – transactions – ordered enumeration – fast commits to log/journal ● Common interface – can always swap in another KV DB if we want ● Abstract storage backend (rocksdb::Env) ● C++ interface ● Strong and active open source community
  • 11. 11 BlueStore BLUESTORE DESIGN BlueFS RocksDB BlockDeviceBlockDeviceBlockDevice BlueRocksEnv data metadata ● rocksdb – object metadata (onode) in rocksdb – write-ahead log (small writes/overwrites) – ceph key/value “omap” data – allocator metadata (free extent list) ● block device – object data ● pluggable allocator ● rocksdb shares block device(s) – BlueRocksEnv is rocksdb::Env – BlueFS is super-simple C++ “file system” ● 2x faster on HDD, more on SSD Allocator
  • 13. 13 ROCKSDB: JOURNAL RECYCLING ● Problem: 1 small (4 KB) Ceph write → 3-4 disk IOs! – BlueStore: write 4 KB of user data – rocksdb: append record to WAL ● write update block at end of log file ● fsync: XFS/ext4/BlueFS journals inode size/alloc update to its journal ● fallocate(2) doesn't help – data blocks are not pre-zeroed; fsync still has to update alloc metadata ● rocksdb LogReader only understands two modes – read until end of file (need accurate file size) – read all valid records, then ignore zeros at end (need zeroed tail)
  • 14. 14 ROCKSDB: JOURNAL RECYCLING (2) ● Put old log files on recycle list (instead of deleting them) ● LogWriter – overwrite old log data with new log data – include log number in each record ● LogReader – stop replaying when we get garbage (bad CRC) – or when we get a valid CRC but record is from a previous log incarnation ● Now we get one log append → one IO! ● Upstream, but missing a bug fix (PR #881)
  • 15. 15 ROCKSDB: BLUEROCKSENV + BLUEFS ● class BlueRocksEnv : public rocksdb::EnvWrapper – passes file IO operations to BlueFS ● BlueFS is a super-simple “file system” – all metadata loaded in RAM on start/mount – no need to store block free list; calculate it on startup – coarse allocation unit (1 MB blocks) – all metadata updates written to a journal – journal rewritten/compacted when it gets large ● Map “directories” (db/, db.wal/, db.bulk/) to different block devices – WAL on NVRAM, NVMe, SSD – level0 and hot SSTs on SSD – cold SSTs on HDD ● BlueStore periodically balances free space between itself and BlueFS BlueStore BlueFS RocksDB BlockDeviceBlockDeviceBlockDevice BlueRocksEnv data metadata Allocator
  • 16. 16 ROCKSDB: ENVMIRROR ● include/rocksdb/utilities/env_mirror.h ● class EnvMirror : public rocksdb::EnvWrapper { EnvMirror(Env* a, Env* b) ● mirrors all writes to both a and b ● sends all reads to both a and b – verifies the results are identical ● Invaluable when debugging BlueRocksEnv – validate BlueRocksEnv vs rocksdb's default PosixEnv
  • 17. 17 ROCKSDB: DELAYED LOG MERGE ● We write lots of short-lived records to log – insert wal_1 = 4 KB – insert wal_2 = 8 KB – … – insert wal_10 = 4 KB – delete wal_1 – insert wal_11 = 4 KB ● Goal – prevent short-lived records from ever getting amplified – keep, say, 2N logs – merge oldest N to new level0 SST, but also remove keys updated/deleted in newest N logs
  • 18. 18 SUMMARY ● Ceph is great ● POSIX was poor choice for storing objects ● Our new BlueStore backend is awesome ● RocksDB rocks and was easy to embed ● Log recycling speeds up commits (now upstream) ● Delayed merge will help too (coming soon)
  • 19. THANK YOU! Sage Weil CEPH PRINCIPAL ARCHITECT sage@redhat.com @liewegas