SlideShare a Scribd company logo
ERASURE CODING AND CACHE TIERING
SAGE WEIL – SCALE13X - 2015.02.22
2
AGENDA
● Ceph architectural overview
● RADOS background
● Cache tiering
● Erasure coding
● Project status, roadmap
ARCHITECTURE
4
CEPH MOTIVATING PRINCIPLES
● All components must scale horizontally
● There can be no single point of failure
● The solution must be hardware agnostic
● Should use commodity hardware
● Self-manage whenever possible
● Open source (LGPL)
● Move beyond legacy approaches
– Client/cluster instead of client/server
– Ad hoc HA
5
CEPH COMPONENTS
RGW
A web services
gateway for object
storage, compatible
with S3 and Swift
LIBRADOS
A library allowing apps to directly access RADOS (C, C++, Java, Python, Ruby, PHP)
RADOS
A software-based, reliable, autonomous, distributed object store comprised of
self-healing, self-managing, intelligent storage nodes and lightweight monitors
RBD
A reliable, fully-
distributed block
device with cloud
platform integration
CEPHFS
A distributed file
system with POSIX
semantics and scale-
out metadata
management
APP HOST/VM CLIENT
ROBUST SERVICES BUILT ON RADOS
7
ARCHITECTURAL COMPONENTS
RGW
A web services
gateway for object
storage, compatible
with S3 and Swift
LIBRADOS
A library allowing apps to directly access RADOS (C, C++, Java, Python, Ruby, PHP)
RADOS
A software-based, reliable, autonomous, distributed object store comprised of
self-healing, self-managing, intelligent storage nodes and lightweight monitors
RBD
A reliable, fully-
distributed block
device with cloud
platform integration
CEPHFS
A distributed file
system with POSIX
semantics and scale-
out metadata
management
APP HOST/VM CLIENT
8
THE RADOS GATEWAY
M M
M
RADOS CLUSTER
RADOSGW
LIBRADOS
socket
RADOSGW
LIBRADOS
APPLICATION APPLICATION
REST
9
MULTI-SITE OBJECT STORAGE
WEB
APPLICATION
APP
SERVER
CEPH OBJECT
GATEWAY
(RGW)
CEPH STORAGE
CLUSTER
(US-EAST)
WEB
APPLICATION
APP
SERVER
CEPH OBJECT
GATEWAY
(RGW)
CEPH STORAGE
CLUSTER
(EU-WEST)
11
RADOSGW MAKES RADOS
WEBBY
RADOSGW:
 REST-based object storage proxy
 Uses RADOS to store objects
●
Stripes large RESTful objects across
many RADOS objects
 API supports buckets, accounts
 Usage accounting for billing
 Compatible with S3 and Swift applications
12
ARCHITECTURAL COMPONENTS
RGW
A web services
gateway for object
storage, compatible
with S3 and Swift
LIBRADOS
A library allowing apps to directly access RADOS (C, C++, Java, Python, Ruby, PHP)
RADOS
A software-based, reliable, autonomous, distributed object store comprised of
self-healing, self-managing, intelligent storage nodes and lightweight monitors
RBD
A reliable, fully-
distributed block
device with cloud
platform integration
CEPHFS
A distributed file
system with POSIX
semantics and scale-
out metadata
management
APP HOST/VM CLIENT
13
STORING VIRTUAL DISKS
M M
RADOS CLUSTER
HYPERVISOR
LIBRBD
VM
14
KERNEL MODULE
M M
RADOS CLUSTER
LINUX HOST
KRBD
15
RBD FEATURES
● Stripe images across entire cluster (pool)
● Read-only snapshots
● Copy-on-write clones
● Broad integration
– Qemu
– Linux kernel
– iSCSI (STGT, LIO)
– OpenStack, CloudStack, Nebula, Ganeti, Proxmox
● Incremental backup (relative to snapshots)
16
ARCHITECTURAL COMPONENTS
RGW
A web services
gateway for object
storage, compatible
with S3 and Swift
LIBRADOS
A library allowing apps to directly access RADOS (C, C++, Java, Python, Ruby, PHP)
RADOS
A software-based, reliable, autonomous, distributed object store comprised of
self-healing, self-managing, intelligent storage nodes and lightweight monitors
RBD
A reliable, fully-
distributed block
device with cloud
platform integration
CEPHFS
A distributed file
system with POSIX
semantics and scale-
out metadata
management
APP HOST/VM CLIENT
17
SEPARATE METADATA SERVER
LINUX HOST
M M
M
RADOS CLUSTER
KERNEL MODULE
datametadata 01
10
18
SCALABLE METADATA SERVERS
METADATA SERVER
 Manages metadata for a POSIX-compliant
shared filesystem
 Directory hierarchy
 File metadata (owner, timestamps,
mode, etc.)
 Snapshots on any directory
 Clients stripe file data in RADOS
 MDS not in data path
 MDS stores metadata in RADOS
 Dynamic MDS cluster scales to 10s or
100s
 Only required for shared filesystem
RADOS
20
ARCHITECTURAL COMPONENTS
RGW
A web services
gateway for object
storage, compatible
with S3 and Swift
LIBRADOS
A library allowing apps to directly access RADOS (C, C++, Java, Python, Ruby, PHP)
RADOS
A software-based, reliable, autonomous, distributed object store comprised of
self-healing, self-managing, intelligent storage nodes and lightweight monitors
RBD
A reliable, fully-
distributed block
device with cloud
platform integration
CEPHFS
A distributed file
system with POSIX
semantics and scale-
out metadata
management
APP HOST/VM CLIENT
21
RADOS
● Flat object namespace within each pool
● Rich object API (librados)
– Bytes, attributes, key/value data
– Partial overwrite of existing data (mutable objects)
– Single-object compound operations
– RADOS classes (stored procedures)
● Strong consistency (CP system)
● Infrastructure aware, dynamic topology
● Hash-based placement (CRUSH)
● Direct client to server data path
22
RADOS CLUSTER
APPLICATION
M M
M M
M
RADOS CLUSTER
23
RADOS COMPONENTS
OSDs:
 10s to 1000s in a cluster
 One per disk (or one per SSD, RAID
group…)
 Serve stored objects to clients
 Intelligently peer for replication & recovery
Monitors:
 Maintain cluster membership and state
 Provide consensus for distributed decision-
making
 Small, odd number (e.g., 5)
 Not part of data path
M
24
OBJECT STORAGE DAEMONS
FS
DISK
OSD
DISK
OSD
FS
DISK
OSD
FS
DISK
OSD
FS
xfs
btrfs
ext4
M
M
M
DATA PLACEMENT
26
WHERE DO OBJECTS LIVE?
??
APPLICATION
M
M
M
OBJECT
27
A METADATA SERVER?
1
APPLICATION
M
M
M
2
28
CALCULATED PLACEMENT
FAPPLICATION
M
M
M
A-G
H-N
O-T
U-Z
29
CRUSH
CLUSTER
OBJECTS
10
01
01
10
10
01
11
01
10
01
01
10
10
01 11
01
1001
0110 10 01
11
01
PLACEMENT GROUPS
(PGs)
30
CRUSH IS A QUICK CALCULATION
RADOS CLUSTER
OBJECT
10
01
01
10
10
01 11
01
1001
0110 10 01
11
01
31
CRUSH AVOIDS FAILED DEVICES
RADOS CLUSTER
OBJECT
10
01
01
10
10
01 11
01
1001
0110 10 01
11
01
10
32
CRUSH: DECLUSTERED PLACEMENT
RADOS CLUSTER
10
01
01
10
10
01 11
0
1
1001
0110 10 01
11
01
32●
Each PG independently maps to a
pseudorandom set of OSDs
●
PGs that map to the same OSD
generally have replicas that do not
●
When an OSD fails, each PG it stored
will generally be re-replicated by a
different OSD
– Highly parallel recovery
– Avoid single-disk recovery bottleneck
10
01
33
CRUSH: DYNAMIC DATA PLACEMENT
CRUSH:
 Pseudo-random placement algorithm
 Fast calculation, no lookup
 Repeatable, deterministic
 Statistically uniform distribution
 Stable mapping
 Limited data migration on change
 Rule-based configuration
 Infrastructure topology aware
 Adjustable replication
 Weighted devices (different sizes)
34
DATA IS ORGANIZED INTO POOLS
CLUSTER
OBJECTS
10
01
01
10
10
01 11
01
1001
0110 10 01
11
01
POOLS
(CONTAINING PGs)
10
01
11
01
10
01
01
10
01
10
10
01
11
01
10
01
10 01 10 11
01
11
01
10
10
01
01
01
10
10
01
01
POOL
A
POOL
B
POOL
C
POOL
DOBJECTS
OBJECTS
OBJECTS
TIERED STORAGE
36
TWO WAYS TO CACHE
● Within each OSD
– Combine SSD and HDD under each OSD
– Make localized promote/demote decisions
– Leverage existing tools
● dm-cache, bcache, FlashCache
● Variety of caching controllers
– We can help with hints
● Cache on separate devices/nodes
– Different hardware for different tiers
● Slow nodes for cold data
● High performance nodes for hot data
– Add, remove, scale each tier independently
● Unlikely to choose right ratios at procurement time
HDD
OSD
BLOCKDEV
SSD
FS
37
TIERED STORAGE
APPLICATION
CACHE POOL (REPLICATED)
BACKING POOL (ERASURE CODED)
CEPH STORAGE CLUSTER
38
RADOS TIERING PRINCIPLES
● Each tier is a RADOS pool
– May be replicated or erasure coded
● Tiers are durable
– e.g., replicate across SSDs in multiple hosts
● Each tier has its own CRUSH policy
– e.g., map cache pool to SSDs devices/hosts only
● librados adapts to tiering topology
– Transparently direct requests accordingly
● e.g., to cache
– No changes to RBD, RGW, CephFS, etc.
39
READ (CACHE HIT)
CEPH CLIENT
CACHE POOL (SSD): WRITEBACK
BACKING POOL (HDD)
CEPH STORAGE CLUSTER
READ READ REPLY
40
READ (CACHE MISS)
CEPH CLIENT
CACHE POOL (SSD): WRITEBACK
BACKING POOL (HDD)
CEPH STORAGE CLUSTER
READ READ REPLY
PROXY READ
42
READ (CACHE MISS)
CEPH CLIENT
CACHE POOL (SSD): WRITEBACK
BACKING POOL (HDD)
CEPH STORAGE CLUSTER
READ
PROMOTE
READ REPLY
43
WRITE (HIT)
CEPH CLIENT
CACHE POOL (SSD): WRITEBACK
BACKING POOL (HDD)
CEPH STORAGE CLUSTER
WRITE ACK
44
WRITE (MISS)
CEPH CLIENT
CACHE POOL (SSD): WRITEBACK
BACKING POOL (HDD)
CEPH STORAGE CLUSTER
WRITE ACK
PROMOTE
45
WRITE (MISS) (COMING SOON)
CEPH CLIENT
CACHE POOL (SSD): WRITEBACK
BACKING POOL (HDD)
CEPH STORAGE CLUSTER
WRITE ACK
PROXY WRITE
46
ESTIMATING TEMPERATURE
● Each PG constructs in-memory bloom filters
– Insert records on both read and write
– Each filter covers configurable period (e.g., 1 hour)
– Tunable false positive probability (e.g., 5%)
– Store most recent N periods on disk (e.g., last 24 hours)
● Estimate temperature
– Has object been accessed in any of the last N periods?
– ...in how many of them?
– Informs the flush/evict decision
● Estimate “recency”
– How many periods since the object hasn't been accessed?
– Informs read miss behavior: proxy vs promote
47
FLUSH AND/OR EVICT
COLD DATA
CEPH CLIENT
CACHE POOL (SSD): WRITEBACK
BACKING POOL (HDD)
CEPH STORAGE CLUSTER
FLUSH ACK EVICT
48
TIERING AGENT
● Each PG has an internal tiering agent
– Manages PG based on administrator defined policy
● Flush dirty objects
– When pool reaches target dirty ratio
– Tries to select cold objects
– Marks objects clean when they have been written back
to the base pool
● Evict (delete) clean objects
– Greater “effort” as cache pool approaches target size
49
CACHE TIER USAGE
● Cache tier should be faster than the base tier
● Cache tier should be replicated (not erasure coded)
● Promote and flush are expensive
– Best results when object temperature are skewed
● Most I/O goes to small number of hot objects
– Cache should be big enough to capture most of the
acting set
● Challenging to benchmark
– Need a realistic workload (e.g., not 'dd') to determine
how it will perform in practice
– Takes a long time to “warm up” the cache
ERASURE CODING
52
ERASURE CODING
OBJECT
REPLICATED POOL
CEPH STORAGE CLUSTER
ERASURE CODED POOL
CEPH STORAGE CLUSTER
COPY COPY
OBJECT
31 2 X Y
COPY
4
Full copies of stored objects
 Very high durability
 3x (200% overhead)
 Quicker recovery
One copy plus parity
 Cost-effective durability
 1.5x (50% overhead)
 Expensive recovery
53
ERASURE CODING SHARDS
CEPH STORAGE CLUSTER
OBJECT
Y
OSD
3
OSD
2
OSD
1
OSD
4
OSD
X
OSD
ERASURE CODED POOL
54
ERASURE CODING SHARDS
CEPH STORAGE CLUSTER
Y
OSD
3
OSD
2
OSD
1
OSD
4
OSD
X
OSD
0
4
8
12
16
1
5
9
13
17
2
6
10
14
18
3
7
9
15
19
A
B
C
D
E
A'
B'
C'
D'
E'
●
Variable stripe size (e.g., 4 KB)
●
Zero-fill shards (logically) in partial tail stripe
55
PRIMARY COORDINATES
CEPH STORAGE CLUSTER
Y
OSD
3
OSD
2
OSD
1
OSD
4
OSD
X
OSD
ERASURE CODED POOL
56
EC READ
CEPH STORAGE CLUSTER
Y
OSD
3
OSD
2
OSD
1
OSD
4
OSD
X
OSD
ERASURE CODED POOL
CEPH CLIENT
READ
57
EC READ
CEPH STORAGE CLUSTER
Y
OSD
3
OSD
2
OSD
1
OSD
4
OSD
X
OSD
ERASURE CODED POOL
CEPH CLIENT
READ
READS
58
EC READ
CEPH STORAGE CLUSTER
Y
OSD
3
OSD
2
OSD
1
OSD
4
OSD
X
OSD
ERASURE CODED POOL
CEPH CLIENT
READ REPLY
59
EC WRITE
CEPH STORAGE CLUSTER
Y
OSD
3
OSD
2
OSD
1
OSD
4
OSD
X
OSD
ERASURE CODED POOL
CEPH CLIENT
WRITE
60
EC WRITE
CEPH STORAGE CLUSTER
Y
OSD
3
OSD
2
OSD
1
OSD
4
OSD
X
OSD
ERASURE CODED POOL
CEPH CLIENT
WRITE
WRITES
61
EC WRITE
CEPH STORAGE CLUSTER
Y
OSD
3
OSD
2
OSD
1
OSD
4
OSD
X
OSD
ERASURE CODED POOL
CEPH CLIENT
WRITE ACK
62
EC WRITE: DEGRADED
CEPH STORAGE CLUSTER
Y
OSD
3
OSD
2
OSD
1
OSD
4
OSD
X
OSD
ERASURE CODED POOL
CEPH CLIENT
WRITE
WRITES
63
EC WRITE: PARTIAL FAILURE
CEPH STORAGE CLUSTER
Y
OSD
3
OSD
2
OSD
1
OSD
4
OSD
X
OSD
ERASURE CODED POOL
CEPH CLIENT
WRITE
WRITES
64
EC WRITE: PARTIAL FAILURE
CEPH STORAGE CLUSTER
Y
OSD
3
OSD
2
OSD
1
OSD
4
OSD
X
OSD
ERASURE CODED POOL
CEPH CLIENT
B B BA A A
65
EC RESTRICTIONS
● Overwrite in place will not work in general
● Log and 2PC would increase complexity, latency
● We chose to restrict allowed operations
– create
– append (on stripe boundary)
– remove (keep previous generation of object for some time)
● These operations can all easily be rolled back locally
– create → delete
– append → truncate
– remove → roll back to previous generation
● Object attrs preserved in existing PG logs (they are small)
● Key/value data is not allowed on EC pools
66
EC WRITE: PARTIAL FAILURE
CEPH STORAGE CLUSTER
Y
OSD
3
OSD
2
OSD
1
OSD
4
OSD
X
OSD
ERASURE CODED POOL
CEPH CLIENT
B B BA A A
67
EC WRITE: PARTIAL FAILURE
CEPH STORAGE CLUSTER
Y
OSD
3
OSD
2
OSD
1
OSD
4
OSD
X
OSD
ERASURE CODED POOL
CEPH CLIENT
A A AA A A
68
EC RESTRICTIONS
● This is a small subset of allowed librados operations
– Notably cannot (over)write any extent
● Coincidentally, unsupported operations are also
inefficient for erasure codes
– Generally require read/modify/write of affected stripe(s)
● Some can consume EC directly
– RGW (no object data update in place)
● Others can combine EC with a cache tier (RBD,
CephFS)
– Replication for warm/hot data
– Erasure coding for cold data
– Tiering agent skips objects with key/value data
69
WHICH ERASURE CODE?
● The EC algorithm and implementation are pluggable
– jerasure/gf-complete (free, open, and very fast)
– ISA-L (Intel library; optimized for modern Intel procs)
– LRC (local recovery code – layers over existing plugins)
– SHEC (trades extra storage for recovery efficiency – new from Fujitsu)
● Parameterized
– Pick “k” and “m”, stripe size
● OSD handles data path, placement, rollback, etc.
● Erasure plugin handles
– Encode and decode math
– Given these available shards, which ones should I fetch to satisfy a
read?
– Given these available shards and these missing shards, which ones
should I fetch to recover?
70
COST OF RECOVERY
1 TB OSD
71
COST OF RECOVERY
1 TB OSD
72
COST OF RECOVERY (REPLICATION)
1 TB OSD
1 TB
73
COST OF RECOVERY (REPLICATION)
1 TB OSD
.01 TB
.01 TB
.01 TB
.01 TB
...
...
.01 TB .01 TB
74
COST OF RECOVERY (REPLICATION)
1 TB OSD
1 TB
75
COST OF RECOVERY (EC)
1 TB OSD
1 TB
1 TB
1 TB
1 TB
76
LOCAL RECOVERY CODE (LRC)
CEPH STORAGE CLUSTER
Y
OSD
3
OSD
2
OSD
1
OSD
4
OSD
X
OSD
ERASURE CODED POOL
A
OSD
C
OSD
B
OSD
OBJECT
77
BIG THANKS TO
● Ceph
– Loic Dachary (CloudWatt, FSF France, Red Hat)
– Andreas Peters (CERN)
– Sam Just (Inktank / Red Hat)
– David Zafman (Inktank / Red Hat)
● jerasure / gf-complete
– Jim Plank (University of Tennessee)
– Kevin Greenan (Box.com)
● Intel (ISL plugin)
● Fujitsu (SHEC plugin)
ROADMAP
79
WHAT'S NEXT
● Erasure coding
– Allow (optimistic) client reads directly from shards
– ARM optimizations for jerasure
● Cache pools
– Better agent decisions (when to flush or evict)
– Supporting different performance profiles
● e.g., slow / “cheap” flash can read just as fast
– Complex topologies
● Multiple readonly cache tiers in multiple sites
● Tiering
– Support “redirects” to (very) cold tier below base pool
– Enable dynamic spin-down, dedup, and other features
80
OTHER ONGOING WORK
● Performance optimization (SanDisk, Intel, Mellanox)
● Alternative OSD backends
– New backend: hybrid key/value and file system
– leveldb, rocksdb, LMDB
● Messenger (network layer) improvements
– RDMA support (libxio – Mellanox)
– Event-driven TCP implementation (UnitedStack)
● CephFS
– Online consistency checking and repair tools
– Performance, robustness
● Multi-datacenter RBD, RADOS replication
81
FOR MORE INFORMATION
● http://ceph.com
● http://github.com/ceph
● http://tracker.ceph.com
● Mailing lists
– ceph-users@ceph.com
– ceph-devel@vger.kernel.org
● irc.oftc.net
– #ceph
– #ceph-devel
● Twitter
– @ceph
THANK YOU!
Sage Weil
CEPH PRINCIPAL
ARCHITECT
sage@redhat.com
@liewegas

More Related Content

What's hot

2021.02 new in Ceph Pacific Dashboard
2021.02 new in Ceph Pacific Dashboard2021.02 new in Ceph Pacific Dashboard
2021.02 new in Ceph Pacific Dashboard
Ceph Community
 
Ceph c01
Ceph c01Ceph c01
Ceph c01
Lâm Đào
 
Ceph Object Storage Reference Architecture Performance and Sizing Guide
Ceph Object Storage Reference Architecture Performance and Sizing GuideCeph Object Storage Reference Architecture Performance and Sizing Guide
Ceph Object Storage Reference Architecture Performance and Sizing Guide
Karan Singh
 
Ceph Intro and Architectural Overview by Ross Turk
Ceph Intro and Architectural Overview by Ross TurkCeph Intro and Architectural Overview by Ross Turk
Ceph Intro and Architectural Overview by Ross Turk
buildacloud
 
Ceph as software define storage
Ceph as software define storageCeph as software define storage
Ceph as software define storage
Mahmoud Shiri Varamini
 
Ceph Object Storage Performance Secrets and Ceph Data Lake Solution
Ceph Object Storage Performance Secrets and Ceph Data Lake SolutionCeph Object Storage Performance Secrets and Ceph Data Lake Solution
Ceph Object Storage Performance Secrets and Ceph Data Lake Solution
Karan Singh
 
ceph optimization on ssd ilsoo byun-short
ceph optimization on ssd ilsoo byun-shortceph optimization on ssd ilsoo byun-short
ceph optimization on ssd ilsoo byun-short
NAVER D2
 
Ceph issue 해결 사례
Ceph issue 해결 사례Ceph issue 해결 사례
Ceph issue 해결 사례
Open Source Consulting
 
Crimson: Ceph for the Age of NVMe and Persistent Memory
Crimson: Ceph for the Age of NVMe and Persistent MemoryCrimson: Ceph for the Age of NVMe and Persistent Memory
Crimson: Ceph for the Age of NVMe and Persistent Memory
ScyllaDB
 
Room 1 - 6 - Trần Quốc Sang - Autoscaling for multi cloud platform based on S...
Room 1 - 6 - Trần Quốc Sang - Autoscaling for multi cloud platform based on S...Room 1 - 6 - Trần Quốc Sang - Autoscaling for multi cloud platform based on S...
Room 1 - 6 - Trần Quốc Sang - Autoscaling for multi cloud platform based on S...
Vietnam Open Infrastructure User Group
 
Ceph - A distributed storage system
Ceph - A distributed storage systemCeph - A distributed storage system
Ceph - A distributed storage system
Italo Santos
 
Bluestore
BluestoreBluestore
Bluestore
Patrick McGarry
 
Storage 101: Rook and Ceph - Open Infrastructure Denver 2019
Storage 101: Rook and Ceph - Open Infrastructure Denver 2019Storage 101: Rook and Ceph - Open Infrastructure Denver 2019
Storage 101: Rook and Ceph - Open Infrastructure Denver 2019
Sean Cohen
 
[OpenStack Days Korea 2016] Track1 - All flash CEPH 구성 및 최적화
[OpenStack Days Korea 2016] Track1 - All flash CEPH 구성 및 최적화[OpenStack Days Korea 2016] Track1 - All flash CEPH 구성 및 최적화
[OpenStack Days Korea 2016] Track1 - All flash CEPH 구성 및 최적화
OpenStack Korea Community
 
Ceph Block Devices: A Deep Dive
Ceph Block Devices:  A Deep DiveCeph Block Devices:  A Deep Dive
Ceph Block Devices: A Deep Dive
Red_Hat_Storage
 
Revisiting CephFS MDS and mClock QoS Scheduler
Revisiting CephFS MDS and mClock QoS SchedulerRevisiting CephFS MDS and mClock QoS Scheduler
Revisiting CephFS MDS and mClock QoS Scheduler
Yongseok Oh
 
[오픈소스컨설팅] 쿠버네티스와 쿠버네티스 on 오픈스택 비교 및 구축 방법
[오픈소스컨설팅] 쿠버네티스와 쿠버네티스 on 오픈스택 비교  및 구축 방법[오픈소스컨설팅] 쿠버네티스와 쿠버네티스 on 오픈스택 비교  및 구축 방법
[오픈소스컨설팅] 쿠버네티스와 쿠버네티스 on 오픈스택 비교 및 구축 방법
Open Source Consulting
 
Ceph
CephCeph
Ceph Day Melbourne - Troubleshooting Ceph
Ceph Day Melbourne - Troubleshooting Ceph Ceph Day Melbourne - Troubleshooting Ceph
Ceph Day Melbourne - Troubleshooting Ceph
Ceph Community
 
Ceph RBD Update - June 2021
Ceph RBD Update - June 2021Ceph RBD Update - June 2021
Ceph RBD Update - June 2021
Ceph Community
 

What's hot (20)

2021.02 new in Ceph Pacific Dashboard
2021.02 new in Ceph Pacific Dashboard2021.02 new in Ceph Pacific Dashboard
2021.02 new in Ceph Pacific Dashboard
 
Ceph c01
Ceph c01Ceph c01
Ceph c01
 
Ceph Object Storage Reference Architecture Performance and Sizing Guide
Ceph Object Storage Reference Architecture Performance and Sizing GuideCeph Object Storage Reference Architecture Performance and Sizing Guide
Ceph Object Storage Reference Architecture Performance and Sizing Guide
 
Ceph Intro and Architectural Overview by Ross Turk
Ceph Intro and Architectural Overview by Ross TurkCeph Intro and Architectural Overview by Ross Turk
Ceph Intro and Architectural Overview by Ross Turk
 
Ceph as software define storage
Ceph as software define storageCeph as software define storage
Ceph as software define storage
 
Ceph Object Storage Performance Secrets and Ceph Data Lake Solution
Ceph Object Storage Performance Secrets and Ceph Data Lake SolutionCeph Object Storage Performance Secrets and Ceph Data Lake Solution
Ceph Object Storage Performance Secrets and Ceph Data Lake Solution
 
ceph optimization on ssd ilsoo byun-short
ceph optimization on ssd ilsoo byun-shortceph optimization on ssd ilsoo byun-short
ceph optimization on ssd ilsoo byun-short
 
Ceph issue 해결 사례
Ceph issue 해결 사례Ceph issue 해결 사례
Ceph issue 해결 사례
 
Crimson: Ceph for the Age of NVMe and Persistent Memory
Crimson: Ceph for the Age of NVMe and Persistent MemoryCrimson: Ceph for the Age of NVMe and Persistent Memory
Crimson: Ceph for the Age of NVMe and Persistent Memory
 
Room 1 - 6 - Trần Quốc Sang - Autoscaling for multi cloud platform based on S...
Room 1 - 6 - Trần Quốc Sang - Autoscaling for multi cloud platform based on S...Room 1 - 6 - Trần Quốc Sang - Autoscaling for multi cloud platform based on S...
Room 1 - 6 - Trần Quốc Sang - Autoscaling for multi cloud platform based on S...
 
Ceph - A distributed storage system
Ceph - A distributed storage systemCeph - A distributed storage system
Ceph - A distributed storage system
 
Bluestore
BluestoreBluestore
Bluestore
 
Storage 101: Rook and Ceph - Open Infrastructure Denver 2019
Storage 101: Rook and Ceph - Open Infrastructure Denver 2019Storage 101: Rook and Ceph - Open Infrastructure Denver 2019
Storage 101: Rook and Ceph - Open Infrastructure Denver 2019
 
[OpenStack Days Korea 2016] Track1 - All flash CEPH 구성 및 최적화
[OpenStack Days Korea 2016] Track1 - All flash CEPH 구성 및 최적화[OpenStack Days Korea 2016] Track1 - All flash CEPH 구성 및 최적화
[OpenStack Days Korea 2016] Track1 - All flash CEPH 구성 및 최적화
 
Ceph Block Devices: A Deep Dive
Ceph Block Devices:  A Deep DiveCeph Block Devices:  A Deep Dive
Ceph Block Devices: A Deep Dive
 
Revisiting CephFS MDS and mClock QoS Scheduler
Revisiting CephFS MDS and mClock QoS SchedulerRevisiting CephFS MDS and mClock QoS Scheduler
Revisiting CephFS MDS and mClock QoS Scheduler
 
[오픈소스컨설팅] 쿠버네티스와 쿠버네티스 on 오픈스택 비교 및 구축 방법
[오픈소스컨설팅] 쿠버네티스와 쿠버네티스 on 오픈스택 비교  및 구축 방법[오픈소스컨설팅] 쿠버네티스와 쿠버네티스 on 오픈스택 비교  및 구축 방법
[오픈소스컨설팅] 쿠버네티스와 쿠버네티스 on 오픈스택 비교 및 구축 방법
 
Ceph
CephCeph
Ceph
 
Ceph Day Melbourne - Troubleshooting Ceph
Ceph Day Melbourne - Troubleshooting Ceph Ceph Day Melbourne - Troubleshooting Ceph
Ceph Day Melbourne - Troubleshooting Ceph
 
Ceph RBD Update - June 2021
Ceph RBD Update - June 2021Ceph RBD Update - June 2021
Ceph RBD Update - June 2021
 

Viewers also liked

Delivering Infrastructure-as-a-Service with Open Source Software
Delivering Infrastructure-as-a-Service with Open Source SoftwareDelivering Infrastructure-as-a-Service with Open Source Software
Delivering Infrastructure-as-a-Service with Open Source Software
Mark Hinkle
 
Cloud data center and openstack
Cloud data center and openstackCloud data center and openstack
Cloud data center and openstack
Andrew Yongjoon Kong
 
Kakao Openstack CI/CD
Kakao Openstack CI/CDKakao Openstack CI/CD
Kakao Openstack CI/CD
어형 이
 
OpenStack Cinder
OpenStack CinderOpenStack Cinder
OpenStack Cinder
Deepti Ramakrishna
 
[OpenStack Days Korea 2016] Track1 - Red Hat enterprise Linux OpenStack Platform
[OpenStack Days Korea 2016] Track1 - Red Hat enterprise Linux OpenStack Platform[OpenStack Days Korea 2016] Track1 - Red Hat enterprise Linux OpenStack Platform
[OpenStack Days Korea 2016] Track1 - Red Hat enterprise Linux OpenStack Platform
OpenStack Korea Community
 
[OpenStack Days Korea 2016] Track1 - 카카오는 오픈스택 기반으로 어떻게 5000VM을 운영하고 있을까?
[OpenStack Days Korea 2016] Track1 - 카카오는 오픈스택 기반으로 어떻게 5000VM을 운영하고 있을까?[OpenStack Days Korea 2016] Track1 - 카카오는 오픈스택 기반으로 어떻게 5000VM을 운영하고 있을까?
[OpenStack Days Korea 2016] Track1 - 카카오는 오픈스택 기반으로 어떻게 5000VM을 운영하고 있을까?
OpenStack Korea Community
 

Viewers also liked (6)

Delivering Infrastructure-as-a-Service with Open Source Software
Delivering Infrastructure-as-a-Service with Open Source SoftwareDelivering Infrastructure-as-a-Service with Open Source Software
Delivering Infrastructure-as-a-Service with Open Source Software
 
Cloud data center and openstack
Cloud data center and openstackCloud data center and openstack
Cloud data center and openstack
 
Kakao Openstack CI/CD
Kakao Openstack CI/CDKakao Openstack CI/CD
Kakao Openstack CI/CD
 
OpenStack Cinder
OpenStack CinderOpenStack Cinder
OpenStack Cinder
 
[OpenStack Days Korea 2016] Track1 - Red Hat enterprise Linux OpenStack Platform
[OpenStack Days Korea 2016] Track1 - Red Hat enterprise Linux OpenStack Platform[OpenStack Days Korea 2016] Track1 - Red Hat enterprise Linux OpenStack Platform
[OpenStack Days Korea 2016] Track1 - Red Hat enterprise Linux OpenStack Platform
 
[OpenStack Days Korea 2016] Track1 - 카카오는 오픈스택 기반으로 어떻게 5000VM을 운영하고 있을까?
[OpenStack Days Korea 2016] Track1 - 카카오는 오픈스택 기반으로 어떻게 5000VM을 운영하고 있을까?[OpenStack Days Korea 2016] Track1 - 카카오는 오픈스택 기반으로 어떻게 5000VM을 운영하고 있을까?
[OpenStack Days Korea 2016] Track1 - 카카오는 오픈스택 기반으로 어떻게 5000VM을 운영하고 있을까?
 

Similar to Storage tiering and erasure coding in Ceph (SCaLE13x)

OSDC 2015: John Spray | The Ceph Storage System
OSDC 2015: John Spray | The Ceph Storage SystemOSDC 2015: John Spray | The Ceph Storage System
OSDC 2015: John Spray | The Ceph Storage System
NETWAYS
 
Red Hat Storage 2014 - Product(s) Overview
Red Hat Storage 2014 - Product(s) OverviewRed Hat Storage 2014 - Product(s) Overview
Red Hat Storage 2014 - Product(s) Overview
Marcel Hergaarden
 
20171101 taco scargo luminous is out, what's in it for you
20171101 taco scargo   luminous is out, what's in it for you20171101 taco scargo   luminous is out, what's in it for you
20171101 taco scargo luminous is out, what's in it for you
Taco Scargo
 
Community Update at OpenStack Summit Boston
Community Update at OpenStack Summit BostonCommunity Update at OpenStack Summit Boston
Community Update at OpenStack Summit Boston
Sage Weil
 
Ceph Overview for Distributed Computing Denver Meetup
Ceph Overview for Distributed Computing Denver MeetupCeph Overview for Distributed Computing Denver Meetup
Ceph Overview for Distributed Computing Denver Meetup
ktdreyer
 
Distributed Storage and Compute With Ceph's librados (Vault 2015)
Distributed Storage and Compute With Ceph's librados (Vault 2015)Distributed Storage and Compute With Ceph's librados (Vault 2015)
Distributed Storage and Compute With Ceph's librados (Vault 2015)
Sage Weil
 
Quick-and-Easy Deployment of a Ceph Storage Cluster
Quick-and-Easy Deployment of a Ceph Storage ClusterQuick-and-Easy Deployment of a Ceph Storage Cluster
Quick-and-Easy Deployment of a Ceph Storage Cluster
Patrick Quairoli
 
Ceph Block Devices: A Deep Dive
Ceph Block Devices: A Deep DiveCeph Block Devices: A Deep Dive
Ceph Block Devices: A Deep Dive
joshdurgin
 
XenSummit - 08/28/2012
XenSummit - 08/28/2012XenSummit - 08/28/2012
XenSummit - 08/28/2012
Ceph Community
 
Block Storage For VMs With Ceph
Block Storage For VMs With CephBlock Storage For VMs With Ceph
Block Storage For VMs With Ceph
The Linux Foundation
 
The Future of Cloud Software Defined Storage with Ceph: Andrew Hatfield, Red Hat
The Future of Cloud Software Defined Storage with Ceph: Andrew Hatfield, Red HatThe Future of Cloud Software Defined Storage with Ceph: Andrew Hatfield, Red Hat
The Future of Cloud Software Defined Storage with Ceph: Andrew Hatfield, Red Hat
OpenStack
 
Ceph - Desmistificando Software-Define Storage
Ceph - Desmistificando Software-Define StorageCeph - Desmistificando Software-Define Storage
Ceph - Desmistificando Software-Define Storage
Italo Santos
 
What's new in Jewel and Beyond
What's new in Jewel and BeyondWhat's new in Jewel and Beyond
What's new in Jewel and Beyond
Sage Weil
 
Red hat ceph storage customer presentation
Red hat ceph storage customer presentationRed hat ceph storage customer presentation
Red hat ceph storage customer presentation
Rodrigo Missiaggia
 
Open Source Storage at Scale: Ceph @ GRNET
Open Source Storage at Scale: Ceph @ GRNETOpen Source Storage at Scale: Ceph @ GRNET
Open Source Storage at Scale: Ceph @ GRNET
Nikos Kormpakis
 
Cache Tiering and Erasure Coding
Cache Tiering and Erasure CodingCache Tiering and Erasure Coding
Cache Tiering and Erasure Coding
Shinobu Kinjo
 
Ceph Day Shanghai - Recovery Erasure Coding and Cache Tiering
Ceph Day Shanghai - Recovery Erasure Coding and Cache TieringCeph Day Shanghai - Recovery Erasure Coding and Cache Tiering
Ceph Day Shanghai - Recovery Erasure Coding and Cache Tiering
Ceph Community
 
What's new in Luminous and Beyond
What's new in Luminous and BeyondWhat's new in Luminous and Beyond
What's new in Luminous and Beyond
Sage Weil
 
Ceph Day New York 2014: Future of CephFS
Ceph Day New York 2014:  Future of CephFS Ceph Day New York 2014:  Future of CephFS
Ceph Day New York 2014: Future of CephFS
Ceph Community
 
CephFS in Jewel: Stable at Last
CephFS in Jewel: Stable at LastCephFS in Jewel: Stable at Last
CephFS in Jewel: Stable at Last
Ceph Community
 

Similar to Storage tiering and erasure coding in Ceph (SCaLE13x) (20)

OSDC 2015: John Spray | The Ceph Storage System
OSDC 2015: John Spray | The Ceph Storage SystemOSDC 2015: John Spray | The Ceph Storage System
OSDC 2015: John Spray | The Ceph Storage System
 
Red Hat Storage 2014 - Product(s) Overview
Red Hat Storage 2014 - Product(s) OverviewRed Hat Storage 2014 - Product(s) Overview
Red Hat Storage 2014 - Product(s) Overview
 
20171101 taco scargo luminous is out, what's in it for you
20171101 taco scargo   luminous is out, what's in it for you20171101 taco scargo   luminous is out, what's in it for you
20171101 taco scargo luminous is out, what's in it for you
 
Community Update at OpenStack Summit Boston
Community Update at OpenStack Summit BostonCommunity Update at OpenStack Summit Boston
Community Update at OpenStack Summit Boston
 
Ceph Overview for Distributed Computing Denver Meetup
Ceph Overview for Distributed Computing Denver MeetupCeph Overview for Distributed Computing Denver Meetup
Ceph Overview for Distributed Computing Denver Meetup
 
Distributed Storage and Compute With Ceph's librados (Vault 2015)
Distributed Storage and Compute With Ceph's librados (Vault 2015)Distributed Storage and Compute With Ceph's librados (Vault 2015)
Distributed Storage and Compute With Ceph's librados (Vault 2015)
 
Quick-and-Easy Deployment of a Ceph Storage Cluster
Quick-and-Easy Deployment of a Ceph Storage ClusterQuick-and-Easy Deployment of a Ceph Storage Cluster
Quick-and-Easy Deployment of a Ceph Storage Cluster
 
Ceph Block Devices: A Deep Dive
Ceph Block Devices: A Deep DiveCeph Block Devices: A Deep Dive
Ceph Block Devices: A Deep Dive
 
XenSummit - 08/28/2012
XenSummit - 08/28/2012XenSummit - 08/28/2012
XenSummit - 08/28/2012
 
Block Storage For VMs With Ceph
Block Storage For VMs With CephBlock Storage For VMs With Ceph
Block Storage For VMs With Ceph
 
The Future of Cloud Software Defined Storage with Ceph: Andrew Hatfield, Red Hat
The Future of Cloud Software Defined Storage with Ceph: Andrew Hatfield, Red HatThe Future of Cloud Software Defined Storage with Ceph: Andrew Hatfield, Red Hat
The Future of Cloud Software Defined Storage with Ceph: Andrew Hatfield, Red Hat
 
Ceph - Desmistificando Software-Define Storage
Ceph - Desmistificando Software-Define StorageCeph - Desmistificando Software-Define Storage
Ceph - Desmistificando Software-Define Storage
 
What's new in Jewel and Beyond
What's new in Jewel and BeyondWhat's new in Jewel and Beyond
What's new in Jewel and Beyond
 
Red hat ceph storage customer presentation
Red hat ceph storage customer presentationRed hat ceph storage customer presentation
Red hat ceph storage customer presentation
 
Open Source Storage at Scale: Ceph @ GRNET
Open Source Storage at Scale: Ceph @ GRNETOpen Source Storage at Scale: Ceph @ GRNET
Open Source Storage at Scale: Ceph @ GRNET
 
Cache Tiering and Erasure Coding
Cache Tiering and Erasure CodingCache Tiering and Erasure Coding
Cache Tiering and Erasure Coding
 
Ceph Day Shanghai - Recovery Erasure Coding and Cache Tiering
Ceph Day Shanghai - Recovery Erasure Coding and Cache TieringCeph Day Shanghai - Recovery Erasure Coding and Cache Tiering
Ceph Day Shanghai - Recovery Erasure Coding and Cache Tiering
 
What's new in Luminous and Beyond
What's new in Luminous and BeyondWhat's new in Luminous and Beyond
What's new in Luminous and Beyond
 
Ceph Day New York 2014: Future of CephFS
Ceph Day New York 2014:  Future of CephFS Ceph Day New York 2014:  Future of CephFS
Ceph Day New York 2014: Future of CephFS
 
CephFS in Jewel: Stable at Last
CephFS in Jewel: Stable at LastCephFS in Jewel: Stable at Last
CephFS in Jewel: Stable at Last
 

More from Sage Weil

Ceph data services in a multi- and hybrid cloud world
Ceph data services in a multi- and hybrid cloud worldCeph data services in a multi- and hybrid cloud world
Ceph data services in a multi- and hybrid cloud world
Sage Weil
 
Making distributed storage easy: usability in Ceph Luminous and beyond
Making distributed storage easy: usability in Ceph Luminous and beyondMaking distributed storage easy: usability in Ceph Luminous and beyond
Making distributed storage easy: usability in Ceph Luminous and beyond
Sage Weil
 
Ceph, Now and Later: Our Plan for Open Unified Cloud Storage
Ceph, Now and Later: Our Plan for Open Unified Cloud StorageCeph, Now and Later: Our Plan for Open Unified Cloud Storage
Ceph, Now and Later: Our Plan for Open Unified Cloud Storage
Sage Weil
 
BlueStore: a new, faster storage backend for Ceph
BlueStore: a new, faster storage backend for CephBlueStore: a new, faster storage backend for Ceph
BlueStore: a new, faster storage backend for Ceph
Sage Weil
 
BlueStore: a new, faster storage backend for Ceph
BlueStore: a new, faster storage backend for CephBlueStore: a new, faster storage backend for Ceph
BlueStore: a new, faster storage backend for Ceph
Sage Weil
 
The State of Ceph, Manila, and Containers in OpenStack
The State of Ceph, Manila, and Containers in OpenStackThe State of Ceph, Manila, and Containers in OpenStack
The State of Ceph, Manila, and Containers in OpenStack
Sage Weil
 
Keeping OpenStack storage trendy with Ceph and containers
Keeping OpenStack storage trendy with Ceph and containersKeeping OpenStack storage trendy with Ceph and containers
Keeping OpenStack storage trendy with Ceph and containers
Sage Weil
 

More from Sage Weil (7)

Ceph data services in a multi- and hybrid cloud world
Ceph data services in a multi- and hybrid cloud worldCeph data services in a multi- and hybrid cloud world
Ceph data services in a multi- and hybrid cloud world
 
Making distributed storage easy: usability in Ceph Luminous and beyond
Making distributed storage easy: usability in Ceph Luminous and beyondMaking distributed storage easy: usability in Ceph Luminous and beyond
Making distributed storage easy: usability in Ceph Luminous and beyond
 
Ceph, Now and Later: Our Plan for Open Unified Cloud Storage
Ceph, Now and Later: Our Plan for Open Unified Cloud StorageCeph, Now and Later: Our Plan for Open Unified Cloud Storage
Ceph, Now and Later: Our Plan for Open Unified Cloud Storage
 
BlueStore: a new, faster storage backend for Ceph
BlueStore: a new, faster storage backend for CephBlueStore: a new, faster storage backend for Ceph
BlueStore: a new, faster storage backend for Ceph
 
BlueStore: a new, faster storage backend for Ceph
BlueStore: a new, faster storage backend for CephBlueStore: a new, faster storage backend for Ceph
BlueStore: a new, faster storage backend for Ceph
 
The State of Ceph, Manila, and Containers in OpenStack
The State of Ceph, Manila, and Containers in OpenStackThe State of Ceph, Manila, and Containers in OpenStack
The State of Ceph, Manila, and Containers in OpenStack
 
Keeping OpenStack storage trendy with Ceph and containers
Keeping OpenStack storage trendy with Ceph and containersKeeping OpenStack storage trendy with Ceph and containers
Keeping OpenStack storage trendy with Ceph and containers
 

Recently uploaded

Mobile App Development Company In Noida | Drona Infotech
Mobile App Development Company In Noida | Drona InfotechMobile App Development Company In Noida | Drona Infotech
Mobile App Development Company In Noida | Drona Infotech
Drona Infotech
 
SWEBOK and Education at FUSE Okinawa 2024
SWEBOK and Education at FUSE Okinawa 2024SWEBOK and Education at FUSE Okinawa 2024
SWEBOK and Education at FUSE Okinawa 2024
Hironori Washizaki
 
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptx
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptxTop Features to Include in Your Winzo Clone App for Business Growth (4).pptx
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptx
rickgrimesss22
 
Revolutionizing Visual Effects Mastering AI Face Swaps.pdf
Revolutionizing Visual Effects Mastering AI Face Swaps.pdfRevolutionizing Visual Effects Mastering AI Face Swaps.pdf
Revolutionizing Visual Effects Mastering AI Face Swaps.pdf
Undress Baby
 
Using Xen Hypervisor for Functional Safety
Using Xen Hypervisor for Functional SafetyUsing Xen Hypervisor for Functional Safety
Using Xen Hypervisor for Functional Safety
Ayan Halder
 
KuberTENes Birthday Bash Guadalajara - Introducción a Argo CD
KuberTENes Birthday Bash Guadalajara - Introducción a Argo CDKuberTENes Birthday Bash Guadalajara - Introducción a Argo CD
KuberTENes Birthday Bash Guadalajara - Introducción a Argo CD
rodomar2
 
A Study of Variable-Role-based Feature Enrichment in Neural Models of Code
A Study of Variable-Role-based Feature Enrichment in Neural Models of CodeA Study of Variable-Role-based Feature Enrichment in Neural Models of Code
A Study of Variable-Role-based Feature Enrichment in Neural Models of Code
Aftab Hussain
 
APIs for Browser Automation (MoT Meetup 2024)
APIs for Browser Automation (MoT Meetup 2024)APIs for Browser Automation (MoT Meetup 2024)
APIs for Browser Automation (MoT Meetup 2024)
Boni García
 
Energy consumption of Database Management - Florina Jonuzi
Energy consumption of Database Management - Florina JonuziEnergy consumption of Database Management - Florina Jonuzi
Energy consumption of Database Management - Florina Jonuzi
Green Software Development
 
LORRAINE ANDREI_LEQUIGAN_HOW TO USE WHATSAPP.pptx
LORRAINE ANDREI_LEQUIGAN_HOW TO USE WHATSAPP.pptxLORRAINE ANDREI_LEQUIGAN_HOW TO USE WHATSAPP.pptx
LORRAINE ANDREI_LEQUIGAN_HOW TO USE WHATSAPP.pptx
lorraineandreiamcidl
 
DDS-Security 1.2 - What's New? Stronger security for long-running systems
DDS-Security 1.2 - What's New? Stronger security for long-running systemsDDS-Security 1.2 - What's New? Stronger security for long-running systems
DDS-Security 1.2 - What's New? Stronger security for long-running systems
Gerardo Pardo-Castellote
 
socradar-q1-2024-aviation-industry-report.pdf
socradar-q1-2024-aviation-industry-report.pdfsocradar-q1-2024-aviation-industry-report.pdf
socradar-q1-2024-aviation-industry-report.pdf
SOCRadar
 
GOING AOT WITH GRAALVM FOR SPRING BOOT (SPRING IO)
GOING AOT WITH GRAALVM FOR  SPRING BOOT (SPRING IO)GOING AOT WITH GRAALVM FOR  SPRING BOOT (SPRING IO)
GOING AOT WITH GRAALVM FOR SPRING BOOT (SPRING IO)
Alina Yurenko
 
Empowering Growth with Best Software Development Company in Noida - Deuglo
Empowering Growth with Best Software  Development Company in Noida - DeugloEmpowering Growth with Best Software  Development Company in Noida - Deuglo
Empowering Growth with Best Software Development Company in Noida - Deuglo
Deuglo Infosystem Pvt Ltd
 
LORRAINE ANDREI_LEQUIGAN_HOW TO USE ZOOM
LORRAINE ANDREI_LEQUIGAN_HOW TO USE ZOOMLORRAINE ANDREI_LEQUIGAN_HOW TO USE ZOOM
LORRAINE ANDREI_LEQUIGAN_HOW TO USE ZOOM
lorraineandreiamcidl
 
Hand Rolled Applicative User Validation Code Kata
Hand Rolled Applicative User ValidationCode KataHand Rolled Applicative User ValidationCode Kata
Hand Rolled Applicative User Validation Code Kata
Philip Schwarz
 
原版定制美国纽约州立大学奥尔巴尼分校毕业证学位证书原版一模一样
原版定制美国纽约州立大学奥尔巴尼分校毕业证学位证书原版一模一样原版定制美国纽约州立大学奥尔巴尼分校毕业证学位证书原版一模一样
原版定制美国纽约州立大学奥尔巴尼分校毕业证学位证书原版一模一样
mz5nrf0n
 
May Marketo Masterclass, London MUG May 22 2024.pdf
May Marketo Masterclass, London MUG May 22 2024.pdfMay Marketo Masterclass, London MUG May 22 2024.pdf
May Marketo Masterclass, London MUG May 22 2024.pdf
Adele Miller
 
Why Mobile App Regression Testing is Critical for Sustained Success_ A Detail...
Why Mobile App Regression Testing is Critical for Sustained Success_ A Detail...Why Mobile App Regression Testing is Critical for Sustained Success_ A Detail...
Why Mobile App Regression Testing is Critical for Sustained Success_ A Detail...
kalichargn70th171
 
GraphSummit Paris - The art of the possible with Graph Technology
GraphSummit Paris - The art of the possible with Graph TechnologyGraphSummit Paris - The art of the possible with Graph Technology
GraphSummit Paris - The art of the possible with Graph Technology
Neo4j
 

Recently uploaded (20)

Mobile App Development Company In Noida | Drona Infotech
Mobile App Development Company In Noida | Drona InfotechMobile App Development Company In Noida | Drona Infotech
Mobile App Development Company In Noida | Drona Infotech
 
SWEBOK and Education at FUSE Okinawa 2024
SWEBOK and Education at FUSE Okinawa 2024SWEBOK and Education at FUSE Okinawa 2024
SWEBOK and Education at FUSE Okinawa 2024
 
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptx
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptxTop Features to Include in Your Winzo Clone App for Business Growth (4).pptx
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptx
 
Revolutionizing Visual Effects Mastering AI Face Swaps.pdf
Revolutionizing Visual Effects Mastering AI Face Swaps.pdfRevolutionizing Visual Effects Mastering AI Face Swaps.pdf
Revolutionizing Visual Effects Mastering AI Face Swaps.pdf
 
Using Xen Hypervisor for Functional Safety
Using Xen Hypervisor for Functional SafetyUsing Xen Hypervisor for Functional Safety
Using Xen Hypervisor for Functional Safety
 
KuberTENes Birthday Bash Guadalajara - Introducción a Argo CD
KuberTENes Birthday Bash Guadalajara - Introducción a Argo CDKuberTENes Birthday Bash Guadalajara - Introducción a Argo CD
KuberTENes Birthday Bash Guadalajara - Introducción a Argo CD
 
A Study of Variable-Role-based Feature Enrichment in Neural Models of Code
A Study of Variable-Role-based Feature Enrichment in Neural Models of CodeA Study of Variable-Role-based Feature Enrichment in Neural Models of Code
A Study of Variable-Role-based Feature Enrichment in Neural Models of Code
 
APIs for Browser Automation (MoT Meetup 2024)
APIs for Browser Automation (MoT Meetup 2024)APIs for Browser Automation (MoT Meetup 2024)
APIs for Browser Automation (MoT Meetup 2024)
 
Energy consumption of Database Management - Florina Jonuzi
Energy consumption of Database Management - Florina JonuziEnergy consumption of Database Management - Florina Jonuzi
Energy consumption of Database Management - Florina Jonuzi
 
LORRAINE ANDREI_LEQUIGAN_HOW TO USE WHATSAPP.pptx
LORRAINE ANDREI_LEQUIGAN_HOW TO USE WHATSAPP.pptxLORRAINE ANDREI_LEQUIGAN_HOW TO USE WHATSAPP.pptx
LORRAINE ANDREI_LEQUIGAN_HOW TO USE WHATSAPP.pptx
 
DDS-Security 1.2 - What's New? Stronger security for long-running systems
DDS-Security 1.2 - What's New? Stronger security for long-running systemsDDS-Security 1.2 - What's New? Stronger security for long-running systems
DDS-Security 1.2 - What's New? Stronger security for long-running systems
 
socradar-q1-2024-aviation-industry-report.pdf
socradar-q1-2024-aviation-industry-report.pdfsocradar-q1-2024-aviation-industry-report.pdf
socradar-q1-2024-aviation-industry-report.pdf
 
GOING AOT WITH GRAALVM FOR SPRING BOOT (SPRING IO)
GOING AOT WITH GRAALVM FOR  SPRING BOOT (SPRING IO)GOING AOT WITH GRAALVM FOR  SPRING BOOT (SPRING IO)
GOING AOT WITH GRAALVM FOR SPRING BOOT (SPRING IO)
 
Empowering Growth with Best Software Development Company in Noida - Deuglo
Empowering Growth with Best Software  Development Company in Noida - DeugloEmpowering Growth with Best Software  Development Company in Noida - Deuglo
Empowering Growth with Best Software Development Company in Noida - Deuglo
 
LORRAINE ANDREI_LEQUIGAN_HOW TO USE ZOOM
LORRAINE ANDREI_LEQUIGAN_HOW TO USE ZOOMLORRAINE ANDREI_LEQUIGAN_HOW TO USE ZOOM
LORRAINE ANDREI_LEQUIGAN_HOW TO USE ZOOM
 
Hand Rolled Applicative User Validation Code Kata
Hand Rolled Applicative User ValidationCode KataHand Rolled Applicative User ValidationCode Kata
Hand Rolled Applicative User Validation Code Kata
 
原版定制美国纽约州立大学奥尔巴尼分校毕业证学位证书原版一模一样
原版定制美国纽约州立大学奥尔巴尼分校毕业证学位证书原版一模一样原版定制美国纽约州立大学奥尔巴尼分校毕业证学位证书原版一模一样
原版定制美国纽约州立大学奥尔巴尼分校毕业证学位证书原版一模一样
 
May Marketo Masterclass, London MUG May 22 2024.pdf
May Marketo Masterclass, London MUG May 22 2024.pdfMay Marketo Masterclass, London MUG May 22 2024.pdf
May Marketo Masterclass, London MUG May 22 2024.pdf
 
Why Mobile App Regression Testing is Critical for Sustained Success_ A Detail...
Why Mobile App Regression Testing is Critical for Sustained Success_ A Detail...Why Mobile App Regression Testing is Critical for Sustained Success_ A Detail...
Why Mobile App Regression Testing is Critical for Sustained Success_ A Detail...
 
GraphSummit Paris - The art of the possible with Graph Technology
GraphSummit Paris - The art of the possible with Graph TechnologyGraphSummit Paris - The art of the possible with Graph Technology
GraphSummit Paris - The art of the possible with Graph Technology
 

Storage tiering and erasure coding in Ceph (SCaLE13x)

  • 1. ERASURE CODING AND CACHE TIERING SAGE WEIL – SCALE13X - 2015.02.22
  • 2. 2 AGENDA ● Ceph architectural overview ● RADOS background ● Cache tiering ● Erasure coding ● Project status, roadmap
  • 4. 4 CEPH MOTIVATING PRINCIPLES ● All components must scale horizontally ● There can be no single point of failure ● The solution must be hardware agnostic ● Should use commodity hardware ● Self-manage whenever possible ● Open source (LGPL) ● Move beyond legacy approaches – Client/cluster instead of client/server – Ad hoc HA
  • 5. 5 CEPH COMPONENTS RGW A web services gateway for object storage, compatible with S3 and Swift LIBRADOS A library allowing apps to directly access RADOS (C, C++, Java, Python, Ruby, PHP) RADOS A software-based, reliable, autonomous, distributed object store comprised of self-healing, self-managing, intelligent storage nodes and lightweight monitors RBD A reliable, fully- distributed block device with cloud platform integration CEPHFS A distributed file system with POSIX semantics and scale- out metadata management APP HOST/VM CLIENT
  • 7. 7 ARCHITECTURAL COMPONENTS RGW A web services gateway for object storage, compatible with S3 and Swift LIBRADOS A library allowing apps to directly access RADOS (C, C++, Java, Python, Ruby, PHP) RADOS A software-based, reliable, autonomous, distributed object store comprised of self-healing, self-managing, intelligent storage nodes and lightweight monitors RBD A reliable, fully- distributed block device with cloud platform integration CEPHFS A distributed file system with POSIX semantics and scale- out metadata management APP HOST/VM CLIENT
  • 8. 8 THE RADOS GATEWAY M M M RADOS CLUSTER RADOSGW LIBRADOS socket RADOSGW LIBRADOS APPLICATION APPLICATION REST
  • 9. 9 MULTI-SITE OBJECT STORAGE WEB APPLICATION APP SERVER CEPH OBJECT GATEWAY (RGW) CEPH STORAGE CLUSTER (US-EAST) WEB APPLICATION APP SERVER CEPH OBJECT GATEWAY (RGW) CEPH STORAGE CLUSTER (EU-WEST)
  • 10. 11 RADOSGW MAKES RADOS WEBBY RADOSGW:  REST-based object storage proxy  Uses RADOS to store objects ● Stripes large RESTful objects across many RADOS objects  API supports buckets, accounts  Usage accounting for billing  Compatible with S3 and Swift applications
  • 11. 12 ARCHITECTURAL COMPONENTS RGW A web services gateway for object storage, compatible with S3 and Swift LIBRADOS A library allowing apps to directly access RADOS (C, C++, Java, Python, Ruby, PHP) RADOS A software-based, reliable, autonomous, distributed object store comprised of self-healing, self-managing, intelligent storage nodes and lightweight monitors RBD A reliable, fully- distributed block device with cloud platform integration CEPHFS A distributed file system with POSIX semantics and scale- out metadata management APP HOST/VM CLIENT
  • 12. 13 STORING VIRTUAL DISKS M M RADOS CLUSTER HYPERVISOR LIBRBD VM
  • 13. 14 KERNEL MODULE M M RADOS CLUSTER LINUX HOST KRBD
  • 14. 15 RBD FEATURES ● Stripe images across entire cluster (pool) ● Read-only snapshots ● Copy-on-write clones ● Broad integration – Qemu – Linux kernel – iSCSI (STGT, LIO) – OpenStack, CloudStack, Nebula, Ganeti, Proxmox ● Incremental backup (relative to snapshots)
  • 15. 16 ARCHITECTURAL COMPONENTS RGW A web services gateway for object storage, compatible with S3 and Swift LIBRADOS A library allowing apps to directly access RADOS (C, C++, Java, Python, Ruby, PHP) RADOS A software-based, reliable, autonomous, distributed object store comprised of self-healing, self-managing, intelligent storage nodes and lightweight monitors RBD A reliable, fully- distributed block device with cloud platform integration CEPHFS A distributed file system with POSIX semantics and scale- out metadata management APP HOST/VM CLIENT
  • 16. 17 SEPARATE METADATA SERVER LINUX HOST M M M RADOS CLUSTER KERNEL MODULE datametadata 01 10
  • 17. 18 SCALABLE METADATA SERVERS METADATA SERVER  Manages metadata for a POSIX-compliant shared filesystem  Directory hierarchy  File metadata (owner, timestamps, mode, etc.)  Snapshots on any directory  Clients stripe file data in RADOS  MDS not in data path  MDS stores metadata in RADOS  Dynamic MDS cluster scales to 10s or 100s  Only required for shared filesystem
  • 18. RADOS
  • 19. 20 ARCHITECTURAL COMPONENTS RGW A web services gateway for object storage, compatible with S3 and Swift LIBRADOS A library allowing apps to directly access RADOS (C, C++, Java, Python, Ruby, PHP) RADOS A software-based, reliable, autonomous, distributed object store comprised of self-healing, self-managing, intelligent storage nodes and lightweight monitors RBD A reliable, fully- distributed block device with cloud platform integration CEPHFS A distributed file system with POSIX semantics and scale- out metadata management APP HOST/VM CLIENT
  • 20. 21 RADOS ● Flat object namespace within each pool ● Rich object API (librados) – Bytes, attributes, key/value data – Partial overwrite of existing data (mutable objects) – Single-object compound operations – RADOS classes (stored procedures) ● Strong consistency (CP system) ● Infrastructure aware, dynamic topology ● Hash-based placement (CRUSH) ● Direct client to server data path
  • 22. 23 RADOS COMPONENTS OSDs:  10s to 1000s in a cluster  One per disk (or one per SSD, RAID group…)  Serve stored objects to clients  Intelligently peer for replication & recovery Monitors:  Maintain cluster membership and state  Provide consensus for distributed decision- making  Small, odd number (e.g., 5)  Not part of data path M
  • 25. 26 WHERE DO OBJECTS LIVE? ?? APPLICATION M M M OBJECT
  • 29. 30 CRUSH IS A QUICK CALCULATION RADOS CLUSTER OBJECT 10 01 01 10 10 01 11 01 1001 0110 10 01 11 01
  • 30. 31 CRUSH AVOIDS FAILED DEVICES RADOS CLUSTER OBJECT 10 01 01 10 10 01 11 01 1001 0110 10 01 11 01 10
  • 31. 32 CRUSH: DECLUSTERED PLACEMENT RADOS CLUSTER 10 01 01 10 10 01 11 0 1 1001 0110 10 01 11 01 32● Each PG independently maps to a pseudorandom set of OSDs ● PGs that map to the same OSD generally have replicas that do not ● When an OSD fails, each PG it stored will generally be re-replicated by a different OSD – Highly parallel recovery – Avoid single-disk recovery bottleneck 10 01
  • 32. 33 CRUSH: DYNAMIC DATA PLACEMENT CRUSH:  Pseudo-random placement algorithm  Fast calculation, no lookup  Repeatable, deterministic  Statistically uniform distribution  Stable mapping  Limited data migration on change  Rule-based configuration  Infrastructure topology aware  Adjustable replication  Weighted devices (different sizes)
  • 33. 34 DATA IS ORGANIZED INTO POOLS CLUSTER OBJECTS 10 01 01 10 10 01 11 01 1001 0110 10 01 11 01 POOLS (CONTAINING PGs) 10 01 11 01 10 01 01 10 01 10 10 01 11 01 10 01 10 01 10 11 01 11 01 10 10 01 01 01 10 10 01 01 POOL A POOL B POOL C POOL DOBJECTS OBJECTS OBJECTS
  • 35. 36 TWO WAYS TO CACHE ● Within each OSD – Combine SSD and HDD under each OSD – Make localized promote/demote decisions – Leverage existing tools ● dm-cache, bcache, FlashCache ● Variety of caching controllers – We can help with hints ● Cache on separate devices/nodes – Different hardware for different tiers ● Slow nodes for cold data ● High performance nodes for hot data – Add, remove, scale each tier independently ● Unlikely to choose right ratios at procurement time HDD OSD BLOCKDEV SSD FS
  • 36. 37 TIERED STORAGE APPLICATION CACHE POOL (REPLICATED) BACKING POOL (ERASURE CODED) CEPH STORAGE CLUSTER
  • 37. 38 RADOS TIERING PRINCIPLES ● Each tier is a RADOS pool – May be replicated or erasure coded ● Tiers are durable – e.g., replicate across SSDs in multiple hosts ● Each tier has its own CRUSH policy – e.g., map cache pool to SSDs devices/hosts only ● librados adapts to tiering topology – Transparently direct requests accordingly ● e.g., to cache – No changes to RBD, RGW, CephFS, etc.
  • 38. 39 READ (CACHE HIT) CEPH CLIENT CACHE POOL (SSD): WRITEBACK BACKING POOL (HDD) CEPH STORAGE CLUSTER READ READ REPLY
  • 39. 40 READ (CACHE MISS) CEPH CLIENT CACHE POOL (SSD): WRITEBACK BACKING POOL (HDD) CEPH STORAGE CLUSTER READ READ REPLY PROXY READ
  • 40. 42 READ (CACHE MISS) CEPH CLIENT CACHE POOL (SSD): WRITEBACK BACKING POOL (HDD) CEPH STORAGE CLUSTER READ PROMOTE READ REPLY
  • 41. 43 WRITE (HIT) CEPH CLIENT CACHE POOL (SSD): WRITEBACK BACKING POOL (HDD) CEPH STORAGE CLUSTER WRITE ACK
  • 42. 44 WRITE (MISS) CEPH CLIENT CACHE POOL (SSD): WRITEBACK BACKING POOL (HDD) CEPH STORAGE CLUSTER WRITE ACK PROMOTE
  • 43. 45 WRITE (MISS) (COMING SOON) CEPH CLIENT CACHE POOL (SSD): WRITEBACK BACKING POOL (HDD) CEPH STORAGE CLUSTER WRITE ACK PROXY WRITE
  • 44. 46 ESTIMATING TEMPERATURE ● Each PG constructs in-memory bloom filters – Insert records on both read and write – Each filter covers configurable period (e.g., 1 hour) – Tunable false positive probability (e.g., 5%) – Store most recent N periods on disk (e.g., last 24 hours) ● Estimate temperature – Has object been accessed in any of the last N periods? – ...in how many of them? – Informs the flush/evict decision ● Estimate “recency” – How many periods since the object hasn't been accessed? – Informs read miss behavior: proxy vs promote
  • 45. 47 FLUSH AND/OR EVICT COLD DATA CEPH CLIENT CACHE POOL (SSD): WRITEBACK BACKING POOL (HDD) CEPH STORAGE CLUSTER FLUSH ACK EVICT
  • 46. 48 TIERING AGENT ● Each PG has an internal tiering agent – Manages PG based on administrator defined policy ● Flush dirty objects – When pool reaches target dirty ratio – Tries to select cold objects – Marks objects clean when they have been written back to the base pool ● Evict (delete) clean objects – Greater “effort” as cache pool approaches target size
  • 47. 49 CACHE TIER USAGE ● Cache tier should be faster than the base tier ● Cache tier should be replicated (not erasure coded) ● Promote and flush are expensive – Best results when object temperature are skewed ● Most I/O goes to small number of hot objects – Cache should be big enough to capture most of the acting set ● Challenging to benchmark – Need a realistic workload (e.g., not 'dd') to determine how it will perform in practice – Takes a long time to “warm up” the cache
  • 49. 52 ERASURE CODING OBJECT REPLICATED POOL CEPH STORAGE CLUSTER ERASURE CODED POOL CEPH STORAGE CLUSTER COPY COPY OBJECT 31 2 X Y COPY 4 Full copies of stored objects  Very high durability  3x (200% overhead)  Quicker recovery One copy plus parity  Cost-effective durability  1.5x (50% overhead)  Expensive recovery
  • 50. 53 ERASURE CODING SHARDS CEPH STORAGE CLUSTER OBJECT Y OSD 3 OSD 2 OSD 1 OSD 4 OSD X OSD ERASURE CODED POOL
  • 51. 54 ERASURE CODING SHARDS CEPH STORAGE CLUSTER Y OSD 3 OSD 2 OSD 1 OSD 4 OSD X OSD 0 4 8 12 16 1 5 9 13 17 2 6 10 14 18 3 7 9 15 19 A B C D E A' B' C' D' E' ● Variable stripe size (e.g., 4 KB) ● Zero-fill shards (logically) in partial tail stripe
  • 52. 55 PRIMARY COORDINATES CEPH STORAGE CLUSTER Y OSD 3 OSD 2 OSD 1 OSD 4 OSD X OSD ERASURE CODED POOL
  • 53. 56 EC READ CEPH STORAGE CLUSTER Y OSD 3 OSD 2 OSD 1 OSD 4 OSD X OSD ERASURE CODED POOL CEPH CLIENT READ
  • 54. 57 EC READ CEPH STORAGE CLUSTER Y OSD 3 OSD 2 OSD 1 OSD 4 OSD X OSD ERASURE CODED POOL CEPH CLIENT READ READS
  • 55. 58 EC READ CEPH STORAGE CLUSTER Y OSD 3 OSD 2 OSD 1 OSD 4 OSD X OSD ERASURE CODED POOL CEPH CLIENT READ REPLY
  • 56. 59 EC WRITE CEPH STORAGE CLUSTER Y OSD 3 OSD 2 OSD 1 OSD 4 OSD X OSD ERASURE CODED POOL CEPH CLIENT WRITE
  • 57. 60 EC WRITE CEPH STORAGE CLUSTER Y OSD 3 OSD 2 OSD 1 OSD 4 OSD X OSD ERASURE CODED POOL CEPH CLIENT WRITE WRITES
  • 58. 61 EC WRITE CEPH STORAGE CLUSTER Y OSD 3 OSD 2 OSD 1 OSD 4 OSD X OSD ERASURE CODED POOL CEPH CLIENT WRITE ACK
  • 59. 62 EC WRITE: DEGRADED CEPH STORAGE CLUSTER Y OSD 3 OSD 2 OSD 1 OSD 4 OSD X OSD ERASURE CODED POOL CEPH CLIENT WRITE WRITES
  • 60. 63 EC WRITE: PARTIAL FAILURE CEPH STORAGE CLUSTER Y OSD 3 OSD 2 OSD 1 OSD 4 OSD X OSD ERASURE CODED POOL CEPH CLIENT WRITE WRITES
  • 61. 64 EC WRITE: PARTIAL FAILURE CEPH STORAGE CLUSTER Y OSD 3 OSD 2 OSD 1 OSD 4 OSD X OSD ERASURE CODED POOL CEPH CLIENT B B BA A A
  • 62. 65 EC RESTRICTIONS ● Overwrite in place will not work in general ● Log and 2PC would increase complexity, latency ● We chose to restrict allowed operations – create – append (on stripe boundary) – remove (keep previous generation of object for some time) ● These operations can all easily be rolled back locally – create → delete – append → truncate – remove → roll back to previous generation ● Object attrs preserved in existing PG logs (they are small) ● Key/value data is not allowed on EC pools
  • 63. 66 EC WRITE: PARTIAL FAILURE CEPH STORAGE CLUSTER Y OSD 3 OSD 2 OSD 1 OSD 4 OSD X OSD ERASURE CODED POOL CEPH CLIENT B B BA A A
  • 64. 67 EC WRITE: PARTIAL FAILURE CEPH STORAGE CLUSTER Y OSD 3 OSD 2 OSD 1 OSD 4 OSD X OSD ERASURE CODED POOL CEPH CLIENT A A AA A A
  • 65. 68 EC RESTRICTIONS ● This is a small subset of allowed librados operations – Notably cannot (over)write any extent ● Coincidentally, unsupported operations are also inefficient for erasure codes – Generally require read/modify/write of affected stripe(s) ● Some can consume EC directly – RGW (no object data update in place) ● Others can combine EC with a cache tier (RBD, CephFS) – Replication for warm/hot data – Erasure coding for cold data – Tiering agent skips objects with key/value data
  • 66. 69 WHICH ERASURE CODE? ● The EC algorithm and implementation are pluggable – jerasure/gf-complete (free, open, and very fast) – ISA-L (Intel library; optimized for modern Intel procs) – LRC (local recovery code – layers over existing plugins) – SHEC (trades extra storage for recovery efficiency – new from Fujitsu) ● Parameterized – Pick “k” and “m”, stripe size ● OSD handles data path, placement, rollback, etc. ● Erasure plugin handles – Encode and decode math – Given these available shards, which ones should I fetch to satisfy a read? – Given these available shards and these missing shards, which ones should I fetch to recover?
  • 69. 72 COST OF RECOVERY (REPLICATION) 1 TB OSD 1 TB
  • 70. 73 COST OF RECOVERY (REPLICATION) 1 TB OSD .01 TB .01 TB .01 TB .01 TB ... ... .01 TB .01 TB
  • 71. 74 COST OF RECOVERY (REPLICATION) 1 TB OSD 1 TB
  • 72. 75 COST OF RECOVERY (EC) 1 TB OSD 1 TB 1 TB 1 TB 1 TB
  • 73. 76 LOCAL RECOVERY CODE (LRC) CEPH STORAGE CLUSTER Y OSD 3 OSD 2 OSD 1 OSD 4 OSD X OSD ERASURE CODED POOL A OSD C OSD B OSD OBJECT
  • 74. 77 BIG THANKS TO ● Ceph – Loic Dachary (CloudWatt, FSF France, Red Hat) – Andreas Peters (CERN) – Sam Just (Inktank / Red Hat) – David Zafman (Inktank / Red Hat) ● jerasure / gf-complete – Jim Plank (University of Tennessee) – Kevin Greenan (Box.com) ● Intel (ISL plugin) ● Fujitsu (SHEC plugin)
  • 76. 79 WHAT'S NEXT ● Erasure coding – Allow (optimistic) client reads directly from shards – ARM optimizations for jerasure ● Cache pools – Better agent decisions (when to flush or evict) – Supporting different performance profiles ● e.g., slow / “cheap” flash can read just as fast – Complex topologies ● Multiple readonly cache tiers in multiple sites ● Tiering – Support “redirects” to (very) cold tier below base pool – Enable dynamic spin-down, dedup, and other features
  • 77. 80 OTHER ONGOING WORK ● Performance optimization (SanDisk, Intel, Mellanox) ● Alternative OSD backends – New backend: hybrid key/value and file system – leveldb, rocksdb, LMDB ● Messenger (network layer) improvements – RDMA support (libxio – Mellanox) – Event-driven TCP implementation (UnitedStack) ● CephFS – Online consistency checking and repair tools – Performance, robustness ● Multi-datacenter RBD, RADOS replication
  • 78. 81 FOR MORE INFORMATION ● http://ceph.com ● http://github.com/ceph ● http://tracker.ceph.com ● Mailing lists – ceph-users@ceph.com – ceph-devel@vger.kernel.org ● irc.oftc.net – #ceph – #ceph-devel ● Twitter – @ceph
  • 79. THANK YOU! Sage Weil CEPH PRINCIPAL ARCHITECT sage@redhat.com @liewegas