20140218
Ian Colle
Director of Engineering, Inktank
ian@inktank.com
@ircolle
www.linkedin.com/in/ircolle
ircolle on freenode

inktank.com | ceph.com
AGENDA

3
WHY CEPH?

 Why has Ceph has become the de facto storage
choice for OpenStack implementations?
http://www.openstack.org/blog/2013/11/openstack-usersurvey-october-2013/
CEPH
CEPH UNIFIED STORAGE

OBJECT
STORAGE

BLOCK
STORAGE

FILE
SYSTEM

S3 & Swift

Snapshots

POSIX

Multi-tenant

Clones

Linux Kernel

Keystone

OpenStack

CIFS/NFS

GeoReplication
Native API

Linux Kernel

HDFS

iSCSI

Distributed
Metadata

6
Copyright © 2014 by Inktank
CEPH OVERVIEW
PHILOSOPHY

HISTORY
Integrated
into
OpenStack

Failure is
normal
Self managing

Included in
Linux kernel

Scale out on
commodity
hardware
Everything
runs in
software

TODAY

Open sourced
for the first time
Project starts
at UCSC

2006

2004

7
Copyright © 2014 by Inktank

2010

2012
TRADITIONAL STORAGE VS.
CEPH
TRADITIONAL
ENTERPRISE STORAGE

8
Copyright © 2014 by Inktank
STRONG & GROWING
COMMUNITY
37,946

IRC chatlines
ML messages
Commits

11,500
8,172
2,715

2,888
1,418
2011-Q3

2011-Q4

2012-Q1

2012-Q2

2012-Q3

2012-Q4

9
Copyright © 2014 by Inktank

2013-Q1

2013-Q2

2013-Q3
ARCHITECTURE

INTERFACES

S3/SWIFT

HOST/HYPERVI
SOR

OBJECT
STORAGE

BLOCK
STORAGE

CIFS/NFS

SDK

FILE
SYSTEM

OBJECT STORAGE DAEMONS (OSD)

STORAGE
CLUSTERS

MONITORS

iSCSI

10
Copyright © 2014 by Inktank
CRUSH
hash(object name) % num pg

CRUSH(pg, cluster state, rule set)

11
12
 CRUSH






13

Pseudo-random placement algorithm
 Fast calculation, no lookup
 Repeatable, deterministic
Statistically uniform distribution
Stable mapping
 Limited data migration on change
Rule-based configuration
 Infrastructure topology aware
 Adjustable replication
 Weighting
??

14
15
16
??

17
18
19
20
21
HOW DO YOU
SPIN UP
HUNDREDS OF VMs
INSTANTLY
AND
EFFICIENTLY?

22
instant copy

= 144
23
write
write

write
write

= 148
24
read
read
read

= 148
25
OPENSTACK AND CEPH
ARCHITECTURAL COMPONENTS
APP

HOST/VM

27
Copyright © 2014 by Inktank

CLIENT
CEPH WITH OPENSTACK

28
Copyright © 2014 by Inktank
PROPOSED ICEHOUSE
ADDITIONS
 Swift RADOS backend? (possibly)
 DevStack Ceph – In work
 Enable Cloning for rbd-backed ephemeral disks – In Review

Copyright © 2014 by Inktank
WHAT’S NEXT FOR CEPH?
CEPH ROADMAP
Firefly

Giant

31
Copyright © 2014 by Inktank

H-Release
CACHE TIERING - WRITEBACK

500TB
writeback cache

5 PB HDD OBJECT
STORAGE
Copyright © 2014 by Inktank
CACHE TIERING - READONLY

200TB
redonly cache

150TB
readonly cache

5 PB HDD OBJECT
STORAGE
Copyright © 2014 by Inktank

150TB
readonly cache
Costs you 30MB
of storage

34
Costs you ~14MB
of storage

35
NEXT STEPS
NEXT STEPS
WHAT NOW?

• Read about the latest version of • Deploy a test cluster on the AWS freeCeph: http://ceph.com/docs
tier using Juju: http://ceph.com/juju
• Deploy a test cluster using ceph- • Ansible playbooks for Ceph:
deploy: http://ceph.com/qsg
https://www.github.com/alfredodeza/c
eph-ansible
 Download the code:
 Most discussion happens on the mailing

lists ceph-devel and ceph-users. Join or
view archives at http://ceph.com/list
 IRC is a great place to get help (or help
others!) #ceph and #ceph-devel. Details
and logs at http://ceph.com/irc
37

http://www.github.com/ceph
 The tracker manages bugs and feature
requests. Register and start looking
around at http://tracker.ceph.com
 Doc updates and suggestions are
always welcome. Learn how to
contribute docs at
http://ceph.com/docwriting
Ian R. Colle
Director of Engineering

ian@inktank.com
@ircolle
www.linkedin.com/in/ircolle
ircolle on freenode

Ceph and OpenStack - Feb 2014

Editor's Notes

  • #12 Controlled Replication Under Scalable Hashing
  • #20 Butwhhat happens when we add more storage?
  • #29 Glance Images (COW clones0, NOVA VM rootdisk, Cinder backend.
  • #30 Can COW clone NOVA image and send to Glance, don’t have to copy in and out of local disk
  • #33 CERN, or HFT
  • #34 House of Cards latest episodeIce Dancing Finals