20140411
Ian Colle
Director of Engineering, Inktank
ian@inktank.com
@ircolle
www.linkedin.com/in/ircolle
ircolle on oftc/freenode
inktank.com | ceph.com
AGENDA
3
CEPH
CEPH UNIFIED STORAGE
FILE
SYSTEM
BLOCK
STORAGE
OBJECT
STORAGE
CloudStack
Geo-Replication
Native API
5
Multi-tenant
S3 & Swift
CloudStack
Linux Kernel
iSCSI
Clones
Snapshots
CIFS/NFS
HDFS
Distributed Metadata
Linux Kernel
POSIX
Copyright © 2014 by Inktank
CEPH OVERVIEW
PHILOSOPHY TODAY
6
2004
2010
2012
HISTORY
2006
Included in
Linux kernel
Integrated into
CloudStack
Open sourced for
the first time
Project starts
at UCSC
Failure is normal
Self managing
Scale out on
commodity
hardware
Everything runs
in software
Copyright © 2014 by Inktank
TRADITIONAL STORAGE VS. CEPH
TRADITIONAL
ENTERPRISE STORAGE
7
Copyright © 2014 by Inktank
STRONG & GROWING COMMUNITY
8
8,172
37,946
2,888
11,500
1,418
2,715
2011-Q3 2011-Q4 2012-Q1 2012-Q2 2012-Q3 2012-Q4 2013-Q1 2013-Q2 2013-Q3
IRC chatlines
ML messages
Commits
Copyright © 2014 by Inktank
ARCHITECTURE
9
S3/SWIFT
HOST/
HYPERVISOR
iSCSI CIFS/NFS SDK
INTERFACESSTORAGECLUSTERS
MONITORS OBJECT STORAGE DAEMONS (OSD)
BLOCK
STORAGE
FILE SYSTEM
OBJECT
STORAGE
Copyright © 2014 by Inktank
CRUSH
10
hash(object name) % num pg
CRUSH(pg, cluster state, rule set)
11
§  CRUSH
§  Pseudo-random placement algorithm
§  Fast calculation, no lookup
§  Repeatable, deterministic
§  Statistically uniform distribution
§  Stable mapping
§  Limited data migration on change
§  Rule-based configuration
§  Infrastructure topology aware
§  Adjustable replication
§  Weighting
12
13
??
14
15
16
??
17
18
19
20
HOW DO YOU
SPIN UP
HUNDREDS OF VMs
INSTANTLY
AND
EFFICIENTLY?
21
22
instant copy
= 144
23
write
write
write
= 148
write
24
read
read
read
= 148
CEPH AND CLOUDSTACK
§  Wido den Hollander wido@42on.com
§  If you don’t know him, you should
§  Developed
§  rados-java
§  libvirt support for RBD
§  CloudStack integration
THIS IS THE HOUSE THAT WIDO BUILT
ARCHITECTURAL COMPONENTS
27
APP HOST/VM CLIENT
Copyright © 2014 by Inktank
CEPH WITH CLOUDSTACK
28
Copyright © 2014 by Inktank
§  Ceph (Firefly in rc)
§  http://www.ceph.com/download/
§  libvirt (1.2.3 released on 01APR2014)
§  http://libvirt.org/sources/
§  QEMU (2.0 in rc, or 1.7.1)
§  http://wiki.qemu.org/Download
§  Cloudstack (4.2 or newer)
§  http://cloudstack.apache.org/downloads.html
WHAT DO I NEED?
§  Set up a ceph cluster
§  http://ceph.com/docs/master/start/
§  Install/configure QEMU
§  http://ceph.com/docs/master/rbd/qemu-rbd/
§  Install/configure libvirt
§  http://ceph.com/docs/master/rbd/libvirt/
§  Create your Cloudstack pool
§  ceph osd pool create cloudstack
§  Add Primary Storage
§  For Protocol select RBD
WHAT DO I DO?
WHAT’S NEXT FOR CEPH?
CEPH ROADMAP
32
Firefly Giant H-Release
Copyright © 2014 by Inktank
CACHE TIERING - WRITEBACK
Copyright © 2014 by Inktank
5 PB HDD OBJECT STORAGE
500TB
writeback cache
CACHE TIERING - READONLY
Copyright © 2014 by Inktank
5 PB HDD OBJECT STORAGE
200TB
redonly cache
150TB
readonly cache
150TB
readonly cache
35
Costs you 30MB
of storage
36
Costs you ~14MB
of storage
NEXT STEPS
NEXT STEPS
WHAT NOW?
•  Read about the latest version of
Ceph: http://ceph.com/docs
•  Deploy a test cluster using ceph-
deploy: http://ceph.com/qsg
§  Most discussion happens on the mailing
lists ceph-devel and ceph-users. Join or
view archives at http://ceph.com/list
§  IRC is a great place to get help (or help
others!) #ceph and #ceph-devel. Details
and logs at http://ceph.com/irc
38
•  Deploy a test cluster on the AWS free-tier
using Juju: http://ceph.com/juju
•  Ansible playbooks for Ceph:
https://github.com/ceph/ceph-ansible
§  Download the code:
http://www.github.com/ceph
§  The tracker manages bugs and feature
requests. Register and start looking
around at http://tracker.ceph.com
§  Doc updates and suggestions are always
welcome. Learn how to contribute docs at
http://ceph.com/docwriting
Ian R. Colle
Director of Engineering
ian@inktank.com
@ircolle
www.linkedin.com/in/ircolle
ircolle on oftc/freenode

Ceph and cloud stack apr 2014

  • 1.
  • 2.
    Ian Colle Director ofEngineering, Inktank ian@inktank.com @ircolle www.linkedin.com/in/ircolle ircolle on oftc/freenode inktank.com | ceph.com
  • 3.
  • 4.
  • 5.
    CEPH UNIFIED STORAGE FILE SYSTEM BLOCK STORAGE OBJECT STORAGE CloudStack Geo-Replication NativeAPI 5 Multi-tenant S3 & Swift CloudStack Linux Kernel iSCSI Clones Snapshots CIFS/NFS HDFS Distributed Metadata Linux Kernel POSIX Copyright © 2014 by Inktank
  • 6.
    CEPH OVERVIEW PHILOSOPHY TODAY 6 2004 2010 2012 HISTORY 2006 Includedin Linux kernel Integrated into CloudStack Open sourced for the first time Project starts at UCSC Failure is normal Self managing Scale out on commodity hardware Everything runs in software Copyright © 2014 by Inktank
  • 7.
    TRADITIONAL STORAGE VS.CEPH TRADITIONAL ENTERPRISE STORAGE 7 Copyright © 2014 by Inktank
  • 8.
    STRONG & GROWINGCOMMUNITY 8 8,172 37,946 2,888 11,500 1,418 2,715 2011-Q3 2011-Q4 2012-Q1 2012-Q2 2012-Q3 2012-Q4 2013-Q1 2013-Q2 2013-Q3 IRC chatlines ML messages Commits Copyright © 2014 by Inktank
  • 9.
    ARCHITECTURE 9 S3/SWIFT HOST/ HYPERVISOR iSCSI CIFS/NFS SDK INTERFACESSTORAGECLUSTERS MONITORSOBJECT STORAGE DAEMONS (OSD) BLOCK STORAGE FILE SYSTEM OBJECT STORAGE Copyright © 2014 by Inktank
  • 10.
    CRUSH 10 hash(object name) %num pg CRUSH(pg, cluster state, rule set)
  • 11.
  • 12.
    §  CRUSH §  Pseudo-randomplacement algorithm §  Fast calculation, no lookup §  Repeatable, deterministic §  Statistically uniform distribution §  Stable mapping §  Limited data migration on change §  Rule-based configuration §  Infrastructure topology aware §  Adjustable replication §  Weighting 12
  • 13.
  • 14.
  • 15.
  • 16.
  • 17.
  • 18.
  • 19.
  • 20.
  • 21.
    HOW DO YOU SPINUP HUNDREDS OF VMs INSTANTLY AND EFFICIENTLY? 21
  • 22.
  • 23.
  • 24.
  • 25.
  • 26.
    §  Wido denHollander wido@42on.com §  If you don’t know him, you should §  Developed §  rados-java §  libvirt support for RBD §  CloudStack integration THIS IS THE HOUSE THAT WIDO BUILT
  • 27.
    ARCHITECTURAL COMPONENTS 27 APP HOST/VMCLIENT Copyright © 2014 by Inktank
  • 28.
  • 29.
    §  Ceph (Fireflyin rc) §  http://www.ceph.com/download/ §  libvirt (1.2.3 released on 01APR2014) §  http://libvirt.org/sources/ §  QEMU (2.0 in rc, or 1.7.1) §  http://wiki.qemu.org/Download §  Cloudstack (4.2 or newer) §  http://cloudstack.apache.org/downloads.html WHAT DO I NEED?
  • 30.
    §  Set upa ceph cluster §  http://ceph.com/docs/master/start/ §  Install/configure QEMU §  http://ceph.com/docs/master/rbd/qemu-rbd/ §  Install/configure libvirt §  http://ceph.com/docs/master/rbd/libvirt/ §  Create your Cloudstack pool §  ceph osd pool create cloudstack §  Add Primary Storage §  For Protocol select RBD WHAT DO I DO?
  • 31.
  • 32.
    CEPH ROADMAP 32 Firefly GiantH-Release Copyright © 2014 by Inktank
  • 33.
    CACHE TIERING -WRITEBACK Copyright © 2014 by Inktank 5 PB HDD OBJECT STORAGE 500TB writeback cache
  • 34.
    CACHE TIERING -READONLY Copyright © 2014 by Inktank 5 PB HDD OBJECT STORAGE 200TB redonly cache 150TB readonly cache 150TB readonly cache
  • 35.
  • 36.
  • 37.
  • 38.
    NEXT STEPS WHAT NOW? • Read about the latest version of Ceph: http://ceph.com/docs •  Deploy a test cluster using ceph- deploy: http://ceph.com/qsg §  Most discussion happens on the mailing lists ceph-devel and ceph-users. Join or view archives at http://ceph.com/list §  IRC is a great place to get help (or help others!) #ceph and #ceph-devel. Details and logs at http://ceph.com/irc 38 •  Deploy a test cluster on the AWS free-tier using Juju: http://ceph.com/juju •  Ansible playbooks for Ceph: https://github.com/ceph/ceph-ansible §  Download the code: http://www.github.com/ceph §  The tracker manages bugs and feature requests. Register and start looking around at http://tracker.ceph.com §  Doc updates and suggestions are always welcome. Learn how to contribute docs at http://ceph.com/docwriting
  • 39.
    Ian R. Colle Directorof Engineering ian@inktank.com @ircolle www.linkedin.com/in/ircolle ircolle on oftc/freenode