Inktank
Openstack with Ceph
Who is this guy?

 Ian Colle
 Ceph Program Manager, Inktank

 ian@inktank.com
 @ircolle
 www.linkedin.com/in/ircolle
 ircolle on freenode

 inktank.com | ceph.com
Selecting the Best Cloud Storage System
People need storage solutions that…

•  …are open

•  …are easy to manage

•  …satisfy their requirements
        - performance
        - functional
        - financial (cha’ ching!)
Hard Drives Are Tiny Record Players and They Fail Often
jon_a_ross, Flickr / CC BY 2.0
D    D

  D    D


  D    D      =
  D    D


x 1 MILLION
                  55 times / day
I got it!
“That’s why I use Swift in my Openstack implementation”


Hmmm, what about block storage?
Benefits of Block Storage
• Persistent
        - More familiar to users

• Not tied to a single host
        - Decouples compute and storage
        - Enables Live migration

• Extra capabilities of storage system
        - Efficient snapshots
        - Different types of storage available
        - Cloning for fast restore or scaling
Ceph over Swift
Ceph has reduced administration costs
       - “Intelligent Devices” that use a peer-to-peer mechanism to
       detect failures and react automatically – rapidly ensuring
       replication policies are still honored if a node becomes
       unavailable.
       - Swift requires an operator to notice a failure and update the
       ring configuration before redistribution of data is started.

Ceph guarantees the consistency of your data
       - Even with large volumes of data, Ceph ensures clients get a
       consistent copy from any node within a region.
       - Swift’s replication system means that users may get stale
       data, even with a single site, due to slow asynchronous
       replication as the volume of data builds up.
Swift over Ceph
Swift has quotas, we do not (coming this Fall)

Swift has object expiration, we do not (coming this Fall)
Total Solution Comparison
Ceph
        Ceph provides object AND block storage in a single system that
        is compatible with the Swift and Cinder APIs and is self-healing
        without operator intervention.

Swift
        If you use Swift, you still have to provision and manage a totally
        separate system to handle your block storage (in addition to
        paying the poor guy to go update the ring configuration)
Openstack I know, but what is Ceph?
philosophy   design


      OPEN SOURCE     SCALABLE

COMMUNITY-FOCUSED     NO SINGLE POINT OF FAILURE

                      SOFTWARE BASED

                      SELF-MANAGING
APP                   APP                   HOST/VM                   CLIENT



                       RGW                     RBD                     CEPH FS
  LIBRADOS             (RADOS                  (RADOS Block
                       Gateway)                Device)                 A POSIX-compliant
  A library allowing                                                   distributed file
  apps to directly                                                     system, with a Linux
                       A bucket-based REST     A reliable and fully-   kernel client and
  access RADOS,
                       gateway, compatible     distributed block       support for FUSE
  with support for
                       with S3 and Swift       device, with a Linux
  C, C++, Java,                                kernel client and a
  Python, Ruby,                                QEMU/KVM driver
  and PHP




RADOS

A Reliable, Autonomous, Distributed Object Store comprised of self-healing, self-managing,
intelligent storage nodes
APP                   APP                   HOST/VM                   CLIENT



                       RGW                     RBD                     CEPH FS
  LIBRADOS             (RADOS                  (RADOS Block
                       Gateway)                Device)                 A POSIX-compliant
  A library allowing                                                   distributed file
  apps to directly                                                     system, with a Linux
                       A bucket-based REST     A reliable and fully-   kernel client and
  access RADOS,
                       gateway, compatible     distributed block       support for FUSE
  with support for
                       with S3 and Swift       device, with a Linux
  C, C++, Java,                                kernel client and a
  Python, Ruby,                                QEMU/KVM driver
  and PHP




RADOS

A Reliable, Autonomous, Distributed Object Store comprised of self-healing, self-managing,
intelligent storage nodes
Monitors:



M
       • Maintain cluster map
       • Provide consensus for
       distributed decision-
       making
       • Must have an odd number
       • These do not serve
       stored objects to clients

    OSDs:
       • One per disk
       (recommended)
       • At least three in a cluster
       • Serve stored objects to
       clients
       • Intelligently peer to
       perform replication tasks
       • Supports object classes
OSD    OSD    OSD    OSD    OSD




FS      FS    FS     FS     FS     btrfs
                                   xfs
                                   ext4
DISK   DISK   DISK   DISK   DISK




  M            M            M



                                           16
HUMAN




        M




M           M
APP                   APP                   HOST/VM                   CLIENT



                       RGW                     RBD                     CEPH FS
  LIBRADOS             (RADOS                  (RADOS Block
                       Gateway)                Device)                 A POSIX-compliant
  A library allowing                                                   distributed file
  apps to directly                                                     system, with a Linux
                       A bucket-based REST     A reliable and fully-   kernel client and
  access RADOS,
                       gateway, compatible     distributed block       support for FUSE
  with support for
                       with S3 and Swift       device, with a Linux
  C, C++, Java,                                kernel client and a
  Python, Ruby,                                QEMU/KVM driver
  and PHP




RADOS

A Reliable, Autonomous, Distributed Object Store comprised of self-healing, self-managing,
intelligent storage nodes
LIBRADOS



L
       • Provides direct access to
       RADOS for applications
       • C, C++, Python, PHP,
       Java
       • No HTTP overhead
APP
    LIBRADOS

               native




    M
M               M
APP                   APP                   HOST/VM                   CLIENT



                       RGW                     RBD                     CEPH FS
  LIBRADOS             (RADOS                  (RADOS Block
                       Gateway)                Device)                 A POSIX-compliant
  A library allowing                                                   distributed file
  apps to directly                                                     system, with a Linux
                       A bucket-based REST     A reliable and fully-   kernel client and
  access RADOS,
                       gateway, compatible     distributed block       support for FUSE
  with support for
                       with S3 and Swift       device, with a Linux
  C, C++, Java,                                kernel client and a
  Python, Ruby,                                QEMU/KVM driver
  and PHP




RADOS

A Reliable, Autonomous, Distributed Object Store comprised of self-healing, self-managing,
intelligent storage nodes
APP                APP
                              REST




RGW                RGW
LIBRADOS           LIBRADOS


                                     native




           M
     M         M
RADOS Gateway:
   • REST-based interface to
   RADOS
   • Supports buckets,
   accounting
   • Compatible with S3 and
   Swift applications
APP                   APP                   HOST/VM                   CLIENT



                       RGW                     RBD                     CEPH FS
  LIBRADOS             (RADOS                  (RADOS Block
                       Gateway)                Device)                 A POSIX-compliant
  A library allowing                                                   distributed file
  apps to directly                                                     system, with a Linux
                       A bucket-based REST     A reliable and fully-   kernel client and
  access RADOS,
                       gateway, compatible     distributed block       support for FUSE
  with support for
                       with S3 and Swift       device, with a Linux
  C, C++, Java,                                kernel client and a
  Python, Ruby,                                QEMU/KVM driver
  and PHP




RADOS

A Reliable, Autonomous, Distributed Object Store comprised of self-healing, self-managing,
intelligent storage nodes
VM




VIRTUALIZATION CONTAINER
            LIBRBD
          LIBRADOS




        M
   M                 M
RADOS Block Device:
   • Storage of virtual disks in
   RADOS
   • Allows decoupling of VMs
   and containers
   • Live migration!
   • Images are striped across
   the cluster
   • Boot support in QEMU,
   KVM, and OpenStack Nova
   (more on that later!)
   • Mount support in the
   Linux kernel
APP                    APP                  HOST/VM                   CLIENT



                       RADOSGW                 RBD                      CEPH FS
  LIBRADOS
                       A bucket-based REST     A reliable and fully-    A POSIX-compliant
  A library allowing   gateway, compatible     distributed block        distributed file
  apps to directly     with S3 and Swift       device, with a Linux     system, with a Linux
  access RADOS,                                kernel client and a      kernel client and
  with support for                             QEMU/KVM driver          support for FUSE
  C, C++, Java,
  Python, Ruby,
  and PHP




RADOS

A reliable, autonomous, distributed object store comprised of self-healing, self-managing,
intelligent storage nodes
What Makes Ceph Unique?
Part one: CRUSH
C D
           C D
           C D
           C D
           C D
      ??
APP        C D
           C D
           C D
           C D
           C D
           C D
           C D
C D
      C D
      C D
      C D
      C D
APP   C D
      C D
      C D
      C D
      C D
      C D
      C D
C D
           C D   A-G
           C D
           C D
           C D   H-N
APP   F*   C D
           C D
           C D   O-T
           C D
           C D
           C D   U-Z
           C D
10 10 01 01 10 10 01 11 01 10

                               hash(object name) % num pg

10   10    01   01   10   10    01   11   01   10




                               CRUSH(pg, cluster state, rule set)
10 10 01 01 10 10 01 11 01 10




10   10    01   01   10   10   01   11    01   10
CRUSH
  • Pseudo-random placement
  algorithm
  • Ensures even distribution
  • Repeatable, deterministic
  • Rule-based configuration
       • Replica count
       • Infrastructure topology
       • Weighting
35
36
37
38
What Makes Ceph Unique
Part two: thin provisioning
40
HOW DO YOU
      SPIN UP
THOUSANDS OF VMs
    INSTANTLY
       AND
  EFFICIENTLY?
42
43
44
How Does Ceph work with Openstack?
Ceph / Openstack Integration
RBD support initially added in Cactus

Have increased features / integration with each subsequent release

You can use both the Swift (object/blob store) and Keystone (identity
service) APIs to talk to RGW

Cinder block storage as a service talks directly to RBD

Nova cloud computing controller talks to RBD via the hypervisor

Coming in Havana – Ability to create a volume from an RBD image via
the Horizon UI
What is Inktank?
I really like your polo shirt, please tell me what it means!
Who?
The majority of Ceph contributors

Formed by Sage Weil (CTO), the creator of Ceph, in 2011

Funded by DreamHost and other investors (Mark Shuttleworth, etc.)
Why?
To ensure the long-term success of Ceph

To help companies adopt Ceph through services, support, training, and
consulting
What?
Guide the Ceph roadmap
        - Hosting a virtual Ceph Design Summit in early May
Standardize the Ceph development and release schedule
       - Quarterly stable releases, interim releases every 2 weeks
               * May 2013 – Cuttlefish
                       RBD Incremental Snapshots!
               * Aug 2013 – Dumpling
                       Disaster Recovery (Multisite)
                       Admin API
               * Nov 2013 – Some really cool cephalopod name that
               starts with an E
Ensure Quality
       - Maintain Teuthology test suite
       - Harden each stable release via extensive manual and
       automated testing
Develop reference and custom architectures for implementation
Inktank/Dell Partnership

• Inktank is a Strategic partner for Dell in Emerging Solutions
• The Emerging Solutions Ecosystem Partner Program is designed to
deliver complementary cloud components
• As part of this program, Dell and Inktank provide:
      > Ceph Storage Software
        - Adds scalable cloud storage to the Dell OpenStack-powered cloud
        - Uses Crowbar to provision and configure a Ceph cluster (Yeah
        Crowbar!)
      > Professional Services, Support, and Training
        - Collaborative Support for Dell hardware customers
      > Joint Solution
        - Validated against Dell Reference Architectures via the
        Technology Partner program
What do we want from you??
Try Ceph and tell us what you think!
http://ceph.com/resources/downloads/

http://ceph.com/resources/mailing-list-irc/
        - Ask, if you need help.
        - Help others, if you can!

Ask your company to start dedicating dev resources to the project!
http://github.com/ceph

Find a bug (http://tracker.ceph.com) and fix it!

Participate in our Ceph Design Summit!
One final request…
We’re planning the next release of Ceph and would love your input.

What features would you like us to include?

       iSCSI?

       Live Migration?
56



     Questions?

     Ian Colle
     Ceph Program Manager, Inktank

     ian@inktank.com
     @ircolle
     www.linkedin.com/in/ircolle
     ircolle on freenode

     inktank.com | ceph.com

Openstack with ceph

  • 1.
  • 2.
    Who is thisguy? Ian Colle Ceph Program Manager, Inktank ian@inktank.com @ircolle www.linkedin.com/in/ircolle ircolle on freenode inktank.com | ceph.com
  • 3.
    Selecting the BestCloud Storage System People need storage solutions that… •  …are open •  …are easy to manage •  …satisfy their requirements - performance - functional - financial (cha’ ching!)
  • 4.
    Hard Drives AreTiny Record Players and They Fail Often jon_a_ross, Flickr / CC BY 2.0
  • 5.
    D D D D D D = D D x 1 MILLION 55 times / day
  • 6.
    I got it! “That’swhy I use Swift in my Openstack implementation” Hmmm, what about block storage?
  • 7.
    Benefits of BlockStorage • Persistent - More familiar to users • Not tied to a single host - Decouples compute and storage - Enables Live migration • Extra capabilities of storage system - Efficient snapshots - Different types of storage available - Cloning for fast restore or scaling
  • 8.
    Ceph over Swift Cephhas reduced administration costs - “Intelligent Devices” that use a peer-to-peer mechanism to detect failures and react automatically – rapidly ensuring replication policies are still honored if a node becomes unavailable. - Swift requires an operator to notice a failure and update the ring configuration before redistribution of data is started. Ceph guarantees the consistency of your data - Even with large volumes of data, Ceph ensures clients get a consistent copy from any node within a region. - Swift’s replication system means that users may get stale data, even with a single site, due to slow asynchronous replication as the volume of data builds up.
  • 9.
    Swift over Ceph Swifthas quotas, we do not (coming this Fall) Swift has object expiration, we do not (coming this Fall)
  • 10.
    Total Solution Comparison Ceph Ceph provides object AND block storage in a single system that is compatible with the Swift and Cinder APIs and is self-healing without operator intervention. Swift If you use Swift, you still have to provision and manage a totally separate system to handle your block storage (in addition to paying the poor guy to go update the ring configuration)
  • 11.
    Openstack I know,but what is Ceph?
  • 12.
    philosophy design OPEN SOURCE SCALABLE COMMUNITY-FOCUSED NO SINGLE POINT OF FAILURE SOFTWARE BASED SELF-MANAGING
  • 13.
    APP APP HOST/VM CLIENT RGW RBD CEPH FS LIBRADOS (RADOS (RADOS Block Gateway) Device) A POSIX-compliant A library allowing distributed file apps to directly system, with a Linux A bucket-based REST A reliable and fully- kernel client and access RADOS, gateway, compatible distributed block support for FUSE with support for with S3 and Swift device, with a Linux C, C++, Java, kernel client and a Python, Ruby, QEMU/KVM driver and PHP RADOS A Reliable, Autonomous, Distributed Object Store comprised of self-healing, self-managing, intelligent storage nodes
  • 14.
    APP APP HOST/VM CLIENT RGW RBD CEPH FS LIBRADOS (RADOS (RADOS Block Gateway) Device) A POSIX-compliant A library allowing distributed file apps to directly system, with a Linux A bucket-based REST A reliable and fully- kernel client and access RADOS, gateway, compatible distributed block support for FUSE with support for with S3 and Swift device, with a Linux C, C++, Java, kernel client and a Python, Ruby, QEMU/KVM driver and PHP RADOS A Reliable, Autonomous, Distributed Object Store comprised of self-healing, self-managing, intelligent storage nodes
  • 15.
    Monitors: M • Maintain cluster map • Provide consensus for distributed decision- making • Must have an odd number • These do not serve stored objects to clients OSDs: • One per disk (recommended) • At least three in a cluster • Serve stored objects to clients • Intelligently peer to perform replication tasks • Supports object classes
  • 16.
    OSD OSD OSD OSD OSD FS FS FS FS FS btrfs xfs ext4 DISK DISK DISK DISK DISK M M M 16
  • 17.
    HUMAN M M M
  • 18.
    APP APP HOST/VM CLIENT RGW RBD CEPH FS LIBRADOS (RADOS (RADOS Block Gateway) Device) A POSIX-compliant A library allowing distributed file apps to directly system, with a Linux A bucket-based REST A reliable and fully- kernel client and access RADOS, gateway, compatible distributed block support for FUSE with support for with S3 and Swift device, with a Linux C, C++, Java, kernel client and a Python, Ruby, QEMU/KVM driver and PHP RADOS A Reliable, Autonomous, Distributed Object Store comprised of self-healing, self-managing, intelligent storage nodes
  • 19.
    LIBRADOS L • Provides direct access to RADOS for applications • C, C++, Python, PHP, Java • No HTTP overhead
  • 20.
    APP LIBRADOS native M M M
  • 21.
    APP APP HOST/VM CLIENT RGW RBD CEPH FS LIBRADOS (RADOS (RADOS Block Gateway) Device) A POSIX-compliant A library allowing distributed file apps to directly system, with a Linux A bucket-based REST A reliable and fully- kernel client and access RADOS, gateway, compatible distributed block support for FUSE with support for with S3 and Swift device, with a Linux C, C++, Java, kernel client and a Python, Ruby, QEMU/KVM driver and PHP RADOS A Reliable, Autonomous, Distributed Object Store comprised of self-healing, self-managing, intelligent storage nodes
  • 22.
    APP APP REST RGW RGW LIBRADOS LIBRADOS native M M M
  • 23.
    RADOS Gateway: • REST-based interface to RADOS • Supports buckets, accounting • Compatible with S3 and Swift applications
  • 24.
    APP APP HOST/VM CLIENT RGW RBD CEPH FS LIBRADOS (RADOS (RADOS Block Gateway) Device) A POSIX-compliant A library allowing distributed file apps to directly system, with a Linux A bucket-based REST A reliable and fully- kernel client and access RADOS, gateway, compatible distributed block support for FUSE with support for with S3 and Swift device, with a Linux C, C++, Java, kernel client and a Python, Ruby, QEMU/KVM driver and PHP RADOS A Reliable, Autonomous, Distributed Object Store comprised of self-healing, self-managing, intelligent storage nodes
  • 25.
    VM VIRTUALIZATION CONTAINER LIBRBD LIBRADOS M M M
  • 26.
    RADOS Block Device: • Storage of virtual disks in RADOS • Allows decoupling of VMs and containers • Live migration! • Images are striped across the cluster • Boot support in QEMU, KVM, and OpenStack Nova (more on that later!) • Mount support in the Linux kernel
  • 27.
    APP APP HOST/VM CLIENT RADOSGW RBD CEPH FS LIBRADOS A bucket-based REST A reliable and fully- A POSIX-compliant A library allowing gateway, compatible distributed block distributed file apps to directly with S3 and Swift device, with a Linux system, with a Linux access RADOS, kernel client and a kernel client and with support for QEMU/KVM driver support for FUSE C, C++, Java, Python, Ruby, and PHP RADOS A reliable, autonomous, distributed object store comprised of self-healing, self-managing, intelligent storage nodes
  • 28.
    What Makes CephUnique? Part one: CRUSH
  • 29.
    C D C D C D C D C D ?? APP C D C D C D C D C D C D C D
  • 30.
    C D C D C D C D C D APP C D C D C D C D C D C D C D
  • 31.
    C D C D A-G C D C D C D H-N APP F* C D C D C D O-T C D C D C D U-Z C D
  • 32.
    10 10 0101 10 10 01 11 01 10 hash(object name) % num pg 10 10 01 01 10 10 01 11 01 10 CRUSH(pg, cluster state, rule set)
  • 33.
    10 10 0101 10 10 01 11 01 10 10 10 01 01 10 10 01 11 01 10
  • 34.
    CRUSH • Pseudo-randomplacement algorithm • Ensures even distribution • Repeatable, deterministic • Rule-based configuration • Replica count • Infrastructure topology • Weighting
  • 35.
  • 36.
  • 37.
  • 38.
  • 39.
    What Makes CephUnique Part two: thin provisioning
  • 40.
  • 41.
    HOW DO YOU SPIN UP THOUSANDS OF VMs INSTANTLY AND EFFICIENTLY?
  • 42.
  • 43.
  • 44.
  • 45.
    How Does Cephwork with Openstack?
  • 46.
    Ceph / OpenstackIntegration RBD support initially added in Cactus Have increased features / integration with each subsequent release You can use both the Swift (object/blob store) and Keystone (identity service) APIs to talk to RGW Cinder block storage as a service talks directly to RBD Nova cloud computing controller talks to RBD via the hypervisor Coming in Havana – Ability to create a volume from an RBD image via the Horizon UI
  • 48.
    What is Inktank? Ireally like your polo shirt, please tell me what it means!
  • 50.
    Who? The majority ofCeph contributors Formed by Sage Weil (CTO), the creator of Ceph, in 2011 Funded by DreamHost and other investors (Mark Shuttleworth, etc.)
  • 51.
    Why? To ensure thelong-term success of Ceph To help companies adopt Ceph through services, support, training, and consulting
  • 52.
    What? Guide the Cephroadmap - Hosting a virtual Ceph Design Summit in early May Standardize the Ceph development and release schedule - Quarterly stable releases, interim releases every 2 weeks * May 2013 – Cuttlefish RBD Incremental Snapshots! * Aug 2013 – Dumpling Disaster Recovery (Multisite) Admin API * Nov 2013 – Some really cool cephalopod name that starts with an E Ensure Quality - Maintain Teuthology test suite - Harden each stable release via extensive manual and automated testing Develop reference and custom architectures for implementation
  • 53.
    Inktank/Dell Partnership • Inktankis a Strategic partner for Dell in Emerging Solutions • The Emerging Solutions Ecosystem Partner Program is designed to deliver complementary cloud components • As part of this program, Dell and Inktank provide: > Ceph Storage Software - Adds scalable cloud storage to the Dell OpenStack-powered cloud - Uses Crowbar to provision and configure a Ceph cluster (Yeah Crowbar!) > Professional Services, Support, and Training - Collaborative Support for Dell hardware customers > Joint Solution - Validated against Dell Reference Architectures via the Technology Partner program
  • 54.
    What do wewant from you?? Try Ceph and tell us what you think! http://ceph.com/resources/downloads/ http://ceph.com/resources/mailing-list-irc/ - Ask, if you need help. - Help others, if you can! Ask your company to start dedicating dev resources to the project! http://github.com/ceph Find a bug (http://tracker.ceph.com) and fix it! Participate in our Ceph Design Summit!
  • 55.
    One final request… We’replanning the next release of Ceph and would love your input. What features would you like us to include? iSCSI? Live Migration?
  • 56.
    56 Questions? Ian Colle Ceph Program Manager, Inktank ian@inktank.com @ircolle www.linkedin.com/in/ircolle ircolle on freenode inktank.com | ceph.com