Your SlideShare is downloading. ×
0
New Features for Ceph with Cinder          and Beyond
73
60         Why Ceph?     •   Low cost     •   Flexible     •   Scalable     •   Open source
APP                    APP                  HOST/VM                   CLIENT                       RADOSGW                ...
OSD    OSD    OSD    OSD    OSD                                   btrfsFS      FS    FS     FS     FS                     ...
HUMAN        MM           M                82
Monitors:M    •  Maintain cluster map    •  Provide consensus for       distributed decision-making    •  Must have an odd...
C D      C D      C D      C D      C DAPP   C D      C D      C D      C D      C D      C D      C D
C D      C D      C D      C D      C DAPP   C D      C D      C D      C D      C D      C D      C D
C D          C D   A-G          C D          C D          C D   H-N      FAPP   *   C D          C D          C D   O-T   ...
10 10 01 01 10 10 01 11 01 10                               hash(object name) % num pg10   10    01   01   10   10    01  ...
10 10 01 01 10 10 01 11 01 1010   10    01   01   10   10   01   11    01   10                                            ...
CRUSH•  Pseudo-random placement   algorithm•  Ensures even distribution•  Repeatable, deterministic•  Rule-based configura...
CLIENT         ??              110
112
CLIENT         ??              113
111
APP                    APP                  HOST/VM                   CLIENT                       RADOSGW                ...
APP    LIBRADOS               native    MM               M                        85
LIBRADOSL    •  Provides direct access to       RADOS for applications    •  C, C++, Python, PHP, Java    •  No HTTP overh...
APP                    APP                  HOST/VM                   CLIENT                       RADOSGW                ...
APP                APP                                RESTRADOSGW          RADOSGW  LIBRADOS           LIBRADOS           ...
RADOS Gateway:•  REST-based interface to   RADOS•  Supports buckets,   accounting•  Compatible with S3 and   Swift applica...
APP                    APP                  HOST/VM                   CLIENT                       RADOSGW                ...
VMVIRTUALIZATION CONTAINER             LIBRBD            LIBRADOS        M   M                   M                        ...
CONTAINER            VM       CONTAINER   LIBRBD                        LIBRBD  LIBRADOS                      LIBRADOS    ...
HOST    KRBD (KERNEL MODULE)           LIBRADOS       MM                          M                               93
RADOS Block Device:• Storage of virtual disks in RADOS• Allows decoupling of VMs and  containers• Live migration!• Images ...
VMVIRTUALIZATION CONTAINER             LIBRBD            LIBRADOS        M   M                   M                        ...
HOW DO YOU      SPIN UPTHOUSANDS OF VMs    INSTANTLY       AND  EFFICIENTLY?                   116
instant copy144   0       0      0   0   = 144                                     117
write                          CLIENT                  write                  write                  write144   4   = 148 ...
read                  read                         CLIENT                  read144   4   = 148                            ...
old-style VM image creationlocal disk                Nova               Glance(VM images)               compute           ...
Why use block storage?• Persistent  •    More familiar to users•  Not tied to a single host  •    Decouples compute and st...
Cinder volume creationCinder                   Cinder         volume              Glance API                     volume   ...
Efficient volume creationCinder                   Cinder                   volume              Glance API                 ...
54         Whats new in Bobtail:         Improved OSD threading     •   Filesystem and journal related-locks are now      ...
55         Whats new in Bobtail:         Recovery QoS     •   Message priority system reworked to prevent         starvati...
56         Whats new in Bobtail:         Block Device Cloning     •   Instantly create new volumes based on         templa...
57         Whats new in Bobtail:         Keystone Integration     •   RADOS gateway can talk to keystone to         authen...
58         Whats next: Cuttlefish     •   Incremental backup for block devices     •   On-disk encryption     •   REST man...
59         Whats next: Dumpling     •   Geo-replication for RADOS gateway     •   REST management API for Ceph cluster    ...
Questions?Josh Durginjosh.durgin@inktank.comjdurgin on freenodeinktank.com | ceph.com
Upcoming SlideShare
Loading in...5
×

New Features for Ceph with Cinder and Beyond

1,235

Published on

Published in: Technology
0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
1,235
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
59
Comments
0
Likes
2
Embeds 0
No embeds

No notes for slide

Transcript of "New Features for Ceph with Cinder and Beyond"

  1. 1. New Features for Ceph with Cinder and Beyond
  2. 2. 73
  3. 3. 60 Why Ceph? • Low cost • Flexible • Scalable • Open source
  4. 4. APP APP HOST/VM CLIENT RADOSGW RBD CEPH FS LIBRADOS A bucket-based REST A reliable and fully- A POSIX-compliant A library allowing gateway, compatible distributed block distributed file apps to directly with S3 and Swift device, with a Linux system, with a Linux access RADOS, kernel client and a kernel client and with support for QEMU/KVM driver support for FUSE C, C++, Java, Python, Ruby, and PHPRADOSA reliable, autonomous, distributed object store comprised of self-healing, self-managing,intelligent storage nodes 79
  5. 5. OSD OSD OSD OSD OSD btrfsFS FS FS FS FS xfs ext4DISK DISK DISK DISK DISK M M M 81
  6. 6. HUMAN MM M 82
  7. 7. Monitors:M •  Maintain cluster map •  Provide consensus for distributed decision-making •  Must have an odd number •  These do not serve stored objects to clients OSDs: •  One per disk (recommended) •  At least three in a cluster •  Serve stored objects to clients •  Intelligently peer to perform replication tasks •  Supports object classes 83
  8. 8. C D C D C D C D C DAPP C D C D C D C D C D C D C D
  9. 9. C D C D C D C D C DAPP C D C D C D C D C D C D C D
  10. 10. C D C D A-G C D C D C D H-N FAPP * C D C D C D O-T C D C D C D U-Z C D
  11. 11. 10 10 01 01 10 10 01 11 01 10 hash(object name) % num pg10 10 01 01 10 10 01 11 01 10 CRUSH(pg, cluster state, rule set) 107
  12. 12. 10 10 01 01 10 10 01 11 01 1010 10 01 01 10 10 01 11 01 10 108
  13. 13. CRUSH•  Pseudo-random placement algorithm•  Ensures even distribution•  Repeatable, deterministic•  Rule-based configuration •  Replica count •  Infrastructure topology •  Weighting 109
  14. 14. CLIENT ?? 110
  15. 15. 112
  16. 16. CLIENT ?? 113
  17. 17. 111
  18. 18. APP APP HOST/VM CLIENT RADOSGW RBD CEPH FS LIBRADOS A bucket-based REST A reliable and fully- A POSIX-compliant A library allowing gateway, compatible distributed block distributed file apps to directly with S3 and Swift device, with a Linux system, with a Linux access RADOS, kernel client and a kernel client and with support for QEMU/KVM driver support for FUSE C, C++, Java, Python, Ruby, and PHPRADOSA reliable, autonomous, distributed object store comprised of self-healing, self-managing,intelligent storage nodes 84
  19. 19. APP LIBRADOS native MM M 85
  20. 20. LIBRADOSL •  Provides direct access to RADOS for applications •  C, C++, Python, PHP, Java •  No HTTP overhead
  21. 21. APP APP HOST/VM CLIENT RADOSGW RBD CEPH FS LIBRADOS A bucket-based REST A reliable and fully- A POSIX-compliant A library allowing gateway, compatible distributed block distributed file apps to directly with S3 and Swift device, with a Linux system, with a Linux access RADOS, kernel client and a kernel client and with support for QEMU/KVM driver support for FUSE C, C++, Java, Python, Ruby, and PHPRADOSA reliable, autonomous, distributed object store comprised of self-healing, self-managing,intelligent storage nodes 87
  22. 22. APP APP RESTRADOSGW RADOSGW LIBRADOS LIBRADOS native M M M 88
  23. 23. RADOS Gateway:•  REST-based interface to RADOS•  Supports buckets, accounting•  Compatible with S3 and Swift applications 89
  24. 24. APP APP HOST/VM CLIENT RADOSGW RBD CEPH FS LIBRADOS A bucket-based REST A reliable and fully- A POSIX-compliant A library allowing gateway, compatible distributed block distributed file apps to directly with S3 and Swift device, with a Linux system, with a Linux access RADOS, kernel client and a kernel client and with support for QEMU/KVM driver support for FUSE C, C++, Java, Python, Ruby, and PHPRADOSA reliable, autonomous, distributed object store comprised of self-healing, self-managing,intelligent storage nodes 90
  25. 25. VMVIRTUALIZATION CONTAINER LIBRBD LIBRADOS M M M 91
  26. 26. CONTAINER VM CONTAINER LIBRBD LIBRBD LIBRADOS LIBRADOS M M M 92
  27. 27. HOST KRBD (KERNEL MODULE) LIBRADOS MM M 93
  28. 28. RADOS Block Device:• Storage of virtual disks in RADOS• Allows decoupling of VMs and containers• Live migration!• Images are striped across the cluster• Thin-provisioning• Snapshots and cloning
  29. 29. VMVIRTUALIZATION CONTAINER LIBRBD LIBRADOS M M M 115
  30. 30. HOW DO YOU SPIN UPTHOUSANDS OF VMs INSTANTLY AND EFFICIENTLY? 116
  31. 31. instant copy144 0 0 0 0 = 144 117
  32. 32. write CLIENT write write write144 4 = 148 118
  33. 33. read read CLIENT read144 4 = 148 119
  34. 34. old-style VM image creationlocal disk Nova Glance(VM images) compute (templates) read X● ephemeral● expensive to create X X 29
  35. 35. Why use block storage?• Persistent • More familiar to users• Not tied to a single host • Decouples compute and storage • Enables Live migration• Extra capabilities of storage system • Efficient snapshots • Different types of storage available • Cloning for fast restore or scaling
  36. 36. Cinder volume creationCinder Cinder volume Glance API volume driver (templates) create image from X locate X location of X read X X flexibility in where VM images are stored X reference to X 31
  37. 37. Efficient volume creationCinder Cinder volume Glance API volume driver (templates) create image from X locate X location of X clone X to X X fast CoW clone X X complete reference to X 32
  38. 38. 54 Whats new in Bobtail: Improved OSD threading • Filesystem and journal related-locks are now more fine-grained • Boosted single disk IOPS from 6k to 22k • Restructured how map updates are handled, letting each placement group process them independently
  39. 39. 55 Whats new in Bobtail: Recovery QoS • Message priority system reworked to prevent starvation • Recovery operations can be lower priority than client I/O without starving • Requests to access an object can increase recovery priority for that object
  40. 40. 56 Whats new in Bobtail: Block Device Cloning • Instantly create new volumes based on templates (snapshots) • Integrated with Cinder in Folsom • Grizzly adds the ability to copy (not clone) non-raw images to RBD
  41. 41. 57 Whats new in Bobtail: Keystone Integration • RADOS gateway can talk to keystone to authenticate swift api requests • Let keystone manage your users • Supported by the Ceph juju charm
  42. 42. 58 Whats next: Cuttlefish • Incremental backup for block devices • On-disk encryption • REST management API for RADOS gateway • More performance improvements (especially for small I/O) • More! (http://www.inktank.com/about- inktank/roadmap/)
  43. 43. 59 Whats next: Dumpling • Geo-replication for RADOS gateway • REST management API for Ceph cluster • ... (virtual) Ceph Developer Summit May 6
  44. 44. Questions?Josh Durginjosh.durgin@inktank.comjdurgin on freenodeinktank.com | ceph.com
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×