Storing VMs with Cinder and Ceph RBD.pdf

4,987 views

Published on

true

Storing VMs with Cinder and Ceph RBD.pdf

  1. 1. Storing VMs with Cinder and Ceph RBD
  2. 2. Growing With Hardware AppliancesC D First PB C D Second PBC D •  Proprietary C D •  Proprietary storageC D storage hardware C D hardwareC D •  Well-known C D •  Same storageC D storage vendor C D vendorC D C DC D C D $14 b’zillion AnotherC D C DC D C D $14 b’zillionC D C DC D C DC D C D 47
  3. 3. C D C D C C D C D D C D C DC++ C D C D C D C D C D 52
  4. 4. X C D C D C C D C D D C D C DC++ C D C D C D C D C D 53
  5. 5. C D C D C D C D C DHUMAN !! C D[DEVELOPER] C D C D C D C D C D C D 54
  6. 6. Hard Drives Are Tiny Record Players and They Fail Oftenjon_a_ross, Flickr / CC BY 2.0 71
  7. 7. D D D D D D = D Dx 1 MILLION 55 times / day 72
  8. 8. 73
  9. 9. philosophy design OPEN SOURCE SCALABLECOMMUNITY-FOCUSED NO SINGLE POINT OF FAILURE SOFTWARE BASED SELF-MANAGING
  10. 10. APP APP HOST/VM CLIENT RADOSGW RBD CEPH FS LIBRADOS A bucket-based REST A reliable and fully- A POSIX-compliant A library allowing gateway, compatible distributed block distributed file apps to directly with S3 and Swift device, with a Linux system, with a Linux access RADOS, kernel client and a kernel client and with support for QEMU/KVM driver support for FUSE C, C++, Java, Python, Ruby, and PHPRADOSA reliable, autonomous, distributed object store comprised of self-healing, self-managing,intelligent storage nodes 79
  11. 11. OSD OSD OSD OSD OSD btrfsFS FS FS FS FS xfs ext4DISK DISK DISK DISK DISK M M M 81
  12. 12. HUMAN MM M 82
  13. 13. Monitors:M •  Maintain cluster map •  Provide consensus for distributed decision-making •  Must have an odd number •  These do not serve stored objects to clients OSDs: •  One per disk (recommended) •  At least three in a cluster •  Serve stored objects to clients •  Intelligently peer to perform replication tasks •  Supports object classes 83
  14. 14. C D C D C D C D C D ??APP C D C D C D C D C D C D C D
  15. 15. C D C D C D C D C DAPP C D C D C D C D C D C D C D
  16. 16. C D C D A-G C D C D C D H-N FAPP * C D C D C D O-T C D C D C D U-Z C D
  17. 17. 10 10 01 01 10 10 01 11 01 10 hash(object name) % num pg10 10 01 01 10 10 01 11 01 10 CRUSH(pg, cluster state, rule set) 107
  18. 18. 10 10 01 01 10 10 01 11 01 1010 10 01 01 10 10 01 11 01 10 108
  19. 19. CRUSH•  Pseudo-random placement algorithm•  Ensures even distribution•  Repeatable, deterministic•  Rule-based configuration •  Replica count •  Infrastructure topology •  Weighting 109
  20. 20. CLIENT ?? 110
  21. 21. 112
  22. 22. CLIENT ?? 113
  23. 23. 111
  24. 24. APP APP HOST/VM CLIENT RADOSGW RBD CEPH FS LIBRADOS A bucket-based REST A reliable and fully- A POSIX-compliant A library allowing gateway, compatible distributed block distributed file apps to directly with S3 and Swift device, with a Linux system, with a Linux access RADOS, kernel client and a kernel client and with support for QEMU/KVM driver support for FUSE C, C++, Java, Python, Ruby, and PHPRADOSA reliable, autonomous, distributed object store comprised of self-healing, self-managing,intelligent storage nodes 84
  25. 25. APP LIBRADOS native MM M 85
  26. 26. LIBRADOSL •  Provides direct access to RADOS for applications •  C, C++, Python, PHP, Java •  No HTTP overhead
  27. 27. APP APP HOST/VM CLIENT RADOSGW RBD CEPH FS LIBRADOS A bucket-based REST A reliable and fully- A POSIX-compliant A library allowing gateway, compatible distributed block distributed file apps to directly with S3 and Swift device, with a Linux system, with a Linux access RADOS, kernel client and a kernel client and with support for QEMU/KVM driver support for FUSE C, C++, Java, Python, Ruby, and PHPRADOSA reliable, autonomous, distributed object store comprised of self-healing, self-managing,intelligent storage nodes 87
  28. 28. APP APP RESTRADOSGW RADOSGW LIBRADOS LIBRADOS native M M M 88
  29. 29. RADOS Gateway:•  REST-based interface to RADOS•  Supports buckets, accounting•  Compatible with S3 and Swift applications 89
  30. 30. APP APP HOST/VM CLIENT RADOSGW RBD CEPH FS LIBRADOS A bucket-based REST A reliable and fully- A POSIX-compliant A library allowing gateway, compatible distributed block distributed file apps to directly with S3 and Swift device, with a Linux system, with a Linux access RADOS, kernel client and a kernel client and with support for QEMU/KVM driver support for FUSE C, C++, Java, Python, Ruby, and PHPRADOSA reliable, autonomous, distributed object store comprised of self-healing, self-managing,intelligent storage nodes 90
  31. 31. VMVIRTUALIZATION CONTAINER LIBRBD LIBRADOS M M M 91
  32. 32. CONTAINER VM CONTAINER LIBRBD LIBRBD LIBRADOS LIBRADOS M M M 92
  33. 33. HOST KRBD (KERNEL MODULE) LIBRADOS MM M 93
  34. 34. RADOS Block Device:• Storage of virtual disks in RADOS• Allows decoupling of VMs and containers• Live migration!• Images are striped across the cluster• Thin-provisioning• Snapshots and cloning
  35. 35. VMVIRTUALIZATION CONTAINER LIBRBD LIBRADOS M M M 115
  36. 36. HOW DO YOU SPIN UPTHOUSANDS OF VMs INSTANTLY AND EFFICIENTLY? 116
  37. 37. instant copy144 0 0 0 0 = 144 117
  38. 38. write CLIENT write write write144 4 = 148 118
  39. 39. read read CLIENT read144 4 = 148 119
  40. 40. old-style VM image creationlocal disk Nova Glance(VM images) compute (templates) read X● ephemeral● expensive to create X X 29
  41. 41. Why use block storage?• Persistent • More familiar to users• Not tied to a single host • Decouples compute and storage • Enables Live migration• Extra capabilities of storage system • Efficient snapshots • Different types of storage available • Cloning for fast restore or scaling
  42. 42. Cinder volume creationCinder Cinder volume Glance API volume driver (templates) create image from X locate X location of X read X X flexibility in where VM images are stored X reference to X 31
  43. 43. Efficient volume creationCinder Cinder volume Glance API volume driver (templates) create image from X locate X location of X clone X to X X fast CoW clone X X complete reference to X 32
  44. 44. Questions?Josh Durginjosh.durgin@inktank.comjdurgin on freenodeinktank.com | ceph.com

×