New features for Ceph with Cinder and Beyond
Upcoming SlideShare
Loading in...5
×
 

Like this? Share it with your network

Share

New features for Ceph with Cinder and Beyond

on

  • 2,631 views

Ceph block device lead engineer presents his slides to the OpenStack community at ODS spring 2013.

Ceph block device lead engineer presents his slides to the OpenStack community at ODS spring 2013.

Statistics

Views

Total Views
2,631
Views on SlideShare
1,796
Embed Views
835

Actions

Likes
6
Downloads
80
Comments
0

5 Embeds 835

http://www.scoop.it 507
http://www.inktank.com 192
http://dev.gluesys.com 106
http://www.linkedin.com 29
http://plus.url.google.com 1

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

New features for Ceph with Cinder and Beyond Presentation Transcript

  • 1. New Features for Ceph with Cinder and Beyond
  • 2. 73
  • 3. 60 Why Ceph? • Low cost • Flexible • Scalable • Open source
  • 4. APP APP HOST/VM CLIENT RADOSGW RBD CEPH FS LIBRADOS A bucket-based REST A reliable and fully- A POSIX-compliant A library allowing gateway, compatible distributed block distributed file apps to directly with S3 and Swift device, with a Linux system, with a Linux access RADOS, kernel client and a kernel client and with support for QEMU/KVM driver support for FUSE C, C++, Java, Python, Ruby, and PHPRADOSA reliable, autonomous, distributed object store comprised of self-healing, self-managing,intelligent storage nodes 79
  • 5. OSD OSD OSD OSD OSD btrfsFS FS FS FS FS xfs ext4DISK DISK DISK DISK DISK M M M 81
  • 6. HUMAN MM M 82
  • 7. Monitors:M •  Maintain cluster map •  Provide consensus for distributed decision-making •  Must have an odd number •  These do not serve stored objects to clients OSDs: •  One per disk (recommended) •  At least three in a cluster •  Serve stored objects to clients •  Intelligently peer to perform replication tasks •  Supports object classes 83
  • 8. C D C D C D C D C DAPP C D C D C D C D C D C D C D
  • 9. C D C D C D C D C DAPP C D C D C D C D C D C D C D
  • 10. C D C D A-G C D C D C D H-N FAPP * C D C D C D O-T C D C D C D U-Z C D
  • 11. 10 10 01 01 10 10 01 11 01 10 hash(object name) % num pg10 10 01 01 10 10 01 11 01 10 CRUSH(pg, cluster state, rule set) 107
  • 12. 10 10 01 01 10 10 01 11 01 1010 10 01 01 10 10 01 11 01 10 108
  • 13. CRUSH•  Pseudo-random placement algorithm•  Ensures even distribution•  Repeatable, deterministic•  Rule-based configuration •  Replica count •  Infrastructure topology •  Weighting 109
  • 14. CLIENT ?? 110
  • 15. 112
  • 16. CLIENT ?? 113
  • 17. 111
  • 18. APP APP HOST/VM CLIENT RADOSGW RBD CEPH FS LIBRADOS A bucket-based REST A reliable and fully- A POSIX-compliant A library allowing gateway, compatible distributed block distributed file apps to directly with S3 and Swift device, with a Linux system, with a Linux access RADOS, kernel client and a kernel client and with support for QEMU/KVM driver support for FUSE C, C++, Java, Python, Ruby, and PHPRADOSA reliable, autonomous, distributed object store comprised of self-healing, self-managing,intelligent storage nodes 84
  • 19. APP LIBRADOS native MM M 85
  • 20. LIBRADOSL •  Provides direct access to RADOS for applications •  C, C++, Python, PHP, Java •  No HTTP overhead
  • 21. APP APP HOST/VM CLIENT RADOSGW RBD CEPH FS LIBRADOS A bucket-based REST A reliable and fully- A POSIX-compliant A library allowing gateway, compatible distributed block distributed file apps to directly with S3 and Swift device, with a Linux system, with a Linux access RADOS, kernel client and a kernel client and with support for QEMU/KVM driver support for FUSE C, C++, Java, Python, Ruby, and PHPRADOSA reliable, autonomous, distributed object store comprised of self-healing, self-managing,intelligent storage nodes 87
  • 22. APP APP RESTRADOSGW RADOSGW LIBRADOS LIBRADOS native M M M 88
  • 23. RADOS Gateway:•  REST-based interface to RADOS•  Supports buckets, accounting•  Compatible with S3 and Swift applications 89
  • 24. APP APP HOST/VM CLIENT RADOSGW RBD CEPH FS LIBRADOS A bucket-based REST A reliable and fully- A POSIX-compliant A library allowing gateway, compatible distributed block distributed file apps to directly with S3 and Swift device, with a Linux system, with a Linux access RADOS, kernel client and a kernel client and with support for QEMU/KVM driver support for FUSE C, C++, Java, Python, Ruby, and PHPRADOSA reliable, autonomous, distributed object store comprised of self-healing, self-managing,intelligent storage nodes 90
  • 25. VMVIRTUALIZATION CONTAINER LIBRBD LIBRADOS M M M 91
  • 26. CONTAINER VM CONTAINER LIBRBD LIBRBD LIBRADOS LIBRADOS M M M 92
  • 27. HOST KRBD (KERNEL MODULE) LIBRADOS MM M 93
  • 28. RADOS Block Device:• Storage of virtual disks in RADOS• Allows decoupling of VMs and containers• Live migration!• Images are striped across the cluster• Thin-provisioning• Snapshots and cloning
  • 29. VMVIRTUALIZATION CONTAINER LIBRBD LIBRADOS M M M 115
  • 30. HOW DO YOU SPIN UPTHOUSANDS OF VMs INSTANTLY AND EFFICIENTLY? 116
  • 31. instant copy144 0 0 0 0 = 144 117
  • 32. write CLIENT write write write144 4 = 148 118
  • 33. read read CLIENT read144 4 = 148 119
  • 34. old-style VM image creationlocal disk Nova Glance(VM images) compute (templates) read X● ephemeral● expensive to create X X 29
  • 35. Why use block storage?• Persistent • More familiar to users• Not tied to a single host • Decouples compute and storage • Enables Live migration• Extra capabilities of storage system • Efficient snapshots • Different types of storage available • Cloning for fast restore or scaling
  • 36. Cinder volume creationCinder Cinder volume Glance API volume driver (templates) create image from X locate X location of X read X X flexibility in where VM images are stored X reference to X 31
  • 37. Efficient volume creationCinder Cinder volume Glance API volume driver (templates) create image from X locate X location of X clone X to X X fast CoW clone X X complete reference to X 32
  • 38. 54 Whats new in Bobtail: Improved OSD threading • Filesystem and journal related-locks are now more fine-grained • Boosted single disk IOPS from 6k to 22k • Restructured how map updates are handled, letting each placement group process them independently
  • 39. 55 Whats new in Bobtail: Recovery QoS • Message priority system reworked to prevent starvation • Recovery operations can be lower priority than client I/O without starving • Requests to access an object can increase recovery priority for that object
  • 40. 56 Whats new in Bobtail: Block Device Cloning • Instantly create new volumes based on templates (snapshots) • Integrated with Cinder in Folsom • Grizzly adds the ability to copy (not clone) non-raw images to RBD
  • 41. 57 Whats new in Bobtail: Keystone Integration • RADOS gateway can talk to keystone to authenticate swift api requests • Let keystone manage your users • Supported by the Ceph juju charm
  • 42. 58 Whats next: Cuttlefish • Incremental backup for block devices • On-disk encryption • REST management API for RADOS gateway • More performance improvements (especially for small I/O) • More! (http://www.inktank.com/about- inktank/roadmap/)
  • 43. 59 Whats next: Dumpling • Geo-replication for RADOS gateway • REST management API for Ceph cluster • ... (virtual) Ceph Developer Summit May 6
  • 44. Questions?Josh Durginjosh.durgin@inktank.comjdurgin on freenodeinktank.com | ceph.com