CEPH
STORAGE
Architecture & Administration
By Liam Dao
for Red Hat Ceph Storage
01
Preparing
Chapter Overview
Goal ● Identify challenges faced by traditional storage and
explain how Ceph addresses them.
Objectives ● Summarize the challenges faced by traditional
storage solutions and explain the use case for
software-defined storage in general and Ceph in
particular.
● Describe the architecture of Red Hat Ceph Storage,
explain how it distributes and organizes data, and
list the methods that clients can use to access that
data
Data Storage Challenges
1
2
3
4
5
6
7
Data Keeps Growing
Integration of Emerging
Technologies
Legacy Storage
Infrastructure
Responsiveness
Performance, Resilience
and DR
Opportunity Cost
Different Types of
Storage
Software Defined Storage
Before SDS
Software Defined Storage
After SDS
Ceph Storage
Definition
-> An open source, petabyte-scale distributed storage
system primarily for object-based, block-based and
file-based store
Ceph Storage
Design goals:
● Be scalable for every component
● Provide no single point of failure
● Be software-based (not an appliance) and open
source (no vendor lock-in)
● Run on readily available hardware
● Be self-managed wherever possible, minimizing
user intervention
Ceph Storage
Use case:
● Storing images and virtual block device storage
for an OpenStack environment (using Glance,
Cinder, and Nova)
● Applications that use standard APIs to access
object-based storage
● Persistent storage for containers
● Rich media applications
Ceph Storage
Features:
● Multisite and disaster recovery options
● Flexible storage policies
● Data durability via erasure coding or replication
● Deployment in a containerized environment
RADOS
-> the Ceph storage back end
-> based on 4 daemons :
● Monitors
● Object Storage Devices
● Managers
● Metadata Servers
Architecture
RADOS Daemons
MONs
MGRs
OSDs
MDSs
Monitors: maintain maps of
the cluster state and are used
to help the other daemons
coordinate with each other
Object Storage Devices: store
data and handle data
replication, recovery, and
rebalancing
Managers: keep track of
runtime metrics and expose
cluster information
Metadata Servers: store
metadata used by CephFS
Access Methods
Ceph Native API
(librados)
-> allows applications to work
directly with RADOS to access object
stored by the Ceph Cluster
Access Methods
Ceph Block Device
(RBD)
-> Provide block storage within a
Ceph cluster through RBS images.
Access Methods
Ceph Object Gateway
(RADOS Gateway - RGW)
-> object storage interface, provides
applications with a gateway with
RESTful API
Access Methods
Ceph File System
(CephFS)
-> parallel file system that provides
a scalable, single-hierarchy shared
disk
Access Methods
4 types of accessing
● Ceph Native API
● Ceph Object Gateway
● Ceph Block Device
● Ceph File System
Data Distribution & Organization
Pools
-> Logical partitions of the Ceph
storage cluster
-> Each Pool is assigned number of
hash Buckets (Placement Group)
Data Distribution & Organization
Placement Group (PG)
-> aggregates a series of objects
into a hash bucket, or group, and it
mapped to a set of OSDs
CRUSH algorithm
-> Used to select the OSDs hosting the
data for a pool
Data Distribution & Organization
Mapping an Object to Its Associated OSDs:
Data Distribution & Organization
Mapping an Object to Its Associated OSDs:
1. Ceph get the latest copy of the cluster map from monitor
2. Calculate the PG ID for an object
○ by the object’s name and object’s storage pool
3. CRUSH algorithm determine the Acting Set
○ Acting Set: which OSDs are responsible for a PG
○ The first ODS in the acting set is the current primary ODS, All other is the
secondary
○ Ceph client can then directly work with primary ODS in order to access the
object
Next Chapter
Chapter 2: Deploying Red Hat Ceph Storage
Goal ● Deploy and expand a new Red Hat Ceph Storage
cluster
Objectives ● Plan a Red Hat Ceph Storage deployment based on the
software's prerequisites.
● Describe supported configurations for Red Hat Ceph
Storage.
● Deploy a Red Hat Ceph Storage cluster using Ansible.
● Add OSDs to nodes in an existing cluster to increase
capacity and performance.

Ceph c01

  • 1.
  • 2.
    for Red HatCeph Storage 01 Preparing
  • 3.
    Chapter Overview Goal ●Identify challenges faced by traditional storage and explain how Ceph addresses them. Objectives ● Summarize the challenges faced by traditional storage solutions and explain the use case for software-defined storage in general and Ceph in particular. ● Describe the architecture of Red Hat Ceph Storage, explain how it distributes and organizes data, and list the methods that clients can use to access that data
  • 4.
    Data Storage Challenges 1 2 3 4 5 6 7 DataKeeps Growing Integration of Emerging Technologies Legacy Storage Infrastructure Responsiveness Performance, Resilience and DR Opportunity Cost Different Types of Storage
  • 5.
  • 6.
  • 7.
    Ceph Storage Definition -> Anopen source, petabyte-scale distributed storage system primarily for object-based, block-based and file-based store
  • 8.
    Ceph Storage Design goals: ●Be scalable for every component ● Provide no single point of failure ● Be software-based (not an appliance) and open source (no vendor lock-in) ● Run on readily available hardware ● Be self-managed wherever possible, minimizing user intervention
  • 9.
    Ceph Storage Use case: ●Storing images and virtual block device storage for an OpenStack environment (using Glance, Cinder, and Nova) ● Applications that use standard APIs to access object-based storage ● Persistent storage for containers ● Rich media applications
  • 10.
    Ceph Storage Features: ● Multisiteand disaster recovery options ● Flexible storage policies ● Data durability via erasure coding or replication ● Deployment in a containerized environment
  • 11.
    RADOS -> the Cephstorage back end -> based on 4 daemons : ● Monitors ● Object Storage Devices ● Managers ● Metadata Servers Architecture
  • 12.
    RADOS Daemons MONs MGRs OSDs MDSs Monitors: maintainmaps of the cluster state and are used to help the other daemons coordinate with each other Object Storage Devices: store data and handle data replication, recovery, and rebalancing Managers: keep track of runtime metrics and expose cluster information Metadata Servers: store metadata used by CephFS
  • 13.
    Access Methods Ceph NativeAPI (librados) -> allows applications to work directly with RADOS to access object stored by the Ceph Cluster
  • 14.
    Access Methods Ceph BlockDevice (RBD) -> Provide block storage within a Ceph cluster through RBS images.
  • 15.
    Access Methods Ceph ObjectGateway (RADOS Gateway - RGW) -> object storage interface, provides applications with a gateway with RESTful API
  • 16.
    Access Methods Ceph FileSystem (CephFS) -> parallel file system that provides a scalable, single-hierarchy shared disk
  • 17.
    Access Methods 4 typesof accessing ● Ceph Native API ● Ceph Object Gateway ● Ceph Block Device ● Ceph File System
  • 18.
    Data Distribution &Organization Pools -> Logical partitions of the Ceph storage cluster -> Each Pool is assigned number of hash Buckets (Placement Group)
  • 19.
    Data Distribution &Organization Placement Group (PG) -> aggregates a series of objects into a hash bucket, or group, and it mapped to a set of OSDs CRUSH algorithm -> Used to select the OSDs hosting the data for a pool
  • 20.
    Data Distribution &Organization Mapping an Object to Its Associated OSDs:
  • 21.
    Data Distribution &Organization Mapping an Object to Its Associated OSDs: 1. Ceph get the latest copy of the cluster map from monitor 2. Calculate the PG ID for an object ○ by the object’s name and object’s storage pool 3. CRUSH algorithm determine the Acting Set ○ Acting Set: which OSDs are responsible for a PG ○ The first ODS in the acting set is the current primary ODS, All other is the secondary ○ Ceph client can then directly work with primary ODS in order to access the object
  • 22.
    Next Chapter Chapter 2:Deploying Red Hat Ceph Storage Goal ● Deploy and expand a new Red Hat Ceph Storage cluster Objectives ● Plan a Red Hat Ceph Storage deployment based on the software's prerequisites. ● Describe supported configurations for Red Hat Ceph Storage. ● Deploy a Red Hat Ceph Storage cluster using Ansible. ● Add OSDs to nodes in an existing cluster to increase capacity and performance.