"Dude, where's my volume? A guide to storage backup, migration, and replication with OpenStack Cinder"
OpenStack Cinder now has a wide variety of options for moving and copying storage volumes, but it's not always clear which API calls are designed for which use cases. In this talk, we'll review the storage management workflows for disaster recovery, performance management, and day-to-day operational maintenance using Ceph as an example storage backend. We'll focus on both single and multi-site options for both end users and OpenStack administrators, so attendees should find ways to sleep easier at night knowing how to look after their data.
https://openstacksummitmay2015vancouver.sched.org/event/de8516a550835a338d09634143bed655?iframe=yes&w=i:0;&sidebar=yes&bg=no#?iframe=yes&w=i:0;&sidebar=yes&bg=no
2. Today’s Presenters
Neil Levine Sean Cohen Gorka Eguileor
Director of Product
Management, Ceph
Red Hat
Principal Product
Manager, OpenStack
Red Hat
Software Engineer
Cinder, Manila
Red Hat
2
5. OpenStack
Disaster
Recovery
DR
Configura8ons:
– Ac8ve
-‐
Cold
standby
– Ac8ve
-‐
Hot
standby
– Ac8ve
-‐
Ac8ve
5
OpenStack
Summit
May
2015
-‐
Vancouver
▪ Different
disaster
recovery
topologies
and
configura8ons
come
with
different
RPO/RTO
levels:
Site
Topologies:
– Stretched
Cluster
– One
OpenStack
Cluster
– Two
OpenStack
Clusters
6. OpenStack
Disaster
Recovery
▪ What
does
disaster
recovery
for
OpenStack
involve?
– Capturing
the
metadata
relevant
for
the
protected
workloads/resources
via
components
APIs.
– Ensuring
that
the
required
VM
images
are
present
at
the
target/des8na8on
cloud
(limited
to
single
cluster)
– Replica8on
of
the
workload
data
using
storage
replica8on,
applica8on
level
replica8on,
or
backup/restore.
OpenStack
Summit
May
2015
-‐
Vancouver
8. OpenStack
Cinder
▪ Metadata
database
and
volumes
▪ Topology
– HA
pairs
but
within
a
single-‐site
– No
inherent
mul8-‐site/DR
architecture
▪ APIs
– Volume
Migra8on
API
– Volume
Backup
API
– Volume
Replica8on
API
8
OpenStack
Summit
May
2015
-‐
Vancouver
9. OpenStack
Glance
▪ Metadata
database
and
images
▪ Topology
– HA
pairs
but
within
a
single-‐site
– No
inherent
mul8-‐site/DR
architecture
▪ APIs
– glance-‐api
9
OpenStack
Summit
May
2015
-‐
Vancouver
10. OpenStack
Nova
▪ Metadata
database
and
volume
▪ Topology
– HA
pairs
but
within
a
single-‐site
– No
inherent
mul8-‐site/DR
architecture
▪ Ca_le:
– shouldn’t
be
backing
up
ephemeral
volumes….
– Put
snapshots
in
Glance
if
you
need
them.
10
OpenStack
Summit
May
2015
-‐
Vancouver
16. Can
I
run
a
single
[OpenStack/Ceph]
cluster?
▪ Not
Recommended
▪ OpenStack
not
designed
for
high-‐latency
links
– Possible
for
campus
environments
▪ Ceph
not
designed
for
high-‐latency
links
– Possible
for
campus
environments
– Pay
a_en8on
to
monitor
placement
and
read-‐affinity
selngs
16
OpenStack
Summit
May
2015
-‐
Vancouver
17. Use-‐Case
#1:
User
Ctrl-‐Z
▪ Only
duplicate
the
backup
storage
cluster:
− 1
OpenStack
cluster
(i.e.
one
logical
Cinder
service)
− 2
Ceph
clusters
in
different
physical
loca8ons
▪ Undo
accidental
volume
dele8on
▪ Uses
Cinder
Backup
service:
− Easy
configura8on
− Fine
granularity
▪ Backups
controlled
by
end-‐user
or
cloud
admin
17
OpenStack
Summit
May
2015
-‐
Vancouver
18. Use-‐Case
#1:
Single
“Stretched”
Topology
18
Cinder
Cinder-‐Backup
Ceph
RBD
Ceph
RBD
Site
A
Site
B
OpenStack
Summit
May
2015
-‐
Vancouver
19. Use-‐Case
#1:
Cinder
Backup
▪ Tightly
coupled
to
Cinder
Volume
▪ Mul8ple
available
backends:
RBD,
RGW/Swih,
NFS…
− Incremental
backups
by
default
with
RBD
▪ Backup
metadata
is
required
to
restore
volumes
▪ Usage:
Horizon,
CLI,
cinder-‐client
API
▪ Some
limita8ons:
− Single
backend
− Individual
and
manual
process
− Only
available
volumes
19
OpenStack
Summit
May
2015
-‐
Vancouver
20. Use-‐Case
#1:
Cinder
Backup’s
future
▪ Next
cycle:
− Decoupling
from
Cinder
Volume
− Snapshot
backups
− Scheduling
▪ In
the
mean8me:
Script
− Automa8c
mul8ple
backups
− Control
backups
visibility
from
users
− Backup
in-‐use
volumes
− Limit
backups
per
volume
to
keep
20
OpenStack
Summit
May
2015
-‐
Vancouver
23. Use-‐Case
#2:
The
Admin
Warehouse
23
OpenStack
Summit
May
2015
-‐
Vancouver
▪ One
OpenStack
cluster
o No
OpenStack
services
in
Site
B
▪ Two
Ceph
clusters
▪ Less
to
deploy
in
Site
B,
longer
recovery
8me
▪ Backups
controlled
by
admin,
not
user
▪ Restore
everything
in
event
of
total
data
loss
▪ Equivalent
to
a
tape
backup
24. Use-‐Case
#2:
Topology
24
Cinder
Ceph
RBD
Ceph
RBD
Site
A
Site
B
Glance
MySQL
dump
rbd
export
MySQL
dump
cinder.sql
glance.sql
OpenStack
Summit
May
2015
-‐
Vancouver
25. Use-Case #2: Configuration
▪ mysqldump --databases cinder glance
▪ Automated RBD Export script:
o https://www.rapide.nl/blog/ceph_-_rbd_replication
▪ Limitations:
o No snapshot clean up
o Ensure backups complete in a day
▪ Restore
o Reverse the streams
OpenStack
Summit
May
2015
-‐
Vancouver
27. Use-‐Case
#3:
The
Failover
Site
▪ Two
OpenStack
clusters,
two
Ceph
clusters
▪ Backups
controlled
by
admin
▪ Ac8ve/Passive
▪ Use
low-‐level
tools
to
handle
backups
o MySQL
Replica8on
o RBD
Exports
27
OpenStack
Summit
May
2015
-‐
Vancouver
28. Use-‐Case
#3:
Topology
28
Cinder
Ceph
RBD
Ceph
RBD
Site
A
Site
B
Glance
Cinder
MySQL
replica8on
rbd
exports
use
same
fsid
on
backup
cluster
Glance
MySQL
replica8on
OpenStack
Summit
May
2015
-‐
Vancouver
29. Use-Case #3: Configuration
▪ Replication but not include in HA pair
▪ Unlike Active-Active configurations - the
consistency between the data & the
databases is not guaranteed.
OpenStack
Summit
May
2015
-‐
Vancouver
30. OpenStack
Live
Disaster
Recovery
Site
30
Use-‐Case
#4
OpenStack
Summit
May
2015
-‐
Vancouver
31. Use-‐Case
#4:
Topology
(Future)
31
Cinder
Ceph
RBD
Ceph
RBD
Site
A
Site
B
Glance
Cinder
rbd
mirroring
use
same
fsid
on
backup
cluster
Glance
OpenStack
Summit
May
2015
-‐
Vancouver
cinder
replica8on
glance
replica8on
32. Use-Case #4: Future Options
▪ Glance-Replicator
o Run Glance in 2nd site and push image copies
OpenStack
Summit
May
2015
-‐
Vancouver
33. What’s
coming
up
in
Liberty
Cinder
-‐
Volume
Replica`on
V2
▪ Replica8on
between
Cinders
o Currently
we
have
basic
replica8on
in
a
single
Cinder
deployment.
▪ Consistency
data
replica8on
o Align
CG
design
and
volume-‐replica8on
spec,
one
CG
could
support
different
volume-‐types,
where
the
volume-‐type
to
decide
which
volume-‐replica8on
is
going
to
be
created
and
added
to
CG.
OpenStack
Summit
May
2015
-‐
Vancouver
34. Summary
▪ Today:
o Simple:
▪ Use-Case #1 - Ctrl-Z
o Medium:
▪ Use-Case #2 - Admin Warehouse
o Advanced:
▪ Use-Case #3 - Active/Passive Infrastructure
▪ Future:
o Use-Case #4 - Active/Passive OpenStack