Newest developments in being able to use Bareos for backing up and restoring data of a Gluster or/and CEPH storage cluster. For this we use GFAPI for Gluster and librados/libcephfs (2 plugins) for CEPH. Next to that we also added support for
using Gluster and CEPH as a backing store for storing backup data using the native APIs not a FUSE mount (so without all the additional overhead.)
What do we support now and what is planned. (Currently some is internal development and some is as technology preview in our current bleeding edge code base.)
SpotFlow: Tracking Method Calls and States at Runtime
Open Source Backup Conference 2014: Bakup to and of the cloud, by Marco van Wieringen
1. Sep 23, 2014
OSBCONF 2014
Cloud backup with Bareos
2. OSBCONF 23/09/2014
Bacula is a registered trademark of Kern Sibbald
Bareos is a registered trademark of Bareos GmbH & Co. KG
● Content:
– Who am I
– Quick overview of Cloud solutions
– Bareos and Backup/Restore using Cloud Storage
– Bareos and Backup/Restore of Cloud Storage
3. Who am I
● Marco van Wieringen
● Studied datacommunications from 1989-1995
● Author of Linux diskquotas and BSD
accounting.
● Employed as Senior Consultant for 15+ years
● Mostly worked in the banking sector on Solaris
systems (new environments and migrations)
Bacula is a registered trademark of Kern Sibbald
Bareos is a registered trademark of Bareos GmbH & Co. KG
4. Who am I (2)
● Active with Bacula since 2008
● Forked the codebase in 2012
● #3 independent contributor to Bacula
● One of 5 founders of Bareos GmbH & Co KG
Bacula is a registered trademark of Kern Sibbald
Bareos is a registered trademark of Bareos GmbH & Co. KG
5. Cloud storage landscape
● Started looking into the cloud storage
landscape in February 2014 after FOSDEM
Bacula is a registered trademark of Kern Sibbald
Bareos is a registered trademark of Bareos GmbH & Co. KG
● Two aspects:
– Using as backup storage
– Backup content of Cloud storage
● No particular favorite
● Performance important
● Only looked at API and usability
6. Cloud storage landscape (2)
Bacula is a registered trademark of Kern Sibbald
Bareos is a registered trademark of Bareos GmbH & Co. KG
● GlusterFS
● CEPH
● Amazon S3
● Swift
● Commercial OpenSource like RHS
– Now GlusterFS
– Future CEPH ?
7. GlusterFS
● POSIX-Like Distributed File System
● No Metadata Server
● Network Attached Storage (NAS)
● Heterogeneous Commodity Hardware
● Aggregated Storage and Memory
● Standards-Based – Clients, Applications,
Networks
Bacula is a registered trademark of Kern Sibbald
Bareos is a registered trademark of Bareos GmbH & Co. KG
8. GlusterFS (2)
● Flexible and Agile Scaling
– Capacity – Petabytes and beyond
– Performance – Thousands of Clients
● Single Global Namespace
● Run fully in userspace
● Next to network connectivity also support for
InfiniBand.
Bacula is a registered trademark of Kern Sibbald
Bareos is a registered trademark of Bareos GmbH & Co. KG
9. Data Access Overview
● GlusterFS Native Client
– Filesystem in Userspace (FUSE)
Bacula is a registered trademark of Kern Sibbald
Bareos is a registered trademark of Bareos GmbH & Co. KG
● NFS
– Built-in Service
● SMB/CIFS
– Samba server required
● Unified File and Object (UFO)
– Simultaneous object-based access
10. Use as backup storage
● Using FUSE Gluster volume can be mounted
as POSIX filesystem and used as file storage
today.
● A faster datapath is wanted to integrate into
the storage server so we bypass the FUSE
layer.
● Linus stake on FUSE is that its used for toy
filesystems and not production.
Bacula is a registered trademark of Kern Sibbald
Bareos is a registered trademark of Bareos GmbH & Co. KG
11. Data Access via FUSE
Bacula is a registered trademark of Kern Sibbald
Bareos is a registered trademark of Bareos GmbH & Co. KG
12. Data Access via GFAPI
Bacula is a registered trademark of Kern Sibbald
Bareos is a registered trademark of Bareos GmbH & Co. KG
13. CEPH
● Most interesting things already said by
previous speaker.
● Distributed Storage System
Bacula is a registered trademark of Kern Sibbald
Bareos is a registered trademark of Bareos GmbH & Co. KG
– Large scale
– Fault tolerant
● No single point of failure
● Commodity hardware
– Self-managing, self-healing
14. CEPH (2)
● Unified storage platform
Bacula is a registered trademark of Kern Sibbald
Bareos is a registered trademark of Bareos GmbH & Co. KG
– Objects
● Rich native API
● Compatible RESTful API
– Block
● thin provisioning, snapshots, cloning
– File
● strong consistency, snapshots
15. Data Access Overview
AAPPPP AAPPPP HHOOSSTT/V/VMM CCLLIEIENNTT
LIBRADOS
A library allowing
apps to directly
access RADOS,
with support for
C, C++, Java,
Python, Ruby,
and PHP
RADOSGW
A bucket-based REST
gateway, compatible
with S3 and Swift
RADOS – the Ceph object store
RADOS – the Ceph object store
RBD
A reliable and fully-distributed
block
device, with a Linux
kernel client and a
QEMU/KVM driver
CEPH FS
A POSIX-compliant
distributed file
system, with a Linux
kernel client and
support for FUSE
A reliable, autonomous, distributed object store comprised of self-healing, self-managing,
intelligent storage nodes
A reliable, autonomous, distributed object store comprised of self-healing, self-managing,
intelligent storage nodes
Bacula is a registered trademark of Kern Sibbald
Bareos is a registered trademark of Bareos GmbH & Co. KG
LIBRADOS
A library allowing
apps to directly
access RADOS,
with support for
C, C++, Java,
Python, Ruby,
and PHP
RBD
A reliable and fully-distributed
block
device, with a Linux
kernel client and a
QEMU/KVM driver
CEPH FS
A POSIX-compliant
distributed file
system, with a Linux
kernel client and
support for FUSE
RADOSGW
A bucket-based REST
gateway, compatible
with S3 and Swift
16. Data Access Overview (2)
● Filesystem on top of RBD (Rados Block
Device)
● CEPH FS via either FUSE or native kernel
module.
● RADOSGW (S3/Swift)
● Librados (direct low level access)
Bacula is a registered trademark of Kern Sibbald
Bareos is a registered trademark of Bareos GmbH & Co. KG
17. Data Access Overview (3)
AAPPPP AAPPPP HHOOSSTT/V/VMM CCLLIEIENNTT
LIBRADOS
A library allowing
apps to directly
access RADOS,
with support for
C, C++, Java,
Python, Ruby,
and PHP
LIBRADOS
A library allowing
apps to directly
access RADOS,
with support for
C, C++, Java,
Python, Ruby,
and PHP
RBD
RBD
A reliable and fully-distributed
A reliable and fully-distributed
block
block
device, with a Linux
kernel client and a
QEMU/KVM driver
device, with a Linux
kernel client and a
QEMU/KVM driver
RADOSGW
RADOSGW
A bucket-based REST
gateway, compatible
with S3 and Swift
A bucket-based REST
gateway, compatible
with S3 and Swift
CEPH FS
CEPH FS
A POSIX-compliant
distributed file system,
with a Linux kernel
client and support for
FUSE
A POSIX-compliant
distributed file system,
with a Linux kernel
client and support for
FUSE
NEARLY
AWESOME
AWESOME AWESOME
AWESOME
AWESOME
A reliable, autonomous, distributed object store comprised of self-healing, self-managing,
intelligent storage nodes
Bacula is a registered trademark of Kern Sibbald
Bareos is a registered trademark of Bareos GmbH & Co. KG
RADOS
RADOS
A reliable, autonomous, distributed object store comprised of self-healing, self-managing,
intelligent storage nodes
18. Use as backup storage
● Create filesystem on a RDB device and use as
filestorage.
● CEPHFS via FUSE or native kernel module as
filestorage.
● Native storage using librados
● Native storage using libcephfs
● Last two preferable.
Bacula is a registered trademark of Kern Sibbald
Bareos is a registered trademark of Bareos GmbH & Co. KG
19. Object Storage
● Multiple vendors
● Most API's for new style languages
(PHP/Python/Java)
● HTTP/HTTPS based protcol
● Only able to PUT full new object not
appendable
● Most implementations use CURL as lowlevel
library
Bacula is a registered trademark of Kern Sibbald
Bareos is a registered trademark of Bareos GmbH & Co. KG
20. Object Storage (2)
● Most promising library for native support
seems to be libdroplet of Scality
– Abstraction for different object store providers
– Native C/C++ API
– POSIX file storage implemented but unusable due
to the low level protocol not supporting POSIX
operations.
– Missing a local caching layer for file operations.
Bacula is a registered trademark of Kern Sibbald
Bareos is a registered trademark of Bareos GmbH & Co. KG
21. Use as backup storage
● Use S3FS(C++) or S3QL (Python) and mount
via FUSE.
● Use S3backer which has some interesting
caching but is only one file.
● Caution with costs because of storage and
upload/download costs involved with Amazon
S3.
Bacula is a registered trademark of Kern Sibbald
Bareos is a registered trademark of Bareos GmbH & Co. KG
22. Cloud Storage and Bareos
● Use cloud storage as backing store
– Storage Daemon can save data to cloud
– Already possible today with filesystem abstraction.
● Backup content of the Cloud
– Backing up full cloud probably not feasible
– Subset of data can already be backup without
support in Bareos today.
– Doesn't use specific features available.
Bacula is a registered trademark of Kern Sibbald
Bareos is a registered trademark of Bareos GmbH & Co. KG
23. Cloud Storage and Bareos
● Cloud storage as backing store for data
– Close integration with Storage Daemon
– Shortest low level path for storage
– Minimum number of data copying
– For now POSIX interface only
– No data format changes just save data as normal
disk based volume format.
– Future may bring other formats and other clever
tricks.
Bacula is a registered trademark of Kern Sibbald
Bareos is a registered trademark of Bareos GmbH & Co. KG
24. Storage Daemon Changes
● Support for multiple storage backends in
Storage Daemon.
● Abstraction of different type of devices
● Seperate packaging of different backends
● Analog to the already existing database
backends dynamically loaded on reference in
the config.
Bacula is a registered trademark of Kern Sibbald
Bareos is a registered trademark of Bareos GmbH & Co. KG
25. Storage Daemon Changes
● Available as technology preview in 14.2.1
– File backend (always compiled in)
– Tape backend (in separate package)
– Fifo backend
– GFAPI backend (Gluster)
– RADOS backend (CEPH)
– CEPHFS backend (CEPH)
– OBJECT backend (S3/Swift using droplet)
● Not working needs more work on droplet library.
Bacula is a registered trademark of Kern Sibbald
Bareos is a registered trademark of Bareos GmbH & Co. KG
26. Storage Daemon Changes
● Media stored on CEPH and Gluster have
same format as normal file based volumes
● By using normal tools of CEPH (rados get)
and gluster (mount glusterfs) the file volumes
can be copied to normal disk storage and
used as file storage.
● Replication of backup data can be done using
native tools of Gluster (Geo Replication) etc.
● Possibly in future allow read-only devices to
access data in remote datacenter
Bacula is a registered trademark of Kern Sibbald
Bareos is a registered trademark of Bareos GmbH & Co. KG
27. Future of Cloud backends
● Work out Object Storage plugin
– Probably some chunking method
– Local caching
– Seems to support blobs
– Elasto library may fit
(https://code.google.com/p/elasto/)
Bacula is a registered trademark of Kern Sibbald
● Windows Azure
Bareos is a registered trademark of Bareos GmbH & Co. KG
28. Cloud Storage and Bareos
● Backup content of the Cloud
– Due to dependecies implement as Plugins
– Plugin interface as exists today to limited to write
proper backup and restore filesystem like data
– Implement cloud backup and restore functionality
in File Daemon
– Reuses configure logic added for Storage Daemon
Backends
Bacula is a registered trademark of Kern Sibbald
Bareos is a registered trademark of Bareos GmbH & Co. KG
29. File Daemon Changes
● Generic Hardlink handling also for plugins
● ACL Backup and Restore handling for plugins
● XATTR Backup and Restore handling for
plugins
Bacula is a registered trademark of Kern Sibbald
Bareos is a registered trademark of Bareos GmbH & Co. KG
30. File Daemon Changes
● Hardlink code refactored and available in
master branch.
– Dropped duplicate hash table code
– Reused generic htable for all hash tables
● ACL and XATTR extensions for plugin mostly
coded and on its way for review and merge
into master tree. (Including Python plugin
code)
Bacula is a registered trademark of Kern Sibbald
Bareos is a registered trademark of Bareos GmbH & Co. KG
31. RADOS FD Plugin
● Initial coding done.
– Uses librados
– Uses enumerator to list objects in certain RADOS
pool
– Snapshots rados pool before starting backup
– Early stage of testing
Bacula is a registered trademark of Kern Sibbald
Bareos is a registered trademark of Bareos GmbH & Co. KG
32. GFAPI FD Plugin
Bacula is a registered trademark of Kern Sibbald
Bareos is a registered trademark of Bareos GmbH & Co. KG
● For GlusterFS
● Initial coding done
– Uses Glusters GFAPI
– POSIX API
– First phase only regular files
● Currently scans filesystem for changed files
– Analog to what the normal filedaemon does
– Including support for accurate etc.
● Future maybe support for changelog tracking
33. CEPHFS FD Plugin
Bacula is a registered trademark of Kern Sibbald
Bareos is a registered trademark of Bareos GmbH & Co. KG
● For CEPHFS
● Initial coding done
– Uses CEPH libcephfs
– POSIX API (almost the same as GFAPI)
– First phase only regular files
● Currently scans filesystem for changed files
– Analog to what the normal filedaemon does
– Including support for accurate etc.
34. Future of Cloud FD plugins
● Merge new code after further testing.
Bacula is a registered trademark of Kern Sibbald
Bareos is a registered trademark of Bareos GmbH & Co. KG
● Beta program
● Get some sponsoring
● When we work out object storage as backend
maybe create a object store FD plugin.
35. Questions ?
Bacula is a registered trademark of Kern Sibbald
Bareos is a registered trademark of Bareos GmbH & Co. KG
36. Links
● https://code.google.com/p/elasto/
● https://code.google.com/p/s3fs/
● https://fedorahosted.org/s3fs/
● https://code.google.com/p/s3backer/
● https://github.com/scality/Droplet
● https://forge.gluster.org/
● http://ceph.com/
Bacula is a registered trademark of Kern Sibbald
Bareos is a registered trademark of Bareos GmbH & Co. KG