Bricks and Translators:
GlusterFS as SWIFT replacement in OpenStack
Dr. Udo Seidel
Linux-Strategy @ Amadeus
Agenda
●

Introduction

●

High level overview

●

Storage inside

●

Team up with OpenStack

●

Summary

GlusterFS Day London 2013
Introduction

GlusterFS Day London 2013
Me ;-)
●

Teacher of mathematics & physics

●

PhD in experimental physics

●

Started with Linux in 1996

●

Linux/UNIX trainer

●

Solution engineer in HPC and CAx environment

●

Head of the Linux Strategy team @Amadeus

GlusterFS Day London 2013
GlusterFS Day London 2013
Distributed Storage Systems
●

'Recent' attention on distributed storage
●
●

●

Cloud hype
Big Data

See also
●

CEPH

●

XtreemFS

●

FhGFS

●

...
GlusterFS Day London 2013
Distributed storage:
Paradigm changes
●

Block -> Object

●

Central -> Distributed
●
●

●

Few -> Many
Big -> Small

Server <-> Storage

GlusterFS Day London 2013
Distributed storage – Now what?!?
●

Several implementations

●

Different functions

●

Support models

●

Storage vendors initiatives

●

Relation to Linux distributions
Here and now ==> GlusterFS
GlusterFS Day London 2013
High level overview

GlusterFS Day London 2013
History
●

Gluster founded in 2005

●

Gluster = GNU + cluster

●

Acquisition by Red Hat in 2011

●

Community project
●
●

3.3 in 2012

●

●

3.2 in 2011
3.4 in 2013

Commercial product: RedHat Storage Server
GlusterFS Day London 2013
The Client
●

Native
●
●

Not part of the Linux Kernel

●

●

'speaks' GLUSTERFS
FUSE-based

NFS
●

●

Normal NFS client stack

libgfapi (since 3.4)

GlusterFS Day London 2013
The Server
●

Data
●
●

Translators

●

●

Bricks
Volumes -> exported/served to the client

Meta-Data
●

No dedicated instance

●

Distributed hashing approach

GlusterFS Day London 2013
The 'traditional' picture

GlusterFS Day London 2013
Storage inside

GlusterFS Day London 2013
The Brick
●

Trust each other

●

Interconnect
●

●

TCP/IP and/or RDMA/Infiniband

Dedicated file systems on GlusterFS server
●
●

●

XFS recommended, EXT4 works too
Extended attributes a must

Two main processes/daemons
●

glusterd and glusterfsd
GlusterFS Day London 2013
The Translator
●

One per purpose
●

Replication

●

POSIX

●

Quota

●

I/O behaviour

●

...

●

Chained -> brick graph

●

Technically: configuration
GlusterFS Day London 2013
The Volume
●

Service unit

●

Layer of configuration
●

distributed, replicated, striped, ...

●

NFS

●

Cache

●

Permissions

●

....

GlusterFS Day London 2013
The Distributed Volume

GlusterFS Day London 2013
The Replicated Volume

GlusterFS Day London 2013
The Distributed-Replicated Volume

GlusterFS Day London 2013
And more ...
●

Striped

●

Striped-Distributed

●

Striped-Replicate

●

....

GlusterFS Day London 2013
Meta Data
●

2 kinds
●
●

●

More of local file system style
Related to distributed nature

Some stored in backend file system
●
●

Time stamps

●

●

Permissions
Distribution/replication

Some calculated on the fly
●

Brick location
GlusterFS Day London 2013
Elastic Hash Algorithm
●

Based on file names & paths

●

Name space divided

●

Full brick handled via relinking

●

Stored in extended attributes

●

Client needs to know topology

GlusterFS Day London 2013
Distributed Hash Tables

GlusterFS Day London 2013
Self-Healing
●

On demand vs. Scheduled

●

File based

●

Based on extended attributes

●

Split-brain
●

Quorum function

●

Sometimes: manual intervention

GlusterFS Day London 2013
Geo replication
●

Asynchronous

●

Based on rsync/ssh

●

Master-Slave

●

If needed: cascading

●

One way street

●

Clocks in sync!

●

Coming: Parallel GeoRep
GlusterFS Day London 2013
From files to objects
●

Introduced with version 3.3

●

Hard links with some hierarchy
●

●

Re-uses GFID (inode number)

UFO
●

Unified File and Object

●

Combination with RESTful API

●

S3 and Swift compatible

GlusterFS Day London 2013
Operations:
Growth, shrinkage .. failures
●

A Must!

●

Easy

●

Rebalance!

●

Backup?!?

●

Order of servers important

GlusterFS Day London 2013
What else ...?
●

Encryption :-|

●

Compression :-(

●

Snapshots :-(

●

Hadoop connector :-)

●

Locking granularity :-|

●

File system statistics :-)

●

Monitoring :-(
GlusterFS Day London 2013
Team up with OpenStack

GlusterFS Day London 2013
OpenStack
●

Infrastructure as as Service (IaaS)

●

'Opensource version' of AWS :-)

●

New versions every 6 months
●
●

●

Previous (2013.1.3) is called Grizzly
Current (2013.1.4) is called Havana

Managed by Openstack Foundation

GlusterFS Day London 2013
OpenStack Architecture

GlusterFS Day London 2013
OpenStack Components
●

Keystone - identity

●

Glance – image

●

Nova - compute

●

Cinder – block storage

●

Swift – object storage

●

Quantum - network

●

Horizon - dashboard
GlusterFS Day London 2013
About Swift
●

Replace Amazon S3
●
●

●

Scalable
Redundant

Openstack object store
●

Proxy

●

Object

●

Container

●

Account

●

Auth
GlusterFS Day London 2013
Why GlusterFS in the first place?
●

Scale out storage

●

HA + self healing

●

Easy integration on O/S level

●

Already exposed to similar workloads

GlusterFS Day London 2013
Replacing Swift – a story in 3 acts
●

Sneaking in

●

Step in and throw out

●

Teaming up

GlusterFS Day London 2013
Replacing Swift – the 'lazy' way
●

Before 3.3

●

Just mounting

●

Works for Glance too ...

●

... and even for Cinder

GlusterFS Day London 2013
Replacing Swift – the next version
●

With 3.3

●

Changes to Swift code

●

UFO (Unified File and Object)

●

Mainly proxy server

●

Helper tools

●

Meta data -> Extended attributes

●

One volume per tenant
GlusterFS Day London 2013
Replacing Swift – limitations
●

Dependent on Swift release

●

Missing/late new development

●

Packaging

GlusterFS Day London 2013
Replacing Swift – Gone!
●

Since 3.4 and Grizzly

●

RedHat big player in Openstack

●

UFO renamed to G4O

GlusterFS Day London 2013
And Cinder?
●

Block Storage (since Folsom)

●

Integration similar to Swift
●

Mounting
OR

●

Since Grizzly & GlusterFS 3.4:
–

/etc/cinder/cinder.conf

–

/etc/cinder/shares.conf

–

Still fuse mounted volume

GlusterFS Day London 2013
What else?
●

Storage for VM images (Glance)
●
●

●

'Lazy' mode -> one volume
Better way with future versions

NAS type

GlusterFS Day London 2013
New with Havana release
●

Glance can point to Cinder interface
●

●

Cinder can use libgfapi
●

●

Disk space savings
Performance

Nova integration via libgfapi
●

Performance

●

QEMU assisted snapshotting

GlusterFS Day London 2013
Down the road
●

Manila
●
●

●

OpenStack's shared FS service
Different possible frontends

Savanna
●

Elastic Hadoop on OpenStack

●

Analytics as a Service

GlusterFS Day London 2013
Reviewed: Why GlusterFS?
●
●

Previous arguments still valid
One way to cover different
storage entities

●

Modular usage

●

Separation of duties possible

●

Co-location with other/foreign
workloads
GlusterFS Day London 2013
Summary

GlusterFS Day London 2013
Take aways
●

Thin distributed file system layer

●

Modular architecture

●

Operationally ready

●

Good integration in Openstack

●

Active development and community

GlusterFS Day London 2013
References
●

http://www.gluster.org

●

http://www.sxc.hu (pictures)

GlusterFS Day London 2013
Thank you!

GlusterFS Day London 2013
Bricks and Translators:
GlusterFS as SWIFT replacement in OpenStack
Dr. Udo Seidel
Linux-Strategy @ Amadeus

GlusterFS Day London 2013

Gluster.community.day.2013