Successfully reported this slideshow.
Your SlideShare is downloading. ×

Erasure codes and storage tiers on gluster

Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Upcoming SlideShare
Gluster Data Tiering
Gluster Data Tiering
Loading in …3
×

Check these out next

1 of 26 Ad

Erasure codes and storage tiers on gluster

Download to read offline

In this session, we'll discuss new volume types in Red Hat Gluster Storage. We will talk about erasure codes and storage tiers, and how they can work together. Future directions will also be touched on, including rule based classifiers and data transformations.

You will learn about:

How erasure codes lower the cost of storage.
How to configure and manage an erasure coded volume.
How to tune Gluster and Linux to optimize erasure code performance.
Using erasure codes for archival workloads.
How to utilize an SSD inexpensively as a storage tier.
Gluster's erasure code and storage tiering design.

In this session, we'll discuss new volume types in Red Hat Gluster Storage. We will talk about erasure codes and storage tiers, and how they can work together. Future directions will also be touched on, including rule based classifiers and data transformations.

You will learn about:

How erasure codes lower the cost of storage.
How to configure and manage an erasure coded volume.
How to tune Gluster and Linux to optimize erasure code performance.
Using erasure codes for archival workloads.
How to utilize an SSD inexpensively as a storage tier.
Gluster's erasure code and storage tiering design.

Advertisement
Advertisement

More Related Content

Slideshows for you (20)

Similar to Erasure codes and storage tiers on gluster (20)

Advertisement

More from Red_Hat_Storage (20)

Erasure codes and storage tiers on gluster

  1. 1. Dan Lambright1 Erasure Codes and Storage Tiers on Gluster Dan Lambright SA summit Sep 23, 2014
  2. 2. Dan Lambright2 AGENDA ● Why erasure codes (ec) in Gluster ● How ec works ● Brief peek at underlying mathematics ● Storage tiering in gluster ● Demo ● “One more thing”
  3. 3. Dan Lambright3 Why erasure codes in gluster? ● Desire protection from double failure ● RAID6 controllers are expensive ● Imagine a 64 node volume ● Each brick on a separate bare metal machine ● Cost is 64 x $ for LSI MegaRaid controller 20K =
  4. 4. Dan Lambright4 Why erasure codes in gluster? ● Triplication (3 way replication) is expensive ● Two redundant disks for every data disk ● 200% overhead! :(
  5. 5. Dan Lambright5 Erasure codes ● Store m disks worth of data on k disks (k>m) ● n redundant disks (k-m), ● can pick n to choose failure tolerance ● A generalization of RAID6 ● Distributed across nodes
  6. 6. Dan Lambright6 Overhead analysis ● Can also consider mean time before failure k total disks n how many failures admitted m number of data disks Capacity overhead (n/k) RAID level 3 1 2 33.33% 5 5 1 4 20% 5 6 2 4 33.33% 6 7 3 4 42.86% E 9 1 8 11.11% 5 10 2 8 20% 6 11 3 8 27.27% E 12 4 8 33.33% E
  7. 7. ERASURE CODES PRIMER
  8. 8. Dan Lambright8 ERASURE CODE TERMS ● m data disks ● n parity disks ● k total number disks = m+n ● Symbol – Smallest data unit. w bits. ● Typically w = 8 = a byte ● Chunk (aka fragment) – r symbols per disk ● Stripe – collection of m+n chunks across k disks ● Unit of manipulation for recovery ● Also known as a “slice”
  9. 9. Dan Lambright9 ERASURE CODE TERMS ● r=6 m=4 n =2 k=6 w=1 symbol fragment “Stripe” of 6 fragments 011010
  10. 10. Dan Lambright10 Systematic ● m data chunks, n coding chunks ● (can stripe parity and data chunks on the same disk) ● Reads are simple, only decode on repairs Slice 1 Slice 2 Slice 3
  11. 11. Dan Lambright11 Non-Systematic ● All k chunks in a stripe are coded ● Do not to distinguish data from code servers ● Encode/decode on writes and reads Slice 1 Slice 2 Slice 3
  12. 12. Dan Lambright12 Encoding / Decoding Overhead ● Network RTT dominate the encode/decode overhead ● Packages exist to implement the math ● Intel has fast routines for Inverse, dot product, encoding, decoding, etc ● Jerasure library from academia ● Gluster's is purpose built and fast
  13. 13. GLUSTER IMPLEMENTATION
  14. 14. Dan Lambright14 GLUSTERFS “Disperse Volumes” ● Done by Datalab corp. by Xavier Hernandez. ● Use case : archiving medical records ● Developed over last 2 years ● Now part of gluster upstream
  15. 15. Dan Lambright15 CLI Two new options have been added to the 'create' command of the cli interface: gluster volume create <name> disperse <count> redundancy <count> Disperse is “k” (total number volumes) Redundancy is “n”
  16. 16. Dan Lambright16 “Disperse volumes” design choices ● The “symbols” are bytes: w = 8 ● The fragment size r = 128 ● Algorithm: Reed solomon ● Generator matrix: Vandermonde ● Non–systematic ● Encoding / decoding done on client side ● Modeled after AFR ● Concurrent writes must be processed in order
  17. 17. STORAGE TIERS
  18. 18. Dan Lambright18 Storage Tiers ● Different “subvolume” tiers presented as a single volume ● HDD, SSD, tape, “persistent memory”, etc. ● Plug-in policy describes how data moves between tiers ● V1 policy: Cache ● slow and fast tiers ● CLI to add/remove cache tier from existing volume
  19. 19. Dan Lambright19 Example: Erasure codes + SSD ● User sees one volume ● SSD “caches” ec data Tiered volume “cache”: on SSD ec on HDD Hot Cold demote promote
  20. 20. Dan Lambright20 Future : Data classification (DC) ● Add rules to storage graph ● Rule determines subvolume ● File name ● Attribute (size, content) ● Etc. Filename = *.lock ?` Yes No Secure / Encrypted HDD
  21. 21. Dan Lambright21 Future flexibility ● Many use cases ● Compliance ● Multi-tenancy ● Rack-aware placement (for performance) ● Policies described by language ● Arbitrary number of tiers, rules, subvolumes .. ● Template based
  22. 22. DEMO promote
  23. 23. ONE MORE THING.. promote
  24. 24. Dan Lambright24 Bitrot ● A daemon that scans gluster volumes ● Finds corrupted data ● Digest associated with each file ● Alert / recover on mismatch ● “Plug-ins” to daemon may do other things.. ● Tuning parameters to be non-intrusive to performance ● Encryption ● Compression ● Etc.
  25. 25. 25 Do it! ● Learn the math: ● http://web.eecs.utk.edu/~plank/plank/papers/FAST- 2013-Tutorial.html ● Get the bits: ● https://forge.gluster.org/disperse
  26. 26. RED HAT CONFIDENTIAL – DO NOT DISTRIBUTE Thank You! ● dlambright@redhat.com ● RHS: www.redhat.com/storage/ ● GlusterFS: www.gluster.org ● @Glusterorg @RedHatStorage Gluster Red Hat Storage Slides Available on Mojo

×