Lt2013 glusterfs.talk

473 views

Published on

Bricks and Translators - The distributed file system made by Red Hat (spring 2013)

Published in: Technology
0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
473
On SlideShare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
11
Comments
0
Likes
2
Embeds 0
No embeds

No notes for slide

Lt2013 glusterfs.talk

  1. 1. Bricks and Translators:The distributed file system made by Red HatDr. Udo SeidelLinux-Strategy @ Amadeus
  2. 2. LinuxTag 2013 2To my Mum
  3. 3. LinuxTag 2013 3Agenda● Introduction● High level overview● Storage inside● Use cases● Summary
  4. 4. LinuxTag 2013 4Introduction
  5. 5. LinuxTag 2013 5Me ;-)● Teacher of mathematics & physics● PhD in experimental physics● Started with Linux in 1996● Linux/UNIX trainer● Solution engineer in HPC and CAx environment● Head of the Linux Strategy team @Amadeus
  6. 6. LinuxTag 2013 6Storage: History● Reviewing storage task responsibilities● Block allocation● Space management● Extension of SCSI standard● Object based storage● Meta-Data handling separated from datamanagement
  7. 7. LinuxTag 2013 7Object based storage● Storage objects quite general● Partition, file, ...● Unique identifier● OSD (Object based Storage Device)● Hardware -> original trigger● Software -> common implementation● Main component of distributed file systems
  8. 8. LinuxTag 2013 8Distributed storage:Paradigm changes● Block -> Object● Central -> Distributed● Few -> Many● Big -> Small● Server <-> Storage
  9. 9. LinuxTag 2013 9Distributed File Systems● Recent attention on distributed storage● Cloud hype● Big Data● See also CEPH talks
  10. 10. LinuxTag 2013 10Distributed storage – Now what?!?● Several implementations● Different functions● Support models● Storage vendors initiatives● Relation to Linux distributionsHere and now ==> GlusterFS
  11. 11. LinuxTag 2013 11High level overview
  12. 12. LinuxTag 2013 12History● Gluster founded in 2005● Gluster = GNU + cluster● Acquisition by Red Hat in 2011● Community project● 3.2 in 2011● 3.3 in 2012● Commercial product: RedHat Storage Server
  13. 13. LinuxTag 2013 13The Client● Native● speaks GLUSTERFS● Not part of the Linux Kernel● FUSE-based● NFS● Normal NFS client stack● S3/Swift compatible● Proxy needed
  14. 14. LinuxTag 2013 14The Server● Data● Bricks● Translators● Volumes -> exported/served to the client● Meta-Data● No dedicated instance● Distributed hashing approach
  15. 15. LinuxTag 2013 15The picture
  16. 16. LinuxTag 2013 16Storage inside
  17. 17. LinuxTag 2013 17The Brick● Trust each other● Interconnect● TCP/IP and/or RDMA/Infiniband● Dedicated file systems on GlusterFS server● XFS recommended, EXT4 works too● Extended attributes a must● Two main processes/daemons● glusterd and glusterfsd
  18. 18. LinuxTag 2013 18The Translator● One per purpose● Replication● POSIX● Quota● I/O behaviour● Chained -> brick graph● Technically: configuration
  19. 19. LinuxTag 2013 19The Volume● Service unit● Layer of configuration● distributed, replicated, striped, ...● NFS● Cache● Permissions● ....
  20. 20. LinuxTag 2013 20The Striped Volume
  21. 21. LinuxTag 2013 21The Distributed Volume
  22. 22. LinuxTag 2013 22The Replicated Volume
  23. 23. LinuxTag 2013 23The Distributed-Replicated Volume
  24. 24. LinuxTag 2013 24Meta Data● 2 kinds● More of local file system style● Related to distributed nature● Some stored in backend file system● Permissions● Time stamps● Distribution/replication● Some calculated on the fly● Brick location
  25. 25. LinuxTag 2013 25Elastic Hash Algorithm● Based on file names● Name space divided● Full brick handled via relinking● Stored in extended attributes● Client needs to know topology
  26. 26. LinuxTag 2013 26Distributed Hash Tables
  27. 27. LinuxTag 2013 27Self-Healing● On demand vs. Scheduled● File based● Based on extended attributes● Split-brain● Quorum function● Sometimes: manual intervention
  28. 28. LinuxTag 2013 28Geo replication● Asynchronous● Based on rsync/ssh● Master-Slave● If needed: cascading● One way street● Clocks in sync!
  29. 29. LinuxTag 2013 29From files to objects● Introduced with version 3.3● Hard links with some hierarchy● Re-uses GFID (inode number)● UFO● Unified File and Object● Combination with RESTful API● S3 and swift compatible
  30. 30. LinuxTag 2013 30Operations:Growth, shrinkage .. failures● A Must!● Easy● Rebalance!● Order of servers important
  31. 31. LinuxTag 2013 31What else ...?● Encryption :-|● Compression :-(● Snapshots :-(● Hadoop connector :-)● Locking granularity :-|● File system statistics :-)
  32. 32. LinuxTag 2013 32Use cases
  33. 33. LinuxTag 2013 33NAS replacement● NFS as 1:1● Server: GlusterFS● Client: NFS● NFS as such● Server: GlusterFS● Client: GlusterFS
  34. 34. LinuxTag 2013 34Storage back-end for KVM and Co● Stacked (indirect)● Not smart● Workable for main hypervisors● Direct● QEMU● libvirt● oVirt/RHEV
  35. 35. LinuxTag 2013 35SAN replacement● Not quite advanced (yet)● New translator needed● Development started● Presenting GlusterFS as block device● Additional items needed● Locking● ...
  36. 36. LinuxTag 2013 36Summary
  37. 37. LinuxTag 2013 37Take aways● Thin distributed file system layer● Modular architecture● Operationally ready● Still some surprises● Active development and community
  38. 38. LinuxTag 2013 38References● http://www.gluster.org● http://www.sxc.hu (pictures)
  39. 39. LinuxTag 2013 39Thank you!
  40. 40. LinuxTag 2013 40Bricks and Translators:The distributed file system made by RedHatDr. Udo SeidelLinux-Strategy @ Amadeus

×