Distributed Petabyte-Scale Cloud Storage with GlusterFS                           The Future of GlusterFS and Gluster.org ...
The Roots of GlusterFS      ●    Distributed storage solutions difficult to find      ●    Decided to write their own     ...
The Roots of GlusterFS●   All storage solutions were either    ●   Too expensive. or...    ●   Not scalable, or…    ●   Si...
The Roots of GlusterFS●   The challenge:    ●   Create a storage system that was…        –   Scalable        –   Seamlessl...
GlusterFS Client Architecture●   Creating a file system in user space    ●   Utilizes fuse module        –   Kernel goes t...
No Centralized Metadata   Client A         Client B         Client C  Server X         Server Y         Server ZExtended A...
What is a Translator?●   Add/remove layers         FUSE Interface Layer●   Reorder layers             Performance Layer●  ...
Some Features●   Distributed, replicated and/or striped volumes●   Global namespace●   High availability●   Geo-replicatio...
No one ever expects the Red Hat acquisition
Red Hat Invests in GlusterFS●   Unstructured data volume to grow 44x by 2020●   Cloud and virtualization are driving scale...
Red Hat Invests in GlusterFS●   GlusterFS adds to the                                           JBoss    Red Hat stack    ...
Red Hat Invests in GlusterFS●   Acceleration of community investment    ●   GlusterFS needs to be “bigger than Red Hat”   ...
Join a Winning Team                       “Join me, and                       together, we can                       rule ...
The Immediate Future
The Gluster CommunityGlobal adoption   ●   300,000+ downloads                      ●   ~35,000 /month                     ...
The Gluster Community●   Why are we changing?    ●   Only 1 non-Red Hat core contributor        –   There were 2, but he a...
Towards “Real” Open Source●   GlusterFS, prior to acquisition    ●   “Open Core”    ●   Tied directly to Gluster products ...
Towards “Real” Open Source“Open Core” ●   All engineering controlled by     project/product sponsor ●   No innovation outs...
Towards “Real” Open Source“Real” Open Source ●   Many points of collaboration     and innovation in open     source projec...
Towards “Real” Open Source“Real” Open Source ●   Enables more innovation on     the fringes                               ...
Towards “Real” Open Source“Real” Open Source ●   Enables more innovation on     the fringes                               ...
Project Roadmaps
Whats New in GlusterFS 3.3●   New features    ●   Unified File & Object access    ●   Hadoop / HDFS compatibility●   New V...
File and Object Storage●   Traditional SAN / NAS support either    file or block storage●   New storage methodologies    i...
Technology IntegrationsGlusterFS used as VM storage system                                                Mobile Apps. Web...
HDFS/Hadoop Compatibility●   HDFS compatibility library    ●   Simultaneous file and object access within Hadoop●   Benefi...
The Gluster Community●   What is changing?    ●   HekaFS / CloudFS being folded into Gluster project        –   HekaFS == ...
What else?
GlusterFS Advisory Board●   Advisory board    ●   Consists of industry and community leaders from Facebook, Citrix,       ...
Gluster.org Web Site●   Services for users and developers    ●   Developer section with comprehensive docs    ●   Collabor...
GlusterFS Downloads●   Wheres the code?    ●   GlusterFS 3.3        –   Simultaneous file + object        –   HDFS compati...
Gluster.org Services●   Gluster.org    ●   Portal into all things GlusterFS●   Community.gluster.org    ●   Self-support s...
Development Process●   Source code    ●   Hosted at github.com/gluster●   Bugs and Feature Requests    ●   Bugzilla.redhat...
Thank You●   GlusterFS contacts    ●   Gluster.org/interact/mailinglists    ●   @RedHatStorage & @GlusterOrg    ●   #glust...
Upcoming SlideShare
Loading in …5
×

vBACD - Distributed Petabyte-Scale Cloud Storage with GlusterFS - 2/28

3,146 views

Published on

GlusterFS is an open source scale-out NAS solution. The software is a powerful and flexible solution that simplifies the task of managing unstructured file data whether you have a few terabytes of storage or multiple petabytes. It’s no secret that unstructured data is growing like crazy, Gluster provides a solutions that scales capacity and performance as you need it and is an ideal fit for an IT environment that is increasingly virtualized and moving to the cloud.

There are two key ways that GlusterFS is beneficial for cloud builders:

1. Storage layer for VMs. If you're deploying Xen or KVM VMs on a private cloud, storing them on GlusterFS gives you the ability to migrate to different hypervisors, suspend and resume quickly - even on another hypervisor, scale out far beyond what other filesystems will allow, and utilize N-way replication for DR and HA

2. Unified storage layer for applications. With GlusterFS 3.3, you will be able to access your application data stores from an object (S3, Swift-style) interface, as well as a traditional POSIX-compatible NAS interface. This unified approach gives developers and admins the ability to access the same data store using a variety of different methods.

In this session, attendees will learn steps for deployment and some common use cases.

Speaker Bio

John Mark is an experienced veteran of all things open source and a self-described agitprop, agitator and advocate for those who volunteer countless, unpaid hours for a particular project or community. He first fell down the slippery slope of open source as a web developer at VA Linux Systems and eventually switched to the community team, beginning a career that has now lasted over ten years. Along the way, John Mark made stops at young, up-and-coming startups, such as Groundwork, Hyperic and then Gluster (later acquired by Red Hat). In between, there was a brief interlude at IDG World Expo, where he was the conference director for LinuxWorld, GridWorld and OSBC. His advice for companies who want to "do community" is to trust your community and give them the space to "just try s***." John Mark loves to perform community karaoke, and is available for weddings, funerals and Bar/Bat Mitzvahs

Published in: Technology

vBACD - Distributed Petabyte-Scale Cloud Storage with GlusterFS - 2/28

  1. 1. Distributed Petabyte-Scale Cloud Storage with GlusterFS The Future of GlusterFS and Gluster.org John Mark Walker GlusterFS Community Guy Red Hat, Inc. February 28, 2012
  2. 2. The Roots of GlusterFS ● Distributed storage solutions difficult to find ● Decided to write their own ● No filesystem experts – Pro & Con ● Applied lessons from microkernel architecture – GNU HurdFebruary 28, 2012 The Future of Gluster.org - John Mark Walker
  3. 3. The Roots of GlusterFS● All storage solutions were either ● Too expensive. or... ● Not scalable, or… ● Single purpose, or… ● Don’t support legacy apps, or… ● Dont support new apps, or... ● Do some combo of the above, but not very well
  4. 4. The Roots of GlusterFS● The challenge: ● Create a storage system that was… – Scalable – Seamlessly integrated in the data center – Future-proof● The solution: GlusterFS ● Scalable, with DHT ● POSIX-compliant ● Stackable ● User-space
  5. 5. GlusterFS Client Architecture● Creating a file system in user space ● Utilizes fuse module – Kernel goes through fuse, which hands off to glusterd glusterd Applications Linux kernel Fuse Ext4 … …
  6. 6. No Centralized Metadata Client A Client B Client C Server X Server Y Server ZExtended Attr. Extended Attr. Extended Attr. Files Files Files
  7. 7. What is a Translator?● Add/remove layers FUSE Interface Layer● Reorder layers Performance Layer● Move layers between Distribution Layer client and server Replication Layer● Implement new layers Protocol Layer ● e.g. encryption● Replace old layers Local Filesystem Later ● e.g. replication
  8. 8. Some Features● Distributed, replicated and/or striped volumes● Global namespace● High availability● Geo-replication● Rebalancing● Remove or replace bricks● Self healing● volume profile and top metrics
  9. 9. No one ever expects the Red Hat acquisition
  10. 10. Red Hat Invests in GlusterFS● Unstructured data volume to grow 44x by 2020● Cloud and virtualization are driving scale-out storage growth● Scale-out storage shipments to exceed 63,000 PB by 2015 (74% CAGR)● 40% of core cloud spend related to storage● GlusterFS-based solutions up to 50% less than other storage systems
  11. 11. Red Hat Invests in GlusterFS● GlusterFS adds to the JBoss Red Hat stack ● Complements other RHEL Red Hat offerings ● Many integration points Bare RHEV Clouds● More engineers Metal hacking on GlusterFS GlusterFS Unified Storage than ever before
  12. 12. Red Hat Invests in GlusterFS● Acceleration of community investment ● GlusterFS needs to be “bigger than Red Hat” ● Transformation of GlusterFS from product to project – From “open core” to upstream ● More resources for engineering and community outreach ● Red Hats success rests on economies of scale – Critical mass of users and developers
  13. 13. Join a Winning Team “Join me, and together, we can rule the galaxy...”● Were hiring hackers and engineers● Looking for community collaborators ● ISVs, students, IT professionals, fans, et al.
  14. 14. The Immediate Future
  15. 15. The Gluster CommunityGlobal adoption ● 300,000+ downloads ● ~35,000 /month ● >300% increase Y/Y ● 1000+ deployments ● 45 countries ● 2,000+ registered users ● Mailing lists, Forums, etc.
  16. 16. The Gluster Community● Why are we changing? ● Only 1 non-Red Hat core contributor – There were 2, but he acquired us ● Want to be the software standard for distributed storage ● Want to be more inclusive, more community-driven Goal: create global ecosystem that supports ISVs, service providers and more
  17. 17. Towards “Real” Open Source● GlusterFS, prior to acquisition ● “Open Core” ● Tied directly to Gluster products – No differentiation ● Very little outside collaboration ● Contributors had to assign copyright to Gluster – Discouraged would-be contributors
  18. 18. Towards “Real” Open Source“Open Core” ● All engineering controlled by project/product sponsor ● No innovation outside of Commercial Product core engineering team ● All open source features also in commercial product ● Many features in Open Source Commercial product not in Code open source code
  19. 19. Towards “Real” Open Source“Real” Open Source ● Many points of collaboration and innovation in open source project Open Source Code ● Engineering team from multiple sources ● Project and product do not completely overlap ● Commercial products are Commercial Products hardened, more secure and thoroughly tested
  20. 20. Towards “Real” Open Source“Real” Open Source ● Enables more innovation on the fringes Fedora Linux ● Engineering team from multiple sources ● Open source project is “upstream” from commercial product ● “Downstream” products are RHEL hardened, more secure and thoroughly tested
  21. 21. Towards “Real” Open Source“Real” Open Source ● Enables more innovation on the fringes GlusterFS ● Engineering team from multiple sources ● Open source project is “upstream” from commercial product ● “Downstream” products are Red Hat Storage hardened, more secure and thoroughly tested
  22. 22. Project Roadmaps
  23. 23. Whats New in GlusterFS 3.3● New features ● Unified File & Object access ● Hadoop / HDFS compatibility● New Volume Type ● Replicated + striped (+ distributed) volumes● Enhancements to Distributed volumes (DHT translator) ● Rebalance can migrate open files ● Remove-brick can migrate data to remaining bricks● Enhancements to Replicated volumes (AFR translator) ● Change replica count on an active volume, add replication to distribute-only volumes ● Granular locking – Much faster self-healing for large files ● Proactive self-heal process starts without FS stat ● Round-trip reduction for lower latency ● Quorum enforcement - avoid split brain scenarios GlusterFS 3.3 ETA in Q2/Q3 2012
  24. 24. File and Object Storage● Traditional SAN / NAS support either file or block storage● New storage methodologies implement RESTful APIs over HTTP● Demand for unifying the storage infrastructure increasing● Treats files as objects and volumes as buckets● Available now in 3.3 betas● Soon to be backported to 3.2.x● Contributing to OpenStack project ● Re-factored Swift API
  25. 25. Technology IntegrationsGlusterFS used as VM storage system Mobile Apps. Web Clients. Enterprise Software Ecosystem ● Pause and re-start VM’s, even on another API Layer hypervisor … ● HA and DR for VM’s Compute ● Faster VM deployment ● V-motion –like capability Unified File & Object StorageShared storage ISOs and appliances ● oVirt / RHEV ● CloudStack ● OpenStack OpenStack Imaging Services Goal: The standard for cloud storage
  26. 26. HDFS/Hadoop Compatibility● HDFS compatibility library ● Simultaneous file and object access within Hadoop● Benefits ● Legacy app access to MapReduce applications ● Enables data storage consolidation● Simplify and unify storage deployments● Provide users with file level access to data● Enable legacy applications to access data via NFS ● Analytic apps can access data without modification
  27. 27. The Gluster Community● What is changing? ● HekaFS / CloudFS being folded into Gluster project – HekaFS == GlusterFS + multi-tenancy and SSL for auth and data encryption – HekaFS.org – ETA ~9 months
  28. 28. What else?
  29. 29. GlusterFS Advisory Board● Advisory board ● Consists of industry and community leaders from Facebook, Citrix, Fedora, and OpenStack – Richard Wareing, Storage Engineer, Facebook – Jeff Darcy, Filesystem Engineer, Red Hat; Founder, HekaFS Project – AB Periasamy, Co-Founder, GlusterFS project – Ewan Mellor, Xen Engineer, Citrix; Member, OpenStack project – David Nalley, CloudStack Community Mgr; Fedora Advisory Board – Louis Zuckerman, Sr. System Administrator, Picture Marketing – Joe Julian, Sr. System Administrator, Ed Wyse Beauty Products – Greg DeKoenigsberg, Community VP, Eucalyptus; co-founder, Fedora – John Mark Walker, Gluster.org Community Guy (Chair)
  30. 30. Gluster.org Web Site● Services for users and developers ● Developer section with comprehensive docs ● Collaborative project hosting ● Continuing development of end user documentation and interactive tools● Published roadmaps ● Transparent feature development
  31. 31. GlusterFS Downloads● Wheres the code? ● GlusterFS 3.3 – Simultaneous file + object – HDFS compatibility – Improved self-healing + VM hosting ● Granular locking – Beta 3 due Feb/Mar 2012 – http://download.gluster.org/pub/gluster/glusterfs
  32. 32. Gluster.org Services● Gluster.org ● Portal into all things GlusterFS● Community.gluster.org ● Self-support site; Q&A; HOWTOs; tutorials● Patch review, CI ● review.gluster.com● #gluster ● IRC channel on Freenode
  33. 33. Development Process● Source code ● Hosted at github.com/gluster● Bugs and Feature Requests ● Bugzilla.redhat.com – select GlusterFS from menu● Patches ● Submit via Gerritt at review.gluster.com● See Development Work Flow doc: ● gluster.org/community/documentation/index.php/Development_Work_Flow
  34. 34. Thank You● GlusterFS contacts ● Gluster.org/interact/mailinglists ● @RedHatStorage & @GlusterOrg ● #gluster on Freenode● My contact info ● johnmark@redhat.com ● Twitter & identi.ca: @johnmark

×