Gluster: where weve been - a history

•Download as ODP, PDF•

1 like•925 views

This document summarizes the history and development of the Gluster distributed storage system. It describes how Gluster began in Bangalore and has grown to support large scale deployments. Key developments included creating a unified global namespace, adding features through a stackable architecture, and developing templates to simplify common use cases. Gluster now provides a simple, scalable, and low-cost storage solution that is well suited for the growing data demands of cloud computing through its scale-out architecture and open source model.

Technology Education

Gluster: Where We've Been
AB Periasamy
Office of the CTO, Red Hat

John Mark Walker
Gluster Community Guy

Topics
The Big Idea
Humble beginnings
From Bangalore to Milpitas
Scale-out + Open source == WINNING
User-space, no metadata server, stackable
Cloud and commoditization

06/13/12

A Data Explosion!
74% == Unstructured data annual growth
63,000 PB == Scale-out storage in 2015
40% == storage-related expense for cloud
44x == Unstructured data volume growth by
2020

06/13/12

Conference Room US Head Office

06/13/12

Bengaluru Office Bengaluru Office

What Can You Store?
Media – Docs, Photos, Video
VM Filesystem – VM Disk Images
Big Data – Log Files, RFID Data
Objects – Long Tail Data

06/13/12

The big idea:
Storage should be
simple
Simple, scalable, low-cost
06/13/12

What is GlusterFS,
Really?
Gluster is a unified, distributed
storage system
DHT, stackable, POSIX, Swift, HDFS

06/13/12

Phase 1: Lego Kit for
Storage
“People who think that userspace filesystems
are realistic for anything but toys are just
misguided" – Linus Torvalds
Goal: create a global namespace

06/13/12

volume testvol-posix
type storage/posix
option directory /media/datastore
option volume-id 329e31c1-04cc-4386-8bb8-xxxx
end-volume

volume testvol-access-control
type features/access-control
subvolumes testvol-posix
end-volume

volume testvol-locks
type features/locks
subvolumes testvol-access-control
end-volume

volume testvol-io-threads
type performance/io-threads
subvolumes testvol-locks
end-volume
06/13/12

Versions 1.x – 2.x
Hand-crafted volume definition files
See examples
Simple configuration files
Faster than tape? It's good!

06/13/12

Phase 2: Repeatability
of Use Cases

06/13/12

Community-led
Learned from community
Desired features
Usage profiles
All about scalable storage of unstructured
data

06/13/12

GlusterFS 3.0: Putting it
all together
Adding, removing features
Templates – recipes for common use
cases

06/13/12

GlusterFS 3.1 - 2010
Elasticity: add and remove volumes w/
glusterd
Automation: CLI, scriptable

06/13/12

CLI Magic
$ gluster peer probe HOSTNAME
$ gluster volume info
$ gluster volume create VOLNAME [stripe COUNT]
[replica COUNT] [transport tcp | rdma] BRICK
$ gluster volume delete VOLNAME
$ gluster volume add-brick VOLNAME NEW-BRICK ...
$ gluster volume rebalance VOLNAME start

06/13/12

GlusterFS 3.2 - 2011
Native NFS server
Marker framework
Geo-replication
Asynchronous

06/13/12

And now for something
completely different
Commoditization and the changing
economics of storage
Why we're winning

06/13/12

Simple Economics
Simplicity, scalability, less cost

Virtualized Multi-Tenant Automated Commoditized

Scale on Demand In the Cloud Scale Out Open Source

06/13/12

Simplicity Bias
FC, FCoE, iSCSI → HTTP, Sockets
Modified BSD OS → Linux / User Space /
C, Python & Java
Appliance based → Application based

06/13/12

Scale-out Open Source
is the winner

06/13/12

Thank you!
AB Periasamy
Office of the CTO, Red Hat
ab@redhat.com

John Mark Walker
Gluster Community Guy
johnmark@redhat.com

Similar to Gluster: where weve been - a history

Solving Big Data ProblemsEvaluator Group

Scalability 09262012Mike Miller

IBM Cloud Object Storage: How it works and typical use casesTony Pearson

Simplified Data Preparation for Machine Learning in Hybrid and Multi CloudsAlluxio, Inc.

Hadoop Technical PresentationErwan Alliaume

LINBIT_HA_Business_Apr2016Alexandre Huynh

Yahoo! Scalable Storage and Delivery ServicesYahoo Developer Network

Unified Data API for Distributed Cloud Analytics and AIAlluxio, Inc.

EMC config Hadoopsolarisyougood

New Business Applications Powered by In-Memory Technology @MIT Forum for Supp...Paul Hofmann

GlusterFS Architecture - June 30, 2011 MeetupGlusterFS

Strategies for Context Data PersistenceFIWARE

S100299 ibm-cos-orlando-v1804cTony Pearson

Online storage for the masses and the case of pithosnkoziris

S016825 ibm-cos-nola-v1710dTony Pearson

Red Hat Storage Day Boston - Red Hat Gluster Storage vs. Traditional Storage ...Red_Hat_Storage

Pivotal: Virtualize Big Data to Make the Elephant DanceEMC

Hadoop Summit Brussels 2015: Architecting a Scalable Hadoop Platform - Top 10...Sumeet Singh

Proud to be polyglot!NLJUG

HPE Solutions for Challenges in AI and Big DataLviv Startup Club

Similar to Gluster: where weve been - a history (20)

Solving Big Data Problems

Scalability 09262012

IBM Cloud Object Storage: How it works and typical use cases

Simplified Data Preparation for Machine Learning in Hybrid and Multi Clouds

Hadoop Technical Presentation

LINBIT_HA_Business_Apr2016

Yahoo! Scalable Storage and Delivery Services

Unified Data API for Distributed Cloud Analytics and AI

EMC config Hadoop

New Business Applications Powered by In-Memory Technology @MIT Forum for Supp...

GlusterFS Architecture - June 30, 2011 Meetup

Strategies for Context Data Persistence

S100299 ibm-cos-orlando-v1804c

Online storage for the masses and the case of pithos

S016825 ibm-cos-nola-v1710d

Red Hat Storage Day Boston - Red Hat Gluster Storage vs. Traditional Storage ...

Pivotal: Virtualize Big Data to Make the Elephant Dance

Hadoop Summit Brussels 2015: Architecting a Scalable Hadoop Platform - Top 10...

Proud to be polyglot!

HPE Solutions for Challenges in AI and Big Data

Recently uploaded

FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance

FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance

Connector Corner: Automate dynamic content and events by pushing a buttonDianaGray10

Speed Wins: From Kafka to APIs in Minutesconfluent

IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptxAbida Shariff

Knowledge engineering: from people to machines and backElena Simperl

Exploring UiPath Orchestrator API: updates and limits in 2024 🚀DianaGray10

GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...Sri Ambati

Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Jeffrey Haguewood

Key Trends Shaping the Future of Infrastructure.pdfCheryl Hung

FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance

PHP Frameworks: I want to break free (IPC Berlin 2024)Ralf Eggert

From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...Product School

Transcript: Selling digital books in 2024: Insights from industry leaders - T...BookNet Canada

Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Ramesh Iyer

UiPath Test Automation using UiPath Test Suite series, part 1DianaGray10

Designing Great Products: The Power of Design and Leadership by Chief Designe...Product School

Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...UiPathCommunity

Search and Society: Reimagining Information Access for Radical FuturesBhaskar Mitra

IoT Analytics Company Presentation May 2024IoTAnalytics

Recently uploaded (20)

FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf

FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf

Connector Corner: Automate dynamic content and events by pushing a button

Speed Wins: From Kafka to APIs in Minutes

IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx

Knowledge engineering: from people to machines and back

Exploring UiPath Orchestrator API: updates and limits in 2024 🚀

GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...

Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...

Key Trends Shaping the Future of Infrastructure.pdf

FIDO Alliance Osaka Seminar: Overview.pdf

PHP Frameworks: I want to break free (IPC Berlin 2024)

From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...

Transcript: Selling digital books in 2024: Insights from industry leaders - T...

Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...

UiPath Test Automation using UiPath Test Suite series, part 1

Designing Great Products: The Power of Design and Leadership by Chief Designe...

Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...

Search and Society: Reimagining Information Access for Radical Futures

IoT Analytics Company Presentation May 2024

Gluster: where weve been - a history

1. Gluster: Where We've Been AB Periasamy Office of the CTO, Red Hat John Mark Walker Gluster Community Guy

2. Topics The Big Idea Humble beginnings From Bangalore to Milpitas Scale-out + Open source == WINNING User-space, no metadata server, stackable Cloud and commoditization 06/13/12

3. A Data Explosion! 74% == Unstructured data annual growth 63,000 PB == Scale-out storage in 2015 40% == storage-related expense for cloud 44x == Unstructured data volume growth by 2020 06/13/12

4. Conference Room US Head Office 06/13/12 Bengaluru Office Bengaluru Office

5. Gluster Community Deployments

6. Gluster Production Deployments

7. What Can You Store? Media – Docs, Photos, Video VM Filesystem – VM Disk Images Big Data – Log Files, RFID Data Objects – Long Tail Data 06/13/12

8. The big idea: Storage should be simple Simple, scalable, low-cost 06/13/12

9. What is GlusterFS, Really? Gluster is a unified, distributed storage system DHT, stackable, POSIX, Swift, HDFS 06/13/12

10. Phase 1: Lego Kit for Storage “People who think that userspace filesystems are realistic for anything but toys are just misguided" – Linus Torvalds Goal: create a global namespace 06/13/12

11. volume testvol-posix type storage/posix option directory /media/datastore option volume-id 329e31c1-04cc-4386-8bb8-xxxx end-volume volume testvol-access-control type features/access-control subvolumes testvol-posix end-volume volume testvol-locks type features/locks subvolumes testvol-access-control end-volume volume testvol-io-threads type performance/io-threads subvolumes testvol-locks end-volume 06/13/12

12. Versions 1.x – 2.x Hand-crafted volume definition files See examples Simple configuration files Faster than tape? It's good! 06/13/12

13. Phase 2: Repeatability of Use Cases 06/13/12

14. Community-led Learned from community Desired features Usage profiles All about scalable storage of unstructured data 06/13/12

15. GlusterFS 3.0: Putting it all together Adding, removing features Templates – recipes for common use cases 06/13/12

16. GlusterFS 3.1 - 2010 Elasticity: add and remove volumes w/ glusterd Automation: CLI, scriptable 06/13/12

17. CLI Magic $ gluster peer probe HOSTNAME $ gluster volume info $ gluster volume create VOLNAME [stripe COUNT] [replica COUNT] [transport tcp | rdma] BRICK $ gluster volume delete VOLNAME $ gluster volume add-brick VOLNAME NEW-BRICK ... $ gluster volume rebalance VOLNAME start 06/13/12

18. GlusterFS 3.2 - 2011 Native NFS server Marker framework Geo-replication Asynchronous 06/13/12

19. And now for something completely different Commoditization and the changing economics of storage Why we're winning 06/13/12

20. Simple Economics Simplicity, scalability, less cost Virtualized Multi-Tenant Automated Commoditized Scale on Demand In the Cloud Scale Out Open Source 06/13/12

21. Simplicity Bias FC, FCoE, iSCSI → HTTP, Sockets Modified BSD OS → Linux / User Space / C, Python & Java Appliance based → Application based 06/13/12

22. Scale-out Open Source is the winner 06/13/12

23. Thank you! AB Periasamy Office of the CTO, Red Hat ab@redhat.com John Mark Walker Gluster Community Guy johnmark@redhat.com

Editor's Notes

Add examples where complexity has been bad - EMC, Cisco, Brocade et al. certification made business out of complexity - if too complicated, doesn't scale
Discuss approach – how GlusterFS is unique and different from other approaches - Lessons form GNU Hurd - user space distributed storage operating system - overcome some parts of the OS: implemented scheduler, POSIX locking, RDMA, MM, cf. JVM, python, etc. - no metadata separation
If you have a bunch of files, should be as simple as an FTP server - in user space, required FUSE, POSIX translator, NAS protocol, cluster translator
Learned about missing features Found the largest problem and wanted to solve it - patterns emerged - scalable unstructured data storage was the #1 problem people wanted to solve Had a clearer idea where we wanted to go – clear direction
Standalone NFS replacement Active-active replicated storage Scalable, distributed storage .. And then scalable, replicated distributed storage + other combos
Elastic features driven by cloud and virt usage - shared storage for virtual guests - flexible, self-service storage - elastic volume management became requirement - automated provisioning of storage w/ CLI (native NFS server? Or 3.2?)
Marker famework: - story of why it's necessary - backup of data in other locales - don't need entire snapshot - users wanted to continuous, unlimited replication - don't want sysadmin intervention – on-demand - queries FS to find what files have changed - manages queue, telling rsync exactly which files to change Inotify – doesn't scale, if daemon crashes, stops tracking changes - would have to write journaling feature to maintain change queue Geo-replication – can work on high-latency, flaky networks

Gluster: where weve been - a history

Recommended

Recommended

More Related Content

Similar to Gluster: where weve been - a history

Similar to Gluster: where weve been - a history (20)

More from John Mark Walker

More from John Mark Walker (13)

Recently uploaded

Recently uploaded (20)

Gluster: where weve been - a history

Editor's Notes