SlideShare a Scribd company logo
Lifecycle of a Gluster Volume
Shreyas Siravara
Production Engineer
Automating GlusterFS @ Facebook
Stages of a Gluster Volume
1. Creation
2. Maintenance
• Software Upgrades
• Hardware Repairs
3. Decommission
Creation
• Homogenous hardware
•Bricks are the same size
•Exact same CPU, memory configuration
• Easy to debug problems
Validate Hardware
Creation
Layout Management
• Rack failure resilient layout
• Spread replicas across racks
• Automate entire process to avoid human error
• Layout of replicas supports large-scale maintenance
• Avoid data unavailability
Maintenance
Hardware Repair
• What happens if a brick needs repair?
• Some manual effort for physical repairs
• This is done with the local gluster daemons not running
• What happens if a brick comes back empty?
• Multiple replaced drives in a RAID
• SHD automatically “discovers” that the brick is empty & heals it
Maintenance
Hardware Repair
• What happens if the root drive is replaced?
• Fresh OS install
• Automated “restore” flow
• Facebook automation installs the OS
• Install Gluster
• Restore the nodes prior UUID & restore the peer list
• SHD cleans up the pending heals
Maintenance
Software Upgrades: Goals
• Goals:
• Push quickly and safely
• Avoid quorum loss & split-brains
• The customer should not know we’re doing a push
• Halt the push if we find something critical
• Code changes should not result in incompatibility between
servers & clients
Maintenance
Software Upgrades: Batching
• Create batches based on layout
• Every rack becomes a “batch”
• Batches are scheduled serially
• Concurrency within the batch
Batch 1
Rack 1
Brick 1
Brick 4
Brick 7
Batch 2
Rack 2
Brick 2
Brick 5
Brick 8
Batch 3
Rack 3
Brick 3
Brick 6
Brick 9
Maintenance
Software Upgrades: Host Procedure
• Single Host Procedure:
1. Check for quorum margin
2. Wait for pending heals to drop
3. Stop Gluster & install the new version
4. Start Gluster
Maintenance
Software Upgrades: Volume Procedure
• Volume Procedure:
• Upgrade every host in the batch
• Health-check
• Run the next batch
Batch 1
Rack 1
Brick 1
Brick 4
Brick 7
Batch 2
Rack 2
Brick 2
Brick 5
Brick 8
Batch 3
Rack 3
Brick 3
Brick 6
Brick 9
Pending Upgraded
Maintenance
Software Upgrades: Advantages & Potential Improvements
• Advantages:
• Maintain quorum
• Clients don’t need to know that a volume is being upgraded
• We should:
• Correctly drain traffic when we stop Gluster daemons
• Stop listening for new requests
• Complete outstanding I/O
Decommission
Requirements & Challenges
• Requirement:
• Replace 100% of the hardware in a Gluster volume
• Challenges:
• Volume size
• Data Integrity
• No customer impact
• SLA: No errors, low latency
Decommission
Simple Strategy: Replace-brick
• Replace bricks one-replica at a time, wait for rebuilds
• Use gluster volume replace-brick
• Good for smaller volumes, with low numbers of files
• Scales poorly with 10s of millions of files per brick
• Self-heal daemon is not yet fast enough
• Even with multi-threaded SHD
Decommission
Improved Strategy: “Block” copy + Replace-brick
xfsdump
Source Brick Dest Brick
gluster volume replace-brick
Source Brick Dest Brick
Decommission
Improved Strategy: “Block” copy + Replace-brick
• Advantages:
• 100s of MB/s to run the first copy
• Self-heal daemon just has to “top-up” the node
• Heals only the data that changed while the node was offline
• Easy to automate
• Predictable, fixed procedure
Final Thoughts
• Layout is important
• Data unavailability can be avoided
• Decompose into host-level & volume-level procedures
• Keep the procedures simple & predictable
• Avoid overly-complex automation with many edge-cases
Automating Gluster @ Facebook - Shreyas Siravara

More Related Content

What's hot

IITCC15: Xen Project 4.6 Update
IITCC15: Xen Project 4.6 UpdateIITCC15: Xen Project 4.6 Update
IITCC15: Xen Project 4.6 Update
The Linux Foundation
 
State of Gluster Performance
State of Gluster PerformanceState of Gluster Performance
State of Gluster Performance
Gluster.org
 
NantOmics
NantOmicsNantOmics
NantOmics
Ceph Community
 
Linux Block Cache Practice on Ceph BlueStore - Junxin Zhang
Linux Block Cache Practice on Ceph BlueStore - Junxin ZhangLinux Block Cache Practice on Ceph BlueStore - Junxin Zhang
Linux Block Cache Practice on Ceph BlueStore - Junxin Zhang
Ceph Community
 
Redis Day Keynote Salvatore Sanfillipo Redis Labs
Redis Day Keynote Salvatore Sanfillipo Redis LabsRedis Day Keynote Salvatore Sanfillipo Redis Labs
Redis Day Keynote Salvatore Sanfillipo Redis Labs
Redis Labs
 
Challenges with Gluster and Persistent Memory with Dan Lambright
Challenges with Gluster and Persistent Memory with Dan LambrightChallenges with Gluster and Persistent Memory with Dan Lambright
Challenges with Gluster and Persistent Memory with Dan Lambright
Gluster.org
 
2021.06. Ceph Project Update
2021.06. Ceph Project Update2021.06. Ceph Project Update
2021.06. Ceph Project Update
Ceph Community
 
Gluster overview & future directions vault 2015
Gluster overview & future directions vault 2015Gluster overview & future directions vault 2015
Gluster overview & future directions vault 2015
Vijay Bellur
 
Ceph Tech Talk: Ceph at DigitalOcean
Ceph Tech Talk: Ceph at DigitalOceanCeph Tech Talk: Ceph at DigitalOcean
Ceph Tech Talk: Ceph at DigitalOcean
Ceph Community
 
2021.02 new in Ceph Pacific Dashboard
2021.02 new in Ceph Pacific Dashboard2021.02 new in Ceph Pacific Dashboard
2021.02 new in Ceph Pacific Dashboard
Ceph Community
 
Introduction to highly_availablenfs_server_on_scale-out_storage_systems_based...
Introduction to highly_availablenfs_server_on_scale-out_storage_systems_based...Introduction to highly_availablenfs_server_on_scale-out_storage_systems_based...
Introduction to highly_availablenfs_server_on_scale-out_storage_systems_based...
Gluster.org
 
Performance tuning in BlueStore & RocksDB - Li Xiaoyan
Performance tuning in BlueStore & RocksDB - Li XiaoyanPerformance tuning in BlueStore & RocksDB - Li Xiaoyan
Performance tuning in BlueStore & RocksDB - Li Xiaoyan
Ceph Community
 
On demand file-caching_-_gustavo_brand
On demand file-caching_-_gustavo_brandOn demand file-caching_-_gustavo_brand
On demand file-caching_-_gustavo_brand
Gluster.org
 
Looking towards an official cassandra sidecar netflix
Looking towards an official cassandra sidecar   netflixLooking towards an official cassandra sidecar   netflix
Looking towards an official cassandra sidecar netflix
Vinay Kumar Chella
 
Protecting the Galaxy - Multi-Region Disaster Recovery with OpenStack and Ceph
Protecting the Galaxy - Multi-Region Disaster Recovery with OpenStack and CephProtecting the Galaxy - Multi-Region Disaster Recovery with OpenStack and Ceph
Protecting the Galaxy - Multi-Region Disaster Recovery with OpenStack and Ceph
Sean Cohen
 
Managing Redis with Kubernetes - Kelsey Hightower, Google
Managing Redis with Kubernetes - Kelsey Hightower, GoogleManaging Redis with Kubernetes - Kelsey Hightower, Google
Managing Redis with Kubernetes - Kelsey Hightower, Google
Redis Labs
 
CEPH DAY BERLIN - CEPH ON THE BRAIN!
CEPH DAY BERLIN - CEPH ON THE BRAIN!CEPH DAY BERLIN - CEPH ON THE BRAIN!
CEPH DAY BERLIN - CEPH ON THE BRAIN!
Ceph Community
 
Openvz booth
Openvz boothOpenvz booth
Openvz booth
OpenVZ
 
RADOS improvements and roadmap - Greg Farnum, Josh Durgin, Kefu Chai
RADOS improvements and roadmap - Greg Farnum, Josh Durgin, Kefu ChaiRADOS improvements and roadmap - Greg Farnum, Josh Durgin, Kefu Chai
RADOS improvements and roadmap - Greg Farnum, Josh Durgin, Kefu Chai
Ceph Community
 
RBD: What will the future bring? - Jason Dillaman
RBD: What will the future bring? - Jason DillamanRBD: What will the future bring? - Jason Dillaman
RBD: What will the future bring? - Jason Dillaman
Ceph Community
 

What's hot (20)

IITCC15: Xen Project 4.6 Update
IITCC15: Xen Project 4.6 UpdateIITCC15: Xen Project 4.6 Update
IITCC15: Xen Project 4.6 Update
 
State of Gluster Performance
State of Gluster PerformanceState of Gluster Performance
State of Gluster Performance
 
NantOmics
NantOmicsNantOmics
NantOmics
 
Linux Block Cache Practice on Ceph BlueStore - Junxin Zhang
Linux Block Cache Practice on Ceph BlueStore - Junxin ZhangLinux Block Cache Practice on Ceph BlueStore - Junxin Zhang
Linux Block Cache Practice on Ceph BlueStore - Junxin Zhang
 
Redis Day Keynote Salvatore Sanfillipo Redis Labs
Redis Day Keynote Salvatore Sanfillipo Redis LabsRedis Day Keynote Salvatore Sanfillipo Redis Labs
Redis Day Keynote Salvatore Sanfillipo Redis Labs
 
Challenges with Gluster and Persistent Memory with Dan Lambright
Challenges with Gluster and Persistent Memory with Dan LambrightChallenges with Gluster and Persistent Memory with Dan Lambright
Challenges with Gluster and Persistent Memory with Dan Lambright
 
2021.06. Ceph Project Update
2021.06. Ceph Project Update2021.06. Ceph Project Update
2021.06. Ceph Project Update
 
Gluster overview & future directions vault 2015
Gluster overview & future directions vault 2015Gluster overview & future directions vault 2015
Gluster overview & future directions vault 2015
 
Ceph Tech Talk: Ceph at DigitalOcean
Ceph Tech Talk: Ceph at DigitalOceanCeph Tech Talk: Ceph at DigitalOcean
Ceph Tech Talk: Ceph at DigitalOcean
 
2021.02 new in Ceph Pacific Dashboard
2021.02 new in Ceph Pacific Dashboard2021.02 new in Ceph Pacific Dashboard
2021.02 new in Ceph Pacific Dashboard
 
Introduction to highly_availablenfs_server_on_scale-out_storage_systems_based...
Introduction to highly_availablenfs_server_on_scale-out_storage_systems_based...Introduction to highly_availablenfs_server_on_scale-out_storage_systems_based...
Introduction to highly_availablenfs_server_on_scale-out_storage_systems_based...
 
Performance tuning in BlueStore & RocksDB - Li Xiaoyan
Performance tuning in BlueStore & RocksDB - Li XiaoyanPerformance tuning in BlueStore & RocksDB - Li Xiaoyan
Performance tuning in BlueStore & RocksDB - Li Xiaoyan
 
On demand file-caching_-_gustavo_brand
On demand file-caching_-_gustavo_brandOn demand file-caching_-_gustavo_brand
On demand file-caching_-_gustavo_brand
 
Looking towards an official cassandra sidecar netflix
Looking towards an official cassandra sidecar   netflixLooking towards an official cassandra sidecar   netflix
Looking towards an official cassandra sidecar netflix
 
Protecting the Galaxy - Multi-Region Disaster Recovery with OpenStack and Ceph
Protecting the Galaxy - Multi-Region Disaster Recovery with OpenStack and CephProtecting the Galaxy - Multi-Region Disaster Recovery with OpenStack and Ceph
Protecting the Galaxy - Multi-Region Disaster Recovery with OpenStack and Ceph
 
Managing Redis with Kubernetes - Kelsey Hightower, Google
Managing Redis with Kubernetes - Kelsey Hightower, GoogleManaging Redis with Kubernetes - Kelsey Hightower, Google
Managing Redis with Kubernetes - Kelsey Hightower, Google
 
CEPH DAY BERLIN - CEPH ON THE BRAIN!
CEPH DAY BERLIN - CEPH ON THE BRAIN!CEPH DAY BERLIN - CEPH ON THE BRAIN!
CEPH DAY BERLIN - CEPH ON THE BRAIN!
 
Openvz booth
Openvz boothOpenvz booth
Openvz booth
 
RADOS improvements and roadmap - Greg Farnum, Josh Durgin, Kefu Chai
RADOS improvements and roadmap - Greg Farnum, Josh Durgin, Kefu ChaiRADOS improvements and roadmap - Greg Farnum, Josh Durgin, Kefu Chai
RADOS improvements and roadmap - Greg Farnum, Josh Durgin, Kefu Chai
 
RBD: What will the future bring? - Jason Dillaman
RBD: What will the future bring? - Jason DillamanRBD: What will the future bring? - Jason Dillaman
RBD: What will the future bring? - Jason Dillaman
 

Similar to Automating Gluster @ Facebook - Shreyas Siravara

Kubernetes at NU.nl (Kubernetes meetup 2019-09-05)
Kubernetes at NU.nl   (Kubernetes meetup 2019-09-05)Kubernetes at NU.nl   (Kubernetes meetup 2019-09-05)
Kubernetes at NU.nl (Kubernetes meetup 2019-09-05)
Tibo Beijen
 
NGENSTOR_ODA_P2V_V5
NGENSTOR_ODA_P2V_V5NGENSTOR_ODA_P2V_V5
NGENSTOR_ODA_P2V_V5UniFabric
 
The Hard Problems of Continuous Deployment
The Hard Problems of Continuous DeploymentThe Hard Problems of Continuous Deployment
The Hard Problems of Continuous DeploymentTimothy Fitz
 
Geek Sync | Top 5 Tips to Keep Always On Always Humming and Users Happy
Geek Sync | Top 5 Tips to Keep Always On Always Humming and Users HappyGeek Sync | Top 5 Tips to Keep Always On Always Humming and Users Happy
Geek Sync | Top 5 Tips to Keep Always On Always Humming and Users Happy
IDERA Software
 
Diagnosing Problems in Production (Nov 2015)
Diagnosing Problems in Production (Nov 2015)Diagnosing Problems in Production (Nov 2015)
Diagnosing Problems in Production (Nov 2015)
Jon Haddad
 
Advanced Operations
Advanced OperationsAdvanced Operations
Advanced Operations
DataStax Academy
 
Technical track-afterimaging Progress Database
Technical track-afterimaging Progress DatabaseTechnical track-afterimaging Progress Database
Technical track-afterimaging Progress Database
Vinh Nguyen
 
Orleans gdc2019
Orleans gdc2019Orleans gdc2019
Orleans gdc2019
Crystin Cox
 
Make It Cooler: Using Decentralized Version Control
Make It Cooler: Using Decentralized Version ControlMake It Cooler: Using Decentralized Version Control
Make It Cooler: Using Decentralized Version Control
indiver
 
SQLDay2013_Denny Cherry - SQLServer2012inaHighlyAvailableWorld
SQLDay2013_Denny Cherry - SQLServer2012inaHighlyAvailableWorldSQLDay2013_Denny Cherry - SQLServer2012inaHighlyAvailableWorld
SQLDay2013_Denny Cherry - SQLServer2012inaHighlyAvailableWorldPolish SQL Server User Group
 
Nagios XI Best Practices
Nagios XI Best PracticesNagios XI Best Practices
Nagios XI Best Practices
Nagios
 
Scaling and Managing Selenium Grid
Scaling and Managing Selenium GridScaling and Managing Selenium Grid
Scaling and Managing Selenium Grid
dimakovalenko
 
Cassandra Day Atlanta 2015: Diagnosing Problems in Production
Cassandra Day Atlanta 2015: Diagnosing Problems in ProductionCassandra Day Atlanta 2015: Diagnosing Problems in Production
Cassandra Day Atlanta 2015: Diagnosing Problems in Production
DataStax Academy
 
Cassandra Day Chicago 2015: Diagnosing Problems in Production
Cassandra Day Chicago 2015: Diagnosing Problems in ProductionCassandra Day Chicago 2015: Diagnosing Problems in Production
Cassandra Day Chicago 2015: Diagnosing Problems in Production
DataStax Academy
 
Cassandra Day London 2015: Diagnosing Problems in Production
Cassandra Day London 2015: Diagnosing Problems in ProductionCassandra Day London 2015: Diagnosing Problems in Production
Cassandra Day London 2015: Diagnosing Problems in Production
DataStax Academy
 
Webinar: Diagnosing Apache Cassandra Problems in Production
Webinar: Diagnosing Apache Cassandra Problems in ProductionWebinar: Diagnosing Apache Cassandra Problems in Production
Webinar: Diagnosing Apache Cassandra Problems in Production
DataStax Academy
 
Webinar: Diagnosing Apache Cassandra Problems in Production
Webinar: Diagnosing Apache Cassandra Problems in ProductionWebinar: Diagnosing Apache Cassandra Problems in Production
Webinar: Diagnosing Apache Cassandra Problems in Production
DataStax Academy
 
Diagnosing Problems in Production - Cassandra
Diagnosing Problems in Production - CassandraDiagnosing Problems in Production - Cassandra
Diagnosing Problems in Production - Cassandra
Jon Haddad
 
stackconf 2022: Infrastructure Automation (anti) patterns
stackconf 2022: Infrastructure Automation (anti) patternsstackconf 2022: Infrastructure Automation (anti) patterns
stackconf 2022: Infrastructure Automation (anti) patterns
NETWAYS
 
Infrastructure as Code Patterns
Infrastructure as Code PatternsInfrastructure as Code Patterns
Infrastructure as Code Patterns
Kris Buytaert
 

Similar to Automating Gluster @ Facebook - Shreyas Siravara (20)

Kubernetes at NU.nl (Kubernetes meetup 2019-09-05)
Kubernetes at NU.nl   (Kubernetes meetup 2019-09-05)Kubernetes at NU.nl   (Kubernetes meetup 2019-09-05)
Kubernetes at NU.nl (Kubernetes meetup 2019-09-05)
 
NGENSTOR_ODA_P2V_V5
NGENSTOR_ODA_P2V_V5NGENSTOR_ODA_P2V_V5
NGENSTOR_ODA_P2V_V5
 
The Hard Problems of Continuous Deployment
The Hard Problems of Continuous DeploymentThe Hard Problems of Continuous Deployment
The Hard Problems of Continuous Deployment
 
Geek Sync | Top 5 Tips to Keep Always On Always Humming and Users Happy
Geek Sync | Top 5 Tips to Keep Always On Always Humming and Users HappyGeek Sync | Top 5 Tips to Keep Always On Always Humming and Users Happy
Geek Sync | Top 5 Tips to Keep Always On Always Humming and Users Happy
 
Diagnosing Problems in Production (Nov 2015)
Diagnosing Problems in Production (Nov 2015)Diagnosing Problems in Production (Nov 2015)
Diagnosing Problems in Production (Nov 2015)
 
Advanced Operations
Advanced OperationsAdvanced Operations
Advanced Operations
 
Technical track-afterimaging Progress Database
Technical track-afterimaging Progress DatabaseTechnical track-afterimaging Progress Database
Technical track-afterimaging Progress Database
 
Orleans gdc2019
Orleans gdc2019Orleans gdc2019
Orleans gdc2019
 
Make It Cooler: Using Decentralized Version Control
Make It Cooler: Using Decentralized Version ControlMake It Cooler: Using Decentralized Version Control
Make It Cooler: Using Decentralized Version Control
 
SQLDay2013_Denny Cherry - SQLServer2012inaHighlyAvailableWorld
SQLDay2013_Denny Cherry - SQLServer2012inaHighlyAvailableWorldSQLDay2013_Denny Cherry - SQLServer2012inaHighlyAvailableWorld
SQLDay2013_Denny Cherry - SQLServer2012inaHighlyAvailableWorld
 
Nagios XI Best Practices
Nagios XI Best PracticesNagios XI Best Practices
Nagios XI Best Practices
 
Scaling and Managing Selenium Grid
Scaling and Managing Selenium GridScaling and Managing Selenium Grid
Scaling and Managing Selenium Grid
 
Cassandra Day Atlanta 2015: Diagnosing Problems in Production
Cassandra Day Atlanta 2015: Diagnosing Problems in ProductionCassandra Day Atlanta 2015: Diagnosing Problems in Production
Cassandra Day Atlanta 2015: Diagnosing Problems in Production
 
Cassandra Day Chicago 2015: Diagnosing Problems in Production
Cassandra Day Chicago 2015: Diagnosing Problems in ProductionCassandra Day Chicago 2015: Diagnosing Problems in Production
Cassandra Day Chicago 2015: Diagnosing Problems in Production
 
Cassandra Day London 2015: Diagnosing Problems in Production
Cassandra Day London 2015: Diagnosing Problems in ProductionCassandra Day London 2015: Diagnosing Problems in Production
Cassandra Day London 2015: Diagnosing Problems in Production
 
Webinar: Diagnosing Apache Cassandra Problems in Production
Webinar: Diagnosing Apache Cassandra Problems in ProductionWebinar: Diagnosing Apache Cassandra Problems in Production
Webinar: Diagnosing Apache Cassandra Problems in Production
 
Webinar: Diagnosing Apache Cassandra Problems in Production
Webinar: Diagnosing Apache Cassandra Problems in ProductionWebinar: Diagnosing Apache Cassandra Problems in Production
Webinar: Diagnosing Apache Cassandra Problems in Production
 
Diagnosing Problems in Production - Cassandra
Diagnosing Problems in Production - CassandraDiagnosing Problems in Production - Cassandra
Diagnosing Problems in Production - Cassandra
 
stackconf 2022: Infrastructure Automation (anti) patterns
stackconf 2022: Infrastructure Automation (anti) patternsstackconf 2022: Infrastructure Automation (anti) patterns
stackconf 2022: Infrastructure Automation (anti) patterns
 
Infrastructure as Code Patterns
Infrastructure as Code PatternsInfrastructure as Code Patterns
Infrastructure as Code Patterns
 

More from Gluster.org

nfusr: a new userspace NFS client based on libnfs - Shreyas Siravara
nfusr: a new userspace NFS client based on libnfs - Shreyas Siravaranfusr: a new userspace NFS client based on libnfs - Shreyas Siravara
nfusr: a new userspace NFS client based on libnfs - Shreyas Siravara
Gluster.org
 
Facebook’s upstream approach to GlusterFS - David Hasson
Facebook’s upstream approach to GlusterFS  - David HassonFacebook’s upstream approach to GlusterFS  - David Hasson
Facebook’s upstream approach to GlusterFS - David Hasson
Gluster.org
 
Throttling Traffic at Facebook Scale
Throttling Traffic at Facebook ScaleThrottling Traffic at Facebook Scale
Throttling Traffic at Facebook Scale
Gluster.org
 
Gluster Metrics: why they are crucial for running stable deployments of all s...
Gluster Metrics: why they are crucial for running stable deployments of all s...Gluster Metrics: why they are crucial for running stable deployments of all s...
Gluster Metrics: why they are crucial for running stable deployments of all s...
Gluster.org
 
Up and Running with Glusto & Glusto-Tests in 5 Minutes (or less)
Up and Running with Glusto & Glusto-Tests in 5 Minutes (or less)Up and Running with Glusto & Glusto-Tests in 5 Minutes (or less)
Up and Running with Glusto & Glusto-Tests in 5 Minutes (or less)
Gluster.org
 
Releases: What are contributors responsible for
Releases: What are contributors responsible forReleases: What are contributors responsible for
Releases: What are contributors responsible for
Gluster.org
 
RIO Distribution: Reconstructing the onion - Shyamsundar Ranganathan
RIO Distribution: Reconstructing the onion - Shyamsundar RanganathanRIO Distribution: Reconstructing the onion - Shyamsundar Ranganathan
RIO Distribution: Reconstructing the onion - Shyamsundar Ranganathan
Gluster.org
 
Gluster and Kubernetes
Gluster and KubernetesGluster and Kubernetes
Gluster and Kubernetes
Gluster.org
 
Native Clients, more the merrier with GFProxy!
Native Clients, more the merrier with GFProxy!Native Clients, more the merrier with GFProxy!
Native Clients, more the merrier with GFProxy!
Gluster.org
 
GlusterD-2.0: What's Happening? - Kaushal Madappa
GlusterD-2.0: What's Happening? - Kaushal MadappaGlusterD-2.0: What's Happening? - Kaushal Madappa
GlusterD-2.0: What's Happening? - Kaushal Madappa
Gluster.org
 
Scalability and Performance of CNS 3.6
Scalability and Performance of CNS 3.6Scalability and Performance of CNS 3.6
Scalability and Performance of CNS 3.6
Gluster.org
 
What Makes Us Fail
What Makes Us FailWhat Makes Us Fail
What Makes Us Fail
Gluster.org
 
Gluster as Native Storage for Containers - past, present and future
Gluster as Native Storage for Containers - past, present and futureGluster as Native Storage for Containers - past, present and future
Gluster as Native Storage for Containers - past, present and future
Gluster.org
 
Heketi Functionality into Glusterd2
Heketi Functionality into Glusterd2Heketi Functionality into Glusterd2
Heketi Functionality into Glusterd2
Gluster.org
 
Hands On Gluster with Jeff Darcy
Hands On Gluster with Jeff DarcyHands On Gluster with Jeff Darcy
Hands On Gluster with Jeff Darcy
Gluster.org
 
Architecture of the High Availability Solution for Ganesha and Samba with Kal...
Architecture of the High Availability Solution for Ganesha and Samba with Kal...Architecture of the High Availability Solution for Ganesha and Samba with Kal...
Architecture of the High Availability Solution for Ganesha and Samba with Kal...
Gluster.org
 
Gluster Containerized Storage for Cloud Applications
Gluster Containerized Storage for Cloud ApplicationsGluster Containerized Storage for Cloud Applications
Gluster Containerized Storage for Cloud Applications
Gluster.org
 
Gluster as Block Store in Containers
Gluster as Block Store in ContainersGluster as Block Store in Containers
Gluster as Block Store in Containers
Gluster.org
 
Deploying pNFS over Distributed File Storage w/ Jiffin Tony Thottan and Niels...
Deploying pNFS over Distributed File Storage w/ Jiffin Tony Thottan and Niels...Deploying pNFS over Distributed File Storage w/ Jiffin Tony Thottan and Niels...
Deploying pNFS over Distributed File Storage w/ Jiffin Tony Thottan and Niels...
Gluster.org
 
Sharding: Past, Present and Future with Krutika Dhananjay
Sharding: Past, Present and Future with Krutika DhananjaySharding: Past, Present and Future with Krutika Dhananjay
Sharding: Past, Present and Future with Krutika Dhananjay
Gluster.org
 

More from Gluster.org (20)

nfusr: a new userspace NFS client based on libnfs - Shreyas Siravara
nfusr: a new userspace NFS client based on libnfs - Shreyas Siravaranfusr: a new userspace NFS client based on libnfs - Shreyas Siravara
nfusr: a new userspace NFS client based on libnfs - Shreyas Siravara
 
Facebook’s upstream approach to GlusterFS - David Hasson
Facebook’s upstream approach to GlusterFS  - David HassonFacebook’s upstream approach to GlusterFS  - David Hasson
Facebook’s upstream approach to GlusterFS - David Hasson
 
Throttling Traffic at Facebook Scale
Throttling Traffic at Facebook ScaleThrottling Traffic at Facebook Scale
Throttling Traffic at Facebook Scale
 
Gluster Metrics: why they are crucial for running stable deployments of all s...
Gluster Metrics: why they are crucial for running stable deployments of all s...Gluster Metrics: why they are crucial for running stable deployments of all s...
Gluster Metrics: why they are crucial for running stable deployments of all s...
 
Up and Running with Glusto & Glusto-Tests in 5 Minutes (or less)
Up and Running with Glusto & Glusto-Tests in 5 Minutes (or less)Up and Running with Glusto & Glusto-Tests in 5 Minutes (or less)
Up and Running with Glusto & Glusto-Tests in 5 Minutes (or less)
 
Releases: What are contributors responsible for
Releases: What are contributors responsible forReleases: What are contributors responsible for
Releases: What are contributors responsible for
 
RIO Distribution: Reconstructing the onion - Shyamsundar Ranganathan
RIO Distribution: Reconstructing the onion - Shyamsundar RanganathanRIO Distribution: Reconstructing the onion - Shyamsundar Ranganathan
RIO Distribution: Reconstructing the onion - Shyamsundar Ranganathan
 
Gluster and Kubernetes
Gluster and KubernetesGluster and Kubernetes
Gluster and Kubernetes
 
Native Clients, more the merrier with GFProxy!
Native Clients, more the merrier with GFProxy!Native Clients, more the merrier with GFProxy!
Native Clients, more the merrier with GFProxy!
 
GlusterD-2.0: What's Happening? - Kaushal Madappa
GlusterD-2.0: What's Happening? - Kaushal MadappaGlusterD-2.0: What's Happening? - Kaushal Madappa
GlusterD-2.0: What's Happening? - Kaushal Madappa
 
Scalability and Performance of CNS 3.6
Scalability and Performance of CNS 3.6Scalability and Performance of CNS 3.6
Scalability and Performance of CNS 3.6
 
What Makes Us Fail
What Makes Us FailWhat Makes Us Fail
What Makes Us Fail
 
Gluster as Native Storage for Containers - past, present and future
Gluster as Native Storage for Containers - past, present and futureGluster as Native Storage for Containers - past, present and future
Gluster as Native Storage for Containers - past, present and future
 
Heketi Functionality into Glusterd2
Heketi Functionality into Glusterd2Heketi Functionality into Glusterd2
Heketi Functionality into Glusterd2
 
Hands On Gluster with Jeff Darcy
Hands On Gluster with Jeff DarcyHands On Gluster with Jeff Darcy
Hands On Gluster with Jeff Darcy
 
Architecture of the High Availability Solution for Ganesha and Samba with Kal...
Architecture of the High Availability Solution for Ganesha and Samba with Kal...Architecture of the High Availability Solution for Ganesha and Samba with Kal...
Architecture of the High Availability Solution for Ganesha and Samba with Kal...
 
Gluster Containerized Storage for Cloud Applications
Gluster Containerized Storage for Cloud ApplicationsGluster Containerized Storage for Cloud Applications
Gluster Containerized Storage for Cloud Applications
 
Gluster as Block Store in Containers
Gluster as Block Store in ContainersGluster as Block Store in Containers
Gluster as Block Store in Containers
 
Deploying pNFS over Distributed File Storage w/ Jiffin Tony Thottan and Niels...
Deploying pNFS over Distributed File Storage w/ Jiffin Tony Thottan and Niels...Deploying pNFS over Distributed File Storage w/ Jiffin Tony Thottan and Niels...
Deploying pNFS over Distributed File Storage w/ Jiffin Tony Thottan and Niels...
 
Sharding: Past, Present and Future with Krutika Dhananjay
Sharding: Past, Present and Future with Krutika DhananjaySharding: Past, Present and Future with Krutika Dhananjay
Sharding: Past, Present and Future with Krutika Dhananjay
 

Recently uploaded

From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
Product School
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Jeffrey Haguewood
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
Cheryl Hung
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
DianaGray10
 
Elevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object CalisthenicsElevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object Calisthenics
Dorra BARTAGUIZ
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
Elena Simperl
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
UiPathCommunity
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
James Anderson
 
Generating a custom Ruby SDK for your web service or Rails API using Smithy
Generating a custom Ruby SDK for your web service or Rails API using SmithyGenerating a custom Ruby SDK for your web service or Rails API using Smithy
Generating a custom Ruby SDK for your web service or Rails API using Smithy
g2nightmarescribd
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Product School
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
Sri Ambati
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
Frank van Harmelen
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
Alison B. Lowndes
 
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Tobias Schneck
 
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
Product School
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
DianaGray10
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
RTTS
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
Jemma Hussein Allen
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
DanBrown980551
 
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Product School
 

Recently uploaded (20)

From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
 
Elevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object CalisthenicsElevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object Calisthenics
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
 
Generating a custom Ruby SDK for your web service or Rails API using Smithy
Generating a custom Ruby SDK for your web service or Rails API using SmithyGenerating a custom Ruby SDK for your web service or Rails API using Smithy
Generating a custom Ruby SDK for your web service or Rails API using Smithy
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
 
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
 
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
 
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
 

Automating Gluster @ Facebook - Shreyas Siravara

  • 1.
  • 2. Lifecycle of a Gluster Volume Shreyas Siravara Production Engineer Automating GlusterFS @ Facebook
  • 3. Stages of a Gluster Volume 1. Creation 2. Maintenance • Software Upgrades • Hardware Repairs 3. Decommission
  • 4. Creation • Homogenous hardware •Bricks are the same size •Exact same CPU, memory configuration • Easy to debug problems Validate Hardware
  • 5. Creation Layout Management • Rack failure resilient layout • Spread replicas across racks • Automate entire process to avoid human error • Layout of replicas supports large-scale maintenance • Avoid data unavailability
  • 6. Maintenance Hardware Repair • What happens if a brick needs repair? • Some manual effort for physical repairs • This is done with the local gluster daemons not running • What happens if a brick comes back empty? • Multiple replaced drives in a RAID • SHD automatically “discovers” that the brick is empty & heals it
  • 7. Maintenance Hardware Repair • What happens if the root drive is replaced? • Fresh OS install • Automated “restore” flow • Facebook automation installs the OS • Install Gluster • Restore the nodes prior UUID & restore the peer list • SHD cleans up the pending heals
  • 8. Maintenance Software Upgrades: Goals • Goals: • Push quickly and safely • Avoid quorum loss & split-brains • The customer should not know we’re doing a push • Halt the push if we find something critical • Code changes should not result in incompatibility between servers & clients
  • 9. Maintenance Software Upgrades: Batching • Create batches based on layout • Every rack becomes a “batch” • Batches are scheduled serially • Concurrency within the batch Batch 1 Rack 1 Brick 1 Brick 4 Brick 7 Batch 2 Rack 2 Brick 2 Brick 5 Brick 8 Batch 3 Rack 3 Brick 3 Brick 6 Brick 9
  • 10. Maintenance Software Upgrades: Host Procedure • Single Host Procedure: 1. Check for quorum margin 2. Wait for pending heals to drop 3. Stop Gluster & install the new version 4. Start Gluster
  • 11. Maintenance Software Upgrades: Volume Procedure • Volume Procedure: • Upgrade every host in the batch • Health-check • Run the next batch Batch 1 Rack 1 Brick 1 Brick 4 Brick 7 Batch 2 Rack 2 Brick 2 Brick 5 Brick 8 Batch 3 Rack 3 Brick 3 Brick 6 Brick 9 Pending Upgraded
  • 12. Maintenance Software Upgrades: Advantages & Potential Improvements • Advantages: • Maintain quorum • Clients don’t need to know that a volume is being upgraded • We should: • Correctly drain traffic when we stop Gluster daemons • Stop listening for new requests • Complete outstanding I/O
  • 13. Decommission Requirements & Challenges • Requirement: • Replace 100% of the hardware in a Gluster volume • Challenges: • Volume size • Data Integrity • No customer impact • SLA: No errors, low latency
  • 14. Decommission Simple Strategy: Replace-brick • Replace bricks one-replica at a time, wait for rebuilds • Use gluster volume replace-brick • Good for smaller volumes, with low numbers of files • Scales poorly with 10s of millions of files per brick • Self-heal daemon is not yet fast enough • Even with multi-threaded SHD
  • 15. Decommission Improved Strategy: “Block” copy + Replace-brick xfsdump Source Brick Dest Brick gluster volume replace-brick Source Brick Dest Brick
  • 16. Decommission Improved Strategy: “Block” copy + Replace-brick • Advantages: • 100s of MB/s to run the first copy • Self-heal daemon just has to “top-up” the node • Heals only the data that changed while the node was offline • Easy to automate • Predictable, fixed procedure
  • 17. Final Thoughts • Layout is important • Data unavailability can be avoided • Decompose into host-level & volume-level procedures • Keep the procedures simple & predictable • Avoid overly-complex automation with many edge-cases