SlideShare a Scribd company logo
1 of 47
Download to read offline
June-21-2019
MongoDB HA, what can go wrong?
{"name": "Igor Donchovski",
"live_in": "Skopje",
"email": "donchovski@pythian.com",
"current_role": "Lead database consultant",
"education": [{"type": "College", "name": "FEIT", "graduated": "2008", "university": "UKIM"},
{"type": "Master", "name": "FINKI", "graduated": "2013", "university": "UKIM"}],
"work": [{"role": "Web developer", "start": "2007", "end": "2012", "company": "Gord Systems"},
{"role": "DBA", "start": "2012", "end": "2014", "company": "NOVP"},
{"role": "Database consultant", "start": "2014", "end": "2016", "company": "Pythian"},
{"role": "Lead database consultant", "start": "2016", "company": "Pythian"}],
"certificates": [{"name": "C100DBA", "year": "2016", "description": "MongoDB certified DBA"}],
"social": [{"network": "LinkedIn", "link": "www.linkedin.com/in/igorle"},
{"network": "Twitter", "link": "https://twitter.com/igorle", "handle": "@igorle"}],
"interests": ["Hiking", "Biking", "Traveling"],
"hobbies": ["Painting", "Photography", "Cooking"],
"proud_of": ["Volunteering", "Helping the Community"]}
About Me
© 2019 Pythian. Confidential
• What is replica set, how replication works
• Replication concept
• Replica set features, deployment architectures
• Hidden nodes, Arbiter nodes, Priority 0 nodes
• Production failures
• Monitoring replica set
• QA
Overview
© 2019 Pythian. Confidential
Time
© 2019 Pythian. Confidential
Replication
• Group of mongod processes that maintain the same data set
• Redundancy and high availability
• Increased read capacity (scaling reads)
• Automatic failover
Replica Set
# Members # Nodes Required to Elect New Primary Fault Tolerance
3 2 1
4 3 1
5 3 2
6 4 2
7 4 3
© 2019 Pythian. Confidential
priority:1 votes:1
priority:1 votes:1 priority:1 votes:1
Replication Concept
1. Write operations go to the Primary node
2. All changes are recorded into operations log
3. Asynchronous replication to Secondary
4. Secondaries copy the Primary oplog
5. Secondary can use sync source Secondary
1.
© 2019 Pythian. Confidential
Replication Concept
1. Write operations go to the Primary node
2. All changes are recorded into operations log
3. Asynchronous replication to Secondary
4. Secondaries copy the Primary oplog
5. Secondary can use sync source Secondary
2. oplog
1.
© 2019 Pythian. Confidential
Replication Concept
1. Write operations go to the Primary node
2. All changes are recorded into operations log
3. Asynchronous replication to Secondary
4. Secondaries copy the Primary oplog
5. Secondary can use sync source Secondary
2. oplog
1.
3. 3.
© 2019 Pythian. Confidential
Replication Concept
1. Write operations go to the Primary node
2. All changes are recorded into operations log
3. Asynchronous replication to Secondary
4. Secondaries copy the Primary oplog
5. Secondary can use sync source Secondary
© 2018 Pythian. Confidential
2. oplog
1.
3. 3.
4. 4.
Replication Concept
1. Write operations go to the Primary node
2. All changes are recorded into operations log
3. Asynchronous replication to Secondary
4. Secondaries copy the Primary oplog
5. Secondary can use sync source Secondary*
*settings.chainingAllowed (true by default)
2. oplog
1.
3. 3.
4. 4.
5.
© 2019 Pythian. Confidential
Replica Set Oplog
• Special capped collection that keeps a rolling record of all operations that
modify the data stored in the databases
• Idempotent
• Default oplog size
For Unix and Windows systems
Storage Engine Default Oplog Size Lower Bound Upper Bound
In-memory 5% of physical memory 50MB 50GB
WiredTiger 5% of free disk space 990MB 50GB
MMAPv1 5% of free disk space 990MB 50GB
© 2019 Pythian. Confidential
© 2019 Pythian. Confidential
Configuration
Configuration Options
• 50 members per replica set (7 voting members)
• Arbiter node
• Priority 0 node
• Hidden node
• Delayed node
© 2019 Pythian. Confidential
• Does not hold copy of data
• Votes in elections
Arbiter Node
hidden : true
Arbiter
© 2019 Pythian. Confidential
Priority 0 Node
Priority - floating point (i.e. decimal) number between 0 and 1000
• Cannot become primary, cannot trigger election
• Visible to application (accepts reads/writes)
• Votes in elections
Secondary
priority : 0
© 2019 Pythian. Confidential
Hidden Node
• Not visible to application
• Never becomes primary, but can vote in elections
• Use cases
○ Reporting
○ Backups
hidden : truehidden: true priority:0
Secondary
hidden : true priority : 0
© 2019 Pythian. Confidential
Delayed Node
• Must be priority 0 member
• Should be hidden member (not mandatory)
• Mainly used for backups (historical snapshot of data)
• Recovery in case of human error
Secondary
slaveDelay : 3600
priority : 0
hidden : true
© 2019 Pythian. Confidential
© 2019 Pythian. Confidential
Everyone on the same page?
© 2019 Pythian. Confidential
Failures
Small Oplog Size
1. Primary/Secondary node down
○ Node failure
○ Planned maintenance
2. Automatic Failover
…… (several hours later)
3. New Primary overwrites latest oplog
4. Failed Node needs resync
MongoDB >= 3.6: db.adminCommand({replSetResizeOplog: 1, size: 32000})
© 2019 Pythian. Confidential
Arbiter Nodes
● Votes in election
● Does not hold copy of data
● If 2 nodes are down, no majority to elect
new Primary
● Fault tolerance is still 1 node
● 4 data nodes + 1 Arbiter makes more
sense
Heartbeat
© 2019 Pythian. Confidential
Priority 0 Nodes
● Application driver sends writes to Primary
● Reads go to Primary by default
● Secondaries can serve reads
● Read preference
○ primary (default)
○ primaryPreferred
○ secondary
○ secondaryPreferred
○ nearest
© 2019 Pythian. Confidential
• Primary node fails
• Replica set starts election for new Primary
• Zero nodes eligible for Primary
• Application can not send writes
• Database is read only*
*depends on read preference setting
Priority 0 Nodes
© 2019 Pythian. Confidential
Hidden Nodes
● Application driver sends writes to Primary
● Reads go to Primary by default
● Secondaries cannot serve reads
● Read preference
○ primary
© 2019 Pythian. Confidential
• Primary node fails
• Replica set starts election for new Primary
• Zero nodes eligible for Primary (priority:0)
• Application can not send writes/reads
• Downtime
Hidden Nodes
© 2019 Pythian. Confidential
• Primary node fails
• Secondary elected as new Primary
• Working set does not fit in memory
• Performance degradation
• Application stalls
Hardware
64GB RAM, 16 CPU
32GB RAM, 8 CPU 32GB RAM, 8 CPU
© 2019 Pythian. Confidential
• Dataset grows
• No Disk space on Secondary
• mongod process fails
• 2 nodes replica set
• Zero tolerance for failures
Hardware
Disk: 300GB
Disk: 300GB Disk: 200GB
© 2019 Pythian. Confidential
● Heartbeat lost
● Primary step down
● New Primary election
● Application timeout*
● Rollback
Best Practice: Test Primary step
down for your application
*Retryable writes since MongoDB 3.6
Network
© 2019 Pythian. Confidential
• All replica set members deployed in single Availability Zone
• Availability Zone #1 goes down
• Downtime
Cloud
Cloud Deployment
Region #1
Availability Zone #1
© 2019 Pythian. Confidential
● Availability Zone #1 goes down
○ New Primary elected from AZ #2
● Availability Zone #2 goes down
○ Database is read only
Cloud Deployment
© 2019 Pythian. Confidential
Cloud
Region #1
AZ#1 AZ#2
• Region #1 goes down
• Downtime
Cloud Deployment
© 2019 Pythian. Confidential
Cloud
Region #1
AZ#1 AZ#2 AZ#3
● VM2 goes down
○ Primary node has majority on VM1
● VM1 goes down
○ Database is read only
Virtualization
VMWARE
VM1 VM2
Physical Server
© 2019 Pythian. Confidential
● Replica set major version upgrade (3.6>4.0)
● Driver v3.6 not compatible with DB v4.0
● Compatibility changes
● Application cannot send requests
● Downtime
● Rollback to previous DB version
Version Upgrades
MongoDB: 3.6.4 MongoDB: 3.6.4
© 2019 Pythian. Confidential
● Replica set major version upgrade
● Promote new version as Primary
● Confirm application works
● Forget to upgrade Secondaries
● Start using new features
● New Primary elected
● Application errors
Version Upgrades
MongoDB: 3.6 MongoDB: 3.6
MongoDB: 4.0
© 2019 Pythian. Confidential
● Minor version upgrade
● Promote new version as Primary
● Confirm application works
● Forget to upgrade Secondaries
● Bug fixes in minor release
● New Primary elected
● Application errors
Version Upgrades
MongoDB: 3.6.4 MongoDB: 3.6.4
MongoDB: 3.6.8
© 2019 Pythian. Confidential
Version Upgrades
MongoDB: 3.6.8MongoDB: 3.6.8MongoDB: 3.6.8
MongoDB: 3.6.8
MongoDB: 3.6.8
MongoDB: 3.6.3
MongoDB: 3.6.3
MongoDB: 3.6.8
MongoDB: 3.6.8MongoDB: 3.6.8
MongoDB: 3.6.8
MongoDB: 3.6.8
MongoDB: 3.6.8
MongoDB: 3.6.8
© 2019 Pythian. Confidential
MongoDB: 3.6.3
● Adding index on a collection
● Connect to the Primary node
○ db.people.createIndex( { zipcode: 1 }, { background: true } )
DDL Operation
© 2019 Pythian. Confidential
● Stop one Secondary
● Restart on different port
DDL Operation
Secondary
--port=27777
© 2019 Pythian. Confidential
● Add the Index
● Rejoin to replica
● Promote Secondary as Primary
● Forget the other nodes
DDL Operation
Secondary
--port=27777
db.people.createIndex({zipcode:1})
© 2019 Pythian. Confidential
● Pick one Secondary
● db.fsyncLock()
● Take snapshot
● db.fsyncUnlock()
● Unlock fails
● Secondary starts lagging
● Primary overwrites oplog
● Secondary needs initial sync
Backups
© 2019 Pythian. Confidential
© 2019 Pythian. Confidential
Sharded Clusters
© 2019 Pythian. Confidential
Sharded Clusters
© 2019 Pythian. Confidential
Monitoring Replica Set
• Replica set has no Primary
• Number of unhealthy members is above threshold
• Replication lag is above threshold
• Replica set elected new Primary
• Host of any type has restarted
• Host of type Secondary is recovering
• Host of any type is down
• Host of any type has experienced Rollback
• Network issues between members of the replica set or cluster
• Monitoring backup status
© 2019 Pythian. Confidential
Summary
• Replica set with odd number of voting members
• Hidden or Delayed member for dedicated functions (reporting, backups …)
• Have more than one eligible Primary in the replica set
• Use multi-AZ for Cloud deployments
• Don’t deploy more than one mongod process per node/host
• Run replica set members with same hardware for all nodes
• Run replica set members with same mongo version
• Monitor your replica set status and nodes
• Monitor replication lag and Oplog size
© 2019 Pythian. Confidential
Questions?
© 2019 Pythian. Confidential
We’re Hiring!
https://www.pythian.com/careers/
© 2019 Pythian. Confidential

More Related Content

What's hot

How To Connect Spark To Your Own Datasource
How To Connect Spark To Your Own DatasourceHow To Connect Spark To Your Own Datasource
How To Connect Spark To Your Own DatasourceMongoDB
 
Webinar: Schema Patterns and Your Storage Engine
Webinar: Schema Patterns and Your Storage EngineWebinar: Schema Patterns and Your Storage Engine
Webinar: Schema Patterns and Your Storage EngineMongoDB
 
2014 05-07-fr - add dev series - session 6 - deploying your application-2
2014 05-07-fr - add dev series - session 6 - deploying your application-22014 05-07-fr - add dev series - session 6 - deploying your application-2
2014 05-07-fr - add dev series - session 6 - deploying your application-2MongoDB
 
NoSQL Analytics: JSON Data Analysis and Acceleration in MongoDB World
NoSQL Analytics: JSON Data Analysis and Acceleration in MongoDB WorldNoSQL Analytics: JSON Data Analysis and Acceleration in MongoDB World
NoSQL Analytics: JSON Data Analysis and Acceleration in MongoDB WorldAjay Gupte
 
Webinar: Architecting Secure and Compliant Applications with MongoDB
Webinar: Architecting Secure and Compliant Applications with MongoDBWebinar: Architecting Secure and Compliant Applications with MongoDB
Webinar: Architecting Secure and Compliant Applications with MongoDBMongoDB
 
Migrating to MongoDB: Best Practices
Migrating to MongoDB: Best PracticesMigrating to MongoDB: Best Practices
Migrating to MongoDB: Best PracticesMongoDB
 
Back to Basics Webinar 6: Production Deployment
Back to Basics Webinar 6: Production DeploymentBack to Basics Webinar 6: Production Deployment
Back to Basics Webinar 6: Production DeploymentMongoDB
 
Challenges with MongoDB
Challenges with MongoDBChallenges with MongoDB
Challenges with MongoDBStone Gao
 
Sharding
ShardingSharding
ShardingMongoDB
 
MongoDB - External Authentication
MongoDB - External AuthenticationMongoDB - External Authentication
MongoDB - External AuthenticationJason Terpko
 
How Thermo Fisher is Reducing Data Analysis Times from Days to Minutes with M...
How Thermo Fisher is Reducing Data Analysis Times from Days to Minutes with M...How Thermo Fisher is Reducing Data Analysis Times from Days to Minutes with M...
How Thermo Fisher is Reducing Data Analysis Times from Days to Minutes with M...MongoDB
 
Sharding
ShardingSharding
ShardingMongoDB
 
How sitecore depends on mongo db for scalability and performance, and what it...
How sitecore depends on mongo db for scalability and performance, and what it...How sitecore depends on mongo db for scalability and performance, and what it...
How sitecore depends on mongo db for scalability and performance, and what it...Antonios Giannopoulos
 
High Performance Applications with MongoDB
High Performance Applications with MongoDBHigh Performance Applications with MongoDB
High Performance Applications with MongoDBMongoDB
 
MongoDB Europe 2016 - Big Data meets Big Compute
MongoDB Europe 2016 - Big Data meets Big ComputeMongoDB Europe 2016 - Big Data meets Big Compute
MongoDB Europe 2016 - Big Data meets Big ComputeMongoDB
 

What's hot (20)

How To Connect Spark To Your Own Datasource
How To Connect Spark To Your Own DatasourceHow To Connect Spark To Your Own Datasource
How To Connect Spark To Your Own Datasource
 
MongoDB and Spark
MongoDB and SparkMongoDB and Spark
MongoDB and Spark
 
Webinar: Schema Patterns and Your Storage Engine
Webinar: Schema Patterns and Your Storage EngineWebinar: Schema Patterns and Your Storage Engine
Webinar: Schema Patterns and Your Storage Engine
 
2014 05-07-fr - add dev series - session 6 - deploying your application-2
2014 05-07-fr - add dev series - session 6 - deploying your application-22014 05-07-fr - add dev series - session 6 - deploying your application-2
2014 05-07-fr - add dev series - session 6 - deploying your application-2
 
NoSQL Analytics: JSON Data Analysis and Acceleration in MongoDB World
NoSQL Analytics: JSON Data Analysis and Acceleration in MongoDB WorldNoSQL Analytics: JSON Data Analysis and Acceleration in MongoDB World
NoSQL Analytics: JSON Data Analysis and Acceleration in MongoDB World
 
Webinar: Architecting Secure and Compliant Applications with MongoDB
Webinar: Architecting Secure and Compliant Applications with MongoDBWebinar: Architecting Secure and Compliant Applications with MongoDB
Webinar: Architecting Secure and Compliant Applications with MongoDB
 
Migrating to MongoDB: Best Practices
Migrating to MongoDB: Best PracticesMigrating to MongoDB: Best Practices
Migrating to MongoDB: Best Practices
 
Back to Basics Webinar 6: Production Deployment
Back to Basics Webinar 6: Production DeploymentBack to Basics Webinar 6: Production Deployment
Back to Basics Webinar 6: Production Deployment
 
Challenges with MongoDB
Challenges with MongoDBChallenges with MongoDB
Challenges with MongoDB
 
MongodB Internals
MongodB InternalsMongodB Internals
MongodB Internals
 
Sharding
ShardingSharding
Sharding
 
MongoDB - External Authentication
MongoDB - External AuthenticationMongoDB - External Authentication
MongoDB - External Authentication
 
How Thermo Fisher is Reducing Data Analysis Times from Days to Minutes with M...
How Thermo Fisher is Reducing Data Analysis Times from Days to Minutes with M...How Thermo Fisher is Reducing Data Analysis Times from Days to Minutes with M...
How Thermo Fisher is Reducing Data Analysis Times from Days to Minutes with M...
 
MongoDB Sharding Fundamentals
MongoDB Sharding Fundamentals MongoDB Sharding Fundamentals
MongoDB Sharding Fundamentals
 
Sharding
ShardingSharding
Sharding
 
How sitecore depends on mongo db for scalability and performance, and what it...
How sitecore depends on mongo db for scalability and performance, and what it...How sitecore depends on mongo db for scalability and performance, and what it...
How sitecore depends on mongo db for scalability and performance, and what it...
 
High Performance Applications with MongoDB
High Performance Applications with MongoDBHigh Performance Applications with MongoDB
High Performance Applications with MongoDB
 
Mongo db dhruba
Mongo db dhrubaMongo db dhruba
Mongo db dhruba
 
MongoDB Europe 2016 - Big Data meets Big Compute
MongoDB Europe 2016 - Big Data meets Big ComputeMongoDB Europe 2016 - Big Data meets Big Compute
MongoDB Europe 2016 - Big Data meets Big Compute
 
Mongo db 3.4 Overview
Mongo db 3.4 OverviewMongo db 3.4 Overview
Mongo db 3.4 Overview
 

Similar to MongoDB HA - what can go wrong

MongoDB World 2019: Unleash the Power of the MongoDB Aggregation Framework
MongoDB World 2019: Unleash the Power of the MongoDB Aggregation FrameworkMongoDB World 2019: Unleash the Power of the MongoDB Aggregation Framework
MongoDB World 2019: Unleash the Power of the MongoDB Aggregation FrameworkMongoDB
 
Building A Self Service Streaming Platform at Pinterest - Steven Bairos-Novak...
Building A Self Service Streaming Platform at Pinterest - Steven Bairos-Novak...Building A Self Service Streaming Platform at Pinterest - Steven Bairos-Novak...
Building A Self Service Streaming Platform at Pinterest - Steven Bairos-Novak...Flink Forward
 
Sidiq Permana - Building For The Next Billion Users
Sidiq Permana - Building For The Next Billion UsersSidiq Permana - Building For The Next Billion Users
Sidiq Permana - Building For The Next Billion UsersDicoding
 
MongoDB @ Fiverr: The Road to Atlas
MongoDB @ Fiverr: The Road to AtlasMongoDB @ Fiverr: The Road to Atlas
MongoDB @ Fiverr: The Road to AtlasMongoDB
 
Angular v2 et plus : le futur du développement d'applications en entreprise
Angular v2 et plus : le futur du développement d'applications en entrepriseAngular v2 et plus : le futur du développement d'applications en entreprise
Angular v2 et plus : le futur du développement d'applications en entrepriseLINAGORA
 
Splunk Phantom, the Endpoint Data Model & Splunk Security Essentials App!
Splunk Phantom, the Endpoint Data Model & Splunk Security Essentials App!Splunk Phantom, the Endpoint Data Model & Splunk Security Essentials App!
Splunk Phantom, the Endpoint Data Model & Splunk Security Essentials App!Harry McLaren
 
Modernisation of legacy PHP applications using Symfony2 - PHP Northeast Confe...
Modernisation of legacy PHP applications using Symfony2 - PHP Northeast Confe...Modernisation of legacy PHP applications using Symfony2 - PHP Northeast Confe...
Modernisation of legacy PHP applications using Symfony2 - PHP Northeast Confe...Fabrice Bernhard
 
Deploying MariaDB for HA on Google Cloud Platform
Deploying MariaDB for HA on Google Cloud PlatformDeploying MariaDB for HA on Google Cloud Platform
Deploying MariaDB for HA on Google Cloud PlatformMariaDB plc
 
Open Social Summit Korea Overview
Open Social Summit Korea OverviewOpen Social Summit Korea Overview
Open Social Summit Korea OverviewChris Schalk
 
Implementing MySQL Database-as-a-Service using open source tools
Implementing MySQL Database-as-a-Service using open source toolsImplementing MySQL Database-as-a-Service using open source tools
Implementing MySQL Database-as-a-Service using open source toolsAll Things Open
 
Android best practices 2015
Android best practices 2015Android best practices 2015
Android best practices 2015Sean Katz
 
Conquering Data Migration from Oracle to Postgres
Conquering Data Migration from Oracle to PostgresConquering Data Migration from Oracle to Postgres
Conquering Data Migration from Oracle to PostgresEDB
 
IRJET- Industry Production Manager using Raspberry Pi
IRJET-  	  Industry Production Manager using Raspberry PiIRJET-  	  Industry Production Manager using Raspberry Pi
IRJET- Industry Production Manager using Raspberry PiIRJET Journal
 
From Java Code to Java Heap: Understanding the Memory Usage of Your App - Ch...
From Java Code to Java Heap: Understanding the Memory Usage of Your App  - Ch...From Java Code to Java Heap: Understanding the Memory Usage of Your App  - Ch...
From Java Code to Java Heap: Understanding the Memory Usage of Your App - Ch...jaxLondonConference
 
A Planet-Scale Database for Low Latency Transactional Apps by Yugabyte
A Planet-Scale Database for Low Latency Transactional Apps by YugabyteA Planet-Scale Database for Low Latency Transactional Apps by Yugabyte
A Planet-Scale Database for Low Latency Transactional Apps by YugabyteCarlos Andrés García
 
A Planet-Scale Database for Low Latency Transactional Apps by Yugabyte
A Planet-Scale Database for Low Latency Transactional Apps by YugabyteA Planet-Scale Database for Low Latency Transactional Apps by Yugabyte
A Planet-Scale Database for Low Latency Transactional Apps by YugabyteVMware Tanzu
 
DevOpsDaysRiga 2018: Eric Skoglund, Lars Albertsson - Kubernetes as data plat...
DevOpsDaysRiga 2018: Eric Skoglund, Lars Albertsson - Kubernetes as data plat...DevOpsDaysRiga 2018: Eric Skoglund, Lars Albertsson - Kubernetes as data plat...
DevOpsDaysRiga 2018: Eric Skoglund, Lars Albertsson - Kubernetes as data plat...DevOpsDays Riga
 
Kubernetes as data platform
Kubernetes as data platformKubernetes as data platform
Kubernetes as data platformLars Albertsson
 

Similar to MongoDB HA - what can go wrong (20)

MongoDB World 2019: Unleash the Power of the MongoDB Aggregation Framework
MongoDB World 2019: Unleash the Power of the MongoDB Aggregation FrameworkMongoDB World 2019: Unleash the Power of the MongoDB Aggregation Framework
MongoDB World 2019: Unleash the Power of the MongoDB Aggregation Framework
 
Building A Self Service Streaming Platform at Pinterest - Steven Bairos-Novak...
Building A Self Service Streaming Platform at Pinterest - Steven Bairos-Novak...Building A Self Service Streaming Platform at Pinterest - Steven Bairos-Novak...
Building A Self Service Streaming Platform at Pinterest - Steven Bairos-Novak...
 
Sidiq Permana - Building For The Next Billion Users
Sidiq Permana - Building For The Next Billion UsersSidiq Permana - Building For The Next Billion Users
Sidiq Permana - Building For The Next Billion Users
 
MongoDB @ Fiverr: The Road to Atlas
MongoDB @ Fiverr: The Road to AtlasMongoDB @ Fiverr: The Road to Atlas
MongoDB @ Fiverr: The Road to Atlas
 
Angular v2 et plus : le futur du développement d'applications en entreprise
Angular v2 et plus : le futur du développement d'applications en entrepriseAngular v2 et plus : le futur du développement d'applications en entreprise
Angular v2 et plus : le futur du développement d'applications en entreprise
 
Scalable Application Development @ Picnic
Scalable Application Development @ PicnicScalable Application Development @ Picnic
Scalable Application Development @ Picnic
 
Splunk Phantom, the Endpoint Data Model & Splunk Security Essentials App!
Splunk Phantom, the Endpoint Data Model & Splunk Security Essentials App!Splunk Phantom, the Endpoint Data Model & Splunk Security Essentials App!
Splunk Phantom, the Endpoint Data Model & Splunk Security Essentials App!
 
Modernisation of legacy PHP applications using Symfony2 - PHP Northeast Confe...
Modernisation of legacy PHP applications using Symfony2 - PHP Northeast Confe...Modernisation of legacy PHP applications using Symfony2 - PHP Northeast Confe...
Modernisation of legacy PHP applications using Symfony2 - PHP Northeast Confe...
 
Industrialiser spark
Industrialiser sparkIndustrialiser spark
Industrialiser spark
 
Deploying MariaDB for HA on Google Cloud Platform
Deploying MariaDB for HA on Google Cloud PlatformDeploying MariaDB for HA on Google Cloud Platform
Deploying MariaDB for HA on Google Cloud Platform
 
Open Social Summit Korea Overview
Open Social Summit Korea OverviewOpen Social Summit Korea Overview
Open Social Summit Korea Overview
 
Implementing MySQL Database-as-a-Service using open source tools
Implementing MySQL Database-as-a-Service using open source toolsImplementing MySQL Database-as-a-Service using open source tools
Implementing MySQL Database-as-a-Service using open source tools
 
Android best practices 2015
Android best practices 2015Android best practices 2015
Android best practices 2015
 
Conquering Data Migration from Oracle to Postgres
Conquering Data Migration from Oracle to PostgresConquering Data Migration from Oracle to Postgres
Conquering Data Migration from Oracle to Postgres
 
IRJET- Industry Production Manager using Raspberry Pi
IRJET-  	  Industry Production Manager using Raspberry PiIRJET-  	  Industry Production Manager using Raspberry Pi
IRJET- Industry Production Manager using Raspberry Pi
 
From Java Code to Java Heap: Understanding the Memory Usage of Your App - Ch...
From Java Code to Java Heap: Understanding the Memory Usage of Your App  - Ch...From Java Code to Java Heap: Understanding the Memory Usage of Your App  - Ch...
From Java Code to Java Heap: Understanding the Memory Usage of Your App - Ch...
 
A Planet-Scale Database for Low Latency Transactional Apps by Yugabyte
A Planet-Scale Database for Low Latency Transactional Apps by YugabyteA Planet-Scale Database for Low Latency Transactional Apps by Yugabyte
A Planet-Scale Database for Low Latency Transactional Apps by Yugabyte
 
A Planet-Scale Database for Low Latency Transactional Apps by Yugabyte
A Planet-Scale Database for Low Latency Transactional Apps by YugabyteA Planet-Scale Database for Low Latency Transactional Apps by Yugabyte
A Planet-Scale Database for Low Latency Transactional Apps by Yugabyte
 
DevOpsDaysRiga 2018: Eric Skoglund, Lars Albertsson - Kubernetes as data plat...
DevOpsDaysRiga 2018: Eric Skoglund, Lars Albertsson - Kubernetes as data plat...DevOpsDaysRiga 2018: Eric Skoglund, Lars Albertsson - Kubernetes as data plat...
DevOpsDaysRiga 2018: Eric Skoglund, Lars Albertsson - Kubernetes as data plat...
 
Kubernetes as data platform
Kubernetes as data platformKubernetes as data platform
Kubernetes as data platform
 

Recently uploaded

Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...Dr.Costas Sachpazis
 
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVHARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVRajaP95
 
IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...
IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...
IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...RajaP95
 
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxDecoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxJoão Esperancinha
 
Coefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxCoefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxAsutosh Ranjan
 
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
High Profile Call Girls Nashik Megha 7001305949 Independent Escort Service Na...
High Profile Call Girls Nashik Megha 7001305949 Independent Escort Service Na...High Profile Call Girls Nashik Megha 7001305949 Independent Escort Service Na...
High Profile Call Girls Nashik Megha 7001305949 Independent Escort Service Na...Call Girls in Nagpur High Profile
 
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Dr.Costas Sachpazis
 
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130Suhani Kapoor
 
ZXCTN 5804 / ZTE PTN / ZTE POTN / ZTE 5804 PTN / ZTE POTN 5804 ( 100/200 GE Z...
ZXCTN 5804 / ZTE PTN / ZTE POTN / ZTE 5804 PTN / ZTE POTN 5804 ( 100/200 GE Z...ZXCTN 5804 / ZTE PTN / ZTE POTN / ZTE 5804 PTN / ZTE POTN 5804 ( 100/200 GE Z...
ZXCTN 5804 / ZTE PTN / ZTE POTN / ZTE 5804 PTN / ZTE POTN 5804 ( 100/200 GE Z...ZTE
 
Introduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxIntroduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxupamatechverse
 
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escortsranjana rawat
 
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).pptssuser5c9d4b1
 
Analog to Digital and Digital to Analog Converter
Analog to Digital and Digital to Analog ConverterAnalog to Digital and Digital to Analog Converter
Analog to Digital and Digital to Analog ConverterAbhinavSharma374939
 
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024Mark Billinghurst
 
Introduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptxIntroduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptxupamatechverse
 
Introduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptxIntroduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptxupamatechverse
 
What are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptxWhat are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptxwendy cai
 

Recently uploaded (20)

Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
 
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVHARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
 
IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...
IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...
IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...
 
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxDecoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
 
Coefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxCoefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptx
 
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
 
High Profile Call Girls Nashik Megha 7001305949 Independent Escort Service Na...
High Profile Call Girls Nashik Megha 7001305949 Independent Escort Service Na...High Profile Call Girls Nashik Megha 7001305949 Independent Escort Service Na...
High Profile Call Girls Nashik Megha 7001305949 Independent Escort Service Na...
 
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
 
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
 
ZXCTN 5804 / ZTE PTN / ZTE POTN / ZTE 5804 PTN / ZTE POTN 5804 ( 100/200 GE Z...
ZXCTN 5804 / ZTE PTN / ZTE POTN / ZTE 5804 PTN / ZTE POTN 5804 ( 100/200 GE Z...ZXCTN 5804 / ZTE PTN / ZTE POTN / ZTE 5804 PTN / ZTE POTN 5804 ( 100/200 GE Z...
ZXCTN 5804 / ZTE PTN / ZTE POTN / ZTE 5804 PTN / ZTE POTN 5804 ( 100/200 GE Z...
 
Introduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxIntroduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptx
 
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
 
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
 
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
 
Analog to Digital and Digital to Analog Converter
Analog to Digital and Digital to Analog ConverterAnalog to Digital and Digital to Analog Converter
Analog to Digital and Digital to Analog Converter
 
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024
 
Introduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptxIntroduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptx
 
Introduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptxIntroduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptx
 
What are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptxWhat are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptx
 

MongoDB HA - what can go wrong

  • 2. {"name": "Igor Donchovski", "live_in": "Skopje", "email": "donchovski@pythian.com", "current_role": "Lead database consultant", "education": [{"type": "College", "name": "FEIT", "graduated": "2008", "university": "UKIM"}, {"type": "Master", "name": "FINKI", "graduated": "2013", "university": "UKIM"}], "work": [{"role": "Web developer", "start": "2007", "end": "2012", "company": "Gord Systems"}, {"role": "DBA", "start": "2012", "end": "2014", "company": "NOVP"}, {"role": "Database consultant", "start": "2014", "end": "2016", "company": "Pythian"}, {"role": "Lead database consultant", "start": "2016", "company": "Pythian"}], "certificates": [{"name": "C100DBA", "year": "2016", "description": "MongoDB certified DBA"}], "social": [{"network": "LinkedIn", "link": "www.linkedin.com/in/igorle"}, {"network": "Twitter", "link": "https://twitter.com/igorle", "handle": "@igorle"}], "interests": ["Hiking", "Biking", "Traveling"], "hobbies": ["Painting", "Photography", "Cooking"], "proud_of": ["Volunteering", "Helping the Community"]} About Me © 2019 Pythian. Confidential
  • 3. • What is replica set, how replication works • Replication concept • Replica set features, deployment architectures • Hidden nodes, Arbiter nodes, Priority 0 nodes • Production failures • Monitoring replica set • QA Overview © 2019 Pythian. Confidential Time
  • 4. © 2019 Pythian. Confidential Replication
  • 5. • Group of mongod processes that maintain the same data set • Redundancy and high availability • Increased read capacity (scaling reads) • Automatic failover Replica Set # Members # Nodes Required to Elect New Primary Fault Tolerance 3 2 1 4 3 1 5 3 2 6 4 2 7 4 3 © 2019 Pythian. Confidential priority:1 votes:1 priority:1 votes:1 priority:1 votes:1
  • 6. Replication Concept 1. Write operations go to the Primary node 2. All changes are recorded into operations log 3. Asynchronous replication to Secondary 4. Secondaries copy the Primary oplog 5. Secondary can use sync source Secondary 1. © 2019 Pythian. Confidential
  • 7. Replication Concept 1. Write operations go to the Primary node 2. All changes are recorded into operations log 3. Asynchronous replication to Secondary 4. Secondaries copy the Primary oplog 5. Secondary can use sync source Secondary 2. oplog 1. © 2019 Pythian. Confidential
  • 8. Replication Concept 1. Write operations go to the Primary node 2. All changes are recorded into operations log 3. Asynchronous replication to Secondary 4. Secondaries copy the Primary oplog 5. Secondary can use sync source Secondary 2. oplog 1. 3. 3. © 2019 Pythian. Confidential
  • 9. Replication Concept 1. Write operations go to the Primary node 2. All changes are recorded into operations log 3. Asynchronous replication to Secondary 4. Secondaries copy the Primary oplog 5. Secondary can use sync source Secondary © 2018 Pythian. Confidential 2. oplog 1. 3. 3. 4. 4.
  • 10. Replication Concept 1. Write operations go to the Primary node 2. All changes are recorded into operations log 3. Asynchronous replication to Secondary 4. Secondaries copy the Primary oplog 5. Secondary can use sync source Secondary* *settings.chainingAllowed (true by default) 2. oplog 1. 3. 3. 4. 4. 5. © 2019 Pythian. Confidential
  • 11. Replica Set Oplog • Special capped collection that keeps a rolling record of all operations that modify the data stored in the databases • Idempotent • Default oplog size For Unix and Windows systems Storage Engine Default Oplog Size Lower Bound Upper Bound In-memory 5% of physical memory 50MB 50GB WiredTiger 5% of free disk space 990MB 50GB MMAPv1 5% of free disk space 990MB 50GB © 2019 Pythian. Confidential
  • 12. © 2019 Pythian. Confidential Configuration
  • 13. Configuration Options • 50 members per replica set (7 voting members) • Arbiter node • Priority 0 node • Hidden node • Delayed node © 2019 Pythian. Confidential
  • 14. • Does not hold copy of data • Votes in elections Arbiter Node hidden : true Arbiter © 2019 Pythian. Confidential
  • 15. Priority 0 Node Priority - floating point (i.e. decimal) number between 0 and 1000 • Cannot become primary, cannot trigger election • Visible to application (accepts reads/writes) • Votes in elections Secondary priority : 0 © 2019 Pythian. Confidential
  • 16. Hidden Node • Not visible to application • Never becomes primary, but can vote in elections • Use cases ○ Reporting ○ Backups hidden : truehidden: true priority:0 Secondary hidden : true priority : 0 © 2019 Pythian. Confidential
  • 17. Delayed Node • Must be priority 0 member • Should be hidden member (not mandatory) • Mainly used for backups (historical snapshot of data) • Recovery in case of human error Secondary slaveDelay : 3600 priority : 0 hidden : true © 2019 Pythian. Confidential
  • 18. © 2019 Pythian. Confidential Everyone on the same page?
  • 19. © 2019 Pythian. Confidential Failures
  • 20. Small Oplog Size 1. Primary/Secondary node down ○ Node failure ○ Planned maintenance 2. Automatic Failover …… (several hours later) 3. New Primary overwrites latest oplog 4. Failed Node needs resync MongoDB >= 3.6: db.adminCommand({replSetResizeOplog: 1, size: 32000}) © 2019 Pythian. Confidential
  • 21. Arbiter Nodes ● Votes in election ● Does not hold copy of data ● If 2 nodes are down, no majority to elect new Primary ● Fault tolerance is still 1 node ● 4 data nodes + 1 Arbiter makes more sense Heartbeat © 2019 Pythian. Confidential
  • 22. Priority 0 Nodes ● Application driver sends writes to Primary ● Reads go to Primary by default ● Secondaries can serve reads ● Read preference ○ primary (default) ○ primaryPreferred ○ secondary ○ secondaryPreferred ○ nearest © 2019 Pythian. Confidential
  • 23. • Primary node fails • Replica set starts election for new Primary • Zero nodes eligible for Primary • Application can not send writes • Database is read only* *depends on read preference setting Priority 0 Nodes © 2019 Pythian. Confidential
  • 24. Hidden Nodes ● Application driver sends writes to Primary ● Reads go to Primary by default ● Secondaries cannot serve reads ● Read preference ○ primary © 2019 Pythian. Confidential
  • 25. • Primary node fails • Replica set starts election for new Primary • Zero nodes eligible for Primary (priority:0) • Application can not send writes/reads • Downtime Hidden Nodes © 2019 Pythian. Confidential
  • 26. • Primary node fails • Secondary elected as new Primary • Working set does not fit in memory • Performance degradation • Application stalls Hardware 64GB RAM, 16 CPU 32GB RAM, 8 CPU 32GB RAM, 8 CPU © 2019 Pythian. Confidential
  • 27. • Dataset grows • No Disk space on Secondary • mongod process fails • 2 nodes replica set • Zero tolerance for failures Hardware Disk: 300GB Disk: 300GB Disk: 200GB © 2019 Pythian. Confidential
  • 28. ● Heartbeat lost ● Primary step down ● New Primary election ● Application timeout* ● Rollback Best Practice: Test Primary step down for your application *Retryable writes since MongoDB 3.6 Network © 2019 Pythian. Confidential
  • 29. • All replica set members deployed in single Availability Zone • Availability Zone #1 goes down • Downtime Cloud Cloud Deployment Region #1 Availability Zone #1 © 2019 Pythian. Confidential
  • 30. ● Availability Zone #1 goes down ○ New Primary elected from AZ #2 ● Availability Zone #2 goes down ○ Database is read only Cloud Deployment © 2019 Pythian. Confidential Cloud Region #1 AZ#1 AZ#2
  • 31. • Region #1 goes down • Downtime Cloud Deployment © 2019 Pythian. Confidential Cloud Region #1 AZ#1 AZ#2 AZ#3
  • 32. ● VM2 goes down ○ Primary node has majority on VM1 ● VM1 goes down ○ Database is read only Virtualization VMWARE VM1 VM2 Physical Server © 2019 Pythian. Confidential
  • 33. ● Replica set major version upgrade (3.6>4.0) ● Driver v3.6 not compatible with DB v4.0 ● Compatibility changes ● Application cannot send requests ● Downtime ● Rollback to previous DB version Version Upgrades MongoDB: 3.6.4 MongoDB: 3.6.4 © 2019 Pythian. Confidential
  • 34. ● Replica set major version upgrade ● Promote new version as Primary ● Confirm application works ● Forget to upgrade Secondaries ● Start using new features ● New Primary elected ● Application errors Version Upgrades MongoDB: 3.6 MongoDB: 3.6 MongoDB: 4.0 © 2019 Pythian. Confidential
  • 35. ● Minor version upgrade ● Promote new version as Primary ● Confirm application works ● Forget to upgrade Secondaries ● Bug fixes in minor release ● New Primary elected ● Application errors Version Upgrades MongoDB: 3.6.4 MongoDB: 3.6.4 MongoDB: 3.6.8 © 2019 Pythian. Confidential
  • 36. Version Upgrades MongoDB: 3.6.8MongoDB: 3.6.8MongoDB: 3.6.8 MongoDB: 3.6.8 MongoDB: 3.6.8 MongoDB: 3.6.3 MongoDB: 3.6.3 MongoDB: 3.6.8 MongoDB: 3.6.8MongoDB: 3.6.8 MongoDB: 3.6.8 MongoDB: 3.6.8 MongoDB: 3.6.8 MongoDB: 3.6.8 © 2019 Pythian. Confidential MongoDB: 3.6.3
  • 37. ● Adding index on a collection ● Connect to the Primary node ○ db.people.createIndex( { zipcode: 1 }, { background: true } ) DDL Operation © 2019 Pythian. Confidential
  • 38. ● Stop one Secondary ● Restart on different port DDL Operation Secondary --port=27777 © 2019 Pythian. Confidential
  • 39. ● Add the Index ● Rejoin to replica ● Promote Secondary as Primary ● Forget the other nodes DDL Operation Secondary --port=27777 db.people.createIndex({zipcode:1}) © 2019 Pythian. Confidential
  • 40. ● Pick one Secondary ● db.fsyncLock() ● Take snapshot ● db.fsyncUnlock() ● Unlock fails ● Secondary starts lagging ● Primary overwrites oplog ● Secondary needs initial sync Backups © 2019 Pythian. Confidential
  • 41. © 2019 Pythian. Confidential
  • 42. Sharded Clusters © 2019 Pythian. Confidential
  • 43. Sharded Clusters © 2019 Pythian. Confidential
  • 44. Monitoring Replica Set • Replica set has no Primary • Number of unhealthy members is above threshold • Replication lag is above threshold • Replica set elected new Primary • Host of any type has restarted • Host of type Secondary is recovering • Host of any type is down • Host of any type has experienced Rollback • Network issues between members of the replica set or cluster • Monitoring backup status © 2019 Pythian. Confidential
  • 45. Summary • Replica set with odd number of voting members • Hidden or Delayed member for dedicated functions (reporting, backups …) • Have more than one eligible Primary in the replica set • Use multi-AZ for Cloud deployments • Don’t deploy more than one mongod process per node/host • Run replica set members with same hardware for all nodes • Run replica set members with same mongo version • Monitor your replica set status and nodes • Monitor replication lag and Oplog size © 2019 Pythian. Confidential