SlideShare a Scribd company logo
June-21-2019
MongoDB HA, what can go wrong?
{"name": "Igor Donchovski",
"live_in": "Skopje",
"email": "donchovski@pythian.com",
"current_role": "Lead database consultant",
"education": [{"type": "College", "name": "FEIT", "graduated": "2008", "university": "UKIM"},
{"type": "Master", "name": "FINKI", "graduated": "2013", "university": "UKIM"}],
"work": [{"role": "Web developer", "start": "2007", "end": "2012", "company": "Gord Systems"},
{"role": "DBA", "start": "2012", "end": "2014", "company": "NOVP"},
{"role": "Database consultant", "start": "2014", "end": "2016", "company": "Pythian"},
{"role": "Lead database consultant", "start": "2016", "company": "Pythian"}],
"certificates": [{"name": "C100DBA", "year": "2016", "description": "MongoDB certified DBA"}],
"social": [{"network": "LinkedIn", "link": "www.linkedin.com/in/igorle"},
{"network": "Twitter", "link": "https://twitter.com/igorle", "handle": "@igorle"}],
"interests": ["Hiking", "Biking", "Traveling"],
"hobbies": ["Painting", "Photography", "Cooking"],
"proud_of": ["Volunteering", "Helping the Community"]}
About Me
© 2019 Pythian. Confidential
• What is replica set, how replication works
• Replication concept
• Replica set features, deployment architectures
• Hidden nodes, Arbiter nodes, Priority 0 nodes
• Production failures
• Monitoring replica set
• QA
Overview
© 2019 Pythian. Confidential
Time
© 2019 Pythian. Confidential
Replication
• Group of mongod processes that maintain the same data set
• Redundancy and high availability
• Increased read capacity (scaling reads)
• Automatic failover
Replica Set
# Members # Nodes Required to Elect New Primary Fault Tolerance
3 2 1
4 3 1
5 3 2
6 4 2
7 4 3
© 2019 Pythian. Confidential
priority:1 votes:1
priority:1 votes:1 priority:1 votes:1
Replication Concept
1. Write operations go to the Primary node
2. All changes are recorded into operations log
3. Asynchronous replication to Secondary
4. Secondaries copy the Primary oplog
5. Secondary can use sync source Secondary
1.
© 2019 Pythian. Confidential
Replication Concept
1. Write operations go to the Primary node
2. All changes are recorded into operations log
3. Asynchronous replication to Secondary
4. Secondaries copy the Primary oplog
5. Secondary can use sync source Secondary
2. oplog
1.
© 2019 Pythian. Confidential
Replication Concept
1. Write operations go to the Primary node
2. All changes are recorded into operations log
3. Asynchronous replication to Secondary
4. Secondaries copy the Primary oplog
5. Secondary can use sync source Secondary
2. oplog
1.
3. 3.
© 2019 Pythian. Confidential
Replication Concept
1. Write operations go to the Primary node
2. All changes are recorded into operations log
3. Asynchronous replication to Secondary
4. Secondaries copy the Primary oplog
5. Secondary can use sync source Secondary
© 2018 Pythian. Confidential
2. oplog
1.
3. 3.
4. 4.
Replication Concept
1. Write operations go to the Primary node
2. All changes are recorded into operations log
3. Asynchronous replication to Secondary
4. Secondaries copy the Primary oplog
5. Secondary can use sync source Secondary*
*settings.chainingAllowed (true by default)
2. oplog
1.
3. 3.
4. 4.
5.
© 2019 Pythian. Confidential
Replica Set Oplog
• Special capped collection that keeps a rolling record of all operations that
modify the data stored in the databases
• Idempotent
• Default oplog size
For Unix and Windows systems
Storage Engine Default Oplog Size Lower Bound Upper Bound
In-memory 5% of physical memory 50MB 50GB
WiredTiger 5% of free disk space 990MB 50GB
MMAPv1 5% of free disk space 990MB 50GB
© 2019 Pythian. Confidential
© 2019 Pythian. Confidential
Configuration
Configuration Options
• 50 members per replica set (7 voting members)
• Arbiter node
• Priority 0 node
• Hidden node
• Delayed node
© 2019 Pythian. Confidential
• Does not hold copy of data
• Votes in elections
Arbiter Node
hidden : true
Arbiter
© 2019 Pythian. Confidential
Priority 0 Node
Priority - floating point (i.e. decimal) number between 0 and 1000
• Cannot become primary, cannot trigger election
• Visible to application (accepts reads/writes)
• Votes in elections
Secondary
priority : 0
© 2019 Pythian. Confidential
Hidden Node
• Not visible to application
• Never becomes primary, but can vote in elections
• Use cases
○ Reporting
○ Backups
hidden : truehidden: true priority:0
Secondary
hidden : true priority : 0
© 2019 Pythian. Confidential
Delayed Node
• Must be priority 0 member
• Should be hidden member (not mandatory)
• Mainly used for backups (historical snapshot of data)
• Recovery in case of human error
Secondary
slaveDelay : 3600
priority : 0
hidden : true
© 2019 Pythian. Confidential
© 2019 Pythian. Confidential
Everyone on the same page?
© 2019 Pythian. Confidential
Failures
Small Oplog Size
1. Primary/Secondary node down
○ Node failure
○ Planned maintenance
2. Automatic Failover
…… (several hours later)
3. New Primary overwrites latest oplog
4. Failed Node needs resync
MongoDB >= 3.6: db.adminCommand({replSetResizeOplog: 1, size: 32000})
© 2019 Pythian. Confidential
Arbiter Nodes
● Votes in election
● Does not hold copy of data
● If 2 nodes are down, no majority to elect
new Primary
● Fault tolerance is still 1 node
● 4 data nodes + 1 Arbiter makes more
sense
Heartbeat
© 2019 Pythian. Confidential
Priority 0 Nodes
● Application driver sends writes to Primary
● Reads go to Primary by default
● Secondaries can serve reads
● Read preference
○ primary (default)
○ primaryPreferred
○ secondary
○ secondaryPreferred
○ nearest
© 2019 Pythian. Confidential
• Primary node fails
• Replica set starts election for new Primary
• Zero nodes eligible for Primary
• Application can not send writes
• Database is read only*
*depends on read preference setting
Priority 0 Nodes
© 2019 Pythian. Confidential
Hidden Nodes
● Application driver sends writes to Primary
● Reads go to Primary by default
● Secondaries cannot serve reads
● Read preference
○ primary
© 2019 Pythian. Confidential
• Primary node fails
• Replica set starts election for new Primary
• Zero nodes eligible for Primary (priority:0)
• Application can not send writes/reads
• Downtime
Hidden Nodes
© 2019 Pythian. Confidential
• Primary node fails
• Secondary elected as new Primary
• Working set does not fit in memory
• Performance degradation
• Application stalls
Hardware
64GB RAM, 16 CPU
32GB RAM, 8 CPU 32GB RAM, 8 CPU
© 2019 Pythian. Confidential
• Dataset grows
• No Disk space on Secondary
• mongod process fails
• 2 nodes replica set
• Zero tolerance for failures
Hardware
Disk: 300GB
Disk: 300GB Disk: 200GB
© 2019 Pythian. Confidential
● Heartbeat lost
● Primary step down
● New Primary election
● Application timeout*
● Rollback
Best Practice: Test Primary step
down for your application
*Retryable writes since MongoDB 3.6
Network
© 2019 Pythian. Confidential
• All replica set members deployed in single Availability Zone
• Availability Zone #1 goes down
• Downtime
Cloud
Cloud Deployment
Region #1
Availability Zone #1
© 2019 Pythian. Confidential
● Availability Zone #1 goes down
○ New Primary elected from AZ #2
● Availability Zone #2 goes down
○ Database is read only
Cloud Deployment
© 2019 Pythian. Confidential
Cloud
Region #1
AZ#1 AZ#2
• Region #1 goes down
• Downtime
Cloud Deployment
© 2019 Pythian. Confidential
Cloud
Region #1
AZ#1 AZ#2 AZ#3
● VM2 goes down
○ Primary node has majority on VM1
● VM1 goes down
○ Database is read only
Virtualization
VMWARE
VM1 VM2
Physical Server
© 2019 Pythian. Confidential
● Replica set major version upgrade (3.6>4.0)
● Driver v3.6 not compatible with DB v4.0
● Compatibility changes
● Application cannot send requests
● Downtime
● Rollback to previous DB version
Version Upgrades
MongoDB: 3.6.4 MongoDB: 3.6.4
© 2019 Pythian. Confidential
● Replica set major version upgrade
● Promote new version as Primary
● Confirm application works
● Forget to upgrade Secondaries
● Start using new features
● New Primary elected
● Application errors
Version Upgrades
MongoDB: 3.6 MongoDB: 3.6
MongoDB: 4.0
© 2019 Pythian. Confidential
● Minor version upgrade
● Promote new version as Primary
● Confirm application works
● Forget to upgrade Secondaries
● Bug fixes in minor release
● New Primary elected
● Application errors
Version Upgrades
MongoDB: 3.6.4 MongoDB: 3.6.4
MongoDB: 3.6.8
© 2019 Pythian. Confidential
Version Upgrades
MongoDB: 3.6.8MongoDB: 3.6.8MongoDB: 3.6.8
MongoDB: 3.6.8
MongoDB: 3.6.8
MongoDB: 3.6.3
MongoDB: 3.6.3
MongoDB: 3.6.8
MongoDB: 3.6.8MongoDB: 3.6.8
MongoDB: 3.6.8
MongoDB: 3.6.8
MongoDB: 3.6.8
MongoDB: 3.6.8
© 2019 Pythian. Confidential
MongoDB: 3.6.3
● Adding index on a collection
● Connect to the Primary node
○ db.people.createIndex( { zipcode: 1 }, { background: true } )
DDL Operation
© 2019 Pythian. Confidential
● Stop one Secondary
● Restart on different port
DDL Operation
Secondary
--port=27777
© 2019 Pythian. Confidential
● Add the Index
● Rejoin to replica
● Promote Secondary as Primary
● Forget the other nodes
DDL Operation
Secondary
--port=27777
db.people.createIndex({zipcode:1})
© 2019 Pythian. Confidential
● Pick one Secondary
● db.fsyncLock()
● Take snapshot
● db.fsyncUnlock()
● Unlock fails
● Secondary starts lagging
● Primary overwrites oplog
● Secondary needs initial sync
Backups
© 2019 Pythian. Confidential
© 2019 Pythian. Confidential
Sharded Clusters
© 2019 Pythian. Confidential
Sharded Clusters
© 2019 Pythian. Confidential
Monitoring Replica Set
• Replica set has no Primary
• Number of unhealthy members is above threshold
• Replication lag is above threshold
• Replica set elected new Primary
• Host of any type has restarted
• Host of type Secondary is recovering
• Host of any type is down
• Host of any type has experienced Rollback
• Network issues between members of the replica set or cluster
• Monitoring backup status
© 2019 Pythian. Confidential
Summary
• Replica set with odd number of voting members
• Hidden or Delayed member for dedicated functions (reporting, backups …)
• Have more than one eligible Primary in the replica set
• Use multi-AZ for Cloud deployments
• Don’t deploy more than one mongod process per node/host
• Run replica set members with same hardware for all nodes
• Run replica set members with same mongo version
• Monitor your replica set status and nodes
• Monitor replication lag and Oplog size
© 2019 Pythian. Confidential
Questions?
© 2019 Pythian. Confidential
We’re Hiring!
https://www.pythian.com/careers/
© 2019 Pythian. Confidential

More Related Content

What's hot

How To Connect Spark To Your Own Datasource
How To Connect Spark To Your Own DatasourceHow To Connect Spark To Your Own Datasource
How To Connect Spark To Your Own Datasource
MongoDB
 
MongoDB and Spark
MongoDB and SparkMongoDB and Spark
MongoDB and Spark
Norberto Leite
 
Webinar: Schema Patterns and Your Storage Engine
Webinar: Schema Patterns and Your Storage EngineWebinar: Schema Patterns and Your Storage Engine
Webinar: Schema Patterns and Your Storage Engine
MongoDB
 
2014 05-07-fr - add dev series - session 6 - deploying your application-2
2014 05-07-fr - add dev series - session 6 - deploying your application-22014 05-07-fr - add dev series - session 6 - deploying your application-2
2014 05-07-fr - add dev series - session 6 - deploying your application-2MongoDB
 
NoSQL Analytics: JSON Data Analysis and Acceleration in MongoDB World
NoSQL Analytics: JSON Data Analysis and Acceleration in MongoDB WorldNoSQL Analytics: JSON Data Analysis and Acceleration in MongoDB World
NoSQL Analytics: JSON Data Analysis and Acceleration in MongoDB World
Ajay Gupte
 
Webinar: Architecting Secure and Compliant Applications with MongoDB
Webinar: Architecting Secure and Compliant Applications with MongoDBWebinar: Architecting Secure and Compliant Applications with MongoDB
Webinar: Architecting Secure and Compliant Applications with MongoDB
MongoDB
 
Migrating to MongoDB: Best Practices
Migrating to MongoDB: Best PracticesMigrating to MongoDB: Best Practices
Migrating to MongoDB: Best Practices
MongoDB
 
Back to Basics Webinar 6: Production Deployment
Back to Basics Webinar 6: Production DeploymentBack to Basics Webinar 6: Production Deployment
Back to Basics Webinar 6: Production Deployment
MongoDB
 
Challenges with MongoDB
Challenges with MongoDBChallenges with MongoDB
Challenges with MongoDB
Stone Gao
 
MongodB Internals
MongodB InternalsMongodB Internals
MongodB Internals
Norberto Leite
 
Sharding
ShardingSharding
Sharding
MongoDB
 
MongoDB - External Authentication
MongoDB - External AuthenticationMongoDB - External Authentication
MongoDB - External Authentication
Jason Terpko
 
How Thermo Fisher is Reducing Data Analysis Times from Days to Minutes with M...
How Thermo Fisher is Reducing Data Analysis Times from Days to Minutes with M...How Thermo Fisher is Reducing Data Analysis Times from Days to Minutes with M...
How Thermo Fisher is Reducing Data Analysis Times from Days to Minutes with M...
MongoDB
 
MongoDB Sharding Fundamentals
MongoDB Sharding Fundamentals MongoDB Sharding Fundamentals
MongoDB Sharding Fundamentals
Antonios Giannopoulos
 
Sharding
ShardingSharding
Sharding
MongoDB
 
How sitecore depends on mongo db for scalability and performance, and what it...
How sitecore depends on mongo db for scalability and performance, and what it...How sitecore depends on mongo db for scalability and performance, and what it...
How sitecore depends on mongo db for scalability and performance, and what it...
Antonios Giannopoulos
 
High Performance Applications with MongoDB
High Performance Applications with MongoDBHigh Performance Applications with MongoDB
High Performance Applications with MongoDB
MongoDB
 
Mongo db dhruba
Mongo db dhrubaMongo db dhruba
Mongo db dhruba
Dhrubaji Mandal ♛
 
MongoDB Europe 2016 - Big Data meets Big Compute
MongoDB Europe 2016 - Big Data meets Big ComputeMongoDB Europe 2016 - Big Data meets Big Compute
MongoDB Europe 2016 - Big Data meets Big Compute
MongoDB
 
Mongo db 3.4 Overview
Mongo db 3.4 OverviewMongo db 3.4 Overview
Mongo db 3.4 Overview
Norberto Leite
 

What's hot (20)

How To Connect Spark To Your Own Datasource
How To Connect Spark To Your Own DatasourceHow To Connect Spark To Your Own Datasource
How To Connect Spark To Your Own Datasource
 
MongoDB and Spark
MongoDB and SparkMongoDB and Spark
MongoDB and Spark
 
Webinar: Schema Patterns and Your Storage Engine
Webinar: Schema Patterns and Your Storage EngineWebinar: Schema Patterns and Your Storage Engine
Webinar: Schema Patterns and Your Storage Engine
 
2014 05-07-fr - add dev series - session 6 - deploying your application-2
2014 05-07-fr - add dev series - session 6 - deploying your application-22014 05-07-fr - add dev series - session 6 - deploying your application-2
2014 05-07-fr - add dev series - session 6 - deploying your application-2
 
NoSQL Analytics: JSON Data Analysis and Acceleration in MongoDB World
NoSQL Analytics: JSON Data Analysis and Acceleration in MongoDB WorldNoSQL Analytics: JSON Data Analysis and Acceleration in MongoDB World
NoSQL Analytics: JSON Data Analysis and Acceleration in MongoDB World
 
Webinar: Architecting Secure and Compliant Applications with MongoDB
Webinar: Architecting Secure and Compliant Applications with MongoDBWebinar: Architecting Secure and Compliant Applications with MongoDB
Webinar: Architecting Secure and Compliant Applications with MongoDB
 
Migrating to MongoDB: Best Practices
Migrating to MongoDB: Best PracticesMigrating to MongoDB: Best Practices
Migrating to MongoDB: Best Practices
 
Back to Basics Webinar 6: Production Deployment
Back to Basics Webinar 6: Production DeploymentBack to Basics Webinar 6: Production Deployment
Back to Basics Webinar 6: Production Deployment
 
Challenges with MongoDB
Challenges with MongoDBChallenges with MongoDB
Challenges with MongoDB
 
MongodB Internals
MongodB InternalsMongodB Internals
MongodB Internals
 
Sharding
ShardingSharding
Sharding
 
MongoDB - External Authentication
MongoDB - External AuthenticationMongoDB - External Authentication
MongoDB - External Authentication
 
How Thermo Fisher is Reducing Data Analysis Times from Days to Minutes with M...
How Thermo Fisher is Reducing Data Analysis Times from Days to Minutes with M...How Thermo Fisher is Reducing Data Analysis Times from Days to Minutes with M...
How Thermo Fisher is Reducing Data Analysis Times from Days to Minutes with M...
 
MongoDB Sharding Fundamentals
MongoDB Sharding Fundamentals MongoDB Sharding Fundamentals
MongoDB Sharding Fundamentals
 
Sharding
ShardingSharding
Sharding
 
How sitecore depends on mongo db for scalability and performance, and what it...
How sitecore depends on mongo db for scalability and performance, and what it...How sitecore depends on mongo db for scalability and performance, and what it...
How sitecore depends on mongo db for scalability and performance, and what it...
 
High Performance Applications with MongoDB
High Performance Applications with MongoDBHigh Performance Applications with MongoDB
High Performance Applications with MongoDB
 
Mongo db dhruba
Mongo db dhrubaMongo db dhruba
Mongo db dhruba
 
MongoDB Europe 2016 - Big Data meets Big Compute
MongoDB Europe 2016 - Big Data meets Big ComputeMongoDB Europe 2016 - Big Data meets Big Compute
MongoDB Europe 2016 - Big Data meets Big Compute
 
Mongo db 3.4 Overview
Mongo db 3.4 OverviewMongo db 3.4 Overview
Mongo db 3.4 Overview
 

Similar to MongoDB HA - what can go wrong

MongoDB World 2019: Unleash the Power of the MongoDB Aggregation Framework
MongoDB World 2019: Unleash the Power of the MongoDB Aggregation FrameworkMongoDB World 2019: Unleash the Power of the MongoDB Aggregation Framework
MongoDB World 2019: Unleash the Power of the MongoDB Aggregation Framework
MongoDB
 
Building A Self Service Streaming Platform at Pinterest - Steven Bairos-Novak...
Building A Self Service Streaming Platform at Pinterest - Steven Bairos-Novak...Building A Self Service Streaming Platform at Pinterest - Steven Bairos-Novak...
Building A Self Service Streaming Platform at Pinterest - Steven Bairos-Novak...
Flink Forward
 
Sidiq Permana - Building For The Next Billion Users
Sidiq Permana - Building For The Next Billion UsersSidiq Permana - Building For The Next Billion Users
Sidiq Permana - Building For The Next Billion Users
Dicoding
 
MongoDB @ Fiverr: The Road to Atlas
MongoDB @ Fiverr: The Road to AtlasMongoDB @ Fiverr: The Road to Atlas
MongoDB @ Fiverr: The Road to Atlas
MongoDB
 
Angular v2 et plus : le futur du développement d'applications en entreprise
Angular v2 et plus : le futur du développement d'applications en entrepriseAngular v2 et plus : le futur du développement d'applications en entreprise
Angular v2 et plus : le futur du développement d'applications en entreprise
LINAGORA
 
Scalable Application Development @ Picnic
Scalable Application Development @ PicnicScalable Application Development @ Picnic
Scalable Application Development @ Picnic
Sander Mak (@Sander_Mak)
 
Splunk Phantom, the Endpoint Data Model & Splunk Security Essentials App!
Splunk Phantom, the Endpoint Data Model & Splunk Security Essentials App!Splunk Phantom, the Endpoint Data Model & Splunk Security Essentials App!
Splunk Phantom, the Endpoint Data Model & Splunk Security Essentials App!
Harry McLaren
 
Modernisation of legacy PHP applications using Symfony2 - PHP Northeast Confe...
Modernisation of legacy PHP applications using Symfony2 - PHP Northeast Confe...Modernisation of legacy PHP applications using Symfony2 - PHP Northeast Confe...
Modernisation of legacy PHP applications using Symfony2 - PHP Northeast Confe...
Fabrice Bernhard
 
Industrialiser spark
Industrialiser sparkIndustrialiser spark
Industrialiser spark
Lucien Fregosi
 
Deploying MariaDB for HA on Google Cloud Platform
Deploying MariaDB for HA on Google Cloud PlatformDeploying MariaDB for HA on Google Cloud Platform
Deploying MariaDB for HA on Google Cloud Platform
MariaDB plc
 
Open Social Summit Korea Overview
Open Social Summit Korea OverviewOpen Social Summit Korea Overview
Open Social Summit Korea Overview
Chris Schalk
 
Implementing MySQL Database-as-a-Service using open source tools
Implementing MySQL Database-as-a-Service using open source toolsImplementing MySQL Database-as-a-Service using open source tools
Implementing MySQL Database-as-a-Service using open source tools
All Things Open
 
Android best practices 2015
Android best practices 2015Android best practices 2015
Android best practices 2015
Sean Katz
 
Conquering Data Migration from Oracle to Postgres
Conquering Data Migration from Oracle to PostgresConquering Data Migration from Oracle to Postgres
Conquering Data Migration from Oracle to Postgres
EDB
 
IRJET- Industry Production Manager using Raspberry Pi
IRJET-  	  Industry Production Manager using Raspberry PiIRJET-  	  Industry Production Manager using Raspberry Pi
IRJET- Industry Production Manager using Raspberry Pi
IRJET Journal
 
From Java Code to Java Heap: Understanding the Memory Usage of Your App - Ch...
From Java Code to Java Heap: Understanding the Memory Usage of Your App  - Ch...From Java Code to Java Heap: Understanding the Memory Usage of Your App  - Ch...
From Java Code to Java Heap: Understanding the Memory Usage of Your App - Ch...
jaxLondonConference
 
A Planet-Scale Database for Low Latency Transactional Apps by Yugabyte
A Planet-Scale Database for Low Latency Transactional Apps by YugabyteA Planet-Scale Database for Low Latency Transactional Apps by Yugabyte
A Planet-Scale Database for Low Latency Transactional Apps by Yugabyte
Carlos Andrés García
 
A Planet-Scale Database for Low Latency Transactional Apps by Yugabyte
A Planet-Scale Database for Low Latency Transactional Apps by YugabyteA Planet-Scale Database for Low Latency Transactional Apps by Yugabyte
A Planet-Scale Database for Low Latency Transactional Apps by Yugabyte
VMware Tanzu
 
Kubernetes as data platform
Kubernetes as data platformKubernetes as data platform
Kubernetes as data platform
Lars Albertsson
 
DevOpsDaysRiga 2018: Eric Skoglund, Lars Albertsson - Kubernetes as data plat...
DevOpsDaysRiga 2018: Eric Skoglund, Lars Albertsson - Kubernetes as data plat...DevOpsDaysRiga 2018: Eric Skoglund, Lars Albertsson - Kubernetes as data plat...
DevOpsDaysRiga 2018: Eric Skoglund, Lars Albertsson - Kubernetes as data plat...
DevOpsDays Riga
 

Similar to MongoDB HA - what can go wrong (20)

MongoDB World 2019: Unleash the Power of the MongoDB Aggregation Framework
MongoDB World 2019: Unleash the Power of the MongoDB Aggregation FrameworkMongoDB World 2019: Unleash the Power of the MongoDB Aggregation Framework
MongoDB World 2019: Unleash the Power of the MongoDB Aggregation Framework
 
Building A Self Service Streaming Platform at Pinterest - Steven Bairos-Novak...
Building A Self Service Streaming Platform at Pinterest - Steven Bairos-Novak...Building A Self Service Streaming Platform at Pinterest - Steven Bairos-Novak...
Building A Self Service Streaming Platform at Pinterest - Steven Bairos-Novak...
 
Sidiq Permana - Building For The Next Billion Users
Sidiq Permana - Building For The Next Billion UsersSidiq Permana - Building For The Next Billion Users
Sidiq Permana - Building For The Next Billion Users
 
MongoDB @ Fiverr: The Road to Atlas
MongoDB @ Fiverr: The Road to AtlasMongoDB @ Fiverr: The Road to Atlas
MongoDB @ Fiverr: The Road to Atlas
 
Angular v2 et plus : le futur du développement d'applications en entreprise
Angular v2 et plus : le futur du développement d'applications en entrepriseAngular v2 et plus : le futur du développement d'applications en entreprise
Angular v2 et plus : le futur du développement d'applications en entreprise
 
Scalable Application Development @ Picnic
Scalable Application Development @ PicnicScalable Application Development @ Picnic
Scalable Application Development @ Picnic
 
Splunk Phantom, the Endpoint Data Model & Splunk Security Essentials App!
Splunk Phantom, the Endpoint Data Model & Splunk Security Essentials App!Splunk Phantom, the Endpoint Data Model & Splunk Security Essentials App!
Splunk Phantom, the Endpoint Data Model & Splunk Security Essentials App!
 
Modernisation of legacy PHP applications using Symfony2 - PHP Northeast Confe...
Modernisation of legacy PHP applications using Symfony2 - PHP Northeast Confe...Modernisation of legacy PHP applications using Symfony2 - PHP Northeast Confe...
Modernisation of legacy PHP applications using Symfony2 - PHP Northeast Confe...
 
Industrialiser spark
Industrialiser sparkIndustrialiser spark
Industrialiser spark
 
Deploying MariaDB for HA on Google Cloud Platform
Deploying MariaDB for HA on Google Cloud PlatformDeploying MariaDB for HA on Google Cloud Platform
Deploying MariaDB for HA on Google Cloud Platform
 
Open Social Summit Korea Overview
Open Social Summit Korea OverviewOpen Social Summit Korea Overview
Open Social Summit Korea Overview
 
Implementing MySQL Database-as-a-Service using open source tools
Implementing MySQL Database-as-a-Service using open source toolsImplementing MySQL Database-as-a-Service using open source tools
Implementing MySQL Database-as-a-Service using open source tools
 
Android best practices 2015
Android best practices 2015Android best practices 2015
Android best practices 2015
 
Conquering Data Migration from Oracle to Postgres
Conquering Data Migration from Oracle to PostgresConquering Data Migration from Oracle to Postgres
Conquering Data Migration from Oracle to Postgres
 
IRJET- Industry Production Manager using Raspberry Pi
IRJET-  	  Industry Production Manager using Raspberry PiIRJET-  	  Industry Production Manager using Raspberry Pi
IRJET- Industry Production Manager using Raspberry Pi
 
From Java Code to Java Heap: Understanding the Memory Usage of Your App - Ch...
From Java Code to Java Heap: Understanding the Memory Usage of Your App  - Ch...From Java Code to Java Heap: Understanding the Memory Usage of Your App  - Ch...
From Java Code to Java Heap: Understanding the Memory Usage of Your App - Ch...
 
A Planet-Scale Database for Low Latency Transactional Apps by Yugabyte
A Planet-Scale Database for Low Latency Transactional Apps by YugabyteA Planet-Scale Database for Low Latency Transactional Apps by Yugabyte
A Planet-Scale Database for Low Latency Transactional Apps by Yugabyte
 
A Planet-Scale Database for Low Latency Transactional Apps by Yugabyte
A Planet-Scale Database for Low Latency Transactional Apps by YugabyteA Planet-Scale Database for Low Latency Transactional Apps by Yugabyte
A Planet-Scale Database for Low Latency Transactional Apps by Yugabyte
 
Kubernetes as data platform
Kubernetes as data platformKubernetes as data platform
Kubernetes as data platform
 
DevOpsDaysRiga 2018: Eric Skoglund, Lars Albertsson - Kubernetes as data plat...
DevOpsDaysRiga 2018: Eric Skoglund, Lars Albertsson - Kubernetes as data plat...DevOpsDaysRiga 2018: Eric Skoglund, Lars Albertsson - Kubernetes as data plat...
DevOpsDaysRiga 2018: Eric Skoglund, Lars Albertsson - Kubernetes as data plat...
 

Recently uploaded

Heap Sort (SS).ppt FOR ENGINEERING GRADUATES, BCA, MCA, MTECH, BSC STUDENTS
Heap Sort (SS).ppt FOR ENGINEERING GRADUATES, BCA, MCA, MTECH, BSC STUDENTSHeap Sort (SS).ppt FOR ENGINEERING GRADUATES, BCA, MCA, MTECH, BSC STUDENTS
Heap Sort (SS).ppt FOR ENGINEERING GRADUATES, BCA, MCA, MTECH, BSC STUDENTS
Soumen Santra
 
Modelagem de um CSTR com reação endotermica.pdf
Modelagem de um CSTR com reação endotermica.pdfModelagem de um CSTR com reação endotermica.pdf
Modelagem de um CSTR com reação endotermica.pdf
camseq
 
Low power architecture of logic gates using adiabatic techniques
Low power architecture of logic gates using adiabatic techniquesLow power architecture of logic gates using adiabatic techniques
Low power architecture of logic gates using adiabatic techniques
nooriasukmaningtyas
 
Harnessing WebAssembly for Real-time Stateless Streaming Pipelines
Harnessing WebAssembly for Real-time Stateless Streaming PipelinesHarnessing WebAssembly for Real-time Stateless Streaming Pipelines
Harnessing WebAssembly for Real-time Stateless Streaming Pipelines
Christina Lin
 
PPT on GRP pipes manufacturing and testing
PPT on GRP pipes manufacturing and testingPPT on GRP pipes manufacturing and testing
PPT on GRP pipes manufacturing and testing
anoopmanoharan2
 
Online aptitude test management system project report.pdf
Online aptitude test management system project report.pdfOnline aptitude test management system project report.pdf
Online aptitude test management system project report.pdf
Kamal Acharya
 
Hierarchical Digital Twin of a Naval Power System
Hierarchical Digital Twin of a Naval Power SystemHierarchical Digital Twin of a Naval Power System
Hierarchical Digital Twin of a Naval Power System
Kerry Sado
 
一比一原版(Otago毕业证)奥塔哥大学毕业证成绩单如何办理
一比一原版(Otago毕业证)奥塔哥大学毕业证成绩单如何办理一比一原版(Otago毕业证)奥塔哥大学毕业证成绩单如何办理
一比一原版(Otago毕业证)奥塔哥大学毕业证成绩单如何办理
dxobcob
 
在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样
在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样
在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样
obonagu
 
RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...
RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...
RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...
thanhdowork
 
Ethernet Routing and switching chapter 1.ppt
Ethernet Routing and switching chapter 1.pptEthernet Routing and switching chapter 1.ppt
Ethernet Routing and switching chapter 1.ppt
azkamurat
 
ACEP Magazine edition 4th launched on 05.06.2024
ACEP Magazine edition 4th launched on 05.06.2024ACEP Magazine edition 4th launched on 05.06.2024
ACEP Magazine edition 4th launched on 05.06.2024
Rahul
 
Unbalanced Three Phase Systems and circuits.pptx
Unbalanced Three Phase Systems and circuits.pptxUnbalanced Three Phase Systems and circuits.pptx
Unbalanced Three Phase Systems and circuits.pptx
ChristineTorrepenida1
 
Fundamentals of Electric Drives and its applications.pptx
Fundamentals of Electric Drives and its applications.pptxFundamentals of Electric Drives and its applications.pptx
Fundamentals of Electric Drives and its applications.pptx
manasideore6
 
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&BDesign and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Sreedhar Chowdam
 
Recycled Concrete Aggregate in Construction Part III
Recycled Concrete Aggregate in Construction Part IIIRecycled Concrete Aggregate in Construction Part III
Recycled Concrete Aggregate in Construction Part III
Aditya Rajan Patra
 
[JPP-1] - (JEE 3.0) - Kinematics 1D - 14th May..pdf
[JPP-1] - (JEE 3.0) - Kinematics 1D - 14th May..pdf[JPP-1] - (JEE 3.0) - Kinematics 1D - 14th May..pdf
[JPP-1] - (JEE 3.0) - Kinematics 1D - 14th May..pdf
awadeshbabu
 
Literature Review Basics and Understanding Reference Management.pptx
Literature Review Basics and Understanding Reference Management.pptxLiterature Review Basics and Understanding Reference Management.pptx
Literature Review Basics and Understanding Reference Management.pptx
Dr Ramhari Poudyal
 
digital fundamental by Thomas L.floydl.pdf
digital fundamental by Thomas L.floydl.pdfdigital fundamental by Thomas L.floydl.pdf
digital fundamental by Thomas L.floydl.pdf
drwaing
 
一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理
一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理
一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理
bakpo1
 

Recently uploaded (20)

Heap Sort (SS).ppt FOR ENGINEERING GRADUATES, BCA, MCA, MTECH, BSC STUDENTS
Heap Sort (SS).ppt FOR ENGINEERING GRADUATES, BCA, MCA, MTECH, BSC STUDENTSHeap Sort (SS).ppt FOR ENGINEERING GRADUATES, BCA, MCA, MTECH, BSC STUDENTS
Heap Sort (SS).ppt FOR ENGINEERING GRADUATES, BCA, MCA, MTECH, BSC STUDENTS
 
Modelagem de um CSTR com reação endotermica.pdf
Modelagem de um CSTR com reação endotermica.pdfModelagem de um CSTR com reação endotermica.pdf
Modelagem de um CSTR com reação endotermica.pdf
 
Low power architecture of logic gates using adiabatic techniques
Low power architecture of logic gates using adiabatic techniquesLow power architecture of logic gates using adiabatic techniques
Low power architecture of logic gates using adiabatic techniques
 
Harnessing WebAssembly for Real-time Stateless Streaming Pipelines
Harnessing WebAssembly for Real-time Stateless Streaming PipelinesHarnessing WebAssembly for Real-time Stateless Streaming Pipelines
Harnessing WebAssembly for Real-time Stateless Streaming Pipelines
 
PPT on GRP pipes manufacturing and testing
PPT on GRP pipes manufacturing and testingPPT on GRP pipes manufacturing and testing
PPT on GRP pipes manufacturing and testing
 
Online aptitude test management system project report.pdf
Online aptitude test management system project report.pdfOnline aptitude test management system project report.pdf
Online aptitude test management system project report.pdf
 
Hierarchical Digital Twin of a Naval Power System
Hierarchical Digital Twin of a Naval Power SystemHierarchical Digital Twin of a Naval Power System
Hierarchical Digital Twin of a Naval Power System
 
一比一原版(Otago毕业证)奥塔哥大学毕业证成绩单如何办理
一比一原版(Otago毕业证)奥塔哥大学毕业证成绩单如何办理一比一原版(Otago毕业证)奥塔哥大学毕业证成绩单如何办理
一比一原版(Otago毕业证)奥塔哥大学毕业证成绩单如何办理
 
在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样
在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样
在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样
 
RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...
RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...
RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...
 
Ethernet Routing and switching chapter 1.ppt
Ethernet Routing and switching chapter 1.pptEthernet Routing and switching chapter 1.ppt
Ethernet Routing and switching chapter 1.ppt
 
ACEP Magazine edition 4th launched on 05.06.2024
ACEP Magazine edition 4th launched on 05.06.2024ACEP Magazine edition 4th launched on 05.06.2024
ACEP Magazine edition 4th launched on 05.06.2024
 
Unbalanced Three Phase Systems and circuits.pptx
Unbalanced Three Phase Systems and circuits.pptxUnbalanced Three Phase Systems and circuits.pptx
Unbalanced Three Phase Systems and circuits.pptx
 
Fundamentals of Electric Drives and its applications.pptx
Fundamentals of Electric Drives and its applications.pptxFundamentals of Electric Drives and its applications.pptx
Fundamentals of Electric Drives and its applications.pptx
 
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&BDesign and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
 
Recycled Concrete Aggregate in Construction Part III
Recycled Concrete Aggregate in Construction Part IIIRecycled Concrete Aggregate in Construction Part III
Recycled Concrete Aggregate in Construction Part III
 
[JPP-1] - (JEE 3.0) - Kinematics 1D - 14th May..pdf
[JPP-1] - (JEE 3.0) - Kinematics 1D - 14th May..pdf[JPP-1] - (JEE 3.0) - Kinematics 1D - 14th May..pdf
[JPP-1] - (JEE 3.0) - Kinematics 1D - 14th May..pdf
 
Literature Review Basics and Understanding Reference Management.pptx
Literature Review Basics and Understanding Reference Management.pptxLiterature Review Basics and Understanding Reference Management.pptx
Literature Review Basics and Understanding Reference Management.pptx
 
digital fundamental by Thomas L.floydl.pdf
digital fundamental by Thomas L.floydl.pdfdigital fundamental by Thomas L.floydl.pdf
digital fundamental by Thomas L.floydl.pdf
 
一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理
一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理
一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理
 

MongoDB HA - what can go wrong

  • 2. {"name": "Igor Donchovski", "live_in": "Skopje", "email": "donchovski@pythian.com", "current_role": "Lead database consultant", "education": [{"type": "College", "name": "FEIT", "graduated": "2008", "university": "UKIM"}, {"type": "Master", "name": "FINKI", "graduated": "2013", "university": "UKIM"}], "work": [{"role": "Web developer", "start": "2007", "end": "2012", "company": "Gord Systems"}, {"role": "DBA", "start": "2012", "end": "2014", "company": "NOVP"}, {"role": "Database consultant", "start": "2014", "end": "2016", "company": "Pythian"}, {"role": "Lead database consultant", "start": "2016", "company": "Pythian"}], "certificates": [{"name": "C100DBA", "year": "2016", "description": "MongoDB certified DBA"}], "social": [{"network": "LinkedIn", "link": "www.linkedin.com/in/igorle"}, {"network": "Twitter", "link": "https://twitter.com/igorle", "handle": "@igorle"}], "interests": ["Hiking", "Biking", "Traveling"], "hobbies": ["Painting", "Photography", "Cooking"], "proud_of": ["Volunteering", "Helping the Community"]} About Me © 2019 Pythian. Confidential
  • 3. • What is replica set, how replication works • Replication concept • Replica set features, deployment architectures • Hidden nodes, Arbiter nodes, Priority 0 nodes • Production failures • Monitoring replica set • QA Overview © 2019 Pythian. Confidential Time
  • 4. © 2019 Pythian. Confidential Replication
  • 5. • Group of mongod processes that maintain the same data set • Redundancy and high availability • Increased read capacity (scaling reads) • Automatic failover Replica Set # Members # Nodes Required to Elect New Primary Fault Tolerance 3 2 1 4 3 1 5 3 2 6 4 2 7 4 3 © 2019 Pythian. Confidential priority:1 votes:1 priority:1 votes:1 priority:1 votes:1
  • 6. Replication Concept 1. Write operations go to the Primary node 2. All changes are recorded into operations log 3. Asynchronous replication to Secondary 4. Secondaries copy the Primary oplog 5. Secondary can use sync source Secondary 1. © 2019 Pythian. Confidential
  • 7. Replication Concept 1. Write operations go to the Primary node 2. All changes are recorded into operations log 3. Asynchronous replication to Secondary 4. Secondaries copy the Primary oplog 5. Secondary can use sync source Secondary 2. oplog 1. © 2019 Pythian. Confidential
  • 8. Replication Concept 1. Write operations go to the Primary node 2. All changes are recorded into operations log 3. Asynchronous replication to Secondary 4. Secondaries copy the Primary oplog 5. Secondary can use sync source Secondary 2. oplog 1. 3. 3. © 2019 Pythian. Confidential
  • 9. Replication Concept 1. Write operations go to the Primary node 2. All changes are recorded into operations log 3. Asynchronous replication to Secondary 4. Secondaries copy the Primary oplog 5. Secondary can use sync source Secondary © 2018 Pythian. Confidential 2. oplog 1. 3. 3. 4. 4.
  • 10. Replication Concept 1. Write operations go to the Primary node 2. All changes are recorded into operations log 3. Asynchronous replication to Secondary 4. Secondaries copy the Primary oplog 5. Secondary can use sync source Secondary* *settings.chainingAllowed (true by default) 2. oplog 1. 3. 3. 4. 4. 5. © 2019 Pythian. Confidential
  • 11. Replica Set Oplog • Special capped collection that keeps a rolling record of all operations that modify the data stored in the databases • Idempotent • Default oplog size For Unix and Windows systems Storage Engine Default Oplog Size Lower Bound Upper Bound In-memory 5% of physical memory 50MB 50GB WiredTiger 5% of free disk space 990MB 50GB MMAPv1 5% of free disk space 990MB 50GB © 2019 Pythian. Confidential
  • 12. © 2019 Pythian. Confidential Configuration
  • 13. Configuration Options • 50 members per replica set (7 voting members) • Arbiter node • Priority 0 node • Hidden node • Delayed node © 2019 Pythian. Confidential
  • 14. • Does not hold copy of data • Votes in elections Arbiter Node hidden : true Arbiter © 2019 Pythian. Confidential
  • 15. Priority 0 Node Priority - floating point (i.e. decimal) number between 0 and 1000 • Cannot become primary, cannot trigger election • Visible to application (accepts reads/writes) • Votes in elections Secondary priority : 0 © 2019 Pythian. Confidential
  • 16. Hidden Node • Not visible to application • Never becomes primary, but can vote in elections • Use cases ○ Reporting ○ Backups hidden : truehidden: true priority:0 Secondary hidden : true priority : 0 © 2019 Pythian. Confidential
  • 17. Delayed Node • Must be priority 0 member • Should be hidden member (not mandatory) • Mainly used for backups (historical snapshot of data) • Recovery in case of human error Secondary slaveDelay : 3600 priority : 0 hidden : true © 2019 Pythian. Confidential
  • 18. © 2019 Pythian. Confidential Everyone on the same page?
  • 19. © 2019 Pythian. Confidential Failures
  • 20. Small Oplog Size 1. Primary/Secondary node down ○ Node failure ○ Planned maintenance 2. Automatic Failover …… (several hours later) 3. New Primary overwrites latest oplog 4. Failed Node needs resync MongoDB >= 3.6: db.adminCommand({replSetResizeOplog: 1, size: 32000}) © 2019 Pythian. Confidential
  • 21. Arbiter Nodes ● Votes in election ● Does not hold copy of data ● If 2 nodes are down, no majority to elect new Primary ● Fault tolerance is still 1 node ● 4 data nodes + 1 Arbiter makes more sense Heartbeat © 2019 Pythian. Confidential
  • 22. Priority 0 Nodes ● Application driver sends writes to Primary ● Reads go to Primary by default ● Secondaries can serve reads ● Read preference ○ primary (default) ○ primaryPreferred ○ secondary ○ secondaryPreferred ○ nearest © 2019 Pythian. Confidential
  • 23. • Primary node fails • Replica set starts election for new Primary • Zero nodes eligible for Primary • Application can not send writes • Database is read only* *depends on read preference setting Priority 0 Nodes © 2019 Pythian. Confidential
  • 24. Hidden Nodes ● Application driver sends writes to Primary ● Reads go to Primary by default ● Secondaries cannot serve reads ● Read preference ○ primary © 2019 Pythian. Confidential
  • 25. • Primary node fails • Replica set starts election for new Primary • Zero nodes eligible for Primary (priority:0) • Application can not send writes/reads • Downtime Hidden Nodes © 2019 Pythian. Confidential
  • 26. • Primary node fails • Secondary elected as new Primary • Working set does not fit in memory • Performance degradation • Application stalls Hardware 64GB RAM, 16 CPU 32GB RAM, 8 CPU 32GB RAM, 8 CPU © 2019 Pythian. Confidential
  • 27. • Dataset grows • No Disk space on Secondary • mongod process fails • 2 nodes replica set • Zero tolerance for failures Hardware Disk: 300GB Disk: 300GB Disk: 200GB © 2019 Pythian. Confidential
  • 28. ● Heartbeat lost ● Primary step down ● New Primary election ● Application timeout* ● Rollback Best Practice: Test Primary step down for your application *Retryable writes since MongoDB 3.6 Network © 2019 Pythian. Confidential
  • 29. • All replica set members deployed in single Availability Zone • Availability Zone #1 goes down • Downtime Cloud Cloud Deployment Region #1 Availability Zone #1 © 2019 Pythian. Confidential
  • 30. ● Availability Zone #1 goes down ○ New Primary elected from AZ #2 ● Availability Zone #2 goes down ○ Database is read only Cloud Deployment © 2019 Pythian. Confidential Cloud Region #1 AZ#1 AZ#2
  • 31. • Region #1 goes down • Downtime Cloud Deployment © 2019 Pythian. Confidential Cloud Region #1 AZ#1 AZ#2 AZ#3
  • 32. ● VM2 goes down ○ Primary node has majority on VM1 ● VM1 goes down ○ Database is read only Virtualization VMWARE VM1 VM2 Physical Server © 2019 Pythian. Confidential
  • 33. ● Replica set major version upgrade (3.6>4.0) ● Driver v3.6 not compatible with DB v4.0 ● Compatibility changes ● Application cannot send requests ● Downtime ● Rollback to previous DB version Version Upgrades MongoDB: 3.6.4 MongoDB: 3.6.4 © 2019 Pythian. Confidential
  • 34. ● Replica set major version upgrade ● Promote new version as Primary ● Confirm application works ● Forget to upgrade Secondaries ● Start using new features ● New Primary elected ● Application errors Version Upgrades MongoDB: 3.6 MongoDB: 3.6 MongoDB: 4.0 © 2019 Pythian. Confidential
  • 35. ● Minor version upgrade ● Promote new version as Primary ● Confirm application works ● Forget to upgrade Secondaries ● Bug fixes in minor release ● New Primary elected ● Application errors Version Upgrades MongoDB: 3.6.4 MongoDB: 3.6.4 MongoDB: 3.6.8 © 2019 Pythian. Confidential
  • 36. Version Upgrades MongoDB: 3.6.8MongoDB: 3.6.8MongoDB: 3.6.8 MongoDB: 3.6.8 MongoDB: 3.6.8 MongoDB: 3.6.3 MongoDB: 3.6.3 MongoDB: 3.6.8 MongoDB: 3.6.8MongoDB: 3.6.8 MongoDB: 3.6.8 MongoDB: 3.6.8 MongoDB: 3.6.8 MongoDB: 3.6.8 © 2019 Pythian. Confidential MongoDB: 3.6.3
  • 37. ● Adding index on a collection ● Connect to the Primary node ○ db.people.createIndex( { zipcode: 1 }, { background: true } ) DDL Operation © 2019 Pythian. Confidential
  • 38. ● Stop one Secondary ● Restart on different port DDL Operation Secondary --port=27777 © 2019 Pythian. Confidential
  • 39. ● Add the Index ● Rejoin to replica ● Promote Secondary as Primary ● Forget the other nodes DDL Operation Secondary --port=27777 db.people.createIndex({zipcode:1}) © 2019 Pythian. Confidential
  • 40. ● Pick one Secondary ● db.fsyncLock() ● Take snapshot ● db.fsyncUnlock() ● Unlock fails ● Secondary starts lagging ● Primary overwrites oplog ● Secondary needs initial sync Backups © 2019 Pythian. Confidential
  • 41. © 2019 Pythian. Confidential
  • 42. Sharded Clusters © 2019 Pythian. Confidential
  • 43. Sharded Clusters © 2019 Pythian. Confidential
  • 44. Monitoring Replica Set • Replica set has no Primary • Number of unhealthy members is above threshold • Replication lag is above threshold • Replica set elected new Primary • Host of any type has restarted • Host of type Secondary is recovering • Host of any type is down • Host of any type has experienced Rollback • Network issues between members of the replica set or cluster • Monitoring backup status © 2019 Pythian. Confidential
  • 45. Summary • Replica set with odd number of voting members • Hidden or Delayed member for dedicated functions (reporting, backups …) • Have more than one eligible Primary in the replica set • Use multi-AZ for Cloud deployments • Don’t deploy more than one mongod process per node/host • Run replica set members with same hardware for all nodes • Run replica set members with same mongo version • Monitor your replica set status and nodes • Monitor replication lag and Oplog size © 2019 Pythian. Confidential