SlideShare a Scribd company logo
Evolution and Scaling of MongoDB
Management Service Running on MongoDB
Steve Briskin, Lead Engineer, MMS Backup
John Morales, Senior Engineer, MMS Monitoring
2
Agenda
● What is MongoDB Management Service
● MMS Backup
o Schema evolution and optimizations
o How we scaled
● MMS Monitoring
o Read-optimized time series schema
o Write-optimized time series schema
o Benchmarks
3
MongoDB Management Service (MMS)
mms.mongodb.com
Automation and Provisioning
Single-click provisioning, scaling &
upgrades, administrative tasks
Monitoring
Charts, dashboards, and alerts on 100+
metrics
Backup
Backup and restore, with point-in-time
recovery, and support for sharded clusters
4
MMS Backup
● Cloud Backup service
● Takes periodic snapshots
● Manages storage
● Premium features
o Point in time recovery
o Consistent snapshots of sharded clusters
5
MMS Backup Architecture
Backup
Agent
MMS Backup
Ingestion
MMS Backup
Backup Process
Customer’s
MongoDB
Oplog
Store
Snapshot
Store
6
Oplog Store
● “Circular Buffer” of operations
o Many concurrent inserts
o Time-bound (e.g. 24 hours)
o Lifecycle: insert, read once, delete
● Concerns
o Lock contention
o Data purging
o Freelist fragmentation
MMS Backup
Ingestion
Oplog
Store
7
Oplog Store
● Developed with MongoDB 2.2
● Lock Contention
o DB per customer
● Data Purging
o Use TTL Index
● Freelist fragmentation
o Use Power Of 2 Allocation
MMS Backup
Ingestion
Oplog
Store
8
Oplog Store on MongoDB 3.0
• MongoDB 3.0
o More granular locking
o Freelist management improvements
Upgrade
Upgrade
30% Faster
9
Snapshot Storage (Blockstore)
● Backup Snapshot Storage
o File storage in MongoDB
● Design
o Block: 64KB - 15MB of binary data
 SHA256 hash as a unique identifier
o File: List of blocks
o Insert only schema
o De-duplication and compression
MMS Backup
Backup Process
Snapshot
Store
10
Blockstore
● Insert Only + Power Of 2 Allocation = Wasted Space
o Example: 9k document will use 16k
o Worst case: Need 2x disk space
● Writes are sporadic
o Indexes are cold and need to be paged in
o Can be slow and I/O-expensive
11
Blockstore
● Disable Power Of 2 Allocation
o MongoDB 2.2 - 2.6:
db.runCommand({collMod : “collection”,
usePowerOf2Sizes : false})
o MongoDB 3.0:
db.runCommand({collMod : “collection”,
noPadding : true})
● Warm indexes before bulk insertions
db.runCommand({touch : “collection”,
index : true,
data : false})
12
Scaling - Replica Sets
• Started with a single replica set
• Split into purpose-based replica sets
Blockstore (Large HDDs)
Primary
Secondary Secondary
Backup Metadata (Small SSDs)
Primary
Secondary Secondary Secondary Arbiter
Oplog Store (Small HDDs)
Primary
Secondary Secondary Secondary Arbiter
13
Scaling - Application Sharding
• Application sharding for horizontal scaling
• Each customer is assigned to one replica set
Application
Customer A
Customer B
Customer C
Blockstore_1
Primary
Secondary Secondary
Blockstore_2
Primary
Secondary Secondary
Blockstore_0
Primary
Secondary Secondary
14
Scaling - MongoDB Sharding
• MongoDB Sharding
Application
Blockstore_shard_1
Primary
Secondary Secondary
Blockstore_shard_2
Primary
Secondary Secondary
Blockstore_shard_0
Primary
Secondary Secondary
Mongos Mongos
MMS Monitoring Schema Evolution
16
Introduction to MMS Monitoring
Design Objectives circa 2012
•Fast chart load times
•Chart ~80 metrics per host
•Minute-level resolution
Inherent Advantages
•Control our own rate of samples
Browser Users
Monitoring Agent
Metric Data
Sharded ClusterCustomer
Deployment
mms.mongodb.com
17
Circa 2012: Read-Optimized Schema
{
hid: “id”, // Host ID
cid: ObjectId(“...”), // Group ID
g: “network”, // Metric group
i: “bytesOut”, // Specific metric
mn: { // hour worth of points stored
together
“00”: {
n: NumberLong(“...”), // value
t: 1430918626 // time
},
“01”: {
...
},
...,
“59”: { ... }
}
}
● Store points for same metric together
18
Circa 2012: Scaling up Writes
● Write Performance when Read-Optimized
○ Updates $set the time and value sub-doc
○ Documents grow, move on disk
○ I/O mostly random
● Mitigate
○ Ensure updates always in-place (MMAPv1-only)
19
Circa 2012 to Today: Performance
Hooray
●Average chart load time: 15ms
●Today MMS actively monitoring 60k+ hosts
●Storing average of 128 metrics per host
20
2015: What’s Next
Upcoming MMS Monitoring features
●High resolution monitoring
●Charting more metrics
Whoops
●Read-optimized schema inflexible
●Each new metric means new write
Coming Soon: Write-Optimized
{
"_id" : “...”,
"n" : { // network
"bi" : NumberLong(123), // bytesIn
"bo" : NumberLong(234), // bytesOut
"r" : NumberLong(34), // requests
},
"e" : { // page faults
"pf" : NumberLong(3564),
},
"g" : { // queues
"cr" : NumberLong(3564),
},
...,
// sample time for all these points
"t" : ISODate("2015-06-02T15:35:43.189Z")
}
● Store points across metrics together
● Insert-only versus random updates
22
Benchmarking and Tradeoffs
Writes: time (millis) to ingest 4500 hosts Read Latency: millis to read 24-hour chart
200x more write
throughput
~18ms latency tradeoff
Wrap Up and Q & A
● Tailoring configuration for workload
● Schema design and managing tradeoffs
● IOPS often the limiting resource

More Related Content

What's hot

hbaseconasia2017: HBase Practice At XiaoMi
hbaseconasia2017: HBase Practice At XiaoMihbaseconasia2017: HBase Practice At XiaoMi
hbaseconasia2017: HBase Practice At XiaoMi
HBaseCon
 
Consistent hashing algorithmic tradeoffs
Consistent hashing  algorithmic tradeoffsConsistent hashing  algorithmic tradeoffs
Consistent hashing algorithmic tradeoffs
Evan Lin
 
An Intro on Data-oriented Attacks
An Intro on Data-oriented AttacksAn Intro on Data-oriented Attacks
An Intro on Data-oriented Attacks
Aj MaChInE
 
MongoDB Memory Management Demystified
MongoDB Memory Management DemystifiedMongoDB Memory Management Demystified
MongoDB Memory Management DemystifiedMongoDB
 
Netflix - Realtime Impression Store
Netflix - Realtime Impression Store Netflix - Realtime Impression Store
Netflix - Realtime Impression Store
Nitin S
 
Logs @ OVHcloud
Logs @ OVHcloudLogs @ OVHcloud
Logs @ OVHcloud
OVHcloud
 
Security Monitoring for big Infrastructures without a Million Dollar budget
Security Monitoring for big Infrastructures without a Million Dollar budgetSecurity Monitoring for big Infrastructures without a Million Dollar budget
Security Monitoring for big Infrastructures without a Million Dollar budget
Juan Berner
 
Gnocchi v3
Gnocchi v3Gnocchi v3
Gnocchi v3
Gordon Chung
 
Mongo db improve the performance of your application codemotion2016
Mongo db improve the performance of your application codemotion2016Mongo db improve the performance of your application codemotion2016
Mongo db improve the performance of your application codemotion2016
Juan Antonio Roy Couto
 
Scaling Islandora
Scaling IslandoraScaling Islandora
Scaling Islandora
Erin Tripp
 
Analytic Data Report with MongoDB
Analytic Data Report with MongoDBAnalytic Data Report with MongoDB
Analytic Data Report with MongoDB
Li Jia Li
 
Social Data and Log Analysis Using MongoDB
Social Data and Log Analysis Using MongoDBSocial Data and Log Analysis Using MongoDB
Social Data and Log Analysis Using MongoDBTakahiro Inoue
 
Eko10 - Security Monitoring for Big Infrastructures without a Million Dollar ...
Eko10 - Security Monitoring for Big Infrastructures without a Million Dollar ...Eko10 - Security Monitoring for Big Infrastructures without a Million Dollar ...
Eko10 - Security Monitoring for Big Infrastructures without a Million Dollar ...
Hernan Costante
 

What's hot (14)

hbaseconasia2017: HBase Practice At XiaoMi
hbaseconasia2017: HBase Practice At XiaoMihbaseconasia2017: HBase Practice At XiaoMi
hbaseconasia2017: HBase Practice At XiaoMi
 
Consistent hashing algorithmic tradeoffs
Consistent hashing  algorithmic tradeoffsConsistent hashing  algorithmic tradeoffs
Consistent hashing algorithmic tradeoffs
 
An Intro on Data-oriented Attacks
An Intro on Data-oriented AttacksAn Intro on Data-oriented Attacks
An Intro on Data-oriented Attacks
 
Mongo db installation procedure for win 7
Mongo db installation procedure for win 7Mongo db installation procedure for win 7
Mongo db installation procedure for win 7
 
MongoDB Memory Management Demystified
MongoDB Memory Management DemystifiedMongoDB Memory Management Demystified
MongoDB Memory Management Demystified
 
Netflix - Realtime Impression Store
Netflix - Realtime Impression Store Netflix - Realtime Impression Store
Netflix - Realtime Impression Store
 
Logs @ OVHcloud
Logs @ OVHcloudLogs @ OVHcloud
Logs @ OVHcloud
 
Security Monitoring for big Infrastructures without a Million Dollar budget
Security Monitoring for big Infrastructures without a Million Dollar budgetSecurity Monitoring for big Infrastructures without a Million Dollar budget
Security Monitoring for big Infrastructures without a Million Dollar budget
 
Gnocchi v3
Gnocchi v3Gnocchi v3
Gnocchi v3
 
Mongo db improve the performance of your application codemotion2016
Mongo db improve the performance of your application codemotion2016Mongo db improve the performance of your application codemotion2016
Mongo db improve the performance of your application codemotion2016
 
Scaling Islandora
Scaling IslandoraScaling Islandora
Scaling Islandora
 
Analytic Data Report with MongoDB
Analytic Data Report with MongoDBAnalytic Data Report with MongoDB
Analytic Data Report with MongoDB
 
Social Data and Log Analysis Using MongoDB
Social Data and Log Analysis Using MongoDBSocial Data and Log Analysis Using MongoDB
Social Data and Log Analysis Using MongoDB
 
Eko10 - Security Monitoring for Big Infrastructures without a Million Dollar ...
Eko10 - Security Monitoring for Big Infrastructures without a Million Dollar ...Eko10 - Security Monitoring for Big Infrastructures without a Million Dollar ...
Eko10 - Security Monitoring for Big Infrastructures without a Million Dollar ...
 

Similar to Evolution and Scaling of MongoDB Management Service Running on MongoDB

MongoDB 101 & Beyond: Get Started in MongoDB 3.0, Preview 3.2 & Demo of Ops M...
MongoDB 101 & Beyond: Get Started in MongoDB 3.0, Preview 3.2 & Demo of Ops M...MongoDB 101 & Beyond: Get Started in MongoDB 3.0, Preview 3.2 & Demo of Ops M...
MongoDB 101 & Beyond: Get Started in MongoDB 3.0, Preview 3.2 & Demo of Ops M...
MongoDB
 
MongoDB Tokyo - Monitoring and Queueing
MongoDB Tokyo - Monitoring and QueueingMongoDB Tokyo - Monitoring and Queueing
MongoDB Tokyo - Monitoring and Queueing
Boxed Ice
 
SOLR Power FTW: short version
SOLR Power FTW: short versionSOLR Power FTW: short version
SOLR Power FTW: short version
Alex Pinkin
 
Backing Up Data with MMS
Backing Up Data with MMSBacking Up Data with MMS
Backing Up Data with MMS
MongoDB
 
Let the Tiger Roar! - MongoDB 3.0 + WiredTiger
Let the Tiger Roar! - MongoDB 3.0 + WiredTigerLet the Tiger Roar! - MongoDB 3.0 + WiredTiger
Let the Tiger Roar! - MongoDB 3.0 + WiredTiger
Jon Rangel
 
Sizing MongoDB Clusters
Sizing MongoDB Clusters Sizing MongoDB Clusters
Sizing MongoDB Clusters
MongoDB
 
Walking the Walk: Developing the MongoDB Backup Service with MongoDB
Walking the Walk: Developing the MongoDB Backup Service with MongoDBWalking the Walk: Developing the MongoDB Backup Service with MongoDB
Walking the Walk: Developing the MongoDB Backup Service with MongoDBMongoDB
 
MongoDB World 2019: Packing Up Your Data and Moving to MongoDB Atlas
MongoDB World 2019: Packing Up Your Data and Moving to MongoDB AtlasMongoDB World 2019: Packing Up Your Data and Moving to MongoDB Atlas
MongoDB World 2019: Packing Up Your Data and Moving to MongoDB Atlas
MongoDB
 
WiredTiger In-Memory vs WiredTiger B-Tree
WiredTiger In-Memory vs WiredTiger B-TreeWiredTiger In-Memory vs WiredTiger B-Tree
WiredTiger In-Memory vs WiredTiger B-Tree
Sveta Smirnova
 
kranonit S06E01 Игорь Цинько: High load
kranonit S06E01 Игорь Цинько: High loadkranonit S06E01 Игорь Цинько: High load
kranonit S06E01 Игорь Цинько: High loadKrivoy Rog IT Community
 
Evolution of DBA in the Cloud Era
 Evolution of DBA in the Cloud Era Evolution of DBA in the Cloud Era
Evolution of DBA in the Cloud Era
Mydbops
 
MongoDB 3.2 - a giant leap. What’s new?
MongoDB 3.2 - a giant leap. What’s new?MongoDB 3.2 - a giant leap. What’s new?
MongoDB 3.2 - a giant leap. What’s new?
Binary Studio
 
Benchmarking your cloud performance with top 4 global public clouds
Benchmarking your cloud performance with top 4 global public cloudsBenchmarking your cloud performance with top 4 global public clouds
Benchmarking your cloud performance with top 4 global public clouds
data://disrupted®
 
MongoDB at Baidu
MongoDB at BaiduMongoDB at Baidu
MongoDB at Baidu
Mat Keep
 
MongoDB Workshop Universidad de Huelva
MongoDB Workshop Universidad de HuelvaMongoDB Workshop Universidad de Huelva
MongoDB Workshop Universidad de Huelva
Juan Antonio Roy Couto
 
Big data @ Hootsuite analtyics
Big data @ Hootsuite analtyicsBig data @ Hootsuite analtyics
Big data @ Hootsuite analtyics
Claudiu Coman
 
Scaling Monitoring At Databricks From Prometheus to M3
Scaling Monitoring At Databricks From Prometheus to M3Scaling Monitoring At Databricks From Prometheus to M3
Scaling Monitoring At Databricks From Prometheus to M3
LibbySchulze
 
Silicon Valley Code Camp 2014 - Advanced MongoDB
Silicon Valley Code Camp 2014 - Advanced MongoDBSilicon Valley Code Camp 2014 - Advanced MongoDB
Silicon Valley Code Camp 2014 - Advanced MongoDB
Daniel Coupal
 
MongoDB performance tuning and monitoring with MMS
MongoDB performance tuning and monitoring with MMSMongoDB performance tuning and monitoring with MMS
MongoDB performance tuning and monitoring with MMS
Nicholas Tang
 
Eko10 Workshop Opensource Database Auditing
Eko10  Workshop Opensource Database AuditingEko10  Workshop Opensource Database Auditing
Eko10 Workshop Opensource Database Auditing
Juan Berner
 

Similar to Evolution and Scaling of MongoDB Management Service Running on MongoDB (20)

MongoDB 101 & Beyond: Get Started in MongoDB 3.0, Preview 3.2 & Demo of Ops M...
MongoDB 101 & Beyond: Get Started in MongoDB 3.0, Preview 3.2 & Demo of Ops M...MongoDB 101 & Beyond: Get Started in MongoDB 3.0, Preview 3.2 & Demo of Ops M...
MongoDB 101 & Beyond: Get Started in MongoDB 3.0, Preview 3.2 & Demo of Ops M...
 
MongoDB Tokyo - Monitoring and Queueing
MongoDB Tokyo - Monitoring and QueueingMongoDB Tokyo - Monitoring and Queueing
MongoDB Tokyo - Monitoring and Queueing
 
SOLR Power FTW: short version
SOLR Power FTW: short versionSOLR Power FTW: short version
SOLR Power FTW: short version
 
Backing Up Data with MMS
Backing Up Data with MMSBacking Up Data with MMS
Backing Up Data with MMS
 
Let the Tiger Roar! - MongoDB 3.0 + WiredTiger
Let the Tiger Roar! - MongoDB 3.0 + WiredTigerLet the Tiger Roar! - MongoDB 3.0 + WiredTiger
Let the Tiger Roar! - MongoDB 3.0 + WiredTiger
 
Sizing MongoDB Clusters
Sizing MongoDB Clusters Sizing MongoDB Clusters
Sizing MongoDB Clusters
 
Walking the Walk: Developing the MongoDB Backup Service with MongoDB
Walking the Walk: Developing the MongoDB Backup Service with MongoDBWalking the Walk: Developing the MongoDB Backup Service with MongoDB
Walking the Walk: Developing the MongoDB Backup Service with MongoDB
 
MongoDB World 2019: Packing Up Your Data and Moving to MongoDB Atlas
MongoDB World 2019: Packing Up Your Data and Moving to MongoDB AtlasMongoDB World 2019: Packing Up Your Data and Moving to MongoDB Atlas
MongoDB World 2019: Packing Up Your Data and Moving to MongoDB Atlas
 
WiredTiger In-Memory vs WiredTiger B-Tree
WiredTiger In-Memory vs WiredTiger B-TreeWiredTiger In-Memory vs WiredTiger B-Tree
WiredTiger In-Memory vs WiredTiger B-Tree
 
kranonit S06E01 Игорь Цинько: High load
kranonit S06E01 Игорь Цинько: High loadkranonit S06E01 Игорь Цинько: High load
kranonit S06E01 Игорь Цинько: High load
 
Evolution of DBA in the Cloud Era
 Evolution of DBA in the Cloud Era Evolution of DBA in the Cloud Era
Evolution of DBA in the Cloud Era
 
MongoDB 3.2 - a giant leap. What’s new?
MongoDB 3.2 - a giant leap. What’s new?MongoDB 3.2 - a giant leap. What’s new?
MongoDB 3.2 - a giant leap. What’s new?
 
Benchmarking your cloud performance with top 4 global public clouds
Benchmarking your cloud performance with top 4 global public cloudsBenchmarking your cloud performance with top 4 global public clouds
Benchmarking your cloud performance with top 4 global public clouds
 
MongoDB at Baidu
MongoDB at BaiduMongoDB at Baidu
MongoDB at Baidu
 
MongoDB Workshop Universidad de Huelva
MongoDB Workshop Universidad de HuelvaMongoDB Workshop Universidad de Huelva
MongoDB Workshop Universidad de Huelva
 
Big data @ Hootsuite analtyics
Big data @ Hootsuite analtyicsBig data @ Hootsuite analtyics
Big data @ Hootsuite analtyics
 
Scaling Monitoring At Databricks From Prometheus to M3
Scaling Monitoring At Databricks From Prometheus to M3Scaling Monitoring At Databricks From Prometheus to M3
Scaling Monitoring At Databricks From Prometheus to M3
 
Silicon Valley Code Camp 2014 - Advanced MongoDB
Silicon Valley Code Camp 2014 - Advanced MongoDBSilicon Valley Code Camp 2014 - Advanced MongoDB
Silicon Valley Code Camp 2014 - Advanced MongoDB
 
MongoDB performance tuning and monitoring with MMS
MongoDB performance tuning and monitoring with MMSMongoDB performance tuning and monitoring with MMS
MongoDB performance tuning and monitoring with MMS
 
Eko10 Workshop Opensource Database Auditing
Eko10  Workshop Opensource Database AuditingEko10  Workshop Opensource Database Auditing
Eko10 Workshop Opensource Database Auditing
 

More from MongoDB

MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB AtlasMongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB
 
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB
 
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB
 
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDBMongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB
 
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB
 
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series DataMongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB
 
MongoDB SoCal 2020: MongoDB Atlas Jump Start
 MongoDB SoCal 2020: MongoDB Atlas Jump Start MongoDB SoCal 2020: MongoDB Atlas Jump Start
MongoDB SoCal 2020: MongoDB Atlas Jump Start
MongoDB
 
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB
 
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB
 
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB
 
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB
 
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your MindsetMongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB
 
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas JumpstartMongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB
 
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB
 
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB
 
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB
 
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep DiveMongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB
 
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & GolangMongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB
 
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB
 
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB
 

More from MongoDB (20)

MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB AtlasMongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
 
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
 
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
 
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDBMongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
 
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
 
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series DataMongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
 
MongoDB SoCal 2020: MongoDB Atlas Jump Start
 MongoDB SoCal 2020: MongoDB Atlas Jump Start MongoDB SoCal 2020: MongoDB Atlas Jump Start
MongoDB SoCal 2020: MongoDB Atlas Jump Start
 
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
 
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
 
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
 
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
 
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your MindsetMongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
 
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas JumpstartMongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
 
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
 
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
 
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
 
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep DiveMongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
 
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & GolangMongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
 
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
 
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
 

Recently uploaded

Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Ramesh Iyer
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
Guy Korland
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
Jemma Hussein Allen
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Albert Hoitingh
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
DianaGray10
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
KatiaHIMEUR1
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
ControlCase
 
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
Product School
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
Product School
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
Cheryl Hung
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
Product School
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance
 
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
Product School
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
Alan Dix
 
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsTo Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
Paul Groth
 
Elevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object CalisthenicsElevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object Calisthenics
Dorra BARTAGUIZ
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Jeffrey Haguewood
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
Ana-Maria Mihalceanu
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
RTTS
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Thierry Lestable
 

Recently uploaded (20)

Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
 
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
 
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
 
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsTo Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
 
Elevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object CalisthenicsElevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object Calisthenics
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
 

Evolution and Scaling of MongoDB Management Service Running on MongoDB

  • 1. Evolution and Scaling of MongoDB Management Service Running on MongoDB Steve Briskin, Lead Engineer, MMS Backup John Morales, Senior Engineer, MMS Monitoring
  • 2. 2 Agenda ● What is MongoDB Management Service ● MMS Backup o Schema evolution and optimizations o How we scaled ● MMS Monitoring o Read-optimized time series schema o Write-optimized time series schema o Benchmarks
  • 3. 3 MongoDB Management Service (MMS) mms.mongodb.com Automation and Provisioning Single-click provisioning, scaling & upgrades, administrative tasks Monitoring Charts, dashboards, and alerts on 100+ metrics Backup Backup and restore, with point-in-time recovery, and support for sharded clusters
  • 4. 4 MMS Backup ● Cloud Backup service ● Takes periodic snapshots ● Manages storage ● Premium features o Point in time recovery o Consistent snapshots of sharded clusters
  • 5. 5 MMS Backup Architecture Backup Agent MMS Backup Ingestion MMS Backup Backup Process Customer’s MongoDB Oplog Store Snapshot Store
  • 6. 6 Oplog Store ● “Circular Buffer” of operations o Many concurrent inserts o Time-bound (e.g. 24 hours) o Lifecycle: insert, read once, delete ● Concerns o Lock contention o Data purging o Freelist fragmentation MMS Backup Ingestion Oplog Store
  • 7. 7 Oplog Store ● Developed with MongoDB 2.2 ● Lock Contention o DB per customer ● Data Purging o Use TTL Index ● Freelist fragmentation o Use Power Of 2 Allocation MMS Backup Ingestion Oplog Store
  • 8. 8 Oplog Store on MongoDB 3.0 • MongoDB 3.0 o More granular locking o Freelist management improvements Upgrade Upgrade 30% Faster
  • 9. 9 Snapshot Storage (Blockstore) ● Backup Snapshot Storage o File storage in MongoDB ● Design o Block: 64KB - 15MB of binary data  SHA256 hash as a unique identifier o File: List of blocks o Insert only schema o De-duplication and compression MMS Backup Backup Process Snapshot Store
  • 10. 10 Blockstore ● Insert Only + Power Of 2 Allocation = Wasted Space o Example: 9k document will use 16k o Worst case: Need 2x disk space ● Writes are sporadic o Indexes are cold and need to be paged in o Can be slow and I/O-expensive
  • 11. 11 Blockstore ● Disable Power Of 2 Allocation o MongoDB 2.2 - 2.6: db.runCommand({collMod : “collection”, usePowerOf2Sizes : false}) o MongoDB 3.0: db.runCommand({collMod : “collection”, noPadding : true}) ● Warm indexes before bulk insertions db.runCommand({touch : “collection”, index : true, data : false})
  • 12. 12 Scaling - Replica Sets • Started with a single replica set • Split into purpose-based replica sets Blockstore (Large HDDs) Primary Secondary Secondary Backup Metadata (Small SSDs) Primary Secondary Secondary Secondary Arbiter Oplog Store (Small HDDs) Primary Secondary Secondary Secondary Arbiter
  • 13. 13 Scaling - Application Sharding • Application sharding for horizontal scaling • Each customer is assigned to one replica set Application Customer A Customer B Customer C Blockstore_1 Primary Secondary Secondary Blockstore_2 Primary Secondary Secondary Blockstore_0 Primary Secondary Secondary
  • 14. 14 Scaling - MongoDB Sharding • MongoDB Sharding Application Blockstore_shard_1 Primary Secondary Secondary Blockstore_shard_2 Primary Secondary Secondary Blockstore_shard_0 Primary Secondary Secondary Mongos Mongos
  • 16. 16 Introduction to MMS Monitoring Design Objectives circa 2012 •Fast chart load times •Chart ~80 metrics per host •Minute-level resolution Inherent Advantages •Control our own rate of samples Browser Users Monitoring Agent Metric Data Sharded ClusterCustomer Deployment mms.mongodb.com
  • 17. 17 Circa 2012: Read-Optimized Schema { hid: “id”, // Host ID cid: ObjectId(“...”), // Group ID g: “network”, // Metric group i: “bytesOut”, // Specific metric mn: { // hour worth of points stored together “00”: { n: NumberLong(“...”), // value t: 1430918626 // time }, “01”: { ... }, ..., “59”: { ... } } } ● Store points for same metric together
  • 18. 18 Circa 2012: Scaling up Writes ● Write Performance when Read-Optimized ○ Updates $set the time and value sub-doc ○ Documents grow, move on disk ○ I/O mostly random ● Mitigate ○ Ensure updates always in-place (MMAPv1-only)
  • 19. 19 Circa 2012 to Today: Performance Hooray ●Average chart load time: 15ms ●Today MMS actively monitoring 60k+ hosts ●Storing average of 128 metrics per host
  • 20. 20 2015: What’s Next Upcoming MMS Monitoring features ●High resolution monitoring ●Charting more metrics Whoops ●Read-optimized schema inflexible ●Each new metric means new write
  • 21. Coming Soon: Write-Optimized { "_id" : “...”, "n" : { // network "bi" : NumberLong(123), // bytesIn "bo" : NumberLong(234), // bytesOut "r" : NumberLong(34), // requests }, "e" : { // page faults "pf" : NumberLong(3564), }, "g" : { // queues "cr" : NumberLong(3564), }, ..., // sample time for all these points "t" : ISODate("2015-06-02T15:35:43.189Z") } ● Store points across metrics together ● Insert-only versus random updates
  • 22. 22 Benchmarking and Tradeoffs Writes: time (millis) to ingest 4500 hosts Read Latency: millis to read 24-hour chart 200x more write throughput ~18ms latency tradeoff
  • 23. Wrap Up and Q & A ● Tailoring configuration for workload ● Schema design and managing tradeoffs ● IOPS often the limiting resource

Editor's Notes

  1. We’re talking architecture and evolution. We are not mongoDB kernel engineers. We are internal customers and use MongoDB just like you.
  2. Automation: simplifies deployment of new clusters. Automatic provisioning of hosts in AWS and deployment and configuration of MongoDB. Orchestrates upgrades and other admin tasks. Monitoring: Captures and alerts on critical metrics. You can’t optimize what you can’t measure. Pre-empts problems. Backup: Once you have everything running, you need to back it up. Backup takes periodic snapshots of your data, offers some value added features like PIT recovery and consistent cluster snapshots, manages storage, and monitors for health. Mention Ops Manager!
  3. Cloud backup service Capture write operations and rebuild that dataset on our servers Take periodic snapshots Result: fully managed backups with low overhead for the customer -- low system impact, low development and operational overhead.
  4. Many customers = many concurrent writes = lock contention Best way to purge old data Freelist fragmentation that leads to wasted disk space. Common issue with insert/delete patterns.
  5. Designed and developed ~3 years ago. Power of 2 allocation - allocates and frees space in more predictable chunks. introduced in MongoDB 2.2, was made the default in 2.6. All allocations are rounded up to the next power of 2. For example, 3.5kb -> 4kb. Overall we are very happy with this decision. Over the years we upgraded to 2.4, 2.6, and recently to 3.0.
  6. March 3rd. Locking - ~30% decrease in insert latency Freelist – Reclaimed ~1TB of disk per month Since oplogs are a circular buffer and customer activity is generally steady w/ a slight increase over time we should expect storage to be mostly flat with a slight increase.
  7. We are using MongoDB to store MongoDB data files. Think of it as a filesystem backed by MongoDB. Insert only pattern -- no deletes, mark and sweep for deleting old data. Touch on other optimizations like de-duplication, compression, high availability, and scalability.
  8. In most cases power of 2 is better. It is a MUST for update/delete patterns. But this case is different.
  9. Wrap up: Different use cases require different tuning. MongoDB is flexibile and provides with all the power we need.
  10. Different: usage patterns availability requirements hardware requirements
  11. pros and cons Operationally easier initially -- isolation, fewer components, fewer network hops Small customers -- no one exceeded a single replica set. Assumption that usage would average out between customers sharing a blockstore Load didn’t average out Blockstores became either disk space bound or IO bound Not hard to balance on either, very hard to balance on both.
  12. We shard every collection Our usecase is perfectly suited for this since most data already has a hashed _id. So no additional hashed index requirement Load should be equally distributed. And it is! Operationally easier since it looks like one large end-point. We use automation to manage it. This is the topology that we’re currently migrating to and we’re very happy with the results thus far. That concludes the backup portion of the talk. Now I’ll hand it over to John who’ll talk about schema evolution of our monitoring time series data.
  13. Thanks Steve, so I’m going to switch gears tale of two schemas and tradeoffs Not a prescription of favoring one schema design over another, but rather a history of how the design evolved to meet changing application requirements
  14. To begin take you back to 2012 when MMS Monitoring was first getting off the ground. Lets build a SaaS monitoring product for MongoDB deployments. Started with the broad goals wanting both: monitoring data as granular as it can be but b.) without sacrificing responsiveness. Arrived at these 3 objectives And so from there, decide how are we going to get metric data in, and eventually landed on an architecture as shown by this high-level architecture diagram on the left which has few moving pieces. One advantage of using agent is we control sample rate, and can make it fixed interval How we’re going to store this metric data to meet our objectives, and where we ended up…
  15. ..is a Read-Optimized schema. Where the main idea is.. Here we have this cutout of the MMS user interface showing just 4 different charts about a mongod host, and on right how one series is laid out in the schema. But how many points? How to ensure we have an upper bound the number of points? Well remember I mentioned the monitoring agent.. What’s my network out over the past hour? Read only 2 documents. For every metric you see, there will be similar documents for the different metrics types (THERE’S MUCH TO TALK ABOUT ON THIS SLIDE) load all 13 data series on this slides, 780 data points, but only read 26 documents. Great, so looking like we can expect reads to be fast (after all, says read-optimized right on the tin) but what about getting the data in..
  16. Write tradeoff -- existing documents means updates that create subdocuments, which means random I/O And so now three years later, how’d we do..
  17. 98%ile ~350ms most recently. Looking ahead..
  18. One of the things that guides evolution of a product is the feedback of users -- you guys! -- and we’re listening what we need to do. Minute level really ought to be enough for anyone, so locked our lowest level resolution our SCHEMA can express as 1 minute. Each new metric means new documents per host. So like I mentioned, 60,000 is the current number of hosts. So one new metric means 60,000 more writes per minute. In general if we want to double number of hosts, and double number of metrics, 4x number documents - can we do better Quadratic with increase of hosts and metrics
  19. “As you can see, we put every metric into the one document, key abbreviation, etc.” Insert only workload means more sequential access pattern for spinning disks scalable design: more samples = more documents, but More metrics != more documents Per host, 1 large insert vs many small random updates
  20. Time to completely ingest metrics from 4500 hosts was ~10 seconds, with new schema design is now 50 millis Write IOPS: 35x fewer Only ~18ms cost to reading more documents
  21. Now we’ve spoken about different ways to ingest data for write-heavy applications. How we tailor MongoDB configuration for our workload – we saw one case for the oplog store where the delete pattern means usePowerOf2 is essential, and another (blockstore) where it’s undesirable Tradeoffs to read optimized and write-optimized schemas Always tradeoffs - working to find balance between the two. neither approach is strictly superior Write-heavy applications measure success in IOPS. Drawing down from budget - spend judiciously. Optimizing your access pattern for your disks We scale with you guys and are fortunate that MongoDB has the flexibility to meet these different access patterns, use cases. Recruit: the more people use it the bigger MMS needs to be, we scale with you guys and are fortunate that MongoDB has the flexibility to meet these different access patterns, use cases. Recruit: MMS a big focus of investment, if you think the contents of this talk were interesting you could be making a career out of it. Feel free to ask me about life at MongoDB or find one of our Recruiters at the MongoDB booth