SlideShare a Scribd company logo
FEBRUARY 15, 2018 | BELL HARBOR
#MDBlocal
Sizing MongoDB Clusters
#MDBlocal
Master Solutions
Architect
Jay Runkel
MongoDB @jayrunkel
#MDBlocal
MongoDB: 4 years
NoSQL: 9 years
Solution Architect: engineer assigned to sales
Significant experience sizing MongoDB clusters
About me
#MDBlocal
Objective
#MDBlocal
Do I need to shard?
What size servers should I use?
What will my monthly Atlas/AWS/Azure/Google costs be?
When will I need to add a new shard or upgrade my servers?
How much data can my servers support?
How many queries can my servers support?
Will we be able to meet our query latency requirements?
Sizing
#MDBlocal
• Large coffee chain: PlanetDollar
• Collect mobile app performance
• Every tap, click, gesture will generate an event
• 2 Year History
• Perform analytics
• Historical
• Near real-time (executive dashboards)
• Support usage
• 3000 – 5000 events per second
Your boss comes to you…
I need a budget for the monthly Atlas costs?
#MDBlocal
• Build a prototype
• Run performance tests using actual data and queries on hardware with
specs similar to production servers
The only accurate way to size a cluster
• EVERY OTHER APPROACH IS A GUESS
• Including the one I am presenting today
#MDBlocal
• Early in project, but
• Need to order hardware
• Estimate costs to determine “Go/No Go” decision
• Schema design
• Compare the hardware requirements for different schemas
Sometimes, it is necessary to guess ☹
#MDBlocal
MongoDB Clusters Look Like This
Config
Config
Config
Application
Driver
Primary
Secondary
Secondary
#MDBlocal
• # of shards
• Specifications of each server
• CPU
• Storage
• Size
• Performance: IOPS
• Memory
• Network
Our Solution Will Consist Of
#MDBlocal
• Sizing Objective ✓
• IOPS, Query Processing, Working Set
• Sizing Methodology with an Example
Agenda
#MDBlocal
IOPS, Query Processing,
Working Set
#MDBlocal
• # of shards
• Specifications of each server
• CPU
• Storage
• Size
• Performance: IOPS
• Memory
• Network
Our Solution Will Consist Of
#MDBlocal
• IOPS – input output units per second
• Throughput
• Random access
• Most workloads “randomly” access documents
IOPS
collection
#MDBlocal
Storage Performance
Type IOPS
7200 rpm SATA ~ 75 – 100
15000 rpm SAS ~ 175 – 210
RAID-10 (24 x 7200 RPM SAS) 2000
Amazon EBS 250 – 500
Amazon EBS Provisioned IOPS 10000 - 20000
SSD 50000
Flash Storage 100K – 400K (or more)
http://en.wikipedia.org/wiki/IOPS
#MDBlocal
• How many IOPS do we need?
• Want the real answer, run a test
• How to estimate?
Hardest Part of Sizing is IOPS
#MDBlocal
Processing a Query
Select Index
Load relevant index
entries from disk
Identify documents
using index
Retrieve documents
from disk
Filter documents
Return Documents
#MDBlocal
Processing a Query
IO
Select Index
Load relevant index
entries from disk
Identify documents
using index
Retrieve documents
from disk
Filter documents
Return Documents
#MDBlocal
But MongoDB Has a Cache
File System
indexes collections
CPU
Memory
indexes
documents
Disk access is only necessary if indexes
or documents are not in cache
Select Index
Load relevant index
entries from disk
Identify documents
using index
Retrieve documents
from disk
Filter documents
Return Documents
#MDBlocal
Working Set
Working Set = indexes plus frequently accessed
documents
If RAM greater than working set then reduced IO
Select Index
Load relevant index
entries from disk
Identify documents
using index
Retrieve documents
from disk
Filter documents
Return Documents
File System
indexes collections
CPU
Memory
indexes
documents
#MDBlocal
This is all great, but how do we estimate IOPS?
#MDBlocal
Assume
• Working Set < RAM < Data Size
• Memory contains indexes only
MongoDB Simplified Model
File System
indexes collections
CPU
Memory
indexes
documents
#MDBlocal
Assume appropriate indexes
To resolve find:
• Navigate in-memory indexes
• Retrieve document from disk
1 IOP per document returned
Find Queries With Simplified Model
File System
indexes collections
CPU
Memory
indexes
documents
#MDBlocal
Assume appropriate indexes
To resolve find:
• Navigate in-memory indexes
• Retrieve document from disk
1 IOP per document returned
Find Queries With Simplified Model
File System
indexes collections
CPU
Memory
indexes
documents
#MDBlocal
To resolve insert:
• Write document to disk
• Update each index file
IOPS = 1 + # of indexes
Inserts With Simplified Model
File System
indexes collections
CPU
Memory
indexes
documents
#MDBlocal
To resolve delete:
• Navigate in-memory indexes
• Mark document deleted
• Update each index file
IOPS = 1 + # of indexes
Deletes With Simplified Model
File System
indexes collections
CPU
Memory
indexes
documents
#MDBlocal
To resolve delete:
• Navigate in-memory indexes
• Mark document deleted
• Insert new document version
• Update each index file
IOPS = 2 + # of indexes
Updates With Simplified Model
File System
indexes collections
CPU
Memory
indexes
documents
#MDBlocal
• Working Set
• Checkpoints
• Document size relative to block size
• Indexed Arrays
• Journal, Log
The Simplified Model is too simplistic
#MDBlocal
• WiredTiger write process:
1. Update document in RAM (cache)
2. Write to journal (disk)
3. Periodically, write dirty documents to disk (checkpoint)
• 60 seconds or 2 GB (whichever comes first)
Checkpoints
Checkpoint 1 Checkpoint 2 Checkpoint 3
B C A A C A
3 writes
3 documents written
3 writes
2 documents written
#MDBlocal
• Estimate total requirements (using simplified model):
• RAM
• CPU
• Disk Space
• IOPS
• Adjust based upon working set, checkpoints, etc.
• Design (sharded) cluster that provides these totals
How are we going to get there?
#MDBlocal
Sizing Process
#MDBlocal
Methodology
Application
Requirements
Cluster Sizing
• Number of
shards
• Server specs
Magic
Happens
#MDBlocal
1.Collection Size
2.Working Set
3.Queries -> IOPS
4.Adjust based upon working set, checkpoints, etc.
5.Calculate # of shards
6.Review, iterate, repeat
Methodology (cont.)
Build a spread
sheet
Multiple
iterations may
be required
#MDBlocal
1.Assumptions
1.Data Size
1.Working Set
• Index Size
• Frequently Accessed Documents
1.Queries – IOPS
1.Shard Calculations
Sizing Spreadsheet
#MDBlocal
1.Assumptions
1.Data Size
1.Working Set
• Index Size
• Frequently Accessed Documents
2. Queries – IOPS
1. Shard Calculations
Sizing Spreadsheet
#MDBlocal
https://github.com/jayrunkel/mdbw2017Sizing
Sizing Spreadsheet Download
#MDBlocal
Assumptions
#MDBlocal
Sizing Spreadsheet
1.Assumptions
1.Data Size
1.Working Set
• Index Size
• Frequently Accessed Documents
2. Queries – IOPS
1. Shard Calculations
#MDBlocal
• # of documents
• Data size
• Index size
• WT compression
Collection Analysis
#MDBlocal
Calculate The Number of Documents
Application Description # of Documents in Collection
There will be 20M documents in the
collection by the end of 2017
20,000,000
We expect to insert 10K documents
per day with 1 year retention period
365*10,000 = 3,655,000
We have 3000 devices each
producing 1 event per minute and we
need to keep a 90 day history
3000 * 60 * 24 * 90 = 388,800,000
#MDBlocal
Calculate The Number of Documents
Application Description # of Documents in Collection
There will be 20M documents in the collection
by the end of 2017
20,000,000
We expect to insert 10K documents per day
with 1 year retention period
365*10,000 = 3,655,000
We have 3000 devices each producing 1 event
per minute and we need to keep a 90 day
history
3000 * 60 * 24 * 90 = 388,800,000
PlanetDollar:
2 year history. Each day 5000 inserts per
second for 5 hours and 3000 inserts per second
for 19 hours
2*365*(5000*5*3600 +
3000*19*3600) = 215,496,000,000
#MDBlocal
• Data Size = # of documents * Average document size
• This information is available in db.stats(), Compass, Ops Manager, Cloud
Manager, Atlas, etc.
Calculate the Data Size
#MDBlocal
• Write some code
• Programmatically generate a large data set
• 5-10% of expected size
• Measure
• Collection size
• Index size
• Compression
What if there aren’t any documents?
#MDBlocal
• Use db.collection.stats()
• Take data size, index size and extrapolate to production size
• Calculate compression ratio
db.collection.stats()
{
count: 10000
size: 70,388,956
avgObjSize: 7038
storageSize: 25341952
…
totalIndexSize: 147456
}
Determine collection and data size
Parameter Formula Value
# of documents 2.5B
avgObjSize 7038
Collection Size =2.5B * 7038 1.760E13 Bytes
WT Compression = 25341952/70388956 .36
Collection Storage =2.5B * 7038 * .36 6.33E12 Bytes
Index Size Per Doc = 147456 / 10000 15 Bytes
Collection Index Size =2.5B * 15 /1024^3 35 GB
#MDBlocal
PlanetDollar - Collection Size
#MDBlocal
Sizing Spreadsheet
1.Assumptions
1.Data Size
1.Working Set
• Index Size
• Frequently Accessed Documents
1.Queries – IOPS
1.Shard Calculations
#MDBlocal
WorkSet = Indexes plus the set of documents accessed frequently
We know the index size from previous analysis
Estimate the frequently
accessed documents
Given the queries
What are the frequently
accessed docs?
Working Set
File System
collections indexes
CPU
Memory
indexes
documents
#MDBlocal
Query Analysis
Dashboards – last minute of data
Customer support - last hour of data
Reports (run once per day) inspect last years worth of data
Active Documents = 1 hours worth of data
PlanetDollar Working Set
#MDBlocal
PlanetDollar - Working Set
#MDBlocal
1.Assumptions
1.Data Size
1.Working Set
• Index Size
• Frequently Accessed Documents
1.Queries – IOPS
1.Shard Calculations
Sizing Spreadsheet
#MDBlocal
+ # of documents returned per second
+ # of documents updated per second
+ # of indexes impacted by each update
+ # of inserts per second
+ # of indexes impacted by each insert
+ # of deletes per second (x2)
+ # of indexes impacted by each delete
- Multiple updates occurring within checkpoint
- % of find query results in cache
Total IOPS
IOPS Calculation
#MDBlocal
• 5000 inserts per second
• 5000 deletes per second
Dashboards (aggregations: 100 per minute)
• Total events per minute across all users (current minute)
• Total events per minute per region (current minute)
• Total events per store per minute (current minute)
Debugging Tool (ad hoc – 5 per second)
• Find all events for a user in last 60 minutes (100 events returned, on average)
PlanetDollar Queries
#MDBlocal
• Each insert:
• Update collection
• Update each index (3 indexes)
• Each Delete:
• Update collection
• Update each index (3 indexes)
• 5000 inserts/sec
• 5000 deletes/sec
IOPS for inserts and deletes
4 IOPS
5 IOPS
(4 * 5000) + (5 * 5000) = 45000 IOPS
#MDBlocal
• Example: Total events per minute across all users (current minute)
• How many documents will be read from disk?
IOPS for PlanetDollar Aggregations
05000 per second * 60 seconds = 300,000
Most data in cache
Some IOPS will likely be require
#MDBlocal
• Find all events for a user in last 60 minutes
• 5 per second
• 100 documents per query
• # IOPS = 5 * 100 = 500 IOPS
IOPS For Find
#MDBlocal
PlanetDollar - Queries/IOPS
#MDBlocal
• CPU utilized for:
• Compress/decompress
• Encrypt/Decrypt
• Aggregation queries
• General query processing
• In most cases, RAM requirements → large servers → many cores
• Possible exception: aggregation queries
• One core per query
• # cores >> # of simultaneous aggregation queries
How Many CPUs Do I Need?
#MDBlocal
1.Assumptions
1.Data Size
1.Working Set
• Index Size
• Frequently Accessed Documents
1.Queries – IOPS
1.Shard Calculations
Sizing Spreadsheet
#MDBlocal
• At this point you have:
1. Required storage capacity
2. Working Set Size
3. IOPS Estimate
4. Some idea about class of server (or VM) the customer plans to deploy
• Determine number of required shards
Shard Calculations
#MDBlocal
• Sum of disk space across shards
greater than required storage size
Disk Space: How Many Shards Do I Need?
Example
Data Size = 9 TB
WiredTiger Compression Ratio: .33
Storage size = 3 TB
Req Disk: 6 TB
Server disk capacity = 2 TB
3 Shards Required
Recommend providing 2X
the compressed data size in
disk
#MDBlocal
RAM: How Many Shards Do I Need?
Example
Working Set = 428 GB
Server RAM = 128 GB
428/128 = 3.34
4 Shards Required
#MDBlocal
IOPS: How Many Shards Do I Need?
Example
Require: 50K IOPS
AWS Instance: 20K IOPS
3 Shards Required
#MDBlocal
PlanetDollar - Shard Calculations
#MDBlocal
1. Calculate:
• Collection size
• Index size
2. Estimate Working Set
3. Use simplified model to estimate IOPS
4. Revise (working set coverage, checkpoints, etc.)
5. Calculate shards
Sizing Summary
#MDBlocal
Questions?
https://github.com/jayrunkel/mdbw2017Sizing

More Related Content

What's hot

MongoDB WiredTiger Internals: Journey To Transactions
MongoDB WiredTiger Internals: Journey To TransactionsMongoDB WiredTiger Internals: Journey To Transactions
MongoDB WiredTiger Internals: Journey To Transactions
Mydbops
 
RocksDB detail
RocksDB detailRocksDB detail
RocksDB detail
MIJIN AN
 
AWS DynamoDB and Schema Design
AWS DynamoDB and Schema DesignAWS DynamoDB and Schema Design
AWS DynamoDB and Schema Design
Amazon Web Services
 
The columnar roadmap: Apache Parquet and Apache Arrow
The columnar roadmap: Apache Parquet and Apache ArrowThe columnar roadmap: Apache Parquet and Apache Arrow
The columnar roadmap: Apache Parquet and Apache Arrow
DataWorks Summit
 
Sharding Methods for MongoDB
Sharding Methods for MongoDBSharding Methods for MongoDB
Sharding Methods for MongoDB
MongoDB
 
(DAT401) Amazon DynamoDB Deep Dive
(DAT401) Amazon DynamoDB Deep Dive(DAT401) Amazon DynamoDB Deep Dive
(DAT401) Amazon DynamoDB Deep Dive
Amazon Web Services
 
Performance Troubleshooting Using Apache Spark Metrics
Performance Troubleshooting Using Apache Spark MetricsPerformance Troubleshooting Using Apache Spark Metrics
Performance Troubleshooting Using Apache Spark Metrics
Databricks
 
Creating Continuously Up to Date Materialized Aggregates
Creating Continuously Up to Date Materialized AggregatesCreating Continuously Up to Date Materialized Aggregates
Creating Continuously Up to Date Materialized Aggregates
EDB
 
Naver속도의, 속도에 의한, 속도를 위한 몽고DB (네이버 컨텐츠검색과 몽고DB) [Naver]
Naver속도의, 속도에 의한, 속도를 위한 몽고DB (네이버 컨텐츠검색과 몽고DB) [Naver]Naver속도의, 속도에 의한, 속도를 위한 몽고DB (네이버 컨텐츠검색과 몽고DB) [Naver]
Naver속도의, 속도에 의한, 속도를 위한 몽고DB (네이버 컨텐츠검색과 몽고DB) [Naver]
MongoDB
 
MongoDB 101
MongoDB 101MongoDB 101
MongoDB 101
Abhijeet Vaikar
 
My first 90 days with ClickHouse.pdf
My first 90 days with ClickHouse.pdfMy first 90 days with ClickHouse.pdf
My first 90 days with ClickHouse.pdf
Alkin Tezuysal
 
Mongo db 최범균
Mongo db 최범균Mongo db 최범균
Mongo db 최범균
beom kyun choi
 
High-speed Database Throughput Using Apache Arrow Flight SQL
High-speed Database Throughput Using Apache Arrow Flight SQLHigh-speed Database Throughput Using Apache Arrow Flight SQL
High-speed Database Throughput Using Apache Arrow Flight SQL
ScyllaDB
 
Influxdb and time series data
Influxdb and time series dataInfluxdb and time series data
Influxdb and time series data
Marcin Szepczyński
 
Parquet performance tuning: the missing guide
Parquet performance tuning: the missing guideParquet performance tuning: the missing guide
Parquet performance tuning: the missing guide
Ryan Blue
 
Rocks db state store in structured streaming
Rocks db state store in structured streamingRocks db state store in structured streaming
Rocks db state store in structured streaming
Balaji Mohanam
 
Inside MongoDB: the Internals of an Open-Source Database
Inside MongoDB: the Internals of an Open-Source DatabaseInside MongoDB: the Internals of an Open-Source Database
Inside MongoDB: the Internals of an Open-Source Database
Mike Dirolf
 
Unify Stream and Batch Processing using Dataflow, a Portable Programmable Mod...
Unify Stream and Batch Processing using Dataflow, a Portable Programmable Mod...Unify Stream and Batch Processing using Dataflow, a Portable Programmable Mod...
Unify Stream and Batch Processing using Dataflow, a Portable Programmable Mod...
DataWorks Summit
 
RedisConf17- Using Redis at scale @ Twitter
RedisConf17- Using Redis at scale @ TwitterRedisConf17- Using Redis at scale @ Twitter
RedisConf17- Using Redis at scale @ Twitter
Redis Labs
 
Simplifying Big Data Analytics with Apache Spark
Simplifying Big Data Analytics with Apache SparkSimplifying Big Data Analytics with Apache Spark
Simplifying Big Data Analytics with Apache SparkDatabricks
 

What's hot (20)

MongoDB WiredTiger Internals: Journey To Transactions
MongoDB WiredTiger Internals: Journey To TransactionsMongoDB WiredTiger Internals: Journey To Transactions
MongoDB WiredTiger Internals: Journey To Transactions
 
RocksDB detail
RocksDB detailRocksDB detail
RocksDB detail
 
AWS DynamoDB and Schema Design
AWS DynamoDB and Schema DesignAWS DynamoDB and Schema Design
AWS DynamoDB and Schema Design
 
The columnar roadmap: Apache Parquet and Apache Arrow
The columnar roadmap: Apache Parquet and Apache ArrowThe columnar roadmap: Apache Parquet and Apache Arrow
The columnar roadmap: Apache Parquet and Apache Arrow
 
Sharding Methods for MongoDB
Sharding Methods for MongoDBSharding Methods for MongoDB
Sharding Methods for MongoDB
 
(DAT401) Amazon DynamoDB Deep Dive
(DAT401) Amazon DynamoDB Deep Dive(DAT401) Amazon DynamoDB Deep Dive
(DAT401) Amazon DynamoDB Deep Dive
 
Performance Troubleshooting Using Apache Spark Metrics
Performance Troubleshooting Using Apache Spark MetricsPerformance Troubleshooting Using Apache Spark Metrics
Performance Troubleshooting Using Apache Spark Metrics
 
Creating Continuously Up to Date Materialized Aggregates
Creating Continuously Up to Date Materialized AggregatesCreating Continuously Up to Date Materialized Aggregates
Creating Continuously Up to Date Materialized Aggregates
 
Naver속도의, 속도에 의한, 속도를 위한 몽고DB (네이버 컨텐츠검색과 몽고DB) [Naver]
Naver속도의, 속도에 의한, 속도를 위한 몽고DB (네이버 컨텐츠검색과 몽고DB) [Naver]Naver속도의, 속도에 의한, 속도를 위한 몽고DB (네이버 컨텐츠검색과 몽고DB) [Naver]
Naver속도의, 속도에 의한, 속도를 위한 몽고DB (네이버 컨텐츠검색과 몽고DB) [Naver]
 
MongoDB 101
MongoDB 101MongoDB 101
MongoDB 101
 
My first 90 days with ClickHouse.pdf
My first 90 days with ClickHouse.pdfMy first 90 days with ClickHouse.pdf
My first 90 days with ClickHouse.pdf
 
Mongo db 최범균
Mongo db 최범균Mongo db 최범균
Mongo db 최범균
 
High-speed Database Throughput Using Apache Arrow Flight SQL
High-speed Database Throughput Using Apache Arrow Flight SQLHigh-speed Database Throughput Using Apache Arrow Flight SQL
High-speed Database Throughput Using Apache Arrow Flight SQL
 
Influxdb and time series data
Influxdb and time series dataInfluxdb and time series data
Influxdb and time series data
 
Parquet performance tuning: the missing guide
Parquet performance tuning: the missing guideParquet performance tuning: the missing guide
Parquet performance tuning: the missing guide
 
Rocks db state store in structured streaming
Rocks db state store in structured streamingRocks db state store in structured streaming
Rocks db state store in structured streaming
 
Inside MongoDB: the Internals of an Open-Source Database
Inside MongoDB: the Internals of an Open-Source DatabaseInside MongoDB: the Internals of an Open-Source Database
Inside MongoDB: the Internals of an Open-Source Database
 
Unify Stream and Batch Processing using Dataflow, a Portable Programmable Mod...
Unify Stream and Batch Processing using Dataflow, a Portable Programmable Mod...Unify Stream and Batch Processing using Dataflow, a Portable Programmable Mod...
Unify Stream and Batch Processing using Dataflow, a Portable Programmable Mod...
 
RedisConf17- Using Redis at scale @ Twitter
RedisConf17- Using Redis at scale @ TwitterRedisConf17- Using Redis at scale @ Twitter
RedisConf17- Using Redis at scale @ Twitter
 
Simplifying Big Data Analytics with Apache Spark
Simplifying Big Data Analytics with Apache SparkSimplifying Big Data Analytics with Apache Spark
Simplifying Big Data Analytics with Apache Spark
 

Similar to Sizing Your MongoDB Cluster

Webinar: Best Practices for Getting Started with MongoDB
Webinar: Best Practices for Getting Started with MongoDBWebinar: Best Practices for Getting Started with MongoDB
Webinar: Best Practices for Getting Started with MongoDB
MongoDB
 
MongoDB Best Practices
MongoDB Best PracticesMongoDB Best Practices
MongoDB Best Practices
Lewis Lin 🦊
 
Capacity Planning
Capacity PlanningCapacity Planning
Capacity Planning
MongoDB
 
MongoDB .local Munich 2019: A Complete Methodology to Data Modeling for MongoDB
MongoDB .local Munich 2019: A Complete Methodology to Data Modeling for MongoDBMongoDB .local Munich 2019: A Complete Methodology to Data Modeling for MongoDB
MongoDB .local Munich 2019: A Complete Methodology to Data Modeling for MongoDB
MongoDB
 
TechEd AU 2014: Microsoft Azure DocumentDB Deep Dive
TechEd AU 2014: Microsoft Azure DocumentDB Deep DiveTechEd AU 2014: Microsoft Azure DocumentDB Deep Dive
TechEd AU 2014: Microsoft Azure DocumentDB Deep Dive
Intergen
 
MongoDB .local Toronto 2019: Finding the Right Atlas Cluster Size: Does this ...
MongoDB .local Toronto 2019: Finding the Right Atlas Cluster Size: Does this ...MongoDB .local Toronto 2019: Finding the Right Atlas Cluster Size: Does this ...
MongoDB .local Toronto 2019: Finding the Right Atlas Cluster Size: Does this ...
MongoDB
 
Time Series Databases for IoT (On-premises and Azure)
Time Series Databases for IoT (On-premises and Azure)Time Series Databases for IoT (On-premises and Azure)
Time Series Databases for IoT (On-premises and Azure)
Ivo Andreev
 
Building Your Data Warehouse with Amazon Redshift
Building Your Data Warehouse with Amazon RedshiftBuilding Your Data Warehouse with Amazon Redshift
Building Your Data Warehouse with Amazon Redshift
Amazon Web Services
 
Realtime Analytics on AWS
Realtime Analytics on AWSRealtime Analytics on AWS
Realtime Analytics on AWS
Sungmin Kim
 
Dev Jumpstart: Build Your First App with MongoDB
Dev Jumpstart: Build Your First App with MongoDBDev Jumpstart: Build Your First App with MongoDB
Dev Jumpstart: Build Your First App with MongoDB
MongoDB
 
Add Redis to Postgres to Make Your Microservices Go Boom!
Add Redis to Postgres to Make Your Microservices Go Boom!Add Redis to Postgres to Make Your Microservices Go Boom!
Add Redis to Postgres to Make Your Microservices Go Boom!
Dave Nielsen
 
MongoDB Evening Austin, TX 2017
MongoDB Evening Austin, TX 2017MongoDB Evening Austin, TX 2017
MongoDB Evening Austin, TX 2017MongoDB
 
CCI2017 - Considerations for Migrating Databases to Azure - Gianluca Sartori
CCI2017 - Considerations for Migrating Databases to Azure - Gianluca SartoriCCI2017 - Considerations for Migrating Databases to Azure - Gianluca Sartori
CCI2017 - Considerations for Migrating Databases to Azure - Gianluca Sartori
walk2talk srl
 
AWS re:Invent 2016| DAT318 | Migrating from RDBMS to NoSQL: How Sony Moved fr...
AWS re:Invent 2016| DAT318 | Migrating from RDBMS to NoSQL: How Sony Moved fr...AWS re:Invent 2016| DAT318 | Migrating from RDBMS to NoSQL: How Sony Moved fr...
AWS re:Invent 2016| DAT318 | Migrating from RDBMS to NoSQL: How Sony Moved fr...
Amazon Web Services
 
MongoDB .local Munich 2019: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local Munich 2019: MongoDB Atlas Data Lake Technical Deep DiveMongoDB .local Munich 2019: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local Munich 2019: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB
 
Solving your Backup Needs - Ben Cefalo mdbe18
Solving your Backup Needs - Ben Cefalo mdbe18Solving your Backup Needs - Ben Cefalo mdbe18
Solving your Backup Needs - Ben Cefalo mdbe18
MongoDB
 
Microsoft Azure Big Data Analytics
Microsoft Azure Big Data AnalyticsMicrosoft Azure Big Data Analytics
Microsoft Azure Big Data Analytics
Mark Kromer
 
MongoDB: What, why, when
MongoDB: What, why, whenMongoDB: What, why, when
MongoDB: What, why, when
Eugenio Minardi
 
MongoDB and the Internet of Things
MongoDB and the Internet of ThingsMongoDB and the Internet of Things
MongoDB and the Internet of Things
Sam_Francis
 
AWS re:Invent 2016| GAM301 | How EA Leveraged Amazon Redshift and AWS Partner...
AWS re:Invent 2016| GAM301 | How EA Leveraged Amazon Redshift and AWS Partner...AWS re:Invent 2016| GAM301 | How EA Leveraged Amazon Redshift and AWS Partner...
AWS re:Invent 2016| GAM301 | How EA Leveraged Amazon Redshift and AWS Partner...
Amazon Web Services
 

Similar to Sizing Your MongoDB Cluster (20)

Webinar: Best Practices for Getting Started with MongoDB
Webinar: Best Practices for Getting Started with MongoDBWebinar: Best Practices for Getting Started with MongoDB
Webinar: Best Practices for Getting Started with MongoDB
 
MongoDB Best Practices
MongoDB Best PracticesMongoDB Best Practices
MongoDB Best Practices
 
Capacity Planning
Capacity PlanningCapacity Planning
Capacity Planning
 
MongoDB .local Munich 2019: A Complete Methodology to Data Modeling for MongoDB
MongoDB .local Munich 2019: A Complete Methodology to Data Modeling for MongoDBMongoDB .local Munich 2019: A Complete Methodology to Data Modeling for MongoDB
MongoDB .local Munich 2019: A Complete Methodology to Data Modeling for MongoDB
 
TechEd AU 2014: Microsoft Azure DocumentDB Deep Dive
TechEd AU 2014: Microsoft Azure DocumentDB Deep DiveTechEd AU 2014: Microsoft Azure DocumentDB Deep Dive
TechEd AU 2014: Microsoft Azure DocumentDB Deep Dive
 
MongoDB .local Toronto 2019: Finding the Right Atlas Cluster Size: Does this ...
MongoDB .local Toronto 2019: Finding the Right Atlas Cluster Size: Does this ...MongoDB .local Toronto 2019: Finding the Right Atlas Cluster Size: Does this ...
MongoDB .local Toronto 2019: Finding the Right Atlas Cluster Size: Does this ...
 
Time Series Databases for IoT (On-premises and Azure)
Time Series Databases for IoT (On-premises and Azure)Time Series Databases for IoT (On-premises and Azure)
Time Series Databases for IoT (On-premises and Azure)
 
Building Your Data Warehouse with Amazon Redshift
Building Your Data Warehouse with Amazon RedshiftBuilding Your Data Warehouse with Amazon Redshift
Building Your Data Warehouse with Amazon Redshift
 
Realtime Analytics on AWS
Realtime Analytics on AWSRealtime Analytics on AWS
Realtime Analytics on AWS
 
Dev Jumpstart: Build Your First App with MongoDB
Dev Jumpstart: Build Your First App with MongoDBDev Jumpstart: Build Your First App with MongoDB
Dev Jumpstart: Build Your First App with MongoDB
 
Add Redis to Postgres to Make Your Microservices Go Boom!
Add Redis to Postgres to Make Your Microservices Go Boom!Add Redis to Postgres to Make Your Microservices Go Boom!
Add Redis to Postgres to Make Your Microservices Go Boom!
 
MongoDB Evening Austin, TX 2017
MongoDB Evening Austin, TX 2017MongoDB Evening Austin, TX 2017
MongoDB Evening Austin, TX 2017
 
CCI2017 - Considerations for Migrating Databases to Azure - Gianluca Sartori
CCI2017 - Considerations for Migrating Databases to Azure - Gianluca SartoriCCI2017 - Considerations for Migrating Databases to Azure - Gianluca Sartori
CCI2017 - Considerations for Migrating Databases to Azure - Gianluca Sartori
 
AWS re:Invent 2016| DAT318 | Migrating from RDBMS to NoSQL: How Sony Moved fr...
AWS re:Invent 2016| DAT318 | Migrating from RDBMS to NoSQL: How Sony Moved fr...AWS re:Invent 2016| DAT318 | Migrating from RDBMS to NoSQL: How Sony Moved fr...
AWS re:Invent 2016| DAT318 | Migrating from RDBMS to NoSQL: How Sony Moved fr...
 
MongoDB .local Munich 2019: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local Munich 2019: MongoDB Atlas Data Lake Technical Deep DiveMongoDB .local Munich 2019: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local Munich 2019: MongoDB Atlas Data Lake Technical Deep Dive
 
Solving your Backup Needs - Ben Cefalo mdbe18
Solving your Backup Needs - Ben Cefalo mdbe18Solving your Backup Needs - Ben Cefalo mdbe18
Solving your Backup Needs - Ben Cefalo mdbe18
 
Microsoft Azure Big Data Analytics
Microsoft Azure Big Data AnalyticsMicrosoft Azure Big Data Analytics
Microsoft Azure Big Data Analytics
 
MongoDB: What, why, when
MongoDB: What, why, whenMongoDB: What, why, when
MongoDB: What, why, when
 
MongoDB and the Internet of Things
MongoDB and the Internet of ThingsMongoDB and the Internet of Things
MongoDB and the Internet of Things
 
AWS re:Invent 2016| GAM301 | How EA Leveraged Amazon Redshift and AWS Partner...
AWS re:Invent 2016| GAM301 | How EA Leveraged Amazon Redshift and AWS Partner...AWS re:Invent 2016| GAM301 | How EA Leveraged Amazon Redshift and AWS Partner...
AWS re:Invent 2016| GAM301 | How EA Leveraged Amazon Redshift and AWS Partner...
 

More from MongoDB

MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB AtlasMongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB
 
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB
 
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB
 
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDBMongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB
 
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB
 
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series DataMongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB
 
MongoDB SoCal 2020: MongoDB Atlas Jump Start
 MongoDB SoCal 2020: MongoDB Atlas Jump Start MongoDB SoCal 2020: MongoDB Atlas Jump Start
MongoDB SoCal 2020: MongoDB Atlas Jump Start
MongoDB
 
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB
 
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB
 
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB
 
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB
 
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your MindsetMongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB
 
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas JumpstartMongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB
 
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB
 
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB
 
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB
 
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep DiveMongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB
 
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & GolangMongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB
 
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB
 
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB
 

More from MongoDB (20)

MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB AtlasMongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
 
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
 
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
 
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDBMongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
 
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
 
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series DataMongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
 
MongoDB SoCal 2020: MongoDB Atlas Jump Start
 MongoDB SoCal 2020: MongoDB Atlas Jump Start MongoDB SoCal 2020: MongoDB Atlas Jump Start
MongoDB SoCal 2020: MongoDB Atlas Jump Start
 
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
 
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
 
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
 
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
 
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your MindsetMongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
 
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas JumpstartMongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
 
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
 
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
 
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
 
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep DiveMongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
 
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & GolangMongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
 
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
 
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
 

Recently uploaded

FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Jeffrey Haguewood
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
Safe Software
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
DanBrown980551
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Product School
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
DianaGray10
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
Product School
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
Sri Ambati
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Product School
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
Guy Korland
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Thierry Lestable
 
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
Product School
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
DianaGray10
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
Prayukth K V
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
Product School
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
Laura Byrne
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Albert Hoitingh
 
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
Product School
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
Kari Kakkonen
 

Recently uploaded (20)

FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
 
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
 
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
 

Sizing Your MongoDB Cluster

  • 1. FEBRUARY 15, 2018 | BELL HARBOR #MDBlocal Sizing MongoDB Clusters
  • 3. #MDBlocal MongoDB: 4 years NoSQL: 9 years Solution Architect: engineer assigned to sales Significant experience sizing MongoDB clusters About me
  • 5. #MDBlocal Do I need to shard? What size servers should I use? What will my monthly Atlas/AWS/Azure/Google costs be? When will I need to add a new shard or upgrade my servers? How much data can my servers support? How many queries can my servers support? Will we be able to meet our query latency requirements? Sizing
  • 6. #MDBlocal • Large coffee chain: PlanetDollar • Collect mobile app performance • Every tap, click, gesture will generate an event • 2 Year History • Perform analytics • Historical • Near real-time (executive dashboards) • Support usage • 3000 – 5000 events per second Your boss comes to you… I need a budget for the monthly Atlas costs?
  • 7. #MDBlocal • Build a prototype • Run performance tests using actual data and queries on hardware with specs similar to production servers The only accurate way to size a cluster • EVERY OTHER APPROACH IS A GUESS • Including the one I am presenting today
  • 8. #MDBlocal • Early in project, but • Need to order hardware • Estimate costs to determine “Go/No Go” decision • Schema design • Compare the hardware requirements for different schemas Sometimes, it is necessary to guess ☹
  • 9. #MDBlocal MongoDB Clusters Look Like This Config Config Config Application Driver Primary Secondary Secondary
  • 10. #MDBlocal • # of shards • Specifications of each server • CPU • Storage • Size • Performance: IOPS • Memory • Network Our Solution Will Consist Of
  • 11. #MDBlocal • Sizing Objective ✓ • IOPS, Query Processing, Working Set • Sizing Methodology with an Example Agenda
  • 13. #MDBlocal • # of shards • Specifications of each server • CPU • Storage • Size • Performance: IOPS • Memory • Network Our Solution Will Consist Of
  • 14. #MDBlocal • IOPS – input output units per second • Throughput • Random access • Most workloads “randomly” access documents IOPS collection
  • 15. #MDBlocal Storage Performance Type IOPS 7200 rpm SATA ~ 75 – 100 15000 rpm SAS ~ 175 – 210 RAID-10 (24 x 7200 RPM SAS) 2000 Amazon EBS 250 – 500 Amazon EBS Provisioned IOPS 10000 - 20000 SSD 50000 Flash Storage 100K – 400K (or more) http://en.wikipedia.org/wiki/IOPS
  • 16. #MDBlocal • How many IOPS do we need? • Want the real answer, run a test • How to estimate? Hardest Part of Sizing is IOPS
  • 17. #MDBlocal Processing a Query Select Index Load relevant index entries from disk Identify documents using index Retrieve documents from disk Filter documents Return Documents
  • 18. #MDBlocal Processing a Query IO Select Index Load relevant index entries from disk Identify documents using index Retrieve documents from disk Filter documents Return Documents
  • 19. #MDBlocal But MongoDB Has a Cache File System indexes collections CPU Memory indexes documents Disk access is only necessary if indexes or documents are not in cache Select Index Load relevant index entries from disk Identify documents using index Retrieve documents from disk Filter documents Return Documents
  • 20. #MDBlocal Working Set Working Set = indexes plus frequently accessed documents If RAM greater than working set then reduced IO Select Index Load relevant index entries from disk Identify documents using index Retrieve documents from disk Filter documents Return Documents File System indexes collections CPU Memory indexes documents
  • 21. #MDBlocal This is all great, but how do we estimate IOPS?
  • 22. #MDBlocal Assume • Working Set < RAM < Data Size • Memory contains indexes only MongoDB Simplified Model File System indexes collections CPU Memory indexes documents
  • 23. #MDBlocal Assume appropriate indexes To resolve find: • Navigate in-memory indexes • Retrieve document from disk 1 IOP per document returned Find Queries With Simplified Model File System indexes collections CPU Memory indexes documents
  • 24. #MDBlocal Assume appropriate indexes To resolve find: • Navigate in-memory indexes • Retrieve document from disk 1 IOP per document returned Find Queries With Simplified Model File System indexes collections CPU Memory indexes documents
  • 25. #MDBlocal To resolve insert: • Write document to disk • Update each index file IOPS = 1 + # of indexes Inserts With Simplified Model File System indexes collections CPU Memory indexes documents
  • 26. #MDBlocal To resolve delete: • Navigate in-memory indexes • Mark document deleted • Update each index file IOPS = 1 + # of indexes Deletes With Simplified Model File System indexes collections CPU Memory indexes documents
  • 27. #MDBlocal To resolve delete: • Navigate in-memory indexes • Mark document deleted • Insert new document version • Update each index file IOPS = 2 + # of indexes Updates With Simplified Model File System indexes collections CPU Memory indexes documents
  • 28. #MDBlocal • Working Set • Checkpoints • Document size relative to block size • Indexed Arrays • Journal, Log The Simplified Model is too simplistic
  • 29. #MDBlocal • WiredTiger write process: 1. Update document in RAM (cache) 2. Write to journal (disk) 3. Periodically, write dirty documents to disk (checkpoint) • 60 seconds or 2 GB (whichever comes first) Checkpoints Checkpoint 1 Checkpoint 2 Checkpoint 3 B C A A C A 3 writes 3 documents written 3 writes 2 documents written
  • 30. #MDBlocal • Estimate total requirements (using simplified model): • RAM • CPU • Disk Space • IOPS • Adjust based upon working set, checkpoints, etc. • Design (sharded) cluster that provides these totals How are we going to get there?
  • 33. #MDBlocal 1.Collection Size 2.Working Set 3.Queries -> IOPS 4.Adjust based upon working set, checkpoints, etc. 5.Calculate # of shards 6.Review, iterate, repeat Methodology (cont.) Build a spread sheet Multiple iterations may be required
  • 34. #MDBlocal 1.Assumptions 1.Data Size 1.Working Set • Index Size • Frequently Accessed Documents 1.Queries – IOPS 1.Shard Calculations Sizing Spreadsheet
  • 35. #MDBlocal 1.Assumptions 1.Data Size 1.Working Set • Index Size • Frequently Accessed Documents 2. Queries – IOPS 1. Shard Calculations Sizing Spreadsheet
  • 38. #MDBlocal Sizing Spreadsheet 1.Assumptions 1.Data Size 1.Working Set • Index Size • Frequently Accessed Documents 2. Queries – IOPS 1. Shard Calculations
  • 39. #MDBlocal • # of documents • Data size • Index size • WT compression Collection Analysis
  • 40. #MDBlocal Calculate The Number of Documents Application Description # of Documents in Collection There will be 20M documents in the collection by the end of 2017 20,000,000 We expect to insert 10K documents per day with 1 year retention period 365*10,000 = 3,655,000 We have 3000 devices each producing 1 event per minute and we need to keep a 90 day history 3000 * 60 * 24 * 90 = 388,800,000
  • 41. #MDBlocal Calculate The Number of Documents Application Description # of Documents in Collection There will be 20M documents in the collection by the end of 2017 20,000,000 We expect to insert 10K documents per day with 1 year retention period 365*10,000 = 3,655,000 We have 3000 devices each producing 1 event per minute and we need to keep a 90 day history 3000 * 60 * 24 * 90 = 388,800,000 PlanetDollar: 2 year history. Each day 5000 inserts per second for 5 hours and 3000 inserts per second for 19 hours 2*365*(5000*5*3600 + 3000*19*3600) = 215,496,000,000
  • 42. #MDBlocal • Data Size = # of documents * Average document size • This information is available in db.stats(), Compass, Ops Manager, Cloud Manager, Atlas, etc. Calculate the Data Size
  • 43. #MDBlocal • Write some code • Programmatically generate a large data set • 5-10% of expected size • Measure • Collection size • Index size • Compression What if there aren’t any documents?
  • 44. #MDBlocal • Use db.collection.stats() • Take data size, index size and extrapolate to production size • Calculate compression ratio db.collection.stats() { count: 10000 size: 70,388,956 avgObjSize: 7038 storageSize: 25341952 … totalIndexSize: 147456 } Determine collection and data size Parameter Formula Value # of documents 2.5B avgObjSize 7038 Collection Size =2.5B * 7038 1.760E13 Bytes WT Compression = 25341952/70388956 .36 Collection Storage =2.5B * 7038 * .36 6.33E12 Bytes Index Size Per Doc = 147456 / 10000 15 Bytes Collection Index Size =2.5B * 15 /1024^3 35 GB
  • 46. #MDBlocal Sizing Spreadsheet 1.Assumptions 1.Data Size 1.Working Set • Index Size • Frequently Accessed Documents 1.Queries – IOPS 1.Shard Calculations
  • 47. #MDBlocal WorkSet = Indexes plus the set of documents accessed frequently We know the index size from previous analysis Estimate the frequently accessed documents Given the queries What are the frequently accessed docs? Working Set File System collections indexes CPU Memory indexes documents
  • 48. #MDBlocal Query Analysis Dashboards – last minute of data Customer support - last hour of data Reports (run once per day) inspect last years worth of data Active Documents = 1 hours worth of data PlanetDollar Working Set
  • 50. #MDBlocal 1.Assumptions 1.Data Size 1.Working Set • Index Size • Frequently Accessed Documents 1.Queries – IOPS 1.Shard Calculations Sizing Spreadsheet
  • 51. #MDBlocal + # of documents returned per second + # of documents updated per second + # of indexes impacted by each update + # of inserts per second + # of indexes impacted by each insert + # of deletes per second (x2) + # of indexes impacted by each delete - Multiple updates occurring within checkpoint - % of find query results in cache Total IOPS IOPS Calculation
  • 52. #MDBlocal • 5000 inserts per second • 5000 deletes per second Dashboards (aggregations: 100 per minute) • Total events per minute across all users (current minute) • Total events per minute per region (current minute) • Total events per store per minute (current minute) Debugging Tool (ad hoc – 5 per second) • Find all events for a user in last 60 minutes (100 events returned, on average) PlanetDollar Queries
  • 53. #MDBlocal • Each insert: • Update collection • Update each index (3 indexes) • Each Delete: • Update collection • Update each index (3 indexes) • 5000 inserts/sec • 5000 deletes/sec IOPS for inserts and deletes 4 IOPS 5 IOPS (4 * 5000) + (5 * 5000) = 45000 IOPS
  • 54. #MDBlocal • Example: Total events per minute across all users (current minute) • How many documents will be read from disk? IOPS for PlanetDollar Aggregations 05000 per second * 60 seconds = 300,000 Most data in cache Some IOPS will likely be require
  • 55. #MDBlocal • Find all events for a user in last 60 minutes • 5 per second • 100 documents per query • # IOPS = 5 * 100 = 500 IOPS IOPS For Find
  • 57. #MDBlocal • CPU utilized for: • Compress/decompress • Encrypt/Decrypt • Aggregation queries • General query processing • In most cases, RAM requirements → large servers → many cores • Possible exception: aggregation queries • One core per query • # cores >> # of simultaneous aggregation queries How Many CPUs Do I Need?
  • 58. #MDBlocal 1.Assumptions 1.Data Size 1.Working Set • Index Size • Frequently Accessed Documents 1.Queries – IOPS 1.Shard Calculations Sizing Spreadsheet
  • 59. #MDBlocal • At this point you have: 1. Required storage capacity 2. Working Set Size 3. IOPS Estimate 4. Some idea about class of server (or VM) the customer plans to deploy • Determine number of required shards Shard Calculations
  • 60. #MDBlocal • Sum of disk space across shards greater than required storage size Disk Space: How Many Shards Do I Need? Example Data Size = 9 TB WiredTiger Compression Ratio: .33 Storage size = 3 TB Req Disk: 6 TB Server disk capacity = 2 TB 3 Shards Required Recommend providing 2X the compressed data size in disk
  • 61. #MDBlocal RAM: How Many Shards Do I Need? Example Working Set = 428 GB Server RAM = 128 GB 428/128 = 3.34 4 Shards Required
  • 62. #MDBlocal IOPS: How Many Shards Do I Need? Example Require: 50K IOPS AWS Instance: 20K IOPS 3 Shards Required
  • 64. #MDBlocal 1. Calculate: • Collection size • Index size 2. Estimate Working Set 3. Use simplified model to estimate IOPS 4. Revise (working set coverage, checkpoints, etc.) 5. Calculate shards Sizing Summary