SlideShare a Scribd company logo
#MDBW17
Jay Runkel
Principal Solutions Architect
SIZING MONGODB CLUSTERS
jay.runkel@mongodb.com
@jayrunkel
#MDBW17
AGENDA
• Sizing Objective
• IOPS, Query Processing, Working Set
• Sizing Methodology
• Sizing Example
SIZING
OBJECTIVE
#MDBW17
SIZING
Do I need to shard?
What size servers should I use?
What will my monthly Atlas/AWS/Azure/Google costs be?
When will I need to add a new shard or upgrade my servers?
How much data can my servers support?
How many queries can my servers support?
Will we be able to meet our query latency requirements?
#MDBW17
YOUR BOSS COMES TO YOU…
• Large coffee chain: PlanetDollar
• Collect mobile app performance
• Every tap, click, gesture will generate an event
• 2 Year History
• Perform analytics
‒ Historical
‒ Near real-time (executive dashboards)
• Support usage
• 3000 – 5000 events per second
I need a budget for the monthly Atlas costs?
#MDBW17
THE ONLY ACCURATE WAY TO SIZE A
CLUSTER
• Build a prototype
• Run performance tests using actual data and queries on hardware with specs
similar to production servers
• EVERY OTHER APPROACH IS A GUESS
• Including the one I am presenting today
#MDBW17
SOMETIMES, IT IS NECESSARY TO GUESS 
• Early in project, but
‒ Need to order hardware
‒ Estimate costs to determine “Go/No Go” decision
• Schema design
‒ Compare the hardware requirements for different schemas
MongoDB Clusters Look Like This
Config
Config
Config
Application
Driver
Primary
Secondary
Secondary
#MDBW17
OUR SOLUTION WILL CONSIST OF
• # of shards
• Specifications of each server
‒ CPU
‒ Storage
o Size
o Performance: IOPS
‒ Memory
‒ Network
BACKGROUND
IOPS, Query Processing, Working Set
#MDBW17
OUR SOLUTION WILL CONSIST OF
• # of shards
• Specifications of each server
‒ CPU
‒ Storage
o Size
o Performance: IOPS
‒ Memory
‒ Network
#MDBW17
IOPS
• IOPS – input output units per second
• Throughput
• Random access
• Most workloads “randomly” access documents
collection
#MDBW17
STORAGE PERFORMANCE
Type IOPS
7200 rpm SATA ~ 75 – 100
15000 rpm SAS ~ 175 – 210
RAID-10 (24 x 7200 RPM SAS) 2000
Amazon EBS 250 – 500
Amazon EBS Provisioned IOPS 10000 - 20000
SSD 50000
Flash Storage 100K – 400K (or more)
http://en.wikipedia.org/wiki/IOPS
#MDBW17
HARDEST PART OF SIZING IS IOPS
• How many IOPS do we need?
• Want the real answer, run a test
• How to estimate?
#MDBW17
PROCESSING A QUERY
Select Index
Load relevant index
entries from disk
Identify documents
using index
Retrieve documents
from disk
Filter documents
Return Documents
#MDBW17
PROCESSING A QUERY
Select Index
Load relevant index
entries from disk
Identify documents
using index
Retrieve documents
from disk
Filter documents
Return Documents
IO
#MDBW17
BUT MONGODB HAS A CACHE
Select Index
Load relevant index
entries from disk
Identify documents
using index
Retrieve documents
from disk
Filter documents
Return Documents
File System
indexes collections
CPU
Memory
indexes
documents
Disk access is only necessary if
indexes or documents are not in
cache
#MDBW17
WORKING SET
Select Index
Load relevant index
entries from disk
Identify documents
using index
Retrieve documents
from disk
Filter documents
Return Documents
File System
indexes collections
CPU
Memory
indexes
documents
Working Set = indexes plus frequently accessed
documents
If RAM greater than working set then reduced IO
#MDBW17
THIS IS ALL GREAT, BUT HOW DO WE ESTIMATE
IOPS?
#MDBW17
MONGODB SIMPLIFIED MODEL
Assume
• Working Set < RAM < Data Size
• Memory contains indexes only
File System
collections indexes
CPU
Memory
indexes
#MDBW17
FIND QUERIES WITH SIMPLIFIED MODEL
File System
collections indexes
CPU
Memory
indexes
Assume appropriate indexes
To resolve find:
• Navigate in-memory indexes
• Retrieve document from disk
1 IOP per document returned
#MDBW17
FIND QUERIES WITH SIMPLIFIED MODEL
File System
collections indexes
CPU
Memory
indexes
Assume appropriate indexes
To resolve find:
• Navigate in-memory indexes
• Retrieve document from disk
1 IOP per document returned
#MDBW17
INSERTS WITH SIMPLIFIED MODEL
To resolve insert:
• Write document to disk
• Update each index file
IOPS = 1 + # of indexes
File System
collections indexes
CPU
Memory
indexes
#MDBW17
DELETES WITH SIMPLIFIED MODEL
To resolve delete:
• Navigate in-memory indexes
• Mark document deleted
• Update each index file
IOPS = 1 + # of indexes
File System
collections indexes
CPU
Memory
indexes
#MDBW17
UPDATES WITH SIMPLIFIED MODEL
To resolve delete:
• Navigate in-memory indexes
• Mark document deleted
• Insert new document version
• Update each index file
IOPS = 2 + # of indexes
File System
collections indexes
CPU
Memory
indexes
#MDBW17
THE SIMPLIFIED MODEL IS TOO SIMPLISTIC
• Working Set
• Checkpoints
• Document size relative to block size
• Indexed Arrays
• Journal, Log
#MDBW17
CHECKPOINTS
• WiredTiger write process:
1. Update document in RAM (cache)
2. Write to journal (disk)
3. Periodically, write dirty documents to disk (checkpoint)
o 60 seconds or 2 GB (whichever comes first)
Checkpoint 1 Checkpoint 2 Checkpoint 3
B C A A C A
3 writes
3 documents written
3 writes
2 documents written
#MDBW17
HOW ARE WE GOING TO GET THERE?
• Estimate total requirements (using simplified model):
‒ RAM
‒ CPU
‒ Disk Space
‒ IOPS
• Adjust based upon working set, checkpoints, etc.
• Design (sharded) cluster that provides these totals
SIZING PROCESS
#MDBW17
METHODOLOGY
Application
Requirements
Cluster Sizing
• Number of
shards
• Server specs
Magic Happens
#MDBW17
METHODOLOGY (CONT.)
1. Collection Size
2. Working Set
3. Queries -> IOPS
4. Adjust based upon working set, checkpoints, etc.
5. Using candidate server specs, calculate # of shards
6. Review, iterate, repeat
Build a spread
sheet
Multiple
iterations may
be required
Sizing Spreadsheet
1. Assumptions
2. Data Size
1. Working Set
– Index Size
– Frequently Accessed Documents
1. Queries – IOPS
1. Shard Calculations
#MDBW17
SIZING SPREADSHEET
1. Assumptions
2. Data Size
1. Working Set
‒ Index Size
‒ Frequently Accessed Documents
1. Queries – IOPS
1. Shard Calculations
#MDBW17
COLLECTION ANALYSIS
‒ # of documents
‒ Data size
‒ Index size
‒ WT compression
#MDBW17
CALCULATE THE NUMBER OF DOCUMENTS
Application Description # of Documents in Collection
There will be 20M documents in the
collection by the end of 2017
20,000,000
We expect to insert 10K documents
per day with 1 year retention period
365*10,000 = 3,655,000
We have 3000 devices each
producing 1 event per minute and
we need to keep a 90 day history
3000 * 60 * 24 * 90 = 388,800,000
#MDBW17
CALCULATE THE NUMBER OF DOCUMENTS
Application Description # of Documents in Collection
There will be 20M documents in the
collection by the end of 2017
20,000,000
We expect to insert 10K documents
per day with 1 year retention period
365*10,000 = 3,655,000
We have 3000 devices each
producing 1 event per minute and
we need to keep a 90 day history
3000 * 60 * 24 * 90 = 388,800,000
PlanetDollar:
2 year history. Each day 5000
inserts per second for 5 hours and
3000 inserts per second for 19
hours
2*365*(5000*5*3600 +
3000*19*3600) = 215496000000
#MDBW17
CALCULATE THE DATA SIZE
• Data Size = # of documents * Average document size
• This information is available in db.stats(), Compass, Ops Manager, Cloud
Manager, Atlas, etc.
#MDBW17
WHAT IF THERE AREN’T ANY DOCUMENTS?
• Write some code
‒ Programmatically generate a large data set
o 5-10% of expected size
‒ Measure
o Collection size
o Index size
o Compression
#MDBW17
DETERMINE COLLECTION AND DATA SIZE
• Use db.collection.stats()
‒ Take data size, index size and extrapolate to production size
‒ Calculate compression ratio
db.collection.stats()
{
count: 10000
size: 70,388,956
avgObjSize: 7038
storageSize: 25341952
…
totalIndexSize: 147456
}
Parameter Formula Value
# of documents 2.5B
avgObjSize 7038
Collection Size =2.5B * 7038 1.760E13
Bytes
WT Compression =
25341952/70388956
.36
Collection Storage =2.5B * 7038 * .36 6.33E12 Bytes
Index Size Per Doc = 147456 / 10000 15 Bytes
Collection Index Size =2.5B * 15 /1024^3 35 GB
#MDBW17
SIZING SPREADSHEET
1. Assumptions
2. Data Size
1. Working Set
‒ Index Size
‒ Frequently Accessed Documents
1. Queries – IOPS
1. Shard Calculations
#MDBW17
WORKING SET
• WorkSet = Indexes plus the set of documents accessed frequently
‒ We know the index size from previous analysis
• Estimate the working set
‒ Given the queries
‒ What are the frequently
accessed docs?
File System
collections indexes
CPU
Memory
indexes
documents
#MDBW17
PLANETDOLLAR WORKING SET
Query Analysis
• Dashboards look at last minute of data
• Customer support debugging tools inspect last hours worth of data
• Reports (run once per day) inspect last years worth of data
Active Documents = 1 hours worth of data
5000 * 3600 * 1KB = 18M KB = 17 GB
Run reports on secondaries
#MDBW17
SIZING SPREADSHEET
1. Assumptions
2. Data Size
1. Working Set
‒ Index Size
‒ Frequently Accessed Documents
1. Queries – IOPS
1. Shard Calculations
#MDBW17
IOPS CALCULATION
+ # of documents returned per second
+ # of documents updated per second
+ # of indexes impacted by each update
+ # of inserts per second
+ # of indexes impacted by each insert
+ # of deletes per second (x2)
+ # of indexes impacted by each delete
- Multiple updates occurring within checkpoint
- % of find query results in cache
Total IOPS
#MDBW17
PLANETDOLLAR QUERIES
• 5000 inserts per second
• 5000 deletes per second
Dashboards (aggregations: 100 per minute)
• Total events per minute across all users (current minute)
• Total events per minute per region (current minute)
• Total events per store per minute (current minute)
Debugging Tool (ad hoc – 5 per second)
• Find all events for a user in last 60 minutes (100 events returned, on average)
Analytics (reports generated once per day)
• For all store and regions, count events per day year over year (last 2 years)
• For all store and regions, events per day for last 365 days
#MDBW17
IOPS FOR INSERTS AND DELETES
• Each insert:
‒ Update collection
‒ Update each index (3 indexes)
• Each Delete:
‒ Update collection
‒ Update each index (3 indexes)
• 5000 inserts/sec
• 5000 deletes/sec
4 IOPS
4 IOPS
(4 * 5000) + (4 * 5000) = 40000 IOPS
#MDBW17
IOPS FOR PLANETDOLLAR AGGREGATIONS
• Example: Total events per minute across all users (current minute)
• How many documents will be read from disk?
05000 per second * 60 seconds = 300,000
Most data in
cache
Some IOPS will
likely be required
#MDBW17
IOPS FOR FIND
• Find all events for a user in last 60 minutes
‒ 5 per second
‒ 100 documents per query
• # IOPS = 5 * 100 = 500 IOPS
#MDBW17
HOW MANY CPUS DO I NEED?
• CPU utilized for:
‒ Compress/decompress
‒ Encrypt/Decrypt
‒ Aggregation queries
‒ General query processing
• In most cases, RAM requirements  large servers  many cores
• Possible exception: aggregation queries
‒ One core per query
‒ # cores >> # of simultaneous aggregation queries
#MDBW17
SIZING SPREADSHEET
1. Assumptions
2. Data Size
1. Working Set
‒ Index Size
‒ Frequently Accessed Documents
1. Queries – IOPS
1. Shard Calculations
#MDBW17
SHARD CALCULATIONS
• At this point you have:
1. Required storage capacity
2. Working Set Size
3. IOPS Estimate
4. Some idea about class of server (or VM) the customer plans to deploy
• Determine number of required shards
#MDBW17
DISK SPACE: HOW MANY SHARDS DO I NEED?
• Sum of disk space across shards > greater than required storage size
Example
Data Size = 9 TB
WiredTiger Compression Ratio:
.33
Storage size = 3 TB
Server disk capacity = 2 TB
2 Shards Required
Recommend providing
2X the compressed data
size in disk
#MDBW17
RAM: HOW MANY SHARDS DO I NEED?
Example
Working Set = 428 GB
Server RAM = 128 GB
428/128 = 3.34
4 Shards Required
#MDBW17
IOPS: HOW MANY SHARDS DO I NEED?
Example
Require: 50K IOPS
AWS Instance: 20K IOPS
3 Shards Required
PLANETDOLLAR
EXAMPLE
https://github.com/jayrunkel/mdbw2017Sizing
#MDBW17
ASSUMPTIONS
#MDBW17
COLLECTION SIZE
#MDBW17
WORKING SET
#MDBW17
QUERIES/IOPS
#MDBW17
SHARD CALCULATIONS
#MDBW17
SIZING SUMMARY
1. Calculate:
‒ Collection size
‒ Index size
2. Estimate Working Set
3. Use simplified model to estimate IOPS
4. Revise (working set coverage, checkpoints, etc.)
5. Calculate shards
Sizing MongoDB Clusters

More Related Content

What's hot

Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDB
MongoDB
 
Cassandra Introduction & Features
Cassandra Introduction & FeaturesCassandra Introduction & Features
Cassandra Introduction & Features
DataStax Academy
 
Introducing MongoDB Atlas
Introducing MongoDB AtlasIntroducing MongoDB Atlas
Introducing MongoDB Atlas
MongoDB
 
An Enterprise Architect's View of MongoDB
An Enterprise Architect's View of MongoDBAn Enterprise Architect's View of MongoDB
An Enterprise Architect's View of MongoDB
MongoDB
 
MongodB Internals
MongodB InternalsMongodB Internals
MongodB Internals
Norberto Leite
 
Modeling Data and Queries for Wide Column NoSQL
Modeling Data and Queries for Wide Column NoSQLModeling Data and Queries for Wide Column NoSQL
Modeling Data and Queries for Wide Column NoSQL
ScyllaDB
 
MongoDB.pptx
MongoDB.pptxMongoDB.pptx
MongoDB.pptx
Sigit52
 
Query logging with proxysql
Query logging with proxysqlQuery logging with proxysql
Query logging with proxysql
YoungHeon (Roy) Kim
 
Apache Iceberg - A Table Format for Hige Analytic Datasets
Apache Iceberg - A Table Format for Hige Analytic DatasetsApache Iceberg - A Table Format for Hige Analytic Datasets
Apache Iceberg - A Table Format for Hige Analytic Datasets
Alluxio, Inc.
 
Performance Tuning RocksDB for Kafka Streams' State Stores (Dhruba Borthakur,...
Performance Tuning RocksDB for Kafka Streams' State Stores (Dhruba Borthakur,...Performance Tuning RocksDB for Kafka Streams' State Stores (Dhruba Borthakur,...
Performance Tuning RocksDB for Kafka Streams' State Stores (Dhruba Borthakur,...
confluent
 
[pgday.Seoul 2022] PostgreSQL with Google Cloud
[pgday.Seoul 2022] PostgreSQL with Google Cloud[pgday.Seoul 2022] PostgreSQL with Google Cloud
[pgday.Seoul 2022] PostgreSQL with Google Cloud
PgDay.Seoul
 
Capacity Planning For Your Growing MongoDB Cluster
Capacity Planning For Your Growing MongoDB ClusterCapacity Planning For Your Growing MongoDB Cluster
Capacity Planning For Your Growing MongoDB Cluster
MongoDB
 
CockroachDB: Architecture of a Geo-Distributed SQL Database
CockroachDB: Architecture of a Geo-Distributed SQL DatabaseCockroachDB: Architecture of a Geo-Distributed SQL Database
CockroachDB: Architecture of a Geo-Distributed SQL Database
C4Media
 
Modularized ETL Writing with Apache Spark
Modularized ETL Writing with Apache SparkModularized ETL Writing with Apache Spark
Modularized ETL Writing with Apache Spark
Databricks
 
MongoDB Performance Tuning
MongoDB Performance TuningMongoDB Performance Tuning
MongoDB Performance TuningMongoDB
 
Introduction to Redis
Introduction to RedisIntroduction to Redis
Introduction to RedisDvir Volk
 
High-speed Database Throughput Using Apache Arrow Flight SQL
High-speed Database Throughput Using Apache Arrow Flight SQLHigh-speed Database Throughput Using Apache Arrow Flight SQL
High-speed Database Throughput Using Apache Arrow Flight SQL
ScyllaDB
 
Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao Li & Wenc...
Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao Li & Wenc...Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao Li & Wenc...
Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao Li & Wenc...
Databricks
 
Cosco: An Efficient Facebook-Scale Shuffle Service
Cosco: An Efficient Facebook-Scale Shuffle ServiceCosco: An Efficient Facebook-Scale Shuffle Service
Cosco: An Efficient Facebook-Scale Shuffle Service
Databricks
 
MongoDB Atlas
MongoDB AtlasMongoDB Atlas
MongoDB Atlas
MongoDB
 

What's hot (20)

Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDB
 
Cassandra Introduction & Features
Cassandra Introduction & FeaturesCassandra Introduction & Features
Cassandra Introduction & Features
 
Introducing MongoDB Atlas
Introducing MongoDB AtlasIntroducing MongoDB Atlas
Introducing MongoDB Atlas
 
An Enterprise Architect's View of MongoDB
An Enterprise Architect's View of MongoDBAn Enterprise Architect's View of MongoDB
An Enterprise Architect's View of MongoDB
 
MongodB Internals
MongodB InternalsMongodB Internals
MongodB Internals
 
Modeling Data and Queries for Wide Column NoSQL
Modeling Data and Queries for Wide Column NoSQLModeling Data and Queries for Wide Column NoSQL
Modeling Data and Queries for Wide Column NoSQL
 
MongoDB.pptx
MongoDB.pptxMongoDB.pptx
MongoDB.pptx
 
Query logging with proxysql
Query logging with proxysqlQuery logging with proxysql
Query logging with proxysql
 
Apache Iceberg - A Table Format for Hige Analytic Datasets
Apache Iceberg - A Table Format for Hige Analytic DatasetsApache Iceberg - A Table Format for Hige Analytic Datasets
Apache Iceberg - A Table Format for Hige Analytic Datasets
 
Performance Tuning RocksDB for Kafka Streams' State Stores (Dhruba Borthakur,...
Performance Tuning RocksDB for Kafka Streams' State Stores (Dhruba Borthakur,...Performance Tuning RocksDB for Kafka Streams' State Stores (Dhruba Borthakur,...
Performance Tuning RocksDB for Kafka Streams' State Stores (Dhruba Borthakur,...
 
[pgday.Seoul 2022] PostgreSQL with Google Cloud
[pgday.Seoul 2022] PostgreSQL with Google Cloud[pgday.Seoul 2022] PostgreSQL with Google Cloud
[pgday.Seoul 2022] PostgreSQL with Google Cloud
 
Capacity Planning For Your Growing MongoDB Cluster
Capacity Planning For Your Growing MongoDB ClusterCapacity Planning For Your Growing MongoDB Cluster
Capacity Planning For Your Growing MongoDB Cluster
 
CockroachDB: Architecture of a Geo-Distributed SQL Database
CockroachDB: Architecture of a Geo-Distributed SQL DatabaseCockroachDB: Architecture of a Geo-Distributed SQL Database
CockroachDB: Architecture of a Geo-Distributed SQL Database
 
Modularized ETL Writing with Apache Spark
Modularized ETL Writing with Apache SparkModularized ETL Writing with Apache Spark
Modularized ETL Writing with Apache Spark
 
MongoDB Performance Tuning
MongoDB Performance TuningMongoDB Performance Tuning
MongoDB Performance Tuning
 
Introduction to Redis
Introduction to RedisIntroduction to Redis
Introduction to Redis
 
High-speed Database Throughput Using Apache Arrow Flight SQL
High-speed Database Throughput Using Apache Arrow Flight SQLHigh-speed Database Throughput Using Apache Arrow Flight SQL
High-speed Database Throughput Using Apache Arrow Flight SQL
 
Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao Li & Wenc...
Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao Li & Wenc...Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao Li & Wenc...
Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao Li & Wenc...
 
Cosco: An Efficient Facebook-Scale Shuffle Service
Cosco: An Efficient Facebook-Scale Shuffle ServiceCosco: An Efficient Facebook-Scale Shuffle Service
Cosco: An Efficient Facebook-Scale Shuffle Service
 
MongoDB Atlas
MongoDB AtlasMongoDB Atlas
MongoDB Atlas
 

Similar to Sizing MongoDB Clusters

Sizing MongoDB Clusters
Sizing MongoDB Clusters Sizing MongoDB Clusters
Sizing MongoDB Clusters
MongoDB
 
AWS re:Invent 2016| GAM301 | How EA Leveraged Amazon Redshift and AWS Partner...
AWS re:Invent 2016| GAM301 | How EA Leveraged Amazon Redshift and AWS Partner...AWS re:Invent 2016| GAM301 | How EA Leveraged Amazon Redshift and AWS Partner...
AWS re:Invent 2016| GAM301 | How EA Leveraged Amazon Redshift and AWS Partner...
Amazon Web Services
 
AWS re:Invent 2016: [REPEAT] How EA Leveraged Amazon Redshift and AWS Partner...
AWS re:Invent 2016: [REPEAT] How EA Leveraged Amazon Redshift and AWS Partner...AWS re:Invent 2016: [REPEAT] How EA Leveraged Amazon Redshift and AWS Partner...
AWS re:Invent 2016: [REPEAT] How EA Leveraged Amazon Redshift and AWS Partner...
Amazon Web Services
 
Realtime Analytics on AWS
Realtime Analytics on AWSRealtime Analytics on AWS
Realtime Analytics on AWS
Sungmin Kim
 
MongoDB for Time Series Data
MongoDB for Time Series DataMongoDB for Time Series Data
MongoDB for Time Series Data
MongoDB
 
AWS APAC Webinar Week - Real Time Data Processing with Kinesis
AWS APAC Webinar Week - Real Time Data Processing with KinesisAWS APAC Webinar Week - Real Time Data Processing with Kinesis
AWS APAC Webinar Week - Real Time Data Processing with Kinesis
Amazon Web Services
 
Webinar: Best Practices for Getting Started with MongoDB
Webinar: Best Practices for Getting Started with MongoDBWebinar: Best Practices for Getting Started with MongoDB
Webinar: Best Practices for Getting Started with MongoDB
MongoDB
 
MongoDB Best Practices
MongoDB Best PracticesMongoDB Best Practices
MongoDB Best Practices
Lewis Lin 🦊
 
Webinar: MongoDB Use Cases within the Oil, Gas, and Energy Industries
Webinar: MongoDB Use Cases within the Oil, Gas, and Energy IndustriesWebinar: MongoDB Use Cases within the Oil, Gas, and Energy Industries
Webinar: MongoDB Use Cases within the Oil, Gas, and Energy Industries
MongoDB
 
Getting Started with Real-time Analytics
Getting Started with Real-time AnalyticsGetting Started with Real-time Analytics
Getting Started with Real-time Analytics
Amazon Web Services
 
Microsoft Azure Big Data Analytics
Microsoft Azure Big Data AnalyticsMicrosoft Azure Big Data Analytics
Microsoft Azure Big Data Analytics
Mark Kromer
 
Capacity Planning
Capacity PlanningCapacity Planning
Capacity Planning
MongoDB
 
SRV420 Analyzing Streaming Data in Real-time with Amazon Kinesis
SRV420 Analyzing Streaming Data in Real-time with Amazon KinesisSRV420 Analyzing Streaming Data in Real-time with Amazon Kinesis
SRV420 Analyzing Streaming Data in Real-time with Amazon Kinesis
Amazon Web Services
 
MongoDB for Time Series Data: Setting the Stage for Sensor Management
MongoDB for Time Series Data: Setting the Stage for Sensor ManagementMongoDB for Time Series Data: Setting the Stage for Sensor Management
MongoDB for Time Series Data: Setting the Stage for Sensor ManagementMongoDB
 
2017 AWS DB Day | Amazon Redshift 자세히 살펴보기
2017 AWS DB Day | Amazon Redshift 자세히 살펴보기2017 AWS DB Day | Amazon Redshift 자세히 살펴보기
2017 AWS DB Day | Amazon Redshift 자세히 살펴보기
Amazon Web Services Korea
 
Leveraging Amazon Redshift for Your Data Warehouse
Leveraging Amazon Redshift for Your Data WarehouseLeveraging Amazon Redshift for Your Data Warehouse
Leveraging Amazon Redshift for Your Data Warehouse
Amazon Web Services
 
Mongo db 2.4 time series data - Brignoli
Mongo db 2.4 time series data - BrignoliMongo db 2.4 time series data - Brignoli
Mongo db 2.4 time series data - Brignoli
Codemotion
 
Supercharging Smart Meter BIG DATA Analytics with Microsoft Azure Cloud- SRP ...
Supercharging Smart Meter BIG DATA Analytics with Microsoft Azure Cloud- SRP ...Supercharging Smart Meter BIG DATA Analytics with Microsoft Azure Cloud- SRP ...
Supercharging Smart Meter BIG DATA Analytics with Microsoft Azure Cloud- SRP ...
Mike Rossi
 
Leveraging Amazon Redshift for your Data Warehouse
Leveraging Amazon Redshift for your Data WarehouseLeveraging Amazon Redshift for your Data Warehouse
Leveraging Amazon Redshift for your Data Warehouse
Amazon Web Services
 
AWS re:Invent 2016| DAT318 | Migrating from RDBMS to NoSQL: How Sony Moved fr...
AWS re:Invent 2016| DAT318 | Migrating from RDBMS to NoSQL: How Sony Moved fr...AWS re:Invent 2016| DAT318 | Migrating from RDBMS to NoSQL: How Sony Moved fr...
AWS re:Invent 2016| DAT318 | Migrating from RDBMS to NoSQL: How Sony Moved fr...
Amazon Web Services
 

Similar to Sizing MongoDB Clusters (20)

Sizing MongoDB Clusters
Sizing MongoDB Clusters Sizing MongoDB Clusters
Sizing MongoDB Clusters
 
AWS re:Invent 2016| GAM301 | How EA Leveraged Amazon Redshift and AWS Partner...
AWS re:Invent 2016| GAM301 | How EA Leveraged Amazon Redshift and AWS Partner...AWS re:Invent 2016| GAM301 | How EA Leveraged Amazon Redshift and AWS Partner...
AWS re:Invent 2016| GAM301 | How EA Leveraged Amazon Redshift and AWS Partner...
 
AWS re:Invent 2016: [REPEAT] How EA Leveraged Amazon Redshift and AWS Partner...
AWS re:Invent 2016: [REPEAT] How EA Leveraged Amazon Redshift and AWS Partner...AWS re:Invent 2016: [REPEAT] How EA Leveraged Amazon Redshift and AWS Partner...
AWS re:Invent 2016: [REPEAT] How EA Leveraged Amazon Redshift and AWS Partner...
 
Realtime Analytics on AWS
Realtime Analytics on AWSRealtime Analytics on AWS
Realtime Analytics on AWS
 
MongoDB for Time Series Data
MongoDB for Time Series DataMongoDB for Time Series Data
MongoDB for Time Series Data
 
AWS APAC Webinar Week - Real Time Data Processing with Kinesis
AWS APAC Webinar Week - Real Time Data Processing with KinesisAWS APAC Webinar Week - Real Time Data Processing with Kinesis
AWS APAC Webinar Week - Real Time Data Processing with Kinesis
 
Webinar: Best Practices for Getting Started with MongoDB
Webinar: Best Practices for Getting Started with MongoDBWebinar: Best Practices for Getting Started with MongoDB
Webinar: Best Practices for Getting Started with MongoDB
 
MongoDB Best Practices
MongoDB Best PracticesMongoDB Best Practices
MongoDB Best Practices
 
Webinar: MongoDB Use Cases within the Oil, Gas, and Energy Industries
Webinar: MongoDB Use Cases within the Oil, Gas, and Energy IndustriesWebinar: MongoDB Use Cases within the Oil, Gas, and Energy Industries
Webinar: MongoDB Use Cases within the Oil, Gas, and Energy Industries
 
Getting Started with Real-time Analytics
Getting Started with Real-time AnalyticsGetting Started with Real-time Analytics
Getting Started with Real-time Analytics
 
Microsoft Azure Big Data Analytics
Microsoft Azure Big Data AnalyticsMicrosoft Azure Big Data Analytics
Microsoft Azure Big Data Analytics
 
Capacity Planning
Capacity PlanningCapacity Planning
Capacity Planning
 
SRV420 Analyzing Streaming Data in Real-time with Amazon Kinesis
SRV420 Analyzing Streaming Data in Real-time with Amazon KinesisSRV420 Analyzing Streaming Data in Real-time with Amazon Kinesis
SRV420 Analyzing Streaming Data in Real-time with Amazon Kinesis
 
MongoDB for Time Series Data: Setting the Stage for Sensor Management
MongoDB for Time Series Data: Setting the Stage for Sensor ManagementMongoDB for Time Series Data: Setting the Stage for Sensor Management
MongoDB for Time Series Data: Setting the Stage for Sensor Management
 
2017 AWS DB Day | Amazon Redshift 자세히 살펴보기
2017 AWS DB Day | Amazon Redshift 자세히 살펴보기2017 AWS DB Day | Amazon Redshift 자세히 살펴보기
2017 AWS DB Day | Amazon Redshift 자세히 살펴보기
 
Leveraging Amazon Redshift for Your Data Warehouse
Leveraging Amazon Redshift for Your Data WarehouseLeveraging Amazon Redshift for Your Data Warehouse
Leveraging Amazon Redshift for Your Data Warehouse
 
Mongo db 2.4 time series data - Brignoli
Mongo db 2.4 time series data - BrignoliMongo db 2.4 time series data - Brignoli
Mongo db 2.4 time series data - Brignoli
 
Supercharging Smart Meter BIG DATA Analytics with Microsoft Azure Cloud- SRP ...
Supercharging Smart Meter BIG DATA Analytics with Microsoft Azure Cloud- SRP ...Supercharging Smart Meter BIG DATA Analytics with Microsoft Azure Cloud- SRP ...
Supercharging Smart Meter BIG DATA Analytics with Microsoft Azure Cloud- SRP ...
 
Leveraging Amazon Redshift for your Data Warehouse
Leveraging Amazon Redshift for your Data WarehouseLeveraging Amazon Redshift for your Data Warehouse
Leveraging Amazon Redshift for your Data Warehouse
 
AWS re:Invent 2016| DAT318 | Migrating from RDBMS to NoSQL: How Sony Moved fr...
AWS re:Invent 2016| DAT318 | Migrating from RDBMS to NoSQL: How Sony Moved fr...AWS re:Invent 2016| DAT318 | Migrating from RDBMS to NoSQL: How Sony Moved fr...
AWS re:Invent 2016| DAT318 | Migrating from RDBMS to NoSQL: How Sony Moved fr...
 

More from MongoDB

MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB AtlasMongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB
 
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB
 
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB
 
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDBMongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB
 
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB
 
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series DataMongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB
 
MongoDB SoCal 2020: MongoDB Atlas Jump Start
 MongoDB SoCal 2020: MongoDB Atlas Jump Start MongoDB SoCal 2020: MongoDB Atlas Jump Start
MongoDB SoCal 2020: MongoDB Atlas Jump Start
MongoDB
 
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB
 
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB
 
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB
 
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB
 
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your MindsetMongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB
 
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas JumpstartMongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB
 
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB
 
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB
 
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB
 
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep DiveMongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB
 
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & GolangMongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB
 
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB
 
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB
 

More from MongoDB (20)

MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB AtlasMongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
 
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
 
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
 
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDBMongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
 
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
 
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series DataMongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
 
MongoDB SoCal 2020: MongoDB Atlas Jump Start
 MongoDB SoCal 2020: MongoDB Atlas Jump Start MongoDB SoCal 2020: MongoDB Atlas Jump Start
MongoDB SoCal 2020: MongoDB Atlas Jump Start
 
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
 
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
 
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
 
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
 
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your MindsetMongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
 
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas JumpstartMongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
 
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
 
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
 
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
 
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep DiveMongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
 
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & GolangMongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
 
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
 
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
 

Recently uploaded

Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Product School
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
Jemma Hussein Allen
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
Thijs Feryn
 
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualitySoftware Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Inflectra
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
DanBrown980551
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
Cheryl Hung
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
ThousandEyes
 
Search and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical FuturesSearch and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical Futures
Bhaskar Mitra
 
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Product School
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
Elena Simperl
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
Kari Kakkonen
 
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
Product School
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
DianaGray10
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
James Anderson
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
Guy Korland
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
Prayukth K V
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Product School
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Thierry Lestable
 

Recently uploaded (20)

Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
 
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualitySoftware Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
 
Search and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical FuturesSearch and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical Futures
 
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
 
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
 

Sizing MongoDB Clusters

  • 1. #MDBW17 Jay Runkel Principal Solutions Architect SIZING MONGODB CLUSTERS jay.runkel@mongodb.com @jayrunkel
  • 2. #MDBW17 AGENDA • Sizing Objective • IOPS, Query Processing, Working Set • Sizing Methodology • Sizing Example
  • 4. #MDBW17 SIZING Do I need to shard? What size servers should I use? What will my monthly Atlas/AWS/Azure/Google costs be? When will I need to add a new shard or upgrade my servers? How much data can my servers support? How many queries can my servers support? Will we be able to meet our query latency requirements?
  • 5. #MDBW17 YOUR BOSS COMES TO YOU… • Large coffee chain: PlanetDollar • Collect mobile app performance • Every tap, click, gesture will generate an event • 2 Year History • Perform analytics ‒ Historical ‒ Near real-time (executive dashboards) • Support usage • 3000 – 5000 events per second I need a budget for the monthly Atlas costs?
  • 6. #MDBW17 THE ONLY ACCURATE WAY TO SIZE A CLUSTER • Build a prototype • Run performance tests using actual data and queries on hardware with specs similar to production servers • EVERY OTHER APPROACH IS A GUESS • Including the one I am presenting today
  • 7. #MDBW17 SOMETIMES, IT IS NECESSARY TO GUESS  • Early in project, but ‒ Need to order hardware ‒ Estimate costs to determine “Go/No Go” decision • Schema design ‒ Compare the hardware requirements for different schemas
  • 8. MongoDB Clusters Look Like This Config Config Config Application Driver Primary Secondary Secondary
  • 9. #MDBW17 OUR SOLUTION WILL CONSIST OF • # of shards • Specifications of each server ‒ CPU ‒ Storage o Size o Performance: IOPS ‒ Memory ‒ Network
  • 11. #MDBW17 OUR SOLUTION WILL CONSIST OF • # of shards • Specifications of each server ‒ CPU ‒ Storage o Size o Performance: IOPS ‒ Memory ‒ Network
  • 12. #MDBW17 IOPS • IOPS – input output units per second • Throughput • Random access • Most workloads “randomly” access documents collection
  • 13. #MDBW17 STORAGE PERFORMANCE Type IOPS 7200 rpm SATA ~ 75 – 100 15000 rpm SAS ~ 175 – 210 RAID-10 (24 x 7200 RPM SAS) 2000 Amazon EBS 250 – 500 Amazon EBS Provisioned IOPS 10000 - 20000 SSD 50000 Flash Storage 100K – 400K (or more) http://en.wikipedia.org/wiki/IOPS
  • 14. #MDBW17 HARDEST PART OF SIZING IS IOPS • How many IOPS do we need? • Want the real answer, run a test • How to estimate?
  • 15. #MDBW17 PROCESSING A QUERY Select Index Load relevant index entries from disk Identify documents using index Retrieve documents from disk Filter documents Return Documents
  • 16. #MDBW17 PROCESSING A QUERY Select Index Load relevant index entries from disk Identify documents using index Retrieve documents from disk Filter documents Return Documents IO
  • 17. #MDBW17 BUT MONGODB HAS A CACHE Select Index Load relevant index entries from disk Identify documents using index Retrieve documents from disk Filter documents Return Documents File System indexes collections CPU Memory indexes documents Disk access is only necessary if indexes or documents are not in cache
  • 18. #MDBW17 WORKING SET Select Index Load relevant index entries from disk Identify documents using index Retrieve documents from disk Filter documents Return Documents File System indexes collections CPU Memory indexes documents Working Set = indexes plus frequently accessed documents If RAM greater than working set then reduced IO
  • 19. #MDBW17 THIS IS ALL GREAT, BUT HOW DO WE ESTIMATE IOPS?
  • 20. #MDBW17 MONGODB SIMPLIFIED MODEL Assume • Working Set < RAM < Data Size • Memory contains indexes only File System collections indexes CPU Memory indexes
  • 21. #MDBW17 FIND QUERIES WITH SIMPLIFIED MODEL File System collections indexes CPU Memory indexes Assume appropriate indexes To resolve find: • Navigate in-memory indexes • Retrieve document from disk 1 IOP per document returned
  • 22. #MDBW17 FIND QUERIES WITH SIMPLIFIED MODEL File System collections indexes CPU Memory indexes Assume appropriate indexes To resolve find: • Navigate in-memory indexes • Retrieve document from disk 1 IOP per document returned
  • 23. #MDBW17 INSERTS WITH SIMPLIFIED MODEL To resolve insert: • Write document to disk • Update each index file IOPS = 1 + # of indexes File System collections indexes CPU Memory indexes
  • 24. #MDBW17 DELETES WITH SIMPLIFIED MODEL To resolve delete: • Navigate in-memory indexes • Mark document deleted • Update each index file IOPS = 1 + # of indexes File System collections indexes CPU Memory indexes
  • 25. #MDBW17 UPDATES WITH SIMPLIFIED MODEL To resolve delete: • Navigate in-memory indexes • Mark document deleted • Insert new document version • Update each index file IOPS = 2 + # of indexes File System collections indexes CPU Memory indexes
  • 26. #MDBW17 THE SIMPLIFIED MODEL IS TOO SIMPLISTIC • Working Set • Checkpoints • Document size relative to block size • Indexed Arrays • Journal, Log
  • 27. #MDBW17 CHECKPOINTS • WiredTiger write process: 1. Update document in RAM (cache) 2. Write to journal (disk) 3. Periodically, write dirty documents to disk (checkpoint) o 60 seconds or 2 GB (whichever comes first) Checkpoint 1 Checkpoint 2 Checkpoint 3 B C A A C A 3 writes 3 documents written 3 writes 2 documents written
  • 28. #MDBW17 HOW ARE WE GOING TO GET THERE? • Estimate total requirements (using simplified model): ‒ RAM ‒ CPU ‒ Disk Space ‒ IOPS • Adjust based upon working set, checkpoints, etc. • Design (sharded) cluster that provides these totals
  • 31. #MDBW17 METHODOLOGY (CONT.) 1. Collection Size 2. Working Set 3. Queries -> IOPS 4. Adjust based upon working set, checkpoints, etc. 5. Using candidate server specs, calculate # of shards 6. Review, iterate, repeat Build a spread sheet Multiple iterations may be required
  • 32. Sizing Spreadsheet 1. Assumptions 2. Data Size 1. Working Set – Index Size – Frequently Accessed Documents 1. Queries – IOPS 1. Shard Calculations
  • 33. #MDBW17 SIZING SPREADSHEET 1. Assumptions 2. Data Size 1. Working Set ‒ Index Size ‒ Frequently Accessed Documents 1. Queries – IOPS 1. Shard Calculations
  • 34. #MDBW17 COLLECTION ANALYSIS ‒ # of documents ‒ Data size ‒ Index size ‒ WT compression
  • 35. #MDBW17 CALCULATE THE NUMBER OF DOCUMENTS Application Description # of Documents in Collection There will be 20M documents in the collection by the end of 2017 20,000,000 We expect to insert 10K documents per day with 1 year retention period 365*10,000 = 3,655,000 We have 3000 devices each producing 1 event per minute and we need to keep a 90 day history 3000 * 60 * 24 * 90 = 388,800,000
  • 36. #MDBW17 CALCULATE THE NUMBER OF DOCUMENTS Application Description # of Documents in Collection There will be 20M documents in the collection by the end of 2017 20,000,000 We expect to insert 10K documents per day with 1 year retention period 365*10,000 = 3,655,000 We have 3000 devices each producing 1 event per minute and we need to keep a 90 day history 3000 * 60 * 24 * 90 = 388,800,000 PlanetDollar: 2 year history. Each day 5000 inserts per second for 5 hours and 3000 inserts per second for 19 hours 2*365*(5000*5*3600 + 3000*19*3600) = 215496000000
  • 37. #MDBW17 CALCULATE THE DATA SIZE • Data Size = # of documents * Average document size • This information is available in db.stats(), Compass, Ops Manager, Cloud Manager, Atlas, etc.
  • 38. #MDBW17 WHAT IF THERE AREN’T ANY DOCUMENTS? • Write some code ‒ Programmatically generate a large data set o 5-10% of expected size ‒ Measure o Collection size o Index size o Compression
  • 39. #MDBW17 DETERMINE COLLECTION AND DATA SIZE • Use db.collection.stats() ‒ Take data size, index size and extrapolate to production size ‒ Calculate compression ratio db.collection.stats() { count: 10000 size: 70,388,956 avgObjSize: 7038 storageSize: 25341952 … totalIndexSize: 147456 } Parameter Formula Value # of documents 2.5B avgObjSize 7038 Collection Size =2.5B * 7038 1.760E13 Bytes WT Compression = 25341952/70388956 .36 Collection Storage =2.5B * 7038 * .36 6.33E12 Bytes Index Size Per Doc = 147456 / 10000 15 Bytes Collection Index Size =2.5B * 15 /1024^3 35 GB
  • 40. #MDBW17 SIZING SPREADSHEET 1. Assumptions 2. Data Size 1. Working Set ‒ Index Size ‒ Frequently Accessed Documents 1. Queries – IOPS 1. Shard Calculations
  • 41. #MDBW17 WORKING SET • WorkSet = Indexes plus the set of documents accessed frequently ‒ We know the index size from previous analysis • Estimate the working set ‒ Given the queries ‒ What are the frequently accessed docs? File System collections indexes CPU Memory indexes documents
  • 42. #MDBW17 PLANETDOLLAR WORKING SET Query Analysis • Dashboards look at last minute of data • Customer support debugging tools inspect last hours worth of data • Reports (run once per day) inspect last years worth of data Active Documents = 1 hours worth of data 5000 * 3600 * 1KB = 18M KB = 17 GB Run reports on secondaries
  • 43. #MDBW17 SIZING SPREADSHEET 1. Assumptions 2. Data Size 1. Working Set ‒ Index Size ‒ Frequently Accessed Documents 1. Queries – IOPS 1. Shard Calculations
  • 44. #MDBW17 IOPS CALCULATION + # of documents returned per second + # of documents updated per second + # of indexes impacted by each update + # of inserts per second + # of indexes impacted by each insert + # of deletes per second (x2) + # of indexes impacted by each delete - Multiple updates occurring within checkpoint - % of find query results in cache Total IOPS
  • 45. #MDBW17 PLANETDOLLAR QUERIES • 5000 inserts per second • 5000 deletes per second Dashboards (aggregations: 100 per minute) • Total events per minute across all users (current minute) • Total events per minute per region (current minute) • Total events per store per minute (current minute) Debugging Tool (ad hoc – 5 per second) • Find all events for a user in last 60 minutes (100 events returned, on average) Analytics (reports generated once per day) • For all store and regions, count events per day year over year (last 2 years) • For all store and regions, events per day for last 365 days
  • 46. #MDBW17 IOPS FOR INSERTS AND DELETES • Each insert: ‒ Update collection ‒ Update each index (3 indexes) • Each Delete: ‒ Update collection ‒ Update each index (3 indexes) • 5000 inserts/sec • 5000 deletes/sec 4 IOPS 4 IOPS (4 * 5000) + (4 * 5000) = 40000 IOPS
  • 47. #MDBW17 IOPS FOR PLANETDOLLAR AGGREGATIONS • Example: Total events per minute across all users (current minute) • How many documents will be read from disk? 05000 per second * 60 seconds = 300,000 Most data in cache Some IOPS will likely be required
  • 48. #MDBW17 IOPS FOR FIND • Find all events for a user in last 60 minutes ‒ 5 per second ‒ 100 documents per query • # IOPS = 5 * 100 = 500 IOPS
  • 49. #MDBW17 HOW MANY CPUS DO I NEED? • CPU utilized for: ‒ Compress/decompress ‒ Encrypt/Decrypt ‒ Aggregation queries ‒ General query processing • In most cases, RAM requirements  large servers  many cores • Possible exception: aggregation queries ‒ One core per query ‒ # cores >> # of simultaneous aggregation queries
  • 50. #MDBW17 SIZING SPREADSHEET 1. Assumptions 2. Data Size 1. Working Set ‒ Index Size ‒ Frequently Accessed Documents 1. Queries – IOPS 1. Shard Calculations
  • 51. #MDBW17 SHARD CALCULATIONS • At this point you have: 1. Required storage capacity 2. Working Set Size 3. IOPS Estimate 4. Some idea about class of server (or VM) the customer plans to deploy • Determine number of required shards
  • 52. #MDBW17 DISK SPACE: HOW MANY SHARDS DO I NEED? • Sum of disk space across shards > greater than required storage size Example Data Size = 9 TB WiredTiger Compression Ratio: .33 Storage size = 3 TB Server disk capacity = 2 TB 2 Shards Required Recommend providing 2X the compressed data size in disk
  • 53. #MDBW17 RAM: HOW MANY SHARDS DO I NEED? Example Working Set = 428 GB Server RAM = 128 GB 428/128 = 3.34 4 Shards Required
  • 54. #MDBW17 IOPS: HOW MANY SHARDS DO I NEED? Example Require: 50K IOPS AWS Instance: 20K IOPS 3 Shards Required
  • 61. #MDBW17 SIZING SUMMARY 1. Calculate: ‒ Collection size ‒ Index size 2. Estimate Working Set 3. Use simplified model to estimate IOPS 4. Revise (working set coverage, checkpoints, etc.) 5. Calculate shards

Editor's Notes

  1. Sharding is transparent to applications; whether there is one or one hundred shards, the application code for querying MongoDB is the same. Applications issue queries to a query router that dispatches the query to the appropriate shards. For key-value queries that are based on the shard key, the query router will dispatch the query to the shard that manages the document with the requested key. When using range-based sharding, queries that specify ranges on the shard key are only dispatched to shards that contain documents with values within the range. For queries that don’t use the shard key, the query router will dispatch the query to all shards and aggregate and sort the results as appropriate. Multiple query routers can be used with a MongoDB system, and the appropriate number is determined based on performance and availability requirements of the application.
  2. Do we really need all indexes in RAM? How big is the set of frequently accessed documents? How often will documents be RAM?