MongoDB Performance
Optimization Strategies
Presentation outline
By
Enterprise Account Manager
Kevin Batt
Kevin.batt@enteros.com
408-207-8408
Enteros, Inc.
MongoDB
2014-03-13 Enteros, Inc.
Overview
Before going deep into performance optimization ensure that MongoDB was right
choice for your project as it is completely non relational database means it is
document oriented database.
Map-Reduce
Map-reduce is a data processing paradigm for condensing large volumes of
data into useful aggregated results. For map-reduce operations, MongoDB
provides the mapReduce database command.
Consider the map-reduce operation on the next slide:
MongoDb
2014-03-13 Enteros, Inc.
Performance Optimization
MongoDb
2014-03-13 Enteros, Inc.
Performance Optimization
Update to MongoDB 2.4 or later versions as it supports V8 JavaScript engine and
includes feature like security enhancements, and text search (beta) and hashed
index. The switch to V8 improves concurrency by permitting multiple JavaScript
operations to run at the same time.
In this map-reduce operation, MongoDB applies the map phase to each input
document (i.e. the documents in the collection that match the query
condition). The map function emits key-value pairs. For those keys that have
multiple values, MongoDB applies the reduce phase, which collects and
condenses the aggregated data. MongoDB then stores the results in a
collection. Optionally, the output of the reduce function may pass through a
finalize function to further condense or process the results of the aggregation.
MongoDb
2014-03-13 Enteros, Inc.
Performance Optimization
1. Sharding
Sharding is a method for storing data across multiple machines. MongoDB
uses sharding to support deployments with very large data sets and high
throughput operations.
Shard keys should satisfy the following:
• “distributable” – the worst case of the shard key is auto-incremented
value (this will entail the “hot shard” behavior, when all writes will be
balanced to the single shard – here is the bottle neck). Ideal shard key
should be as much “randomness” as possible.
• Ideal shard key should be the primary field used for your queries.
• An easily divisible shard key makes it easy for MongoDB to distribute
content among the shards. Shard keys that have a limited number of
possible values can result in chunks that are “unsplittable.”
• unique fields in your collection should be part of the shard key
Here is the doc about shard key
MongoDb
2014-03-13 Enteros, Inc.
Performance Optimization
2. Balancing
Bear in mind that moving chunks from shard to another shard is a very
expensive operation (adding of new shards may significantly slow down
the performance).
As an helpful option – you could stop the balancer during the “prime
time”.
MongoDb
2014-03-13 Enteros, Inc.
Performance Optimization
3. Disk Input Output operations
In most cases the hardware bottleneck will be HDD (not CPU or RAM),
especially if you have several shards. So, during the growth of data, the
number of I/O operations will rapidly increase. Also keep monitoring free
disk space. So fast disks are more important in case if you are using sharding.
MongoDb
2014-03-13 Enteros, Inc.
Performance Optimization
3. Disk Input Output operations
In most cases the hardware bottleneck will be HDD (not CPU or
RAM), especially if you have several shards. So, during the growth of
data, the number of I/O operations will rapidly increase. Also keep
monitoring free disk space. So fast disks are more important in case if you
are using sharding.
MongoDb
2014-03-13 Enteros, Inc.
Performance Optimization
4. Locks
MongoDB uses a readers-writer lock that allows concurrent reads access to a
database but gives exclusive access to a single write operation.
When a read lock exists, many read operations may use this lock.
However, when a write lock exists, a single write operation holds the lock
exclusively, and no other read or write operations may share the lock.
Locks are “writer greedy,” which means writes have preference over reads.
When both a read and write are waiting for a lock, MongoDB grants the lock
to the write.
MongoDb
2014-03-13 Enteros, Inc.
Performance Optimization
5. Fast Writes
Use Capped Collections for Fast Writes
Capped Collections are circular, fixed-size collections that keep documents
well-ordered, even without the use of an index. This means that capped
collections can receive very high-speed writes and sequential reads.
These collections are particularly useful for keeping log files but are not
limited to that purpose. Use capped collections where appropriate.
MongoDb
2014-03-13 Enteros, Inc.
Performance Optimization
6. Fast Reads
Use Natural Order for Fast Reads. To return documents in the order they
exist on disk, return sorted operations using the $natural operator. On a
capped collection, this also returns the documents in the order in which they
were written.
Natural order does not use indexes but can be fast for operations when you
want to select the first or last items on disk.
MongoDb
2014-03-13 Enteros, Inc.
Performance Optimization
7. Query Performance
Read out about query performance, especially please pay attention to
Indexes and Compound Indexes.
MongoDb
2014-03-13 Enteros, Inc.
Performance Optimization
9. The size of Database
As far as you might understand MongoDB will store e.g. this document
{ UserFirstAndLastName: "Mikita Manko",
LinkToUsersFacebookPage: "https://www.facebook.com/mikita.manko"
}
“as-is”. I mean that names of these fields “UserFirstAndLastName” and
“LinkToUsersFacebookPage” will reduce free space.
Buy the using “name shorting” technique you can minimize the usage of
memory (you can get rig of something like 30-40% of unnecessary data):
MongoDb
2014-03-13 Enteros, Inc.
Performance Optimization
Obviously that it will cause the creation of “mapper” in your code (You
should map shortened unreadable names from database to long ones to
allow to use readable fields in your code)
{ FL: "Mikita Manko",
BFL: "https://www.facebook.com/mikita.manko"
}
MongoDb
2014-03-13 Enteros, Inc.
Performance Optimization
C. Updates
The most obvious point is to be on the cutting edge of technologies and
Investigate and Install last updates.
Enteros
2014-03-13 Enteros, Inc.
Upbeat High Load Capture
Database Root Cause and Spike Analysis for multi-tiered applications
Enteros UpBeat High Load Capture is an software framework for database problem root cause analysis of
Oracle, DB2, SQL Server, MySQL, Sybase and MongoDB database centric multi-tiered applications. High
Load Capture user interface visually correlates performance and system load metrics across multiple IT
production infrastructure layers. With second-by-second granularity of data analysis, High Load Capture
makes analysis possible for the most transient database performance spikes.
Features
• Multi-threaded, high-precision performance collection engine
• Extensible, dynamically configurable, centrally controlled collection agents
• Comprehensive library of collector agents
• Cross-tier correlation
• Safe, secure agent communication
• Load-sensitive collection controller
Enteros
2014-03-13 Enteros, Inc.
Upbeat High Load Capture
Enteros
2014-03-13 Enteros, Inc.
Upbeat High Load Capture
Supported Infrastructure, Database, Application server, OS monitoring
Database Server OS:
Linux, Sun Solaris, HP/UX, AIX, Windows Server
Client OS:
Windows, Linux
Database:
Oracle, Microsoft SQL, IBM DB2, MySQL, Sybase, MongoDB
Application Server:
Oracle (BEA) WebLogic, Oracle OAS, JBOSS, IBM WAS
MongoDb
2014-03-13 Enteros, Inc.
Enteros, Inc
http://www.enteros.com
Enteros is an innovative software company specializing in
Performance Management and Load Testing Software for
Production Databases - RDBMS and NOSQL/Big Data
Enteros solutions enable IT professionals to identify
and remediate performance problems in business-
critical databases with unprecedented speed, accuracy
and scope.
Kevin Batt; kevin.batt@enteros.com
408-207-8408

Mongo db pefrormance optimization strategies

  • 1.
    MongoDB Performance Optimization Strategies Presentationoutline By Enterprise Account Manager Kevin Batt Kevin.batt@enteros.com 408-207-8408 Enteros, Inc.
  • 2.
    MongoDB 2014-03-13 Enteros, Inc. Overview Beforegoing deep into performance optimization ensure that MongoDB was right choice for your project as it is completely non relational database means it is document oriented database. Map-Reduce Map-reduce is a data processing paradigm for condensing large volumes of data into useful aggregated results. For map-reduce operations, MongoDB provides the mapReduce database command. Consider the map-reduce operation on the next slide:
  • 3.
  • 4.
    MongoDb 2014-03-13 Enteros, Inc. PerformanceOptimization Update to MongoDB 2.4 or later versions as it supports V8 JavaScript engine and includes feature like security enhancements, and text search (beta) and hashed index. The switch to V8 improves concurrency by permitting multiple JavaScript operations to run at the same time. In this map-reduce operation, MongoDB applies the map phase to each input document (i.e. the documents in the collection that match the query condition). The map function emits key-value pairs. For those keys that have multiple values, MongoDB applies the reduce phase, which collects and condenses the aggregated data. MongoDB then stores the results in a collection. Optionally, the output of the reduce function may pass through a finalize function to further condense or process the results of the aggregation.
  • 5.
    MongoDb 2014-03-13 Enteros, Inc. PerformanceOptimization 1. Sharding Sharding is a method for storing data across multiple machines. MongoDB uses sharding to support deployments with very large data sets and high throughput operations. Shard keys should satisfy the following: • “distributable” – the worst case of the shard key is auto-incremented value (this will entail the “hot shard” behavior, when all writes will be balanced to the single shard – here is the bottle neck). Ideal shard key should be as much “randomness” as possible. • Ideal shard key should be the primary field used for your queries. • An easily divisible shard key makes it easy for MongoDB to distribute content among the shards. Shard keys that have a limited number of possible values can result in chunks that are “unsplittable.” • unique fields in your collection should be part of the shard key Here is the doc about shard key
  • 6.
    MongoDb 2014-03-13 Enteros, Inc. PerformanceOptimization 2. Balancing Bear in mind that moving chunks from shard to another shard is a very expensive operation (adding of new shards may significantly slow down the performance). As an helpful option – you could stop the balancer during the “prime time”.
  • 7.
    MongoDb 2014-03-13 Enteros, Inc. PerformanceOptimization 3. Disk Input Output operations In most cases the hardware bottleneck will be HDD (not CPU or RAM), especially if you have several shards. So, during the growth of data, the number of I/O operations will rapidly increase. Also keep monitoring free disk space. So fast disks are more important in case if you are using sharding.
  • 8.
    MongoDb 2014-03-13 Enteros, Inc. PerformanceOptimization 3. Disk Input Output operations In most cases the hardware bottleneck will be HDD (not CPU or RAM), especially if you have several shards. So, during the growth of data, the number of I/O operations will rapidly increase. Also keep monitoring free disk space. So fast disks are more important in case if you are using sharding.
  • 9.
    MongoDb 2014-03-13 Enteros, Inc. PerformanceOptimization 4. Locks MongoDB uses a readers-writer lock that allows concurrent reads access to a database but gives exclusive access to a single write operation. When a read lock exists, many read operations may use this lock. However, when a write lock exists, a single write operation holds the lock exclusively, and no other read or write operations may share the lock. Locks are “writer greedy,” which means writes have preference over reads. When both a read and write are waiting for a lock, MongoDB grants the lock to the write.
  • 10.
    MongoDb 2014-03-13 Enteros, Inc. PerformanceOptimization 5. Fast Writes Use Capped Collections for Fast Writes Capped Collections are circular, fixed-size collections that keep documents well-ordered, even without the use of an index. This means that capped collections can receive very high-speed writes and sequential reads. These collections are particularly useful for keeping log files but are not limited to that purpose. Use capped collections where appropriate.
  • 11.
    MongoDb 2014-03-13 Enteros, Inc. PerformanceOptimization 6. Fast Reads Use Natural Order for Fast Reads. To return documents in the order they exist on disk, return sorted operations using the $natural operator. On a capped collection, this also returns the documents in the order in which they were written. Natural order does not use indexes but can be fast for operations when you want to select the first or last items on disk.
  • 12.
    MongoDb 2014-03-13 Enteros, Inc. PerformanceOptimization 7. Query Performance Read out about query performance, especially please pay attention to Indexes and Compound Indexes.
  • 13.
    MongoDb 2014-03-13 Enteros, Inc. PerformanceOptimization 9. The size of Database As far as you might understand MongoDB will store e.g. this document { UserFirstAndLastName: "Mikita Manko", LinkToUsersFacebookPage: "https://www.facebook.com/mikita.manko" } “as-is”. I mean that names of these fields “UserFirstAndLastName” and “LinkToUsersFacebookPage” will reduce free space. Buy the using “name shorting” technique you can minimize the usage of memory (you can get rig of something like 30-40% of unnecessary data):
  • 14.
    MongoDb 2014-03-13 Enteros, Inc. PerformanceOptimization Obviously that it will cause the creation of “mapper” in your code (You should map shortened unreadable names from database to long ones to allow to use readable fields in your code) { FL: "Mikita Manko", BFL: "https://www.facebook.com/mikita.manko" }
  • 15.
    MongoDb 2014-03-13 Enteros, Inc. PerformanceOptimization C. Updates The most obvious point is to be on the cutting edge of technologies and Investigate and Install last updates.
  • 16.
    Enteros 2014-03-13 Enteros, Inc. UpbeatHigh Load Capture Database Root Cause and Spike Analysis for multi-tiered applications Enteros UpBeat High Load Capture is an software framework for database problem root cause analysis of Oracle, DB2, SQL Server, MySQL, Sybase and MongoDB database centric multi-tiered applications. High Load Capture user interface visually correlates performance and system load metrics across multiple IT production infrastructure layers. With second-by-second granularity of data analysis, High Load Capture makes analysis possible for the most transient database performance spikes. Features • Multi-threaded, high-precision performance collection engine • Extensible, dynamically configurable, centrally controlled collection agents • Comprehensive library of collector agents • Cross-tier correlation • Safe, secure agent communication • Load-sensitive collection controller
  • 17.
  • 18.
    Enteros 2014-03-13 Enteros, Inc. UpbeatHigh Load Capture Supported Infrastructure, Database, Application server, OS monitoring Database Server OS: Linux, Sun Solaris, HP/UX, AIX, Windows Server Client OS: Windows, Linux Database: Oracle, Microsoft SQL, IBM DB2, MySQL, Sybase, MongoDB Application Server: Oracle (BEA) WebLogic, Oracle OAS, JBOSS, IBM WAS
  • 19.
    MongoDb 2014-03-13 Enteros, Inc. Enteros,Inc http://www.enteros.com Enteros is an innovative software company specializing in Performance Management and Load Testing Software for Production Databases - RDBMS and NOSQL/Big Data Enteros solutions enable IT professionals to identify and remediate performance problems in business- critical databases with unprecedented speed, accuracy and scope. Kevin Batt; kevin.batt@enteros.com 408-207-8408