Ops 101 
Asya Kamsky 
Principal Community Advocate 
MongoDB
Operational Database Landscape
RDBMS 
Agility 
MongoDB 
{ 
_id : ObjectId("4c4ba5e5e8aabf3"), 
employee_name: "Dunham, Justin", 
department : "Marketing", 
title : "Product Manager, Web", 
report_up: "Neray, Graham", 
pay_band: “C", 
benefits : [ 
{ type : "Health", 
plan : "PPO Plus" }, 
{ type : "Dental", 
plan : "Standard" } 
] 
}
Document Data Model 
Relational MongoDB 
{ 
first_name: ‘Paul’, 
surname: ‘Miller’, 
city: ‘London’, 
location: 
[45.123,47.232], 
cars: [ 
{ model: ‘Bentley’, 
year: 1973, 
value: 100000, … }, 
{ model: ‘Rolls Royce’, 
year: 1965, 
value: 330000, … } 
} 
}
Document Model Benefits 
• Agility and flexibility 
– Data models can evolve easily 
– Companies can adapt to changes quickly 
• Intuitive, natural data representation 
– Developers are more productive 
– Many types of applications are a good fit 
• Reduces the need for joins, disk seeks 
– Programming is more simple 
– Performance can be delivered at scale
Shell and Drivers 
Drivers 
Drivers for most popular 
programming languages and 
frameworks 
Shell 
Command-line shell for 
interacting directly with 
database 
> db.collection.insert({company:“10gen”, 
product:“MongoDB”}) 
> 
> db.collection.findOne() 
{ 
“_id” : ObjectId(“5106c1c2fc629bfe52792e86”), 
“company” : “10gen” 
“product” : “MongoDB” 
} 
Haskell
Scalability
Automatic Sharding 
• Increase or decrease capacity as you go 
• Automatic balancing 
• Three types of sharding: 
 hash-based 
 range-based 
 tag-aware
Query Routing 
• Multiple query optimization models 
• Many sharding options appropriate for different apps
High Availability
Availability Considerations 
• High Availability – Ensure application availability during 
many types of failures 
• Disaster Recovery – Address the RTO and RPO goals 
for business continuity 
• Maintenance – Perform upgrades and other 
maintenance operations with no application downtime
Replica Sets 
• Replica Set – two or more copies 
• “Self-healing” shard 
• Addresses many concerns: 
- High Availability 
- Disaster Recovery 
- Maintenance
Replica Set Benefits 
Business Needs Replica Set Benefits 
High Availability Automated failover 
Disaster Recovery Hot backups offsite 
Maintenance Rolling upgrades 
Low Latency Locate data near users 
Workload Isolation Read from designated nodes 
Data Consistency Tunable Consistency
Performance
Better Data Locality 
Performance 
In-Memory Caching In-Place 
Updates
Performance at Scale 
• Entertainment Company: 1,400 servers 
• Craigslist: 5B documents 
• Carfax: 11B documents 
• Tier 1 Bank: 30K ops/sec 
• Major Retailer: 50K ops/sec 
• Fed Agency: 500K ops/sec 
• Wordnik: 20B documents, 35,000 ops/sec
MongoDB Performance* 
Top 5 Marketing Firm Government Agency Top 5 Investment Bank 
Data Key/value 10+ fields, arrays, 
nested documents 
20+ fields, arrays, 
nested documents 
Queries Key-based 
1 – 100 docs/query 
80/20 read/write 
Compound queries 
Range queries 
MapReduce 
20/80 read/write 
Compound queries 
Range queries 
50/50 read/write 
Servers ~250 ~50 ~40 
Ops/sec 1,200,000 500,000 30,000 
* These figures are provided as examples. Your application governs your performance.
Key Deployment Considerations 
Capacity Planning 
• Requirements 
• Testing 
• Monitoring
Key Performance Considerations 
Capacity Planning 
• Requirements 
• Testing 
• Monitoring 
Performance Tuning 
• Understanding 
• Adjusting 
• Monitoring
Monitoring
Monitoring 
• CLI and internal status commands 
• mongostat; mongotop; db.serverStatus() 
• Plug-ins for munin, Nagios, cacti, etc. 
• Integration via SNMP to other tools 
• MMS
MongoDB Management Service 
Cloud-based suite of services for managing MongoDB deployments
MongoDB Management Service 
Cloud-based suite of services for managing MongoDB deployments 
• Charts, custom dashboards and automated alerting 
• Tracks 100+ metrics – performance, resource utilization, 
availability and response times 
• 15,000+ users
MongoDB Management Service 
Cloud-based suite of services for managing MongoDB deployments 
• Backup and restore with 
– point-in-time recovery, 
– support for sharded clusters 
• MMS On-Prem included with MongoDB Enterprise 
(backup coming soon)
A Picture Speaks a Thousand Words
Symptoms 
High Use CPU Similar Query Pattern
Monitoring Best Practices 
• Monitor Logs 
– Alert, escalate 
– Correlate 
• Disk 
– Monitor 
• Instrument/Monitor App (including logs!) 
• Know your application and application (write) characteristics
Ops Jumpstart: MongoDB Administration 101
Ops Jumpstart: MongoDB Administration 101

Ops Jumpstart: MongoDB Administration 101

  • 1.
    Ops 101 AsyaKamsky Principal Community Advocate MongoDB
  • 2.
  • 3.
    RDBMS Agility MongoDB { _id : ObjectId("4c4ba5e5e8aabf3"), employee_name: "Dunham, Justin", department : "Marketing", title : "Product Manager, Web", report_up: "Neray, Graham", pay_band: “C", benefits : [ { type : "Health", plan : "PPO Plus" }, { type : "Dental", plan : "Standard" } ] }
  • 4.
    Document Data Model Relational MongoDB { first_name: ‘Paul’, surname: ‘Miller’, city: ‘London’, location: [45.123,47.232], cars: [ { model: ‘Bentley’, year: 1973, value: 100000, … }, { model: ‘Rolls Royce’, year: 1965, value: 330000, … } } }
  • 5.
    Document Model Benefits • Agility and flexibility – Data models can evolve easily – Companies can adapt to changes quickly • Intuitive, natural data representation – Developers are more productive – Many types of applications are a good fit • Reduces the need for joins, disk seeks – Programming is more simple – Performance can be delivered at scale
  • 6.
    Shell and Drivers Drivers Drivers for most popular programming languages and frameworks Shell Command-line shell for interacting directly with database > db.collection.insert({company:“10gen”, product:“MongoDB”}) > > db.collection.findOne() { “_id” : ObjectId(“5106c1c2fc629bfe52792e86”), “company” : “10gen” “product” : “MongoDB” } Haskell
  • 7.
  • 8.
    Automatic Sharding •Increase or decrease capacity as you go • Automatic balancing • Three types of sharding:  hash-based  range-based  tag-aware
  • 9.
    Query Routing •Multiple query optimization models • Many sharding options appropriate for different apps
  • 10.
  • 11.
    Availability Considerations •High Availability – Ensure application availability during many types of failures • Disaster Recovery – Address the RTO and RPO goals for business continuity • Maintenance – Perform upgrades and other maintenance operations with no application downtime
  • 12.
    Replica Sets •Replica Set – two or more copies • “Self-healing” shard • Addresses many concerns: - High Availability - Disaster Recovery - Maintenance
  • 13.
    Replica Set Benefits Business Needs Replica Set Benefits High Availability Automated failover Disaster Recovery Hot backups offsite Maintenance Rolling upgrades Low Latency Locate data near users Workload Isolation Read from designated nodes Data Consistency Tunable Consistency
  • 14.
  • 15.
    Better Data Locality Performance In-Memory Caching In-Place Updates
  • 16.
    Performance at Scale • Entertainment Company: 1,400 servers • Craigslist: 5B documents • Carfax: 11B documents • Tier 1 Bank: 30K ops/sec • Major Retailer: 50K ops/sec • Fed Agency: 500K ops/sec • Wordnik: 20B documents, 35,000 ops/sec
  • 17.
    MongoDB Performance* Top5 Marketing Firm Government Agency Top 5 Investment Bank Data Key/value 10+ fields, arrays, nested documents 20+ fields, arrays, nested documents Queries Key-based 1 – 100 docs/query 80/20 read/write Compound queries Range queries MapReduce 20/80 read/write Compound queries Range queries 50/50 read/write Servers ~250 ~50 ~40 Ops/sec 1,200,000 500,000 30,000 * These figures are provided as examples. Your application governs your performance.
  • 18.
    Key Deployment Considerations Capacity Planning • Requirements • Testing • Monitoring
  • 19.
    Key Performance Considerations Capacity Planning • Requirements • Testing • Monitoring Performance Tuning • Understanding • Adjusting • Monitoring
  • 20.
  • 21.
    Monitoring • CLIand internal status commands • mongostat; mongotop; db.serverStatus() • Plug-ins for munin, Nagios, cacti, etc. • Integration via SNMP to other tools • MMS
  • 22.
    MongoDB Management Service Cloud-based suite of services for managing MongoDB deployments
  • 23.
    MongoDB Management Service Cloud-based suite of services for managing MongoDB deployments • Charts, custom dashboards and automated alerting • Tracks 100+ metrics – performance, resource utilization, availability and response times • 15,000+ users
  • 24.
    MongoDB Management Service Cloud-based suite of services for managing MongoDB deployments • Backup and restore with – point-in-time recovery, – support for sharded clusters • MMS On-Prem included with MongoDB Enterprise (backup coming soon)
  • 25.
    A Picture Speaksa Thousand Words
  • 26.
    Symptoms High UseCPU Similar Query Pattern
  • 27.
    Monitoring Best Practices • Monitor Logs – Alert, escalate – Correlate • Disk – Monitor • Instrument/Monitor App (including logs!) • Know your application and application (write) characteristics

Editor's Notes

  • #2 MongoDB is the leading open-source, document database. Technical details of MongoDB what makes it different from traditional relational database management systems. data storage, high availability and scaling deploying MongoDB in production. operational challenges including performance tuning, capacity planning deploy robust highly-available cluster topology
  • #3 Dotted line is the natural boundary of what is possible today. Eg, ORCL lives far out on the right and does things nosql vendors will ever do. These things come at the expense of some degree of scale and performance. NoSQL born out of wanting greater scalability and performance, but we think they overreacted by giving up some things. Eg, caching layers give up many things, key value stores are super fast, but give up rich data model and rich query model. MongoDB tries to give up some features of a relational database (joins, complex transactions) to enable greater scalability and performance. You get most of the functionality – 80% - with much better scalability and performance. Start with rdbms, ask what could we do to scale – take out complex transactions and joins. How? Change the data model. >> segue to data model section. May need to revise the graphic – either remove the line or all points should be on the line. To enable horizontal scalability, reduce coordination between nodes (joins and transactions). Traditionally in rdbms you would denormalize the data or tell the system more about how data relates to one another. Another way, a more intuitive way, is to use a document data model. More intuitive b/c closer to the way we develop applications today with object oriented languages, like java,.net, ruby, node.js, etc. Document data model is good segue to next section >> Data Model
  • #4 MongoDB provides agility, scalability, and performance without sacrificing the functionality of relational databases, like full index support and rich queries Indexes: secondary, compound, text search (with MongoDB 2.4), geospatial, and more
  • #6 Here we have greatly reduced the relational data model for this application to two tables. In reality no database has two tables. It is much more common to have hundreds or thousands of tables. And as a developer where do you begin when you have a complex data model?? If you’re building an app you’re really thinking about just a hand full of common things, like products, and these can be represented in a document much more easily that a complex relational model where the data is broken up in a way that doesn’t really reflect the way you think about the data or write an application.
  • #21 Many factors affect performance Make the right tradeoffs for your application We can help you make the most of your MongoDB system The following slides are examples of users and their systems.
  • #24 servers – shards + HA requirements 4800/server; 10,000/server; 750/server
  • #27 Many factors affect performance Make the right tradeoffs for your application We can help you make the most of your MongoDB system The following slides are examples of users and their systems.