5. Who?
• 10gen -> company behind MongoDB
• Created by Dwight & Eliot
• MongoDB is open-source & community is key
• Offices in California, NY, Dublin, London & Sydney
• $73.4 million in VC funding
Tuesday 3 July 2012
6. What? (1)
• Powerful, flexible, scalable, fast data store
• Document-oriented
• Embedded docs & arrays
• Scale out
• Easy to start & develop with
Tuesday 3 July 2012
14. Getting Started
• Document -> basic unit of data ~ a row in RBMS
• Collection -> schema equivalent of a table
• Single instance can have multiple dbs
• JS Shell -> administration
• Event document has special, unique key -> _id
Tuesday 3 July 2012
15. Collection
> show collections
file_tweets
mini_tweets
system.indexes
Tuesday 3 July 2012
24. Replica Set Info
• Asynchronous replication (single primary)
• Automatic failover
• App-level definition of “write replication”
• Secondary nodes can replicate with a slaveDelay
• Secondary nodes can be hidden
• Max of 12 nodes, with 7 voting
Tuesday 3 July 2012
25. Sharding
config DB
mongos mongos mongos mongos config DB
config DB
Primary Primary Primary Primary
Secondary Secondary Secondary Secondary
Secondary Secondary Secondary Secondary
Tuesday 3 July 2012
26. Sharding Notes
• Each “shard” usually a Replica Set (same options)
• Copy of meta data stored in-memory by mongos
• Data split into chunks, using range based shard key
• Chunks may be migrated between shards
• New chunks created by “splitting” old chunks
Tuesday 3 July 2012
27. Shard Server in EC2 (1)
Category/Impact Low Medium High
Disk Speed x
Disk Capacity x
RAM x
CPU x
Tuesday 3 July 2012
28. Shard Server in EC2 (2)
• MongoDB designed for OS defaults on 64 bit
instance
• Use standard virtual memory page size
• Raise “nofiles” ulimit (20,000)
• Use RAID10 & modern f/s -> ext4, xfs etc
• Use “noatime” mount option
Tuesday 3 July 2012
29. Server in EC2 (1)
• kernel >= 2.6.23/2.6.25 respectively
• Readahead: how much more to read than what you
asked for
• If too high => possible performance impact
• Set to 0 on EBS devices
• Set to desired value on RAID device
Tuesday 3 July 2012
30. Server in EC2 (2)
• Turn off atime on filesystem (pre-2.6.30 especially)
• RAID 10 is recommended everywhere
• mitigates slow EBS volumes (fail the bad volume)
• Do not use large VM pages
• Do configure swap to prevent OOM Killer
Tuesday 3 July 2012
31. Config Server
• Meta Data for shard stored in ConfigDB
• Copy of meta data stored in-memory by mongos
• Config DB cluster is *not* a replica set -> run 3!!
• If config server goes down then:
• no splits and migrates
• new mongos cannot be started
• running mongos can still use cache to route r/w
Tuesday 3 July 2012
32. Config Server in EC2 (1)
Category/Impact Low Medium High
Disk Speed x
Disk Capacity x
RAM x
CPU x
Tuesday 3 July 2012
33. Config Server in EC2 (2)
• Use Raid10
• Use 64 bit instance
• Can run on shard servers
Tuesday 3 July 2012
34. Mongos in EC2 (1)
Category/Impact Low Medium High
Disk Speed
Disk Capacity
RAM x
CPU x
Tuesday 3 July 2012
35. Mongos in EC2 (2)
• Often run on application servers
• 32 bit mongos ok with 64-bit mongod
Tuesday 3 July 2012
36. Arbiter in EC2 (1)
Category/Impact Low Medium High
Disk Speed x
Disk Capacity x
RAM x
CPU x
Tuesday 3 July 2012
37. Arbiter in EC2 (2)
• Can use micro instance
• Elections may be slower
• Can use instance store
• Still want backups :)
Tuesday 3 July 2012
38. HA in EC2
• Replica Sets
• EC2 Availability Zones
Tuesday 3 July 2012
39. DR in EC2
• Replica Sets
• EC2 Regions
Tuesday 3 July 2012
40. Security
• Turn on authentication
• Create a key between shards
• EC2 Security Groups
• Can reference other Security Groups
• EC2 Regions
• Follow SDLC in coding your app
Tuesday 3 July 2012
41. Monitoring
• Links in with Cacti, Nagios, Munin-Node etc.
• MMS - > it’s free
Tuesday 3 July 2012
44. Instances Guidelines (1)
• Use 64-bit only, 32-bit is not recommended
• Primary/Secondary should be equal*
• High CPU is not necessary
• High Memory for large mongod instances
• Network capacity is also IO capacity (EBS)
Tuesday 3 July 2012
45. Instances Guidelines (2)
• Note the trade-offs - memory/network
• m1.large to m2.xlarge = 2x Mem, 0.5x Network
• Do not use micro except for testing & config
• m1.medium is usually sufficient for config DB
• m1.small can be used for Arbiters
Tuesday 3 July 2012
46. Backups
• EBS Snapshots - RAID complicates things
• Single EBS volume, with journaling means:
• No fsync & lock required
• Similar applies to LVM snapshots
• EC2
• General
Tuesday 3 July 2012
47. EC2/MongoDB Best Practices
• https://wiki.10gen.com/display/DOCS/Amazon+EC2
• https://wiki.10gen.com/display/DOCS/Production+Notes
Tuesday 3 July 2012
49. node.js
• server-side written in javascript
• orginally written for push web apps
• created by Ryan Dahl
Tuesday 3 July 2012
50. Sample node.js code
var express = require('express'),
Db = require('mongodb').Db,
Server = require('mongodb').Server,
Connection = require('mongodb').Connection;
var host = 'localhost';
var port = Connection.DEFAULT_PORT;
var db = new Db('node-mongo-examples',
new Server(host, port, {}), {native_parser:false});
var app = express.createServer();
app.get('/', function(req, res){
res.send('Hello World');
});
db.open(function(err, db) {
if(err) throw err
app.listen(8124);
});
Tuesday 3 July 2012
51. node.js with mongo
• https://github.com/christkv/node-mongodb-native
• http://www.nodebeginner.org
Tuesday 3 July 2012
53. • MongoDB Google User Group
• New MongoDB Docs & Old MongoDB Docs
• Presentations
• If you’re curious :)
Image Source: http://www.cannotstartoutlook.com/wp-content/uploads/2012/06/outlook-problems-help.jpg
Tuesday 3 July 2012
54. Credits
• Credit must go to @comerford, @jonnyeight &
@mikefiedler as I borrowed some of the knowledgeable
slides :)
Tuesday 3 July 2012