Your SlideShare is downloading. ×
  • Like

Thanks for flagging this SlideShare!

Oops! An error has occurred.


Now you can save presentations on your phone or tablet

Available for both IPhone and Android

Text the download link to your phone

Standard text messaging rates apply

Deployment Preparedness



  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads


Total Views
On SlideShare
From Embeds
Number of Embeds



Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

    No notes for slide


  • 1. #MongoDBTokyoDeploymentPreparednessAlvin RichardsTechnical Director, 10gen
  • 2. Plan A because there is no PlanB
  • 3. Part OneBefore you deploy…
  • 4. Prototype Ops Playbook Test Capacity Planning MonitorReinventing the wheel
  • 5. Essentials• Disable NUMA• Pick appropriate file-system (xfs, ext4)• Pick 64-bit O/S – Recent Linux kernel, Win2k8R2• More RAM – Spend on RAM not Cores• Faster Disks – SSDs vs. SAN – Separate Journal and Data Files
  • 6. Key things to consider• Profiling – Baseline/Blue print: Understand what should happen – Ensure good Index usage• Monitoring – SNMP, munin, zabix, cacti, nagios – MongoDB Monitoring Service (MMS)• Sizing – Understand Capability (RAM, IOPs) – Understand Use Cases + Schema
  • 7. What is your SLA?• High Availability? – 24x7x365 operation? – Limited maintenance window?• Data Protection? – Failure of a Single Node? – Failure of a Data Center?• Disaster Recovery? – Manual or automatic failover? – Data Center, Region, Continent?
  • 8. Build & Test your Playbook• Backups• Restores (backups are not enough)• Upgrades• Replica Set Operations• Sharding Operations
  • 9. Part TwoUnder the cover…
  • 10. How to see metrics• mongostat• MongoDB plug ins for – munin, zabix, cacti, ganglia•Hosted Services – MMS - 10gen – Server Density, Cloudkick• Profiling
  • 11. Operation Counters
  • 12. Metrics in detail: opcounters• Counts: Insert, Update, Delete, Query, Commands• Operation counters are mostly straightforward: more is better• Some operations in a replica set primary are accounted differently in a secondary• getlastError(), system.status etc are also counted
  • 13. Resident Memory counter
  • 14. Metrics in detail: residentmemory• Key metric: to a very high degree, the performance of a mongod is a measure of how much data fits in RAM.• If this quantity is stably lower than available physical memory, the mongod is likely performing well.• Correlated metrics: page faults, B-Tree misses
  • 15. Page Faults counter
  • 16. Collection 1 Virtual Disk Address Space 1 Physical RAM Index 1 100 ns = 10,000,000 ns =
  • 17. Metrics in detail: page faults• This measures reads or writes to pages of data file that arent resident in memory• If this is persistently non-zero, your data doesnt fit in memory.• Correlated metrics: resident memory, B-Tree misses, iostats
  • 18. Working Set> db.blogs.stats(){ Size of data "ns" : "test.blogs", "count" : 1338330, "size" : 46915928, Average "avgObjSize" : 35.05557523181876, document size "storageSize" : 86092032, "numExtents" : 12, Size on disk (and "nindexes" : 2, in memory!) "lastExtentSize" : 20872960, "paddingFactor" : 1, "flags" : 0, "totalIndexSize" : 99860480, Size of all "indexSizes" : { indexes "_id_" : 55877632, "name_1" : 43982848 Size of each }, index "ok" : 1}
  • 19. Lock % counter
  • 20. Metrics in detail: lockpercentage and queues• By itself, lock % can be misleading: a high lock percentage just means that writing is happening.• But when lock % is high and queued readers or writers is non-zero, then the mongod probably at its write capacity.• Correlated metrics: iostats
  • 21. Log fileMon Dec 3 15:05:37 [conn81]getmore scaleout.nodes query: { ts: { $lte: new Date(1354547123142) } }cursorid:8607875337747748011ntoreturn:0keyUpdates:0numYields: 216locks(micros) r:615830nreturned:27055reslen:4194349551ms
  • 22. explain, hint// explain() shows the plan used by the operation> db.c.find(<query>).explain()// hint() forces a query to use a specific index// x_1 is the name of the index from db.c.getIndexes()> db.c.find( {x:1} ).hint("x_1")
  • 23. B-Tree Counter
  • 24. Metrics in detail: B-Tree• Indicates b-tree accesses including page fault service during an index lookup• If misses are persistently non-zero, your indexes dont fit in RAM. (You might need to change or drop indexes, or shard your data.)• Correlated metrics: resident memory, page faults, iostats
  • 25. B-Trees strengths• B-Tree indexes are designed for range queries over a single dimension• Think of a compound index on { A, B } as being an index on the concatenation of the A and B values in documents• MongoDB can use its indexes for sorting as well
  • 26. B-Trees weaknesses• Ranges queries on the first field of a compound index are suboptimal• Range queries over multiple dimensions are suboptimal• In both these cases, a suboptimal index might be better than nothing, but best is to try to see if you cant change the problem
  • 27. Indexing dark corners• Some functionality cant currently always use indexes: – $where JavaScript clauses – $mod, $not, $ne – regex• Negation maybe transformed into a range query – Index can be used• Complicated regular expressions scan a whole index
  • 28. Other tricks
  • 29. Warming the Cache> db.c.find( {unused_key: 1} ).explain()> db.c.find( {unused_key: 1} ) .hint( {random_index:1} ) .explain()# cat /data/db/* > /dev/null// New in 2.2> db.runCommand( { touch: "blogs", data: true, index: true } )
  • 30. Journal on another disk•The journals write load is very different than thedata files – journal = append-only – data files = randomly accessed•Putting the journal on a separate disk or RAID(e.g., with a symlink) will minimize any seek-timerelated journaling overhead
  • 31. --directoryperdb• Allows storage tiering – Different access patterns – Different Disk Types / Speeds• use --directoryperdb• add symlink into database directory
  • 32. Dynamically change log level// Change logging level to get more info> db.adminCommand({ setParameter: 1, logLevel: 1 })> db.adminCommand({ setParameter: 1, logLevel: 0 })
  • 33. Because you now have aPlan B