Administration
        Michael DelNegro
 Principal Database Administrator
               AOL



                1
Presentation Overview
• Introduction
• My Applications
• Tips
• Tools
• Resources
• Upcoming
                    2
About Me
• DBA at AOL (Dulles) for six years
• Background in Sybase
• Now MySQL, PostgreSQL and NoSQL
• Was: Blogsmith, Uncut Video, Travel, Autos,
  Journals, Real Estate, Ficlets, Shopping
• Currently: Patch, MapQuest, HSS,
  Datalayer, Demand
• I Heart Big Data
                       3
About MongoDB
• “Scalable, high-performance, open source,
  document-oriented database”
• Databases (Databases)
 • Collections (Tables)
    • Documents (Rows)
     • Fields (Columns) - K/V Pairs
• Indexes
• No Joins
 • Favors Embedding Data instead of FKs
                     4
More About MongoDB
• JSON-style Documents
• Javascript Shell
• Capped Collections
• Flexible Schemas
• Replication w/ Autofailover
• Sharding
• GridFS: File Storage
• Map-Reduce         5
MongoDB Support
• Operating Systems
 • Linux, Windows, Mac OS X, Solaris
 • 32bit, 64bit
• Drivers
 • Java, Javascript, Perl, Ruby, Scala, Erlang, C,
     C#, C++, Haskell, PHP, Python
  • R, Smalltalk, node.js, ColdFusion
                        6
MongoDB Use Cases
• Website Data Store
• Caching Tier
• Document and Content Mgmt Systems
• Event Logging
• Real-time Stats/Analytics
• Archiving
• High Volume Problems
                  7
MongoDB Misuse

• Complex Transactional Systems
• Traditional Business Intelligence
• SQL is a Must

                      8
MongoDB at AOL

• In use since Summer 2010
• I currently administer two applications for
  MapQuest and Patch
• There are other MongoDB applications in
  use throughout the company and more on
  the way


                      9
MapQuest App
• Tracks User Profile Information
• V. 1.6.5.latest (just upgraded from 1.6.3)
• 26 Total Hosts, CentOS 5, 16GB RAM
• 300 million users, 130 million saved maps
• Replica Sets
• 3 Sharded Collections
 • lookup, east coast, west coast
• Java                10
Patch App
• Tracks User Activity
• Master, 2 Slaves
• V. 1.6.3
• About 100GB of data
• Throwaway Data (for now)
• Ruby on Rails
                   11
About Patch
• “HyperLocal” news sites across the
  country
• Fills gap in coverage left by local
  newspapers
• Currently 800 sites are live
• 1000+ by end of 2011
                       12
Nearby Patch Sites
• Vienna (ex. vienna.patch.com)
• Ashburn
• Reston
• McLean
• CollegePark
• GreaterAnnapolis
• 50+ in DC Area
                      13
Upcoming Ops Plans

• Upgrade to 1.8
• Migrate Patch to Replica Sets
• Move MapQuest to bigger hardware (16GB
  -> 64GB memory)
• Add additional slaves

                     14
Admin Tips
• Slaves are a MUST pre1.8
• Use 64 bit version
 • 32 bit version has 2.5 GB storage limit
• Use xfs or ext4
• Keep eye on oplog size
• Turn off atime & dtime
• Consider using getLastError()
                     15
More Admin Tips
• Increase File Descriptor Limits
• Do not use kill -9 (pre-1.8)
• Consider having a slave on replication delay
 • -- slavedelay <seconds>
• db.runCommand(“logRotate”)
• Keep db.<collection>.totalIndexSize() less
  than RAM
• Linux dirty_background_ratio and
  dirty_ratio (pre 2.6.22)
                     16
Even More Admin Tips
• Omit parenthesis to see command details
• 5 Primitives of Mongo
 • insert, remove, update, find, getMore
• Replication is a slave polling master process
• Master and slaves each have their own
  oplog
• Choose shard key carefully (ex. timestamp)
                      17
Admin Tools
• mongodump-mongorestore
 • use fsync and lock database to ensure
    consistent backup
• fsync and lock are a must for file system
  backups (ex LVM)
• http://localhost:28017 (server port + 1000)
• db.currentOp()
                     18
More Admin Tools
• mongostat
• db.printReplicationInfo()
• db.serverStatus()
• db.<collection>.stats()
• Database Profiler
• Explain
                     19
Admin Resources
• mongodb.org
 • Events
 • Forums
• Wordnik Mongo Admin Tools (Github)
• Mongo Snippets (Github)
• IRC (freenode #mongodb)
• Little MongoDB Book
                   20
More Admin Resources
• slideshare (Use Time-Based Search)
• GUI Admin Tools
 • MongoVUE
 • Others
• Kristina Chodorow's Blog
• Boxed Ice
                    21
Even More Resources
• Follow @MongoQuestion (StackOverflow)
• MongoDB on Quora (@q_mongodb)
• 10gen Deployment Strategies Slides
• Books
• Training
• 10gen Support
• Office Hours in NYC and Redwood City
                  22
New MongoDB Release
• 1.8 (Released March 16)
 • Single Server Durability (Journaling)
 • Enhancements to Sharding & Replica Sets
 • Covered and Sparse Indexes
 • Tab Completion
 • Maximum BSON Document: 16MB
 • 1.8 Features Presentation
                    23
Future Releases
• 2.0 (May/June?)
 • Better Map-Reduce and Aggregation
 • Improved Concurrency
 • Online Compaction
 • TTL Time-Out Collections
• Beyond
 • Full-Text Search?
                   24
Thank You!

• www.slideshare.net/radiocats
• @radiocats on Twitter
• www.linkedin.com/in/mdelnegro

                   25

Mongo db admin_20110329

  • 1.
    Administration Michael DelNegro Principal Database Administrator AOL 1
  • 2.
    Presentation Overview • Introduction •My Applications • Tips • Tools • Resources • Upcoming 2
  • 3.
    About Me • DBAat AOL (Dulles) for six years • Background in Sybase • Now MySQL, PostgreSQL and NoSQL • Was: Blogsmith, Uncut Video, Travel, Autos, Journals, Real Estate, Ficlets, Shopping • Currently: Patch, MapQuest, HSS, Datalayer, Demand • I Heart Big Data 3
  • 4.
    About MongoDB • “Scalable,high-performance, open source, document-oriented database” • Databases (Databases) • Collections (Tables) • Documents (Rows) • Fields (Columns) - K/V Pairs • Indexes • No Joins • Favors Embedding Data instead of FKs 4
  • 5.
    More About MongoDB •JSON-style Documents • Javascript Shell • Capped Collections • Flexible Schemas • Replication w/ Autofailover • Sharding • GridFS: File Storage • Map-Reduce 5
  • 6.
    MongoDB Support • OperatingSystems • Linux, Windows, Mac OS X, Solaris • 32bit, 64bit • Drivers • Java, Javascript, Perl, Ruby, Scala, Erlang, C, C#, C++, Haskell, PHP, Python • R, Smalltalk, node.js, ColdFusion 6
  • 7.
    MongoDB Use Cases •Website Data Store • Caching Tier • Document and Content Mgmt Systems • Event Logging • Real-time Stats/Analytics • Archiving • High Volume Problems 7
  • 8.
    MongoDB Misuse • ComplexTransactional Systems • Traditional Business Intelligence • SQL is a Must 8
  • 9.
    MongoDB at AOL •In use since Summer 2010 • I currently administer two applications for MapQuest and Patch • There are other MongoDB applications in use throughout the company and more on the way 9
  • 10.
    MapQuest App • TracksUser Profile Information • V. 1.6.5.latest (just upgraded from 1.6.3) • 26 Total Hosts, CentOS 5, 16GB RAM • 300 million users, 130 million saved maps • Replica Sets • 3 Sharded Collections • lookup, east coast, west coast • Java 10
  • 11.
    Patch App • TracksUser Activity • Master, 2 Slaves • V. 1.6.3 • About 100GB of data • Throwaway Data (for now) • Ruby on Rails 11
  • 12.
    About Patch • “HyperLocal”news sites across the country • Fills gap in coverage left by local newspapers • Currently 800 sites are live • 1000+ by end of 2011 12
  • 13.
    Nearby Patch Sites •Vienna (ex. vienna.patch.com) • Ashburn • Reston • McLean • CollegePark • GreaterAnnapolis • 50+ in DC Area 13
  • 14.
    Upcoming Ops Plans •Upgrade to 1.8 • Migrate Patch to Replica Sets • Move MapQuest to bigger hardware (16GB -> 64GB memory) • Add additional slaves 14
  • 15.
    Admin Tips • Slavesare a MUST pre1.8 • Use 64 bit version • 32 bit version has 2.5 GB storage limit • Use xfs or ext4 • Keep eye on oplog size • Turn off atime & dtime • Consider using getLastError() 15
  • 16.
    More Admin Tips •Increase File Descriptor Limits • Do not use kill -9 (pre-1.8) • Consider having a slave on replication delay • -- slavedelay <seconds> • db.runCommand(“logRotate”) • Keep db.<collection>.totalIndexSize() less than RAM • Linux dirty_background_ratio and dirty_ratio (pre 2.6.22) 16
  • 17.
    Even More AdminTips • Omit parenthesis to see command details • 5 Primitives of Mongo • insert, remove, update, find, getMore • Replication is a slave polling master process • Master and slaves each have their own oplog • Choose shard key carefully (ex. timestamp) 17
  • 18.
    Admin Tools • mongodump-mongorestore • use fsync and lock database to ensure consistent backup • fsync and lock are a must for file system backups (ex LVM) • http://localhost:28017 (server port + 1000) • db.currentOp() 18
  • 19.
    More Admin Tools •mongostat • db.printReplicationInfo() • db.serverStatus() • db.<collection>.stats() • Database Profiler • Explain 19
  • 20.
    Admin Resources • mongodb.org • Events • Forums • Wordnik Mongo Admin Tools (Github) • Mongo Snippets (Github) • IRC (freenode #mongodb) • Little MongoDB Book 20
  • 21.
    More Admin Resources •slideshare (Use Time-Based Search) • GUI Admin Tools • MongoVUE • Others • Kristina Chodorow's Blog • Boxed Ice 21
  • 22.
    Even More Resources •Follow @MongoQuestion (StackOverflow) • MongoDB on Quora (@q_mongodb) • 10gen Deployment Strategies Slides • Books • Training • 10gen Support • Office Hours in NYC and Redwood City 22
  • 23.
    New MongoDB Release •1.8 (Released March 16) • Single Server Durability (Journaling) • Enhancements to Sharding & Replica Sets • Covered and Sparse Indexes • Tab Completion • Maximum BSON Document: 16MB • 1.8 Features Presentation 23
  • 24.
    Future Releases • 2.0(May/June?) • Better Map-Reduce and Aggregation • Improved Concurrency • Online Compaction • TTL Time-Out Collections • Beyond • Full-Text Search? 24
  • 25.
    Thank You! • www.slideshare.net/radiocats •@radiocats on Twitter • www.linkedin.com/in/mdelnegro 25