Introduction to MongoDB<br />										Silicon Valley Code Camp<br />																			Oct 8th, 2011 <br />
Before we start……<br />NoSQL is a movement, its not antiSQL<br />Relational Databases have their place, but they are not t...
Agenda<br />Relational Databases vs. NoSQL<br />CAP Theorem <br />MongoDB at a high level<br />Collections, Documents<br /...
Relational Databases<br />Have been around for years<br />De-facto standard for any persistence<br /> ACID compliant<br />...
NoSQL<br />Why?<br />Not everything can be modeled in a relational construct<br />Cluster-aware out of the box. Replicatio...
CAP Theorem : Pick Two<br />Consistency – Each client sees the same data<br />Availability – The system is always availabl...
Visual Guide to NoSQL Databases<br />Source: http://blog.nahurst.com/visual-guide-to-nosql-systems<br />
How do they make up?<br />Usually the NoSQL databases are AP, or CP.<br />Consistency <br />Eventually consistent<br />Wri...
MongoDB : High Level<br />Document-based Database<br />Schemaless<br />Cluster-aware<br />Easy Querying/Javascript Support...
Collections<br />The closest comparison to a MongoDB Collection in the relational world is a Table<br />A collection is no...
Documents<br />Closest comparison in the relational world is a Row in a Table.<br />Must reside within a Collection<br />L...
Inserting Documents<br />Console defaults to localhost port 27017<br />show databases<br />show collections<br />Insert a ...
Querying and Updating Documents<br />Query a document<br />Select certain fields<br />Using limit, skip, sort and count<br...
Other console commands<br />db.stats()<br />db.collection.stats()<br />db.isMaster()<br />rs.status()<br />db.currentOp()<...
Replication: Master Slave <br />Achieved by “declaring” 1 node as the master, and “declaring” many nodes as its slaves<br ...
Replication : ReplicaSets<br />Achieved by creating a cluster, called a replSet, and adding “members” to it.<br />The “pri...
Accessing MongoDBProgramatically<br />Scala<br />Using casbah<br />Code to insert a document<br />Code to find/query<br />...
Object-Document Mappers<br />Mongo Drivers understand Hashes, or DBObjects. A DBObject essentially is a Map<br />The class...
Internals<br />Data is memory mapped, so writes can scale as no disk IO is performed with every write.<br />Delayed writes...
Replication Internals<br />The almighty Oplog – Capped Collection<br />Acts like a tx log which the slaves or secondaries ...
Scaling MongoDB<br />Be smart with your schema design<br />Know ahead of time if the system will be read-heavy or write-he...
Backups<br />Lock the database for a cold backup<br />Use filer snapshots<br />Use mongodump -> BSON, mongorestore to rest...
Monitoring<br />MMS<br />Developed by 10gen<br />Munin<br />Plugins available to monitor MongoDB Server<br />Nagios<br />F...
Comparison of NoSQL Solutions<br />Source: http://perfectmarket.com/blog/not_only_nosql_review_solution_evaluation_guide_c...
We’re hiring!corp.ign.com/careers, and @ignjobs<br />Scala<br />Java<br />PHP/Zend<br />Rails<br />ElasticSearch<br />Mong...
About<br />Manish Pandit <br />Sr. Engineering Manager<br />	IGN Entertainment<br />http://linkedin.com/in/mpandit<br />@l...
Upcoming SlideShare
Loading in...5
×

Silicon Valley Code Camp: 2011 Introduction to MongoDB

1,735

Published on

My Talk today at Silicon Valley Code Camp 2011 on Introduction to MongoDB.

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
1,735
On Slideshare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
30
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Silicon Valley Code Camp: 2011 Introduction to MongoDB

  1. 1. Introduction to MongoDB<br /> Silicon Valley Code Camp<br /> Oct 8th, 2011 <br />
  2. 2. Before we start……<br />NoSQL is a movement, its not antiSQL<br />Relational Databases have their place, but they are not the only solution<br />Diversify - Best tool for the job<br />The footers contain quotes from the video Mongo DB is Web Scale<br />
  3. 3. Agenda<br />Relational Databases vs. NoSQL<br />CAP Theorem <br />MongoDB at a high level<br />Collections, Documents<br />Inserting, Querying and Updating<br />Other MongoDB Commands<br />Replication Topologies<br />Using MongoDB via a driver<br />Few Internals<br />Administration<br />
  4. 4. Relational Databases<br />Have been around for years<br />De-facto standard for any persistence<br /> ACID compliant<br />Rigid Schema<br />Usually hard to scale over a distributed network <br />Normalization is almost always a requirement<br />ORMs tend to limit the optimizations you can do to the queries.<br />Relational Databases were'nt built for Web Scale. They have impotence mismatch.<br />
  5. 5. NoSQL<br />Why?<br />Not everything can be modeled in a relational construct<br />Cluster-aware out of the box. Replication, shardingetc. is built into the core<br />Schemaless<br />(Mostly) Open Source, Community supported<br />High performance by design and not ball-and-chained with ACID<br />
  6. 6. CAP Theorem : Pick Two<br />Consistency – Each client sees the same data<br />Availability – The system is always available for any reads and writes<br />Partition Tolerance – The system can tolerate any communication failure across the network (except someone pulling the plug across the datacenters).<br />At any given point in time, only two of the above hold true in any distributed datastore.<br />If thats what they need to do to get those kick ass benchmarks, then its a great design.<br />
  7. 7. Visual Guide to NoSQL Databases<br />Source: http://blog.nahurst.com/visual-guide-to-nosql-systems<br />
  8. 8. How do they make up?<br />Usually the NoSQL databases are AP, or CP.<br />Consistency <br />Eventually consistent<br />Write concerns<br />Availability<br />Read-only<br /> stale data<br />
  9. 9. MongoDB : High Level<br />Document-based Database<br />Schemaless<br />Cluster-aware<br />Easy Querying/Javascript Support<br />Memory Mapped<br />Drivers in all the popular languages<br />Excellent developer velocity (Supported by 10gen)<br />Durable via Journaling<br />C-P System based on the CAP theorem<br />MongoDB handles WebScale. You turn it on and it scales right up.<br />
  10. 10. Collections<br />The closest comparison to a MongoDB Collection in the relational world is a Table<br />A collection is not bound by a schema<br />A collection has a namespace<br />Can be a capped collection<br />It contains BSON documents<br />
  11. 11. Documents<br />Closest comparison in the relational world is a Row in a Table.<br />Must reside within a Collection<br />Looks like (structured) JSON, stored as BSON within a collection<br />Limited to 16MB (as of 2.0)<br />Larger sizes supported via GridFS<br />Reference : http://www.bsonspec.org. Defined as Binary-encoded Serialization format for JSON-like Documents. <br />
  12. 12. Inserting Documents<br />Console defaults to localhost port 27017<br />show databases<br />show collections<br />Insert a document in a collection<br />Bulk inserts via Javascript<br />
  13. 13. Querying and Updating Documents<br />Query a document<br />Select certain fields<br />Using limit, skip, sort and count<br />Using explain<br />In Place Updates<br />$inc, $push, $pull, $pop, $slice, $in, $nin<br />Indexing on fields<br />MongoDB is a Document Database, that does not need joins. It uses Map Reduce.<br />
  14. 14. Other console commands<br />db.stats()<br />db.collection.stats()<br />db.isMaster()<br />rs.status()<br />db.currentOp()<br />db.serverStatus()<br />
  15. 15. Replication: Master Slave <br />Achieved by “declaring” 1 node as the master, and “declaring” many nodes as its slaves<br />Single point of failure/No failover<br />Can add any number of slaves easily<br />May need to put slaves behind a load balancer<br />
  16. 16. Replication : ReplicaSets<br />Achieved by creating a cluster, called a replSet, and adding “members” to it.<br />The “primary” and “secondary” roles are decided among the nodes. There is no permanent “master” or “slave”.<br />Automatic Failover via voting<br />Arbiter may be needed if there are even number of nodes to break a tie<br />Easy to add new members<br />Adding load-balancing will void failover<br />
  17. 17. Accessing MongoDBProgramatically<br />Scala<br />Using casbah<br />Code to insert a document<br />Code to find/query<br />Code to update<br />
  18. 18. Object-Document Mappers<br />Mongo Drivers understand Hashes, or DBObjects. A DBObject essentially is a Map<br />The class needs to be converted to a DBObject, either by the developer or by the driver.<br />Some such mappers also provide a DAO which makes it easy to perform CRUD operations.<br />MongoMapper for Ruby<br />Salat for Scala<br />Morphia for Java<br />
  19. 19. Internals<br />Data is memory mapped, so writes can scale as no disk IO is performed with every write.<br />Delayed writes to the disc, default 60 seconds.<br />Always easier to keep the indices and the working set of the data in the memory to avoid swapping<br />Pre-allocated files in increments<br />Smart algorithm to add padding to the storage when the document sizes are inconsistent<br />Durability is achieved by journaling, introduced in 1.7<br />
  20. 20. Replication Internals<br />The almighty Oplog – Capped Collection<br />Acts like a tx log which the slaves or secondaries read from and apply.<br />getmore on the primary/master every 4s<br />Failover and voting<br />Delayed sync<br />Using rs.slaveOk() to query the secondaries in a replSet<br />
  21. 21. Scaling MongoDB<br />Be smart with your schema design<br />Know ahead of time if the system will be read-heavy or write-heavy<br />Use explain(), use indices<br />Do not fetch the entire document - select fields.<br />Keep an eye on index misses and page faults via mongostat<br />Denormalize- avoid links, use embeds.<br />You can never replicate enough<br />Horizontal scaling via sharding<br />If /dev/null is faster then WebScale, I’ll use it. Does /dev/null support sharding?<br />
  22. 22. Backups<br />Lock the database for a cold backup<br />Use filer snapshots<br />Use mongodump -> BSON, mongorestore to restore<br />Use mongoexport -> JSON, mongoimport to restore<br />Spare slaves always help<br />
  23. 23. Monitoring<br />MMS<br />Developed by 10gen<br />Munin<br />Plugins available to monitor MongoDB Server<br />Nagios<br />For Machine Health Check<br />
  24. 24. Comparison of NoSQL Solutions<br />Source: http://perfectmarket.com/blog/not_only_nosql_review_solution_evaluation_guide_chart<br />
  25. 25. We’re hiring!corp.ign.com/careers, and @ignjobs<br />Scala<br />Java<br />PHP/Zend<br />Rails<br />ElasticSearch<br />MongoDB<br />MySQL<br />HTML5<br />Jquery Mobile<br />Sencha Touch<br />Phonegap<br />Wordpress<br />ActionScript/Flash<br />Redis/Memcached<br />CI/CD<br />
  26. 26. About<br />Manish Pandit <br />Sr. Engineering Manager<br /> IGN Entertainment<br />http://linkedin.com/in/mpandit<br />@lobster1234<br />
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×