Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

An Introduction To NoSQL & MongoDB


Published on

My talk on NoSQL & MongoDB from Refresh Cambridge, July 2011

Published in: Business, Technology

An Introduction To NoSQL & MongoDB

  1. An Introduction ToNoSQL & MongoDB<br />Lee Theobald<br />Twitter: @Leesy<br />Email:<br />
  2. NoSQL<br />A form of database management system that is non-relational.<br />Systems are often schema less, avoid joins & are easy to scale.<br />The term NoSQL was coined in 1998 by Carlo Strozzi and then again in early 2009 with the no:sql(east) conference<br />A better term would have been “NoREL” but NoSQL caught on. Think of it more as meaning “Not Only SQL”<br />
  3. But Why Choose NoSQL?<br />Amount of data stored is on the up & up.<br />Facebook is rumoured to hold over 50TB of data in their NoSQL system for their inbox search<br />The data we store is more complex than 15 years ago.<br />Easy Distribution<br />With all this data is needs to be easy to be able to add/remove servers without any disruption of service.<br />
  4. Choose Your Flavour<br />Key-Value Store<br />Graph<br />BigTable<br />Document Store<br />
  5. Key-Value Store<br />Data is stored in (unsurpisingly) key/value pairs.<br />Designed to handle lots of data and heavy load<br />Based on a Amazon’s Dynamo Paper<br />Example: Voldermort ( - Developed by the guys at LinkedIn<br />
  6. Graph<br />Focuses on modeling data & associated connections<br />Inspired by mathematical Graph Theory.<br />Example: FlockDB ( – developed by Twitter<br />
  7. BigTable / Column Families<br />Based on the BigTable paper from Google<br />Data is grouped by columns, not rows.<br />Example: Cassandra ( – Originally developed by Facebook, now and Apache project.<br />
  8. Document Store<br />Data stored as whole documents.<br />JSON & XML are popular formats<br />Maps well to an Object Orientated programming model<br />Example: CouchDB ( or …<br />{<br /> “id”: “123”,<br /> “name”: “Oliver Clothesoff”,<br /> “dob”: {<br /> “year”: 1985,<br /> “month”: 5,<br /> “day”: 12<br /> }<br />}<br />
  9. MongoDB!<br />Short for humongous<br />Open source with development lead by 10Gen<br />Document Based<br />Schema-less<br />Highly Scalable<br />MapReduce<br />Easy Replication & Sharding<br />
  10. Familiar Structure<br />A MongoDB instance is made up of a number of databases.<br />These contain a number of collections & you can have collections nested under other collections.<br />Compare it to MySQL which has databases and tables.<br />
  11. Inserts – As Easy As Pie<br />use cookbook;<br />{<br /> “name”: “Cherry Pie”,<br /> “ingredients”: [“cherries”, “pie”],<br /> “cooking_time”: 30<br />});<br />
  12. Searching – A Piece Of Cake!<br />{<br /> “cooking_time”: { “$gte”: 10, “$lt”: 30 }<br />}<br /><br />
  13. Some More Advanced Syntax<br />Limiting Results<br />db.find().limit(10);<br />Skipping results<br />db.find().skip(5);<br />Sorting<br />db.find().sort({cooking_time: -1});<br />Cursors:<br />var cur = db.find().cursor();<br />cur.forEach( function(x) { print(tojson(x)); });<br />
  14. MapReduce<br />Great way of doing bulk manipulation or aggregation.<br />2 or 3 functions written in JavaScript that execute on the server.<br />An example use could be generating a list of top queries from some search logs.<br />
  15. Map Function<br />Takes some input of the form of key/value pairs, performs some calculations and returns 0 or more key/value pairs<br />map = function() {<br /> if (!this.query) {<br /> return;<br /> }<br /> emit (this.query, {count: 1});<br />}<br />
  16. Reduce Function<br />Takes the results from the map function, does something (normally combine the results) and produces output in key/value pairs<br />reduce = function(key, values) {<br />var count = 0;<br />values.forEach(function(v) {<br /> count += v[‘count’];<br /> }<br /> return {count: count;}<br />}<br />
  17. Replica Sets<br />Master/Slave configuration<br />If your primary server goes down, one of the secondary ones takes overautomatically<br />Extremely easy to setup<br />
  18. Auto Sharding – Horizontal Scaling<br />
  19. Other Features<br />GridFS support – Distributed file storage<br />Geospatial indexing<br />It’s constantly in development so new features are being worked on all the time!<br />
  20. Why Not Try It Yourself<br />Download it at:<br />Online tutorial at:<br />Some handy MongoDB sites:<br />MongoDB Cookbook:<br />Kyle Banker’s blog:<br />There’salso a load of handyreferencecards, stickers and otherMongoDBfreebiesupfront!<br />
  21. Thanks For Listening<br />Any questions?<br />