Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.



Published on

  • Be the first to comment

  • Be the first to like this


  1. 1. By Sai Kiran Kotha
  2. 2. Contents Intro to MongoDB Why use it? Performance analysis Documents and Collections Querying Schema Design Sharding Security Applications Conclusion
  3. 3. History MongoDB’s name comes from the middle five letters of the word “humongous”, meaning big data. MongoDB was created by the founders (Eliot and Dwight) of DoubleClick. Development of MongoDB began in October 2007 by 10gen. In 2009, MongoDB was open sourced as a stand-alone product with an AGPL license. In March 2011, from version 1.4, MongoDB has been considered production ready.
  4. 4. What is MongoDB? Scalable, High-Performance, Open-Source, NoSQL Document orientated database designed with both scalability and developer agility in mind. It is written in C++ & built for speed. Features: Rich Document based queries for Easy readability. Full Index Support for High Performance. Replication and Failover for High Availability. Auto Sharding for Easy Scalability. Map / Reduce for Aggregation.
  5. 5. Why use MongoDB? SQL was invented in the 70’s to store data. MongoDB stores documents (or) objects. Now-a-days, everyone works with objects (Python/Ruby/Java/etc.). And we need Databases to persist our objects. Then why not store objects directly ? Embedded documents and arrays reduce need for joins. No Joins and No-multi document transactions.
  6. 6. TrendsinPopularityof BigDataConnectors
  7. 7. Performance Analysis Anywhere from 2 to 10 times faster than SQL
  8. 8. MongoDB overSQLServer(orMySqlorOracle)
  9. 9. MongoDB overSQLServer(orMySqlorOracle)
  10. 10. RDBMS vs Mongodb
  11. 11. Collection Schema-less(or more accurately, "dynamic schema“) Contains Documents. Indexable by one/more keys. Created on-the-fly when referenced for the first time. Capped Collections: Fixed size, older records get dropped after reaching the limit.
  12. 12. Document Stored in a Collection. Can have _id key – works like Primary keys in MySQL. Supported Relationships – Embedded (or) References. . Document storage in BSON (Binary form of JSON)via GridFS (i.e. stores images, videos, anything...).
  13. 13. Embedded Objects Documents can embed other documents For example: { name: 'Brad Majors', address: { street: 'Oak Terrace', city: 'Denton' } }
  14. 14. Querying Query Expression Objects: MongoDB supports a number of query objects for fetching data. Simple query: db.users.find({}) More selective: db.users.find({'last_name': 'Smith'}) Query Options: Field Selection: // retrieve ssn field for documents where last_name == 'Smith': db.users.find({last_name: 'Smith'}, {'ssn': 1}); // retrieve all fields *except* the thumbnail field, for all documents: db.users.find({}, {thumbnail:0});
  15. 15. Sorting: db.users.find({}).sort({last_name: 1}); // return all documents and sort by last name in ascending order Skip and Limit: db.users.find().skip(20).limit(10); //skips the first 20 last names, and limit our result set to 10 db.users.find({}, {}, 10, 20); // same as above, but less clear Cursors: Used to iteratively retrieve all the documents returned by the query. >var cur = db.example.find(); > cur.forEach( function(x) { print(tojson(x))}); {"n" : 1 , "_id" : "497ce96f395f2f052a494fd4"} {"n" : 2 , "_id" : "497ce971395f2f052a494fd5"}
  16. 16. Queries Insert{key1: "value1", key2: "value2"}){firstname: "Foo", lastname: "Bar", address: {Street: "Foo", City: "Bar"}}) Read db.myCollection.find({lastname: "Meier"}, {firstname: true}).limit(10).skip(20) Update db.myCollection.update({id: 123}, {$set : {a : 4}}) Delete db.myCollection.remove({firstname: "Hans"});
  17. 17. Advanced Queries Conditional operators • db.things.find({j : {$lt: 3}}); More operators: $lte, $gt, $gte $all, $any, $or, $and, $size & more Nested query • db.persons.insert(firstname: "Meier", loves: ['apple', 'orange', 'tomato']) • db.persons.find($or: [{loves: 'apple'}, {loves: 'orange'}]) JavaScript expression • db.myCollection.find( { $where: "this.a > 3" } );
  18. 18. Query examples -usingembeddeddocuments&referenceddocuments Exact match an entire embedded object db.users.find( {address: {street: 'Oak Terrace', city: 'Denton'}} ) Dot-notation for a partial match db.users.find( {"": 'Denton'} ) Allows us to deep, nested queries db.order.find( { shipping: { carrier: "usps" } } ); here shipping is an embedded document (object)
  19. 19. Schema Design There is no predefined schema, dynamic schema. Application creates an ad-hoc schema with the objects it creates The schema is implicit in the queries Collections to represent the top-level classes Less normalization, more embedding
  20. 20. Traditional RDBMS schema:
  21. 21. MongoDB: It depicts the Shapes object stored in the form of JSON type document. This eliminates the storage space involved in creating columns and rows.
  22. 22. Better Schema Design: Embedding Collection for posts Embed comments, author name post = { author: 'Michael Arrington', text: 'This is a pretty awesome post.', comments: [ 'Whatever this post.', 'I agree, lame!' ] }
  23. 23. Schema Design Limitations No referential integrity High degree of denormalization means updating something in many places instead of one Lack of predefined schema is a double-edged sword Should have model in the application Objects within a collection can be completely inconsistent in their fields
  24. 24. MongoDB Admin UI's Some UI's are available as separate community projects and are listed below. Some are focused on administration, while some focus on data viewing. Tools MongoExplorer MongoVUE PHPMoAdmin Meclipse Commercial Database Master Data Viewers mongs
  25. 25. MongoVue: MongoVUE is a .NET GUI for MongoDB. It is elegant and highly usable GUI interface to work with MongoDB. It helps in managing web-scale data.
  26. 26. Database Master Database Master from Nucleon Software Features: Tree view for dbs and collections Create/Drop indexes Server/DB stats
  27. 27. Sharding Data is split up into chunks, each is assigned to a shard Shard: Single server or Replica set Config Servers: Store meta data about chunks and data location Mongos: Routes requests in a transparent way
  28. 28. Replica Sets
  29. 29. Some Cool features Geo-spatial Indexes for Geo-spatial queries. $near, $within_distance, Bound queries (circle, box) GridFS Stores Large Binary Files. Map/Reduce GROUP BY in SQL, map/reduce in MongoDB.
  30. 30. Map/Reduce Data processing . It has some basic aggregation capabilities. Parallelized for working with large sets of data. mapReduce takes a map function, a reduce function and an output directive. Map Funtion : A master node takes an input. Splits it into smaller sections. Sends it to the associated nodes. These nodes may perform the same operation in turn to send those smaller section of input to other nodes. It process the problem (taken as input) and sends it back to the Master Node.
  31. 31. Reduce Function: The master node aggregates those results to find the output. Then, we can use the mapReduce command against some hits collection by: > db.hits.mapReduce(map, reduce,{out: {inline:1}}); We could instead specify {out: 'hit_stats'} and have the results stored in the hit_stats collections: > db.hits.mapReduce(map, reduce,{out:'hit_stats'}); > db.hit_stats.find(); Map/Reduce contd...
  32. 32. Mongodb Security Trusted environment is default. The current version of Mongo supports only basic security. We can authenticate a username and password in the context of a particular database. Once authenticated, a normal user has full read and write access to the database. $ ./mongo > use admin > db.auth("someAdminUser", password) .
  33. 33. Mongodb Security contd.. If there are no admin users, we should first create an administrator user for the entire db server process. This user is stored under the special admin database. One may access the database from the localhost interface without authenticating. Thus, from the server running the database configure an administrative user: $ ./mongo > use admin > db.addUser("theadmin", "anadminpassword")
  34. 34. Mongodb Security contd.. Now, let's configure a "regular" user for another database. > use projectx > db.addUser("joe", "passwordForJoe") Finally, let's add a readonly user. > use projectx > db.addUser("guest", "passwordForGuest", true)
  35. 35. Cool uses Data Warehouse Mongo understands JSON natively Very powerful for analysis Query a bunch of data from web service Import into mongo (mongoimport –f filename.json) Large rails app for building websites (kind of a CMS) Hardcore debugging Spit out large amounts of data
  36. 36. Applications RDBMS replacement for Web Applications. Semi-structured Content Management. Real-time Analytics & High-Speed Logging. Caching and High Scalability. Web 2.0, Media, SAAS, Gaming HealthCare, Finance, Telecom, Government
  37. 37. Some Companies using MongoDB in Production
  38. 38. Conclusion MongoDB is fast. It achieves high performance. It bridges the gap between traditional RDBMS at one end and Key-Value pair search engines at the other end. Document model is simple but powerful. Advanced features like map/reduce, geospatial indexing etc. are very compelling. Very rapid development, open source & surprisingly great drivers for most languages.
  39. 39. Bibliography The required information is extracted from the following websites:- mongodb.php server-2008-performance-showdown/ alternative-to-rdbms-databases-like-oracle-and-mysql/