No SQL - MongoDB


Published on

In my presentation i covered a few thing on NoSQL

What is NoSQL
NoSQL Features
Types of NoSQL
Advantages on NoSQL

and then i moved to MongoDB. This presentation deals with some basic question like

When do we embed data versus linking?
How many collections do we have, and what are they?
When do we need atomic operations?
What indexes will we create to make query and updates fast?
What is shard?

Published in: News & Politics, Technology
1 Like
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

No SQL - MongoDB

  1. 1. NoSQL: MongoDBMirza Asif
  2. 2. What is NoSQL In the past few years, the”one size fits all“-thinking concerning data stores has been questioned by both, Science and web companies, which has lead to the emergence of a great variety of alternative databases. The movement as well as the new datastores is commonly subsumed under the term NoSQL. The basic quality of NoSQL is that, it may not require fixed table schemas, usually avoid join operations, and typically scale horizontally. Academic researchers typically refer to these databases as structured storage, a term that includes classic relational databases as a subset. NoSQL database also trades off “ACID” (atomicity, consistency, isolation and durability). NoSQL databases, to varying degrees, even allow for the schema of data to differ from record to record. If there doesn’t exist schema or a table in NoSQL, then how do you visualize the database structure? Well here is the answer
  3. 3. NoSQL Features No schema required: Data can be inserted in a NoSQL database without first defining a rigid database schema. As a corollary, the format of the data being inserted can be changed at any time, without application disruption. This provides immense application flexibility, which ultimately delivers substantial business flexibility. Auto elasticity: NoSQL automatically spreads your data onto multiple servers without requiring application assistance. Servers can be added or removed from the data layer without application downtime. Integrated caching: In order to increase data through and increase the performance advance NoSQL techniques cache data in system memory. This is in contrast to SQL database where this has to be done using separate infrastructure.
  4. 4. Types of NoSQLDescribing the architecture of data storage in NoSQL, there are threetypes of popular NoSQL databases. Key-value stores. As the name implies, a key-value store is a system that stores values indexed for retrieval by keys. These systems can hold structured or unstructured data. Column- oriented databases. Rather than store sets of information in a heavily structured table of columns and rows with uniform sized fields for each record, as is the case with relational databases, column-oriented databases contain one extendable column of closely related data. document-based stores. These databases store and organize data as collections of documents, rather than as structured tables with uniform sized fields for each record. With these databases, users can add any number of fields of any length to a document.
  5. 5. Advantages of NoSQL NoSQL databases generally process data faster than relational databases. NoSQL databases are also often faster because their data models are simpler. Major NoSQL systems are flexible enough to better enable developers to use the applications in ways that meet their needs.
  6. 6. MongoDB MongoDB (from "humongous") is a scalable, high- performance, open source, document-oriented database. Written in C++. It stores data as BSON format (Binary JSON)
  7. 7. Some basic termsMySQL term Mongo termdatabase databasetable collectionindex indexrow BSON documentcolumn BSON fieldjoin embedding and linkingprimary key _id field
  8. 8. Some Question When do we embed data versus linking? How many collections do we have, and what are they? When do we need atomic operations? What indexes will we create to make query and updates fast? What is shard?
  9. 9. Best Practices "First class" objects, that are at top level, typically have their own collection. Line item detail objects typically are embedded. Objects which follow an object modeling "contains" relationship should generally be embedded. Many to many relationships are generally done by linking.
  10. 10. Best Practices Collections with only a few objects may safely exist as separate collections, as the whole collection is quickly cached in application server memory. Embedded objects are a bit harder to link to than "top level" objects in collections. If the amount of data to embed is huge (many megabytes), you may reach the limit on size of a single object, which is 16 MB per document. If you need more than that see GridFS. If performance is an issue, embed
  11. 11. How to Index A second aspect of schema design is index selection. As a general rule, where you want an index in a relational database, you want an index in Mongo. The _id field is automatically indexed. Fields upon which keys are looked up should be indexed. Sort fields generally should be indexed.
  12. 12. How to Index The MongoDB profiling facility provides useful information for where an index should be added that is missing. Note that adding an index slows writes to a collection, but not reads. Use lots of indexes for collections with a high read : write ratio (assuming one does not mind the storage overage). For collections with more writes than reads, indexes are expensive as keys must be added to each index for each insert.
  13. 13. Atomic Operations Some problems require the ability to perform atomic operations. For example, simply incrementing a counter is often a case where one wants atomicity. MongoDB can also perform more complex operations such as that shown in the pseudocode below: atomically { if( doc.credits > 5 ) { doc.credits -= 5; doc.debits += 5; } }
  14. 14. Atomic Operations Another example would be a user registration scenario. We would never want to users to register the same username simultaneously: atomically { if( exists a document with username=jane ) { print "username already in use please choose another"; } else { insert a document with username=jane in the users collection; print("thanks you have registered as user jane."); } }
  15. 15. What is Sharding?MongoDB scales horizontally via an auto-sharding(partitioning) architecture. Horizontal partitioning splits one or more tables by row, usually within a single instance of a schema and a database server. Sharding goes beyond this: it partitions the problematic table(s) in the same way, but it does this across potentially multiple instances of the schema.
  16. 16. ShardingSharding offers: Automatic balancing for changes in load and data distribution Easy addition of new machines Scaling out to one thousand nodes No single points of failure Automatic failover
  17. 17. Sharding Another consideration for schema design is sharding. A BSON document (which may have significant amounts of embedding) resides on one and only one shard. A collection may be sharded. When sharded, the collection has a shard key, which determines how the collection is partitioned among shards. Typically (but not always) queries on a sharded collection involve the shard key as part of the query expression. The key here is that changing shard keys is difficult. You will want to choose the right key from the start(which is not covered in this presentation).
  18. 18. Question?