NOSQL Databases types and Uses

0 views

Published on

The four categories of NoSQL databases
When to Use NoSQL
When NOT to use NoSQL
Use cases NoSQL (Each Category)

Published in: Technology
2 Comments
7 Likes
Statistics
Notes
No Downloads
Views
Total views
0
On SlideShare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
253
Comments
2
Likes
7
Embeds 0
No embeds

No notes for slide

NOSQL Databases types and Uses

  1. 1. NOSQL DATABASES TYPES AND USES VIEW POINT Suvradeep Rudra April’2014
  2. 2. Agenda • The four categories of NoSQL databases • When to Use NoSQL • When NOT to use NoSQL • Use cases NoSQL (Each Category)
  3. 3. Executive Summary • A NoSQL database provides a mechanism for storage and retrieval of data that is modeled in means other than the tabular relations used in relational databases. Motivations for this approach include simplicity of design, horizontal scaling and finer control over availability. The data structure (e.g., tree, graph, key-value) differs from the RDBMS, and therefore some operations are faster in NoSQL and some in RDBMS.
  4. 4. 4 categories of NoSQL DB • Key-values Stores • Column Family Stores • Document Databases • Graph Databases
  5. 5. Key-values Stores  Key valued stores are those types of NoSQL database that are scheme free, and also your values stored as key i.e in one column you will be having a key “Name” and the value would be “Zack” and in the second column it’s not necessary mean that you must have the value of Name again you could store different kind of data in the same column in different row, and also you could have more column in one row than previous or vice versa, this is the most common kinds of NoSQL database that are currently in the market and other kinds of NoSQL database are built upon the principle of this kinds of NoSQL database and added some features on that.  The Key-Value database is a very simple structure based on Amazon’s Dynamo DB. Data is indexed and queried based on it’s key. Key-value stores provide consistent hashing so they can scale incrementally as your data scales. They communicate node structure through a gossip-based membership protocol to keep all the nodes synchronized. If you are looking to scale very large sets of low complexity data, key-value stores are the best option.  Examples: Tokyo Cabinet/Tyrant, Redis, Voldemort, Oracle BDB, Amazon SimpleDB, Riak  Strengths: Fast lookups  Weaknesses: Stored data has no schema
  6. 6. Column Family Stores  These were created to store and process very large amounts of data distributed over many machines. There are still keys but they point to multiple columns. The columns are arranged by column family.  These data stores are based on Google’s BigTable implementation. They may look similar to relational databases on the surface but under the hood a lot has changed. A column family database can have different columns on each row so is not relational and doesn’t have what qualifies in an RDBMS as a table. The only key concepts in a column family database are columns, column families and super columns. All you really need to start with is a column family. Column families define how the data is structured on disk.  A column by itself is just a key-value pair that exists in a column family. A super column is like a catalogue or a collection of other columns except for other super columns.  Column family databases are still extremely scalable but less-so than key-value stores. However, they work better with more complex data sets.  Examples: Cassandra, HBase
  7. 7. Document Databases  These were inspired by Lotus Notes and are similar to key-value stores. The model is basically versioned documents that are collections of other key-value collections. The semi-structured documents are stored in formats like JSON.  A document database is not a new idea. It was used to power one of the more prominent communication platforms of the 90’s and still in service today, Lotus Notes now called Lotus Domino. APIs for document DBs use Restful web services and JSON for message structure making them easy to move data in and out.  A document database has a fairly simple data model based on collections of key-value pairs. A typical record in a document database would look like this: • { “Subject”: “I like Plankton” • “Author”: “Rusty” • “PostedDate”: “5/23/2006″ • “Tags”: ["plankton", "baseball", "decisions"] • “Body”: “I decided today that I don’t like baseball. I like plankton.” }  Examples: CouchDB, MongoDb  Strengths: Tolerant of incomplete data  Weaknesses: Query performance, no standard query syntax
  8. 8. Graph Databases  Instead of tables of rows and columns and the rigid structure of SQL, a flexible graph model is used which, again, can scale across multiple machines. NoSQL databases do not provide a high-level declarative query language like SQL to avoid overtime in processing. Rather, querying these databases is data-model specific. Many of the NoSQL platforms allow for RESTful interfaces to the data, while other offer query APIs.  Graph databases take document databases to the extreme by introducing the concept of type relationships between documents or nodes. The most common example is the relationship between people on a social network such as Facebook.  A graph database is a big dense network structure. While it could take an RDBMS hours to sift through a huge linked list of people, a graph database uses sophisticated shortest path algorithms to make data queries more efficient. Although slower than its other NoSQL counterparts, a graph database can have the most complex structure of them all and still traverse billions of nodes and relationships with light speed.  Examples: Neo4J, InfoGrid, Infinite Graph  Strengths: Graph algorithms e.g. shortest path,n degree relationships, etc.  Weaknesses: Traverse the entire graph to achieve a definitive answer. Not easy to cluster
  9. 9. When is NoSQL a poor choice? After spending so long extolling the benefits of the various NoSQL solutions, I would like to point out at least one scenario where I haven’t seen a good NosQL solution for the RDBMS: Reporting. One of the great things about RDBMS is that given the information that it already have, it is very easy to massage the data into a lot of interesting forms. That is especially important when you are trying to do things like give the user the ability to analyze the data on their own, such as by providing the user with a report tool that allows them to query, aggregate and manipulate the data to their heart’s content. While it is certainly possible to produce reports on top of a NoSQL store, you wouldn’t be able to come close to the level of flexibility that a RDMBS will offer. That is one of the major benefits of the RDBMS, its flexibility. The NoSQL solutions will tend to outperform the RDBMS solution (as long as you stay in the appropriate niche for each NoSQL solution) and they certainly have better scalability story than the RDBMS, but for user driven reports, the RDBMS is still my tool of choice
  10. 10. Suvradeep Rudra is a Sr. Data Architect and has more than 10 years of experience in Data Management. He held a number of roles at Caritor Inc. (now NTT DATA), Oracle, Deloitte Consulting. Experienced in building overall data strategy, tapping value from data assets and capabilities and driving value to the business. He has worked in various projects, establishing and building data management solutions for customers in the industries such as High Tech, Health Insurance, Oil and Gas, Payments services and Banking. His experience ranges from Data strategy, Product Strategy, MDM, Business Intelligence and Analytics, Data Architecture (Data Warehouse), Data Governance. Suvradeep writes and speaks about Monetizing Company’s Data and Technology trends. He holds Masters in Computer Applications from University of Madras, Chennai, India. He can be reached via LinkedIn profile

×