NOSQL Databases types and Uses

NOSQL DATABASES TYPES AND USES
VIEW POINT
Suvradeep Rudra
April’2014

Agenda
• The four categories of NoSQL databases
• When to Use NoSQL
• When NOT to use NoSQL
• Use cases NoSQL (Each Category)

Executive Summary
• A NoSQL database provides a mechanism for
storage and retrieval of data that is modeled in
means other than the tabular relations used in
relational databases. Motivations for this
approach include simplicity of design, horizontal
scaling and finer control over availability. The
data structure (e.g., tree, graph, key-value)
differs from the RDBMS, and therefore some
operations are faster in NoSQL and some in
RDBMS.

4 categories of NoSQL DB
• Key-values Stores
• Column Family Stores
• Document Databases
• Graph Databases

Key-values Stores
 Key valued stores are those types of NoSQL database that are scheme free, and also your
values stored as key i.e in one column you will be having a key “Name” and the value
would be “Zack” and in the second column it’s not necessary mean that you must have
the value of Name again you could store different kind of data in the same column in
different row, and also you could have more column in one row than previous or vice
versa, this is the most common kinds of NoSQL database that are currently in the market
and other kinds of NoSQL database are built upon the principle of this kinds of NoSQL
database and added some features on that.
 The Key-Value database is a very simple structure based on Amazon’s Dynamo DB. Data
is indexed and queried based on it’s key. Key-value stores provide consistent hashing so
they can scale incrementally as your data scales. They communicate node structure
through a gossip-based membership protocol to keep all the nodes synchronized. If you
are looking to scale very large sets of low complexity data, key-value stores are the best
option.
 Examples: Tokyo Cabinet/Tyrant, Redis, Voldemort, Oracle BDB, Amazon
SimpleDB, Riak
 Strengths: Fast lookups
 Weaknesses: Stored data has no schema

Column Family Stores
 These were created to store and process very large amounts of data distributed over many
machines. There are still keys but they point to multiple columns. The columns are
arranged by column family.
 These data stores are based on Google’s BigTable implementation. They may look
similar to relational databases on the surface but under the hood a lot has changed. A
column family database can have different columns on each row so is not relational and
doesn’t have what qualifies in an RDBMS as a table. The only key concepts in a column
family database are columns, column families and super columns. All you really need to
start with is a column family. Column families define how the data is structured on disk.
 A column by itself is just a key-value pair that exists in a column family. A super column
is like a catalogue or a collection of other columns except for other super columns.
 Column family databases are still extremely scalable but less-so than key-value stores.
However, they work better with more complex data sets.
 Examples: Cassandra, HBase

Document Databases
 These were inspired by Lotus Notes and are similar to key-value stores. The model is
basically versioned documents that are collections of other key-value collections. The
semi-structured documents are stored in formats like JSON.
 A document database is not a new idea. It was used to power one of the more
prominent communication platforms of the 90’s and still in service today, Lotus Notes
now called Lotus Domino. APIs for document DBs use Restful web services and JSON
for message structure making them easy to move data in and out.
 A document database has a fairly simple data model based on collections of key-value
pairs. A typical record in a document database would look like this:
• { “Subject”: “I like Plankton”
• “Author”: “Rusty”
• “PostedDate”: “5/23/2006″
• “Tags”: ["plankton", "baseball", "decisions"]
• “Body”: “I decided today that I don’t like baseball. I like plankton.” }
 Examples: CouchDB, MongoDb
 Strengths: Tolerant of incomplete data
 Weaknesses: Query performance, no standard query syntax

Graph Databases
 Instead of tables of rows and columns and the rigid structure of SQL, a flexible graph
model is used which, again, can scale across multiple machines. NoSQL databases do not
provide a high-level declarative query language like SQL to avoid overtime in
processing. Rather, querying these databases is data-model specific. Many of the NoSQL
platforms allow for RESTful interfaces to the data, while other offer query APIs.
 Graph databases take document databases to the extreme by introducing the concept of
type relationships between documents or nodes. The most common example is the
relationship between people on a social network such as Facebook.
 A graph database is a big dense network structure. While it could take an RDBMS hours
to sift through a huge linked list of people, a graph database uses sophisticated shortest
path algorithms to make data queries more efficient. Although slower than its other
NoSQL counterparts, a graph database can have the most complex structure of them all
and still traverse billions of nodes and relationships with light speed.
 Examples: Neo4J, InfoGrid, Infinite Graph
 Strengths: Graph algorithms e.g. shortest path,n degree relationships, etc.
 Weaknesses: Traverse the entire graph to achieve a definitive answer. Not easy to cluster

When is NoSQL a poor choice?
After spending so long extolling the benefits of the various NoSQL solutions, I would like to
point out at least one scenario where I haven’t seen a good NosQL solution for the RDBMS:
Reporting. One of the great things about RDBMS is that given the information that it already
have, it is very easy to massage the data into a lot of interesting forms. That is especially
important when you are trying to do things like give the user the ability to analyze the data
on their own, such as by providing the user with a report tool that allows them to query,
aggregate and manipulate the data to their heart’s content. While it is certainly possible to
produce reports on top of a NoSQL store, you wouldn’t be able to come close to the level of
flexibility that a RDMBS will offer. That is one of the major benefits of the RDBMS, its
flexibility. The NoSQL solutions will tend to outperform the RDBMS solution (as long as you
stay in the appropriate niche for each NoSQL solution) and they certainly have better
scalability story than the RDBMS, but for user driven reports, the RDBMS is still my tool of
choice

Suvradeep Rudra is a Sr. Data Architect and has more than 10
years of experience in Data Management. He held a number
of roles at Caritor Inc. (now NTT DATA), Oracle, Deloitte
Consulting. Experienced in building overall data strategy,
tapping value from data assets and capabilities and driving
value to the business. He has worked in various projects,
establishing and building data management solutions for
customers in the industries such as High Tech, Health
Insurance, Oil and Gas, Payments services and Banking. His
experience ranges from Data strategy, Product Strategy,
MDM, Business Intelligence and Analytics, Data Architecture
(Data Warehouse), Data Governance.
Suvradeep writes and speaks about Monetizing Company’s
Data and Technology trends.
He holds Masters in Computer Applications from University
of Madras, Chennai, India.
He can be reached via LinkedIn profile

NOSQL Databases types and Uses

More Related Content

What's hot

Viewers also liked

Similar to NOSQL Databases types and Uses

More from Suvradeep Rudra

Recently uploaded

NOSQL Databases types and Uses