Introduction to NoSQL
Agenda
RDBMS & its Limitations
ACID v/s BASE
CAP Theorem
Introduction to NOSQL & its Characteristics
Types of NOSQL Databases
Choosing the right fit
Disadvantages
2
Since 1970
Use SQL to manipulate data
 Easy to use
 Easy to integrate with other system
Fits most of our legacy application demands
Relational DBMS
3
What is problem of RDBMS?
4
BASE
Basic Availability: Each request is guaranteed a response—successful or
failed execution
Soft state: The state of the system may change over time, at times without
any input (for eventual consistency)
Eventual consistency: The database may be momentarily inconsistent but will
be consistent eventually
You have to choose only two. In almost all cases, you would choose availability over
consistency
CAP Theorem
6
NoSQL (Not Only SQL) … ??
 A NoSQL database provides a mechanism for storage and retrieval of data that
employs less constrained consistency models than traditional relational databases.
 Motivations : Simplicity of design ; Horizontal scaling ; Availability.
 NoSQL databases are often highly optimized key–value stores intended for simple
retrieval and appending operations, with the goal being significant performance
benefits in terms of latency and throughput.
 Used for : Big Data and real-time web applications.
7
Why now ??
8
Characteristic of NoSQL
Large data volumes.
Scalable replication and distribution (Horizontal scaling).
Queries need to return answers quickly.
Asynchronous Inserts & Updates.
Schema-less.
BASE / CAP Theorem.
No Joins statement.
No complicated Relationships
Less administration time(less cost).
Types of NoSQL Databases
NoSQL DB family includes several DB types:
Column: HBase, Accumulo, Cassandra
Document: MongoDB, Couchbase
Key-value : Dynamo, Riak, Redis, Cache, Project Voldemort
Graph: Neo4J, Allegro, Virtuoso
Data Model: Collection of key/value pairs
Keys and Values can be complex compounds
Designed to handle massive load
No complex query filters
All joins must be in the code
Advantages
 Very fast
 Very scalable
 Simple model
 Able to distribute horizontally
 Very Predictable performance of O(1)
Disadvantages
 Many data structures (objects) can't be easily modeled as key value pairs
Key/Value Databases
11
Tables are similar to RDBMS, but semi-structured
Based on Google’s BigTable
Rows can have arbitrary columns
Distributed and Decentralized
High Availability & Fault Tolerance
Tunable Consistency
Column Databases
12
Document Databases
13
Inspired by Lotus Notes
Central concept of a Document
Documents encapsulate/encode data in some Encodings:
XML, YAML, JSON, BSON
Graph Database
14
Based on Graph Theory -> G = (V, E)
Designed for data that is well
represented in a graph
Social networks, public transport links, network
topologies, road maps
Nodes, edges, properties are used to represent and
store data
Graph relationships are query able
Which one should I choose ?
What’s best depends on your data
Priorities
What types of queries do you need to support?
How much data?
Optimized for reads, writes, or updates?
Versioning
How separate is data from app? Will other applications need to access it in
future?
And how you want to interact with it
RESTful inteface
Query API
NonSQL query languages
Via indexed values, keys, nodes
File access
It too has disadvantages…
Performance and scalability achieved at the expense of feature support
No joins
Grouping and ordering become more problematic
No SQL
No transactions
Eventual consistency v/s Strict consistency
Tools are often lacking
Summary
NoSQL :
Handle huge data.
High availability with small cost.
More data redundancy.
High performance.
Less administration time.
Less standards.
SQL :
Good to solve ACID problems.
Expensive.
Less data redundancy.
Increasing availability mean increasing cost.
More standards.
More administration.
Thank You !

Introduction to nosql

  • 1.
  • 2.
    Agenda RDBMS & itsLimitations ACID v/s BASE CAP Theorem Introduction to NOSQL & its Characteristics Types of NOSQL Databases Choosing the right fit Disadvantages 2
  • 3.
    Since 1970 Use SQLto manipulate data  Easy to use  Easy to integrate with other system Fits most of our legacy application demands Relational DBMS 3
  • 4.
    What is problemof RDBMS? 4
  • 5.
    BASE Basic Availability: Eachrequest is guaranteed a response—successful or failed execution Soft state: The state of the system may change over time, at times without any input (for eventual consistency) Eventual consistency: The database may be momentarily inconsistent but will be consistent eventually
  • 6.
    You have tochoose only two. In almost all cases, you would choose availability over consistency CAP Theorem 6
  • 7.
    NoSQL (Not OnlySQL) … ??  A NoSQL database provides a mechanism for storage and retrieval of data that employs less constrained consistency models than traditional relational databases.  Motivations : Simplicity of design ; Horizontal scaling ; Availability.  NoSQL databases are often highly optimized key–value stores intended for simple retrieval and appending operations, with the goal being significant performance benefits in terms of latency and throughput.  Used for : Big Data and real-time web applications. 7
  • 8.
  • 9.
    Characteristic of NoSQL Largedata volumes. Scalable replication and distribution (Horizontal scaling). Queries need to return answers quickly. Asynchronous Inserts & Updates. Schema-less. BASE / CAP Theorem. No Joins statement. No complicated Relationships Less administration time(less cost).
  • 10.
    Types of NoSQLDatabases NoSQL DB family includes several DB types: Column: HBase, Accumulo, Cassandra Document: MongoDB, Couchbase Key-value : Dynamo, Riak, Redis, Cache, Project Voldemort Graph: Neo4J, Allegro, Virtuoso
  • 11.
    Data Model: Collectionof key/value pairs Keys and Values can be complex compounds Designed to handle massive load No complex query filters All joins must be in the code Advantages  Very fast  Very scalable  Simple model  Able to distribute horizontally  Very Predictable performance of O(1) Disadvantages  Many data structures (objects) can't be easily modeled as key value pairs Key/Value Databases 11
  • 12.
    Tables are similarto RDBMS, but semi-structured Based on Google’s BigTable Rows can have arbitrary columns Distributed and Decentralized High Availability & Fault Tolerance Tunable Consistency Column Databases 12
  • 13.
    Document Databases 13 Inspired byLotus Notes Central concept of a Document Documents encapsulate/encode data in some Encodings: XML, YAML, JSON, BSON
  • 14.
    Graph Database 14 Based onGraph Theory -> G = (V, E) Designed for data that is well represented in a graph Social networks, public transport links, network topologies, road maps Nodes, edges, properties are used to represent and store data Graph relationships are query able
  • 15.
    Which one shouldI choose ?
  • 16.
  • 17.
    Priorities What types ofqueries do you need to support? How much data? Optimized for reads, writes, or updates? Versioning How separate is data from app? Will other applications need to access it in future?
  • 18.
    And how youwant to interact with it RESTful inteface Query API NonSQL query languages Via indexed values, keys, nodes File access
  • 19.
    It too hasdisadvantages… Performance and scalability achieved at the expense of feature support No joins Grouping and ordering become more problematic No SQL No transactions Eventual consistency v/s Strict consistency Tools are often lacking
  • 20.
    Summary NoSQL : Handle hugedata. High availability with small cost. More data redundancy. High performance. Less administration time. Less standards. SQL : Good to solve ACID problems. Expensive. Less data redundancy. Increasing availability mean increasing cost. More standards. More administration.
  • 21.

Editor's Notes

  • #6 Soft state : Data may be time-dependent on user interaction with possible expiration after a period of time. The data must be updated or accessed to remain relevant in the system.