 NoSQL

05-March-2014
Wednesday

Jainul A. Musani
(MCA,MPhil,MTech)
2
3

What’s Meaning....??

NOT SQL...???
4

Introduction
• Past Decade – DB Professionals dependent
on RDBMS (Relational Database Systems)
and a single standard supported by all
databases : SQL – Structure Query Language
• Relational Model – E.F.Codd’s 1970.
5

Introduction
• RDBMS - Table Oriented

Relational Database for_
•Storage of Data

•Retrieval of Data
5
6

Staff
No

Staff
Name

Post

Salary

Branch
No

Branch
Address

SL21

John White

Manager

30,000

B005

22 Deer Rd,
London

SG37

Ann Beech

Assistant

12,000

B003

163 Main St,
Glasgow

SG14

David Ford

Supervisor

18,000

B003

163 Main St,
Glasgow

SA9

Mary Howe

Assistant

9,000

B007

16 Argyll St,
Aberdeen

SG5

Susan Brand

Manager

24,000

B003

163 Main St,
Glasgow

SL41

Julie Lee

Assistant

9,000

B005

22 Deer Rd,
London
6
7

Introduction
• Relational Model more suitable to ClientServer programming .

• Easier to maintain data and write
programming for Relational Model.
• Predominant technology for storing
structured data in web and business
applications.
8

Introduction
• Relational Model - relies upon hard-and
fast and Structured rules – ACID rules for
database transactions.
RDBMS

ACID Rules
Classical Relational Database

Atomic
Consistent
Isolated
Durable
10

A.C.I.D. Properties
Atomic
• A Transaction data modification – either
Completed –or – not perform.

Consistent
• At end of Transaction all data in
consistent state.
11

A.C.I.D. Properties
Isolated
• Modification of one data must be
independent of another Transaction.
[other wise outcome of result will be erroneous]

Durable
• When Transaction completed, modification
performed must be permanent in the system.
12

A.C.I.D. Properties

12
NoSQL

What is
NoSQL?
14

What is NoSQL...??
 Non-relational database
management systems,
 Different from traditional
RDBMS in some significant
ways.
15

What is NoSQL...??
 Core of NoSQL database_
 Hash Function – mathematical
algorithm – take variable
length of Input and produce a
consistent, fixedlength Output.
 Key/Value pair is stored for
later retrieval of record.
16

What is NoSQL...??
 Designed for
 distributed data stores where
 very large scale of data storing
needs
(for example Google or Facebook
which collects terabits of data every
day for their users).
17

What is NoSQL...??
 These type of data storing may
not require fixed schema,
avoid join operations and
typically scale horizontally.
18

Scaling...!!!!
 Ability of a System to expand
to meet business needs.
Ex. Web application – allow more
people to use web application
 Vertical Scaling

 Horizontal Scaling
19

Vertical Scaling...!!!!
 Scale Up - add more resources
within the same logical unit to
increase capacity.
Ex. Add more CPUs / increase
memory / add more hard drive
20

Horizontal Scaling...!!!!
 Scale Out - add more nodes to
system.
Ex. Add new computer to
distributed software application.
In NoSQL system, data store can be
much faster as it takes advantage
of “scaling out”
NoSQL

Term NoSQL
by Carlo Strozzi
Year 1998
NoSQL

Why is
NoSQL?
23

Why NoSQL ?
 In today’s time data is
becoming easier to access and
capture through third parties
such as Facebook, Google+ and
others.
24

Why NoSQL ?
 Personal user information,
 Social graphs,
 Geo location data,
 User-generated content and
 Machine logging data
are just a few examples where the
data has been increasing
exponentially.
25

Why NoSQL ?
 To avail the above service
properly, it is required to
process huge amount of data.
 Which SQL databases were
never designed. The evolution
of NoSql databases is to handle
these huge data properly.
26

26
27

What’s there in NoSQL ?
 Instead of using structured
tables to store multiple related
attributes in a row, NoSQL
databases use the concept of a
key/value store.
28

What’s there in NoSQL ?
 No schema for the database.
 Stores values for each provided
key, distributes them across
the database and then allows
their efficient retrieval.
29

What’s there in NoSQL ?
 Lack of a schema prevents
complex queries and
essentially prevents the use of
NoSQL as a transactional
database environment
30

RDBMS v/s NoSQL
 Structured and  Stands for Not
organized data
Only SQL
 Structured
 No declarative
query languagequery language
SQL
 No predefined
schema
31

RDBMS v/s NoSQL
 Data and its
relationships
are stored in
separate tables.
 Data
Manipulation

Language, Data
Definition
Language

 Key-Value pair
storage, Column
Store, Document
Store, Graph
databases

 Eventual
consistency
rather ACID
property
32

RDBMS v/s NoSQL
• Tight
Consistency
• BASE
Transaction

 Unstructured
and
unpredictable
data
 CAP Theorem
 Prioritizes high
performance,
high availability

and scalability
NoSQL

CAP
Theorem
(Brewer’s Theorem)
34

CAP Theorem
• When designing any distributed
system. CAP theorem states
that there are three basic
requirements which exist in a
special relation when designing
applications for a distributed
architecture.
35

CAP Theorem
• Consistency - the data in the
database remains consistent
after the execution of an
operation. For example after an
update operation all clients
see the same data.
36

CAP Theorem
• Availability - the system is
always on (service guarantee
availability), no downtime.
37

CAP Theorem
• Partition Tolerance - the
system continues to function
even the communication among
the servers is unreliable, i.e.
the servers may be partitioned
into multiple groups that cannot
communicate with one another.
38

CAP Theorem
• In theoretically it is impossible
to fulfill all 3 requirements
• CAP provides the basic
requirements for a
distributed system to follow 2
of the 3 requirements
39

CAP Theorem
• CA - Single site cluster, therefore all
nodes are always in contact. When a
partition occurs, the system blocks.
• CP - Some data may not be accessible,
but the rest is still consistent/accurate.
• AP - System is still available under
partitioning, but some of the data
returned may be inaccurate.
40

CAP Theorem

40
NoSQL

The BASE
by Eric Brewer
42

The BASE
 The CAP theorem states that a
distributed computer system
cannot guarantee all of the
following three properties at
the same time:
Consistency
Availability
Partition tolerance
43

The BASE
 A BASE system gives up on
consistency.
 Basically Available indicates
that the system does guarantee
availability, in terms of the CAP
theorem.
44

The BASE
 Soft state indicates that the
state of the system may change
over time, even without input.
This is because of the eventual
consistency model.
45

The BASE
• Eventual consistency indicates
that the system will become
consistent over time, given that
the system doesn't receive
input during that time.
46

ACID v/s BASE
ACID
Atomicity
Consistency
Isolation
Durable

BASE
Basically Available
Soft state
Eventual consistency
47

Pros/Cons - NoSQL
Advantages
High Scalability
Distributed Computing
Lower Cost
Schema Flexibility
Semi-Structured Data
No Complicated Relationship
48

Pros/Cons - NoSQL
Disadvantages
No Standardization
Limited Query Capabilities
Eventual Consistent is not
intuitive to program for
NoSQL

Types of
NoSQL
50

Categories of NoSQL database
1) Document Oriented:
Data is stored as documents.
An example format may be
like - FirstName="Arun",
Address="St. Xavier's Road",
Spouse=[{Name:"Kiran"}],
Children=[{Name:"Rihit", Age:8}]
51

CouchDB, Jackrabbit, MongoDB,
OrientDB, SimpleDB,Terrastore
52

Categories of NoSQL database
2) XML database:
Data is stored in XML
format.
BaseX, eXist, MarkLogic Server
etc.
53

Categories of NoSQL database
3) Graph databases:
Data is stored as a collection
of nodes, where nodes are
analogous to objects in a
programming language.
Nodes are connected using
edges.
54

AllegroGraph, DEX, Neo4j,
FlockDB, Sones GraphDB
55

Categories of NoSQL database
4) Key-value store:
In Key-value-store category of
NoSQL database, an user can
store data in schema-less way.
A key may be strings, hashes,
lists, sets, sorted sets and
values are stored against these
keys.
56

Cassandra, Riak, Redis,
memcached, BigTable etc.
57

Production deployment
 There is a large number of
companies using NoSQL.
 Google, Facebook, Mozilla,
Adobe, Foursquare, LinkedIn,
Digg, McGraw-Hill Education,
Vermont Public Radio
NoSQL

Market &
Business
RoadMap of
NoSQL
59
60
NoSQL

That’s All for
NoSQL
Thank You…!!!

NoSQL - 05March2014 Seminar

  • 1.
  • 2.
  • 3.
  • 4.
    4 Introduction • Past Decade– DB Professionals dependent on RDBMS (Relational Database Systems) and a single standard supported by all databases : SQL – Structure Query Language • Relational Model – E.F.Codd’s 1970.
  • 5.
    5 Introduction • RDBMS -Table Oriented Relational Database for_ •Storage of Data •Retrieval of Data 5
  • 6.
    6 Staff No Staff Name Post Salary Branch No Branch Address SL21 John White Manager 30,000 B005 22 DeerRd, London SG37 Ann Beech Assistant 12,000 B003 163 Main St, Glasgow SG14 David Ford Supervisor 18,000 B003 163 Main St, Glasgow SA9 Mary Howe Assistant 9,000 B007 16 Argyll St, Aberdeen SG5 Susan Brand Manager 24,000 B003 163 Main St, Glasgow SL41 Julie Lee Assistant 9,000 B005 22 Deer Rd, London 6
  • 7.
    7 Introduction • Relational Modelmore suitable to ClientServer programming . • Easier to maintain data and write programming for Relational Model. • Predominant technology for storing structured data in web and business applications.
  • 8.
    8 Introduction • Relational Model- relies upon hard-and fast and Structured rules – ACID rules for database transactions.
  • 9.
    RDBMS ACID Rules Classical RelationalDatabase Atomic Consistent Isolated Durable
  • 10.
    10 A.C.I.D. Properties Atomic • ATransaction data modification – either Completed –or – not perform. Consistent • At end of Transaction all data in consistent state.
  • 11.
    11 A.C.I.D. Properties Isolated • Modificationof one data must be independent of another Transaction. [other wise outcome of result will be erroneous] Durable • When Transaction completed, modification performed must be permanent in the system.
  • 12.
  • 13.
  • 14.
    14 What is NoSQL...?? Non-relational database management systems,  Different from traditional RDBMS in some significant ways.
  • 15.
    15 What is NoSQL...?? Core of NoSQL database_  Hash Function – mathematical algorithm – take variable length of Input and produce a consistent, fixedlength Output.  Key/Value pair is stored for later retrieval of record.
  • 16.
    16 What is NoSQL...?? Designed for  distributed data stores where  very large scale of data storing needs (for example Google or Facebook which collects terabits of data every day for their users).
  • 17.
    17 What is NoSQL...?? These type of data storing may not require fixed schema, avoid join operations and typically scale horizontally.
  • 18.
    18 Scaling...!!!!  Ability ofa System to expand to meet business needs. Ex. Web application – allow more people to use web application  Vertical Scaling  Horizontal Scaling
  • 19.
    19 Vertical Scaling...!!!!  ScaleUp - add more resources within the same logical unit to increase capacity. Ex. Add more CPUs / increase memory / add more hard drive
  • 20.
    20 Horizontal Scaling...!!!!  ScaleOut - add more nodes to system. Ex. Add new computer to distributed software application. In NoSQL system, data store can be much faster as it takes advantage of “scaling out”
  • 21.
    NoSQL Term NoSQL by CarloStrozzi Year 1998
  • 22.
  • 23.
    23 Why NoSQL ? In today’s time data is becoming easier to access and capture through third parties such as Facebook, Google+ and others.
  • 24.
    24 Why NoSQL ? Personal user information,  Social graphs,  Geo location data,  User-generated content and  Machine logging data are just a few examples where the data has been increasing exponentially.
  • 25.
    25 Why NoSQL ? To avail the above service properly, it is required to process huge amount of data.  Which SQL databases were never designed. The evolution of NoSql databases is to handle these huge data properly.
  • 26.
  • 27.
    27 What’s there inNoSQL ?  Instead of using structured tables to store multiple related attributes in a row, NoSQL databases use the concept of a key/value store.
  • 28.
    28 What’s there inNoSQL ?  No schema for the database.  Stores values for each provided key, distributes them across the database and then allows their efficient retrieval.
  • 29.
    29 What’s there inNoSQL ?  Lack of a schema prevents complex queries and essentially prevents the use of NoSQL as a transactional database environment
  • 30.
    30 RDBMS v/s NoSQL Structured and  Stands for Not organized data Only SQL  Structured  No declarative query languagequery language SQL  No predefined schema
  • 31.
    31 RDBMS v/s NoSQL Data and its relationships are stored in separate tables.  Data Manipulation Language, Data Definition Language  Key-Value pair storage, Column Store, Document Store, Graph databases  Eventual consistency rather ACID property
  • 32.
    32 RDBMS v/s NoSQL •Tight Consistency • BASE Transaction  Unstructured and unpredictable data  CAP Theorem  Prioritizes high performance, high availability and scalability
  • 33.
  • 34.
    34 CAP Theorem • Whendesigning any distributed system. CAP theorem states that there are three basic requirements which exist in a special relation when designing applications for a distributed architecture.
  • 35.
    35 CAP Theorem • Consistency- the data in the database remains consistent after the execution of an operation. For example after an update operation all clients see the same data.
  • 36.
    36 CAP Theorem • Availability- the system is always on (service guarantee availability), no downtime.
  • 37.
    37 CAP Theorem • PartitionTolerance - the system continues to function even the communication among the servers is unreliable, i.e. the servers may be partitioned into multiple groups that cannot communicate with one another.
  • 38.
    38 CAP Theorem • Intheoretically it is impossible to fulfill all 3 requirements • CAP provides the basic requirements for a distributed system to follow 2 of the 3 requirements
  • 39.
    39 CAP Theorem • CA- Single site cluster, therefore all nodes are always in contact. When a partition occurs, the system blocks. • CP - Some data may not be accessible, but the rest is still consistent/accurate. • AP - System is still available under partitioning, but some of the data returned may be inaccurate.
  • 40.
  • 41.
  • 42.
    42 The BASE  TheCAP theorem states that a distributed computer system cannot guarantee all of the following three properties at the same time: Consistency Availability Partition tolerance
  • 43.
    43 The BASE  ABASE system gives up on consistency.  Basically Available indicates that the system does guarantee availability, in terms of the CAP theorem.
  • 44.
    44 The BASE  Softstate indicates that the state of the system may change over time, even without input. This is because of the eventual consistency model.
  • 45.
    45 The BASE • Eventualconsistency indicates that the system will become consistent over time, given that the system doesn't receive input during that time.
  • 46.
  • 47.
    47 Pros/Cons - NoSQL Advantages HighScalability Distributed Computing Lower Cost Schema Flexibility Semi-Structured Data No Complicated Relationship
  • 48.
    48 Pros/Cons - NoSQL Disadvantages NoStandardization Limited Query Capabilities Eventual Consistent is not intuitive to program for
  • 49.
  • 50.
    50 Categories of NoSQLdatabase 1) Document Oriented: Data is stored as documents. An example format may be like - FirstName="Arun", Address="St. Xavier's Road", Spouse=[{Name:"Kiran"}], Children=[{Name:"Rihit", Age:8}]
  • 51.
  • 52.
    52 Categories of NoSQLdatabase 2) XML database: Data is stored in XML format. BaseX, eXist, MarkLogic Server etc.
  • 53.
    53 Categories of NoSQLdatabase 3) Graph databases: Data is stored as a collection of nodes, where nodes are analogous to objects in a programming language. Nodes are connected using edges.
  • 54.
  • 55.
    55 Categories of NoSQLdatabase 4) Key-value store: In Key-value-store category of NoSQL database, an user can store data in schema-less way. A key may be strings, hashes, lists, sets, sorted sets and values are stored against these keys.
  • 56.
  • 57.
    57 Production deployment  Thereis a large number of companies using NoSQL.  Google, Facebook, Mozilla, Adobe, Foursquare, LinkedIn, Digg, McGraw-Hill Education, Vermont Public Radio
  • 58.
  • 59.
  • 60.
  • 61.