SlideShare a Scribd company logo
1 of 72
Source: NoSQL Distilled
Prepared by Dr. Dipali Meher
Introduction to NoSQL
Not Only SQL
Dr. Dipali Meher
Assistant Professor
Modern College of Arts, Science and Commerce, Ganeshkhind, Pune 411016
mailtomeher@gmail.com/dipalimeher@moderncollegegk.org
MCS, M.Phil,NET,Ph.D
1
Source: NoSQL Distilled
Prepared by Dr. Dipali Meher
Agenda
• Introduction
• Why No SQL?
• Aggregate data models
• Data Modeling Details
• Distribution models
• Consistency
• Version stamps
• Map- reduce
2
Source: NoSQL Distilled
Prepared by Dr. Dipali Meher
3
Source: NoSQL Distilled
Prepared by Dr. Dipali Meher
Introduction
• A NO SQL originally referring to non SQL or non relational is
a database that provides a mechanism for storage and retrieval
of data.
• tabular relations used in relational databases
• Such databases came into existence in the late 1960s,
• Used in real-time web applications and big data
• Sometimes called Not only SQL to emphasize the fact that they
may support SQL-like query languages.
• Example: MarkLogic, Aerospike, FairCom c-treeACE, Google
Spanner (though technically a NewSQL database), Symas
LMDB, and OrientDB have made them central to their designs.
4
Source: NoSQL Distilled
Prepared by Dr. Dipali Meher
5
Source: NoSQL Distilled
Prepared by Dr. Dipali Meher
6
Source: NoSQL Distilled
Prepared by Dr. Dipali Meher
JSON Format
• JSON stands for JavaScript Object Notation.
• JSON objects are used for transferring data between server and client,
XML serves the same purpose. However JSON objects have several
advantages over XML and we are going to discuss them in this tutorial
along with JSON concepts and its usages.
• Example JSON DB
• var chaitanya =
{ "firstName" : "Chaitanya",
"lastName" : "Singh",
"age" : "28" };
7
Source: NoSQL Distilled
Prepared by Dr. Dipali Meher
Features of JSON
• It is light-weight
• It is language independent
• Easy to read and write
• Text based, human readable data exchange format
8
Source: NoSQL Distilled
Prepared by Dr. Dipali Meher
Why use JSON?
• Standard Structure: As we have seen so far that JSON objects
are having a standard structure that makes developers job easy to
read and write code, because they know what to expect from
JSON.
• Light weight: When working with AJAX, it is important to load
the data quickly and asynchronously without requesting the page
re-load. Since JSON is light weighted, it becomes easier to get and
load the requested data quickly.
• Scalable: JSON is language independent, which means it can
work well with most of the modern programming language. Let’s
say if we need to change the server side language, in that case it
would be easier for us to go ahead with that change as JSON
structure is same for all the languages.
9
Source: NoSQL Distilled
Prepared by Dr. Dipali Meher
Difference as example between JSON and
XML Style DB
JSON style: XML style:
{"students":
[ {"name":"John", "age":"23",
"city":"Agra"},
{"name":"Steve", "age":"28",
"city":"Delhi"},
{"name":"Peter", "age":"32",
"city":"Chennai"},
{"name":"Chaitanya", "age":"28",
"city":"Bangalore"}
]}
<students>
<student> <name>John</name> <age>23</age>
<city>Agra</city>
</student>
<student> <name>Steve</name> <age>28</age>
<city>Delhi</city>
</student>
<student> <name>Peter</name> <age>32</age>
<city>Chennai</city>
</student>
<student>
<name>Chaitanya</name> <age>28</age>
<city>Bangalore</city>
</student> </students>
10
Source: NoSQL Distilled
Prepared by Dr. Dipali Meher
Limitations of Relational DB
• In relational database we need to define structure and schema of
data first and then only we can process the data.
• Relational database systems provides consistency and integrity
of data by enforcing ACID properties (Atomicity, Consistency,
Isolation and Durability ). There are some scenarios where this
is useful like banking system. However in most of the other
cases these properties are significant performance overhead and
can make your database response very slow.
• Most of the applications store their data in JSON format and
RDBMS don’t provide you a better way of performing
operations such as create, insert, update, delete etc on this data.
• On the other hand NoSQL store their data in JSON format,
which is compatible with most of the today’s world application.
11
Source: NoSQL Distilled
Prepared by Dr. Dipali Meher
RDBMSVs NoSQL
• RDBMS: It is a structured data that provides more
functionality but gives less performance.
• NoSQL: Structured or semi structured data, less functionality
and high performance.
12
Source: NoSQL Distilled
Prepared by Dr. Dipali Meher
13
Source: NoSQL Distilled
Prepared by Dr. Dipali Meher
14
Source: NoSQL Distilled
Prepared by Dr. Dipali Meher
So when I say less functionality in NoSQL what’s
missing:
• You can’t have constraints in
NoSQL
• Joins are not supported in NoSQL
• These supports actually hinders
the scalability of a database, so
while using NoSQL database like
MongoDB, you can implements
these functionalities at the
application level.
15
When to go for NoSQL:
 When you would want to choose NoSQL
over relational database:
 When you want to store and retrieve huge
amount of data.
 The relationship between the data you
store is not that important
 The data is not structured and changing
over time
 Constraints and Joins support is not
required at database level
 The data is growing continuously and you
need to scale the database regular to
handle the data.
Source: NoSQL Distilled
Prepared by Dr. Dipali Meher
Why NO SQL?
• NoSQL databases are different than relational databases like MySQL.
• In relational database you need to create the table, define schema, set
the data types of fields etc before you can actually insert the data.
• In NoSQL you don’t have to worry about that, you can insert, update
data on the fly.
• One of the advantage of NoSQL database is that they are really easy to
scale and they are much faster in most types of operations that we
perform on database.
• There are certain situations where you would prefer relational
database over NoSQL, however when you are dealing with huge
amount of data then NoSQL database is your best choice.
16
Source: NoSQL Distilled
Prepared by Dr. Dipali Meher
Introduction Continued….
• includes simplicity of design
• Simpler horizontal scaling to clusters of machines
• finer control over availability
• The data structures used by NOSQL databases are different
from those used by default in relational databases which makes
some operations faster in NoSQL.
• Data Structures used in NO SQL language are flexible
17
Source: NoSQL Distilled
Prepared by Dr. Dipali Meher
Differentiate between SQL and NOSQL
18
Source: NoSQL Distilled
Prepared by Dr. Dipali Meher
Barriers to NO SQL
• Low-level query languages
• lack of standardized interfaces
• huge previous investments in existing relational databases
• Lacks true ACID(Atomicity, Consistency, Isolation, Durability)
properties
19
Source: NoSQL Distilled
Prepared by Dr. Dipali Meher
Types of NO SQL DB
• MongoDB falls in the category of NoSQL document based
database.
• Key value store: Memcached, Redis, Coherence
• Tabular: Hbase, Big Table, Accumulo
• Document based: MongoDB, CouchDB, Cloudant
20
Source: NoSQL Distilled
Prepared by Dr. Dipali Meher
Other problems faced by NO SQL
• stale reads problem- Most NoSQL databases offer a concept of
eventual consistency in which database changes are
propagated to all nodes so queries for data might not return
updated data immediately or might result in reading data that
is not accurate which is a problem known as stale reads.
• NO SQL may exhibit lost writes and other forms of data loss.
• Data consistency is bigger challenge
21
Source: NoSQL Distilled
Prepared by Dr. Dipali Meher
Advantages
• High scalability- NO SQL DB uses sharding for horizontal
scaling. Partitioning of data and placing it on multiple machines in
such a way that the order of the data is preserved is sharding.
• Vertical scaling means adding more resources to the existing
machine
• Horizontal scaling means adding more machines to handle the
data. Vertical scaling is not that easy to implement but horizontal
scaling is easy to implement.
• Examples of horizontal scaling databases are MongoDB,
Cassandra etc.
• NoSQL can handle huge amount of data because of scalability, as
the data grows NoSQL scale itself to handle that data in efficient
manner.
• High availability-replication feature in NoSQL databases makes it
highly available because in case of any failure data replicates itself
to the previous consistent state. 22
Source: NoSQL Distilled
Prepared by Dr. Dipali Meher
Disadvantages of NO SQL
• Narrow focus-NoSQL databases have very narrow focus as it is mainly designed
for storage but it provides very little functionality. Relational databases are a better
choice in the field of Transaction Management than NoSQL.
• Open source- It is open-source database. There is no reliable standard for NoSQL
yet. In other words two database systems are likely to be unequal.
• Management Challenge- he purpose of big data tools is to make management of a
large amount of data as simple as possible. But it is not so easy. Data management
in NoSQL is much more complex than a relational database. NoSQL, in particular,
has a reputation for being challenging to install and even more hectic to manage on
a daily basis.
• GUI is not available- GUI mode tools to access the database is not flexibly
available in the market.
• Backup- Backup is a great weak point for some NoSQL databases like MongoDB.
MongoDB has no approach for the backup of data in a consistent manner.
• Large document size-Some database systems like MongoDB and CouchDB store
data in JSON format. Which means that documents are quite large (BigData,
network bandwidth, speed), and having descriptive key names actually hurts, since
they increase the document size.
23
Source: NoSQL Distilled
Prepared by Dr. Dipali Meher
When should NoSQL be used
• When huge amount of data need to be stored and
retrieved .
• The relationship between the data you store is not that
important
• The data changing over time and is not structured.
• Support of Constraints and Joins is not required at
database level
• The data is growing continuously and you need to scale
the database regular to handle the data
24
Source: NoSQL Distilled
Prepared by Dr. Dipali Meher
• successful technology for twenty years, providing
persistence, concurrency control, and an integration
mechanism.
• Application developers have been frustrated with the
impedance mismatch between the relational model and
the in-memory data structures.
• There is a movement away from using databases as
integration points towards encapsulating databases
within applications and integrating through services.
• The vital factor for a change in data storage was the need
to support large volumes of data by running on clusters.
Relational databases are not designed to run efficiently
on clusters. 25
RDBMS
Source: NoSQL Distilled
Prepared by Dr. Dipali Meher
Impedance mismatch
Impedance mismatch is the term used to refer to the
problems that occurs due to differences between
the database model and the programming language
model.
26
Source: NoSQL Distilled
Prepared by Dr. Dipali Meher
NO SQL
• NoSQL is an accidental neologism.There is no prescriptive
definition—all you can make is an observation of common
characteristics.
• The common characteristics of NoSQL databases are
• Not using the relational model
• Running well on clusters
• Open-source
• Built for the 21st century web estates
• Schemaless
• The most important result of the rise of NoSQL is Polyglot
Persistence
27
Source: NoSQL Distilled
Prepared by Dr. Dipali Meher
Aggregate Data Models
• An aggregate is a collection of data that we interact with as
a unit.
• These units of data or aggregates form the boundaries for
ACID operations with the database, Key-value, Document,
and Column-family databases can all be seen as forms of
aggregate-oriented database.
28
Source: NoSQL Distilled
Prepared by Dr. Dipali Meher
Data Model
• A data model is the model through which we perceive
and manipulate our data
• The data model describes how we interact with the data
in the database
• A data model (or datamodel)is an abstract model that
organizes elements of data and standardizes how they
relate to one another and to the properties of real-world
entities. For instance, a data model may specify that the
data element representing a car be composed of a
number of other elements which, in turn, represent the
color and size of the car and define its owner.
• concepts such as entities, attributes, relations, or tables.
29
Source: NoSQL Distilled
Prepared by Dr. Dipali Meher
• Data models are distinct form storage models.
• Storage models describes how the database stores and
manipulates the data internally.
• A storage model is a model that captures key physical
aspects of data structure in a data store.
30
Source: NoSQL Distilled
Prepared by Dr. Dipali Meher
31
Storage model
Source: NoSQL Distilled
Prepared by Dr. Dipali Meher
32
Source: NoSQL Distilled
Prepared by Dr. Dipali Meher
• Ideally we should be ignorant of the storage model, but
in practice we need at least some inkling (impact of thing
after it over)of it—primarily to achieve decent
( acceptable standard )performance.
33
“data model” often means the model of the specific data in
an application. A developer might point to an entity-
relationship diagram of their database and refer to that as
their data model containing customers, orders, products
Metamodel :the model by which the database organizes data
Source: NoSQL Distilled
Prepared by Dr. Dipali Meher
Aggregates
•It recognizes that often, you want to
operate on data in units that have a more
complex structure than a set of tuples. It
can be handy to think in terms of a
complex record that allows lists and other
record structures to be nested inside it
34
complex record = aggregate
Programmers manipulate data through
aggregate structures
Source: NoSQL Distilled
Prepared by Dr. Dipali Meher
• Domain-Driven Design
• aggregate is a collection of related objects treated as unit
• it is a unit for data manipulation and management of consistency
• Aggregates will be updated with atomic operations
• key-value, document, and column-family databases will do this.
• When databases are operating in cluster using of these Aggregate will
be easy
• why easy (aggregate makes a natural unit for replication and sharding)
35
Source: NoSQL Distilled
Prepared by Dr. Dipali Meher
36
Source: NoSQL Distilled
Prepared by Dr. Dipali Meher
37
Source: NoSQL Distilled
Prepared by Dr. Dipali Meher
Sharding
38
Source: NoSQL Distilled
Prepared by Dr. Dipali Meher
39
Source: NoSQL Distilled
Prepared by Dr. Dipali Meher
Relations and Aggregates: example
• ecommerce website: relational databse
40
Source: NoSQL Distilled
Prepared by Dr. Dipali Meher
presents some sample data for this model.
41
Source: NoSQL Distilled
Prepared by Dr. Dipali Meher
Aggregate oriented structure
42
Source: NoSQL Distilled
Prepared by Dr. Dipali Meher
JSON format
43
Source: NoSQL Distilled
Prepared by Dr. Dipali Meher
• In this model, we have two main aggregates: customer
and order.
44
Source: NoSQL Distilled
Prepared by Dr. Dipali Meher
45
Embed all the objects for customer and the customer’s orders
Source: NoSQL Distilled
Prepared by Dr. Dipali Meher
46
Source: NoSQL Distilled
Prepared by Dr. Dipali Meher
• there’s no universal answer for how to draw your aggregate
boundaries.
• It depends entirely on how you tend to manipulate your data.
• If you tend to access a customer together with all of that
customer’s orders at once, then you would prefer a single
aggregate.
• However, if you tend to focus on accessing a single order at a
time, then you should prefer having separate aggregates for
each order.
• Naturally, this is very context-specific; some applications will
prefer one or the other, even within a single system, which is
exactly why many people prefer aggregate ignorance
47
Source: NoSQL Distilled
Prepared by Dr. Dipali Meher
Summary of aggregate data models
• An aggregate is a collection of data that we interact with
as a unit. Aggregates form the boundaries forACID
operations with the database.
• Key-value, document, and column-family databases can
all be seen as forms of aggregate oriented database.
• Aggregates make it easier for the database to manage
data storage over clusters.
• Aggregate-oriented databases work best when most data
interaction is done with the same aggregate; aggregate-
ignorant databases are better when interactions use data
organized in many different formations.
48
Source: NoSQL Distilled
Prepared by Dr. Dipali Meher
Aggregate Data Models Continued…
• Aggregates make it easier for the database to manage data storage over
clusters, since the unit of data now could reside on any machine and
when retrieved from the database gets all the related data along with it.
• Aggregate-oriented databases work best when most data interaction is
done with the same aggregate,
• for example when there is need to get an order and all its details, it
better to store order as an aggregate object but dealing with these
aggregates to get item details on all the orders is not elegant.
• Aggregate-oriented databases make inter-aggregate relationships more
difficult to handle than intra-aggregate relationships.
• Aggregate-ignorant databases are better when interactions use data
organized in many different formations.
• Aggregate-oriented databases often compute materialized views to
provide data organized differently from their primary aggregates. This
is often done with map-reduce computations, such as a map-reduce job
to get items sold per day.
49
Source: NoSQL Distilled
Prepared by Dr. Dipali Meher
Details of Data Models
• Relationships
• Graph Databases
• Schemaless databases
• MaterializedViews
• Modeling for Data Access
50
Source: NoSQL Distilled
Prepared by Dr. Dipali Meher
Aggregates are a central part of the NoSQL story
51
Source: NoSQL Distilled
Prepared by Dr. Dipali Meher
Relationships
• Create the aggregates for commonly accessed data. And put all
these aggregates together.
• In real life this might happen that aggregates access on common
data might be accessed differently.
• Example: one customer is having many orders
Some applications will want to access the order history whenever they
access the customer; this fits in well with combining the customer with
his order history into a single aggregate.
Other applications, however, want to process orders individually and
thus model orders as independent aggregates. In this situation
customer and order aggregate are separated but keep the same
relation ship and(one customer many orders)
many databases—even key-value stores—provide ways to make these
relationships visible to the database. . Document stores make the
content of the aggregate available to the database to form indexes and
queries. Riak, a key-value store, allows you to put link information in
metadata, supporting partial retrieval and link-walking capability.
52
Source: NoSQL Distilled
Prepared by Dr. Dipali Meher
Important aspect about
relationship and aggregates
• How updates are handled?
• Aggregate oriented databases treat the aggregate as the unit of
data-retrieval.
• Consequently, atomicity is only supported within the contents of a
single aggregate.
• If you update multiple aggregates at once, you have to deal yourself
with a failure partway through.
• Relational databases help you with this by allowing you to modify
multiple records in a single transaction, providingACID guarantees
while altering many rows.
53
Source: NoSQL Distilled
Prepared by Dr. Dipali Meher
• So when database contains lots of relationships go for
RDBMS.
54
Source: NoSQL Distilled
Prepared by Dr. Dipali Meher
Graph Databases
• Graph databases are an odd fish in the NoSQL pond
55
Source: NoSQL Distilled
Prepared by Dr. Dipali Meher
• Most of the NOSQL databases run on clusters and are
aggregate oriented.
• These aggregate data models are of large records with
simple connections.
• In case of graph databases there are small records with
complex interconnections. See example in next slide.
56
Source: NoSQL Distilled
Prepared by Dr. Dipali Meher
57
a graph isn’t a bar chart or histogram; instead, we refer to a graph data
structure of nodes connected by edges
Source: NoSQL Distilled
Prepared by Dr. Dipali Meher
• There is difference between graph databases and relational
database queries. In case of graph databases we have to keep
in mind graphical network structure and then ask the query. In
RDBMS we have to keep schema in mind(like foreign keys, the
join)
• In graphical query languages user can find answer then query
by navigating through network of edges.
• Relationships makes graph databases very different from
aggregate-oriented databases query work to be navigating
(to shows directions)relationships.
58
Source: NoSQL Distilled
Prepared by Dr. Dipali Meher
Navigation in graph databases
59
Source: NoSQL Distilled
Prepared by Dr. Dipali Meher
• The emphasis on relationships makes graph databases
very different from aggregate-oriented databases.
60
Source: NoSQL Distilled
Prepared by Dr. Dipali Meher
Schemaless Databases
• A common theme across all the forms of NoSQL
databases is that they are schemaless.
61
• NoSQL  storing data is much more casual.
• A key-value store allows you to store any data you like under
a key.
• A document database effectively does the same thing, since
it makes no restrictions on the structure of the documents
you store.
• Column-family databases allow you to store any data under
any column you like.
• Graph databases allow you to freely add new edges and
freely add properties to nodes and edges as you wish.
Source: NoSQL Distilled
Prepared by Dr. Dipali Meher
Schemaless databases
• freedom and flexibility
• With schema  figure out in advance what you need to
store/ document it / diagram it which is hard to do
• Without schema is no binding  User can easily change
your data storage as you learn more about your project.
• User can easily add new things as you discover them.
• If user donot want to store more attributes in database or
any rows in database then tis is allowed in NoSQL
62
a schemaless store also makes it easier to deal with nonuniform data:
Source: NoSQL Distilled
Prepared by Dr. Dipali Meher
Schemaless databases: Nonuniform data
• data where each record has a different set of fields.
• A schema puts all rows of a table into a straightjacket, which
becomes awkward if you have different kinds of data in
different rows.You either end up with lots of columns that are
usually null (a sparse table), or you end up with meaningless
columns like custom column.
63
Schemalessness avoids this, allowing each record to
contain just what it needs—no more, no less
Source: NoSQL Distilled
Prepared by Dr. Dipali Meher
In schemaless database implicit
schemas are present.
• implicit schema is a set of assumptions about the data’s
structure in the code that manipulates the data.
64
Source: NoSQL Distilled
Prepared by Dr. Dipali Meher
At last Schemaless means what?
65
Source: NoSQL Distilled
Prepared by Dr. Dipali Meher
MaterializedViews
• View in RDBMS
• Views provide a mechanism to hide from the client whether data is derived data or base data—but
can’t avoid the fact that some views are expensive to compute.
• To cope with this, materialized views were invented, which are views that are computed in
advance and cached on disk.
• Materialized views are effective for data that is read heavily but can stand being somewhat
stale(in real life it is nothing but tasteleass in database it is just for view purpose no DDL AND DML
FORTHATVIEW).
• NoSQL databases don’t have views, they may have precomputed and cached
queries, and they reuse the term
“materialized view” to describe them.
•MAP REDUCETECHNIQUE IS USED
• Map-reduce is a data processing paradigm for condensing large volumes of data into useful
aggregated
results.
Materialized views can be used within the same aggregate. 66
Source: NoSQL Distilled
Prepared by Dr. Dipali Meher
2 main ways to build the materialized view
• Eager approach: user can update the materialized view at the
same time you update the base data for it. This approach is
good when you have more frequent reads of the materialized
view than you have writes and you want the materialized views
to be as fresh as possible
• Application database: user can do any updates to base data
also update materialized views.
• outside of the database by reading the data, computing the
view, and saving it back to the database.
67
If you don’t want to pay that overhead on each update, you can run batch jobs
to update the materialized views at regular intervals.Views are
updated with MAP REDUCETECHNIQUE
Source: NoSQL Distilled
Prepared by Dr. Dipali Meher
MAP REDUCE
• A MapReduce job usually splits the input data-set into
independent chunks which are processed by
the map tasks in a completely parallel manner.
• The framework sorts the outputs of the maps, which are
then input to the reduce tasks.
• Typically both the input and the output of the job are
stored in a file-system.
68
Source: NoSQL Distilled
Prepared by Dr. Dipali Meher
MAP REDUCE
69
Source: NoSQL Distilled
Prepared by Dr. Dipali Meher
MAP REDUCE
70
Source: NoSQL Distilled
Prepared by Dr. Dipali Meher
Key points
• Aggregate-oriented databases make inter-aggregate
relationships more difficult to handle than intra-aggregate
relationships.
• Graph databases organize data into node and edge graphs;
they work best for data that has complex relationship
structures.
• Schemaless databases allow you to freely add fields to
records, but there is usually an implicit schema expected by
users of the data.
• Aggregate-oriented databases often compute materialized
views to provide data organized differently from their
primary aggregates.This is often done with map-reduce
computations. 71
Source: NoSQL Distilled
Prepared by Dr. Dipali Meher
ThankYou
72

More Related Content

What's hot

What's hot (20)

Consistency in NoSQL
Consistency in NoSQLConsistency in NoSQL
Consistency in NoSQL
 
6 Data Modeling for NoSQL 2/2
6 Data Modeling for NoSQL 2/26 Data Modeling for NoSQL 2/2
6 Data Modeling for NoSQL 2/2
 
Nosql seminar
Nosql seminarNosql seminar
Nosql seminar
 
NoSQL databases
NoSQL databasesNoSQL databases
NoSQL databases
 
An Intro to NoSQL Databases
An Intro to NoSQL DatabasesAn Intro to NoSQL Databases
An Intro to NoSQL Databases
 
NOSQL Databases types and Uses
NOSQL Databases types and UsesNOSQL Databases types and Uses
NOSQL Databases types and Uses
 
A Seminar on NoSQL Databases.
A Seminar on NoSQL Databases.A Seminar on NoSQL Databases.
A Seminar on NoSQL Databases.
 
Non relational databases-no sql
Non relational databases-no sqlNon relational databases-no sql
Non relational databases-no sql
 
Hadoop Overview & Architecture
Hadoop Overview & Architecture  Hadoop Overview & Architecture
Hadoop Overview & Architecture
 
Introduction to NoSQL
Introduction to NoSQLIntroduction to NoSQL
Introduction to NoSQL
 
Introduction to Hadoop
Introduction to HadoopIntroduction to Hadoop
Introduction to Hadoop
 
Big Data and Hadoop
Big Data and HadoopBig Data and Hadoop
Big Data and Hadoop
 
NoSQL databases
NoSQL databasesNoSQL databases
NoSQL databases
 
NoSql
NoSqlNoSql
NoSql
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDB
 
Intro to HBase
Intro to HBaseIntro to HBase
Intro to HBase
 
RDBMS vs NoSQL
RDBMS vs NoSQLRDBMS vs NoSQL
RDBMS vs NoSQL
 
9. Document Oriented Databases
9. Document Oriented Databases9. Document Oriented Databases
9. Document Oriented Databases
 
Mongo db intro.pptx
Mongo db intro.pptxMongo db intro.pptx
Mongo db intro.pptx
 
An Introduction To NoSQL & MongoDB
An Introduction To NoSQL & MongoDBAn Introduction To NoSQL & MongoDB
An Introduction To NoSQL & MongoDB
 

Similar to Introduction to NoSQL

Sql vs NO-SQL database differences explained
Sql vs NO-SQL database differences explainedSql vs NO-SQL database differences explained
Sql vs NO-SQL database differences explainedSatya Pal
 
Chapter1: NoSQL: It’s about making intelligent choices
Chapter1: NoSQL: It’s about making intelligent choicesChapter1: NoSQL: It’s about making intelligent choices
Chapter1: NoSQL: It’s about making intelligent choicesMaynooth University
 
1. introduction to no sql
1. introduction to no sql1. introduction to no sql
1. introduction to no sqlAnuja Gunale
 
NoSQLDatabases
NoSQLDatabasesNoSQLDatabases
NoSQLDatabasesAdi Challa
 
Relational and non relational database 7
Relational and non relational database 7Relational and non relational database 7
Relational and non relational database 7abdulrahmanhelan
 
Introduction to NoSQL database technology
Introduction to NoSQL database technologyIntroduction to NoSQL database technology
Introduction to NoSQL database technologynicolausalex722
 
Presentation On NoSQL Databases
Presentation On NoSQL DatabasesPresentation On NoSQL Databases
Presentation On NoSQL DatabasesAbiral Gautam
 
MongoDB Lab Manual (1).pdf used in data science
MongoDB Lab Manual (1).pdf used in data scienceMongoDB Lab Manual (1).pdf used in data science
MongoDB Lab Manual (1).pdf used in data sciencebitragowthamkumar1
 
NoSQL - Not Only SQL
NoSQL - Not Only SQLNoSQL - Not Only SQL
NoSQL - Not Only SQLEasyData
 
Unit II -BIG DATA ANALYTICS.docx
Unit II -BIG DATA ANALYTICS.docxUnit II -BIG DATA ANALYTICS.docx
Unit II -BIG DATA ANALYTICS.docxvvpadhu
 
Ccs334 Big data analytics UNIT II ppt notes
Ccs334   Big data analytics UNIT II ppt notesCcs334   Big data analytics UNIT II ppt notes
Ccs334 Big data analytics UNIT II ppt notesVasanthB27
 
Introduction to NoSQL
Introduction to NoSQLIntroduction to NoSQL
Introduction to NoSQLbalwinders
 
Why no sql ? Why Couchbase ?
Why no sql ? Why Couchbase ?Why no sql ? Why Couchbase ?
Why no sql ? Why Couchbase ?Ahmed Rashwan
 

Similar to Introduction to NoSQL (20)

Sql vs NO-SQL database differences explained
Sql vs NO-SQL database differences explainedSql vs NO-SQL database differences explained
Sql vs NO-SQL database differences explained
 
Chapter1: NoSQL: It’s about making intelligent choices
Chapter1: NoSQL: It’s about making intelligent choicesChapter1: NoSQL: It’s about making intelligent choices
Chapter1: NoSQL: It’s about making intelligent choices
 
1. introduction to no sql
1. introduction to no sql1. introduction to no sql
1. introduction to no sql
 
UNIT-2.pptx
UNIT-2.pptxUNIT-2.pptx
UNIT-2.pptx
 
NoSQLDatabases
NoSQLDatabasesNoSQLDatabases
NoSQLDatabases
 
Relational and non relational database 7
Relational and non relational database 7Relational and non relational database 7
Relational and non relational database 7
 
Introduction to NoSQL database technology
Introduction to NoSQL database technologyIntroduction to NoSQL database technology
Introduction to NoSQL database technology
 
the rising no sql technology
the rising no sql technologythe rising no sql technology
the rising no sql technology
 
Presentation On NoSQL Databases
Presentation On NoSQL DatabasesPresentation On NoSQL Databases
Presentation On NoSQL Databases
 
Unit 3 MongDB
Unit 3 MongDBUnit 3 MongDB
Unit 3 MongDB
 
MongoDB Lab Manual (1).pdf used in data science
MongoDB Lab Manual (1).pdf used in data scienceMongoDB Lab Manual (1).pdf used in data science
MongoDB Lab Manual (1).pdf used in data science
 
unit2-ppt1.pptx
unit2-ppt1.pptxunit2-ppt1.pptx
unit2-ppt1.pptx
 
No SQL
No SQLNo SQL
No SQL
 
NoSQL - Not Only SQL
NoSQL - Not Only SQLNoSQL - Not Only SQL
NoSQL - Not Only SQL
 
Unit II -BIG DATA ANALYTICS.docx
Unit II -BIG DATA ANALYTICS.docxUnit II -BIG DATA ANALYTICS.docx
Unit II -BIG DATA ANALYTICS.docx
 
Ccs334 Big data analytics UNIT II ppt notes
Ccs334   Big data analytics UNIT II ppt notesCcs334   Big data analytics UNIT II ppt notes
Ccs334 Big data analytics UNIT II ppt notes
 
Introduction to NoSQL
Introduction to NoSQLIntroduction to NoSQL
Introduction to NoSQL
 
Why no sql ? Why Couchbase ?
Why no sql ? Why Couchbase ?Why no sql ? Why Couchbase ?
Why no sql ? Why Couchbase ?
 
NoSQL.pptx
NoSQL.pptxNoSQL.pptx
NoSQL.pptx
 
NOsql Presentation.pdf
NOsql Presentation.pdfNOsql Presentation.pdf
NOsql Presentation.pdf
 

More from Dr-Dipali Meher

Database Security Methods, DAC, MAC,View
Database Security Methods, DAC, MAC,ViewDatabase Security Methods, DAC, MAC,View
Database Security Methods, DAC, MAC,ViewDr-Dipali Meher
 
Version Stamps in NOSQL Databases
Version Stamps in NOSQL DatabasesVersion Stamps in NOSQL Databases
Version Stamps in NOSQL DatabasesDr-Dipali Meher
 
Formulation of Research Design
Formulation of Research DesignFormulation of Research Design
Formulation of Research DesignDr-Dipali Meher
 
Research Methodology-Intorduction
Research Methodology-IntorductionResearch Methodology-Intorduction
Research Methodology-IntorductionDr-Dipali Meher
 
Introduction to Research
Introduction to ResearchIntroduction to Research
Introduction to ResearchDr-Dipali Meher
 
Schema migrations in no sql
Schema migrations in no sqlSchema migrations in no sql
Schema migrations in no sqlDr-Dipali Meher
 
Naive bayesian classification
Naive bayesian classificationNaive bayesian classification
Naive bayesian classificationDr-Dipali Meher
 
Data mining an introduction
Data mining an introductionData mining an introduction
Data mining an introductionDr-Dipali Meher
 

More from Dr-Dipali Meher (14)

Database Security Methods, DAC, MAC,View
Database Security Methods, DAC, MAC,ViewDatabase Security Methods, DAC, MAC,View
Database Security Methods, DAC, MAC,View
 
Version Stamps in NOSQL Databases
Version Stamps in NOSQL DatabasesVersion Stamps in NOSQL Databases
Version Stamps in NOSQL Databases
 
DataPreprocessing.pptx
DataPreprocessing.pptxDataPreprocessing.pptx
DataPreprocessing.pptx
 
Literature Review
Literature ReviewLiterature Review
Literature Review
 
Research Problem
Research ProblemResearch Problem
Research Problem
 
Formulation of Research Design
Formulation of Research DesignFormulation of Research Design
Formulation of Research Design
 
Types of Research
Types of ResearchTypes of Research
Types of Research
 
Research Methodology-Intorduction
Research Methodology-IntorductionResearch Methodology-Intorduction
Research Methodology-Intorduction
 
Introduction to Research
Introduction to ResearchIntroduction to Research
Introduction to Research
 
Neo4j session
Neo4j sessionNeo4j session
Neo4j session
 
Schema migrations in no sql
Schema migrations in no sqlSchema migrations in no sql
Schema migrations in no sql
 
Naive bayesian classification
Naive bayesian classificationNaive bayesian classification
Naive bayesian classification
 
Data mining an introduction
Data mining an introductionData mining an introduction
Data mining an introduction
 
Function Pointer
Function PointerFunction Pointer
Function Pointer
 

Recently uploaded

A Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformA Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformChameera Dedduwage
 
Science 7 - LAND and SEA BREEZE and its Characteristics
Science 7 - LAND and SEA BREEZE and its CharacteristicsScience 7 - LAND and SEA BREEZE and its Characteristics
Science 7 - LAND and SEA BREEZE and its CharacteristicsKarinaGenton
 
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991RKavithamani
 
Solving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxSolving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxOH TEIK BIN
 
Presiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsPresiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsanshu789521
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Educationpboyjonauth
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeThiyagu K
 
PSYCHIATRIC History collection FORMAT.pptx
PSYCHIATRIC   History collection FORMAT.pptxPSYCHIATRIC   History collection FORMAT.pptx
PSYCHIATRIC History collection FORMAT.pptxPoojaSen20
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingTechSoup
 
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...Marc Dusseiller Dusjagr
 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxpboyjonauth
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactdawncurless
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Sapana Sha
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxSayali Powar
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxheathfieldcps1
 
Separation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesSeparation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesFatimaKhan178732
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...EduSkills OECD
 
Alper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentAlper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentInMediaRes1
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityGeoBlogs
 

Recently uploaded (20)

A Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformA Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy Reform
 
Science 7 - LAND and SEA BREEZE and its Characteristics
Science 7 - LAND and SEA BREEZE and its CharacteristicsScience 7 - LAND and SEA BREEZE and its Characteristics
Science 7 - LAND and SEA BREEZE and its Characteristics
 
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
 
Solving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxSolving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptx
 
Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1
 
Presiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsPresiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha elections
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Education
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and Mode
 
PSYCHIATRIC History collection FORMAT.pptx
PSYCHIATRIC   History collection FORMAT.pptxPSYCHIATRIC   History collection FORMAT.pptx
PSYCHIATRIC History collection FORMAT.pptx
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy Consulting
 
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptx
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impact
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
 
Separation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesSeparation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and Actinides
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
 
Alper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentAlper Gobel In Media Res Media Component
Alper Gobel In Media Res Media Component
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activity
 

Introduction to NoSQL

  • 1. Source: NoSQL Distilled Prepared by Dr. Dipali Meher Introduction to NoSQL Not Only SQL Dr. Dipali Meher Assistant Professor Modern College of Arts, Science and Commerce, Ganeshkhind, Pune 411016 mailtomeher@gmail.com/dipalimeher@moderncollegegk.org MCS, M.Phil,NET,Ph.D 1
  • 2. Source: NoSQL Distilled Prepared by Dr. Dipali Meher Agenda • Introduction • Why No SQL? • Aggregate data models • Data Modeling Details • Distribution models • Consistency • Version stamps • Map- reduce 2
  • 3. Source: NoSQL Distilled Prepared by Dr. Dipali Meher 3
  • 4. Source: NoSQL Distilled Prepared by Dr. Dipali Meher Introduction • A NO SQL originally referring to non SQL or non relational is a database that provides a mechanism for storage and retrieval of data. • tabular relations used in relational databases • Such databases came into existence in the late 1960s, • Used in real-time web applications and big data • Sometimes called Not only SQL to emphasize the fact that they may support SQL-like query languages. • Example: MarkLogic, Aerospike, FairCom c-treeACE, Google Spanner (though technically a NewSQL database), Symas LMDB, and OrientDB have made them central to their designs. 4
  • 5. Source: NoSQL Distilled Prepared by Dr. Dipali Meher 5
  • 6. Source: NoSQL Distilled Prepared by Dr. Dipali Meher 6
  • 7. Source: NoSQL Distilled Prepared by Dr. Dipali Meher JSON Format • JSON stands for JavaScript Object Notation. • JSON objects are used for transferring data between server and client, XML serves the same purpose. However JSON objects have several advantages over XML and we are going to discuss them in this tutorial along with JSON concepts and its usages. • Example JSON DB • var chaitanya = { "firstName" : "Chaitanya", "lastName" : "Singh", "age" : "28" }; 7
  • 8. Source: NoSQL Distilled Prepared by Dr. Dipali Meher Features of JSON • It is light-weight • It is language independent • Easy to read and write • Text based, human readable data exchange format 8
  • 9. Source: NoSQL Distilled Prepared by Dr. Dipali Meher Why use JSON? • Standard Structure: As we have seen so far that JSON objects are having a standard structure that makes developers job easy to read and write code, because they know what to expect from JSON. • Light weight: When working with AJAX, it is important to load the data quickly and asynchronously without requesting the page re-load. Since JSON is light weighted, it becomes easier to get and load the requested data quickly. • Scalable: JSON is language independent, which means it can work well with most of the modern programming language. Let’s say if we need to change the server side language, in that case it would be easier for us to go ahead with that change as JSON structure is same for all the languages. 9
  • 10. Source: NoSQL Distilled Prepared by Dr. Dipali Meher Difference as example between JSON and XML Style DB JSON style: XML style: {"students": [ {"name":"John", "age":"23", "city":"Agra"}, {"name":"Steve", "age":"28", "city":"Delhi"}, {"name":"Peter", "age":"32", "city":"Chennai"}, {"name":"Chaitanya", "age":"28", "city":"Bangalore"} ]} <students> <student> <name>John</name> <age>23</age> <city>Agra</city> </student> <student> <name>Steve</name> <age>28</age> <city>Delhi</city> </student> <student> <name>Peter</name> <age>32</age> <city>Chennai</city> </student> <student> <name>Chaitanya</name> <age>28</age> <city>Bangalore</city> </student> </students> 10
  • 11. Source: NoSQL Distilled Prepared by Dr. Dipali Meher Limitations of Relational DB • In relational database we need to define structure and schema of data first and then only we can process the data. • Relational database systems provides consistency and integrity of data by enforcing ACID properties (Atomicity, Consistency, Isolation and Durability ). There are some scenarios where this is useful like banking system. However in most of the other cases these properties are significant performance overhead and can make your database response very slow. • Most of the applications store their data in JSON format and RDBMS don’t provide you a better way of performing operations such as create, insert, update, delete etc on this data. • On the other hand NoSQL store their data in JSON format, which is compatible with most of the today’s world application. 11
  • 12. Source: NoSQL Distilled Prepared by Dr. Dipali Meher RDBMSVs NoSQL • RDBMS: It is a structured data that provides more functionality but gives less performance. • NoSQL: Structured or semi structured data, less functionality and high performance. 12
  • 13. Source: NoSQL Distilled Prepared by Dr. Dipali Meher 13
  • 14. Source: NoSQL Distilled Prepared by Dr. Dipali Meher 14
  • 15. Source: NoSQL Distilled Prepared by Dr. Dipali Meher So when I say less functionality in NoSQL what’s missing: • You can’t have constraints in NoSQL • Joins are not supported in NoSQL • These supports actually hinders the scalability of a database, so while using NoSQL database like MongoDB, you can implements these functionalities at the application level. 15 When to go for NoSQL:  When you would want to choose NoSQL over relational database:  When you want to store and retrieve huge amount of data.  The relationship between the data you store is not that important  The data is not structured and changing over time  Constraints and Joins support is not required at database level  The data is growing continuously and you need to scale the database regular to handle the data.
  • 16. Source: NoSQL Distilled Prepared by Dr. Dipali Meher Why NO SQL? • NoSQL databases are different than relational databases like MySQL. • In relational database you need to create the table, define schema, set the data types of fields etc before you can actually insert the data. • In NoSQL you don’t have to worry about that, you can insert, update data on the fly. • One of the advantage of NoSQL database is that they are really easy to scale and they are much faster in most types of operations that we perform on database. • There are certain situations where you would prefer relational database over NoSQL, however when you are dealing with huge amount of data then NoSQL database is your best choice. 16
  • 17. Source: NoSQL Distilled Prepared by Dr. Dipali Meher Introduction Continued…. • includes simplicity of design • Simpler horizontal scaling to clusters of machines • finer control over availability • The data structures used by NOSQL databases are different from those used by default in relational databases which makes some operations faster in NoSQL. • Data Structures used in NO SQL language are flexible 17
  • 18. Source: NoSQL Distilled Prepared by Dr. Dipali Meher Differentiate between SQL and NOSQL 18
  • 19. Source: NoSQL Distilled Prepared by Dr. Dipali Meher Barriers to NO SQL • Low-level query languages • lack of standardized interfaces • huge previous investments in existing relational databases • Lacks true ACID(Atomicity, Consistency, Isolation, Durability) properties 19
  • 20. Source: NoSQL Distilled Prepared by Dr. Dipali Meher Types of NO SQL DB • MongoDB falls in the category of NoSQL document based database. • Key value store: Memcached, Redis, Coherence • Tabular: Hbase, Big Table, Accumulo • Document based: MongoDB, CouchDB, Cloudant 20
  • 21. Source: NoSQL Distilled Prepared by Dr. Dipali Meher Other problems faced by NO SQL • stale reads problem- Most NoSQL databases offer a concept of eventual consistency in which database changes are propagated to all nodes so queries for data might not return updated data immediately or might result in reading data that is not accurate which is a problem known as stale reads. • NO SQL may exhibit lost writes and other forms of data loss. • Data consistency is bigger challenge 21
  • 22. Source: NoSQL Distilled Prepared by Dr. Dipali Meher Advantages • High scalability- NO SQL DB uses sharding for horizontal scaling. Partitioning of data and placing it on multiple machines in such a way that the order of the data is preserved is sharding. • Vertical scaling means adding more resources to the existing machine • Horizontal scaling means adding more machines to handle the data. Vertical scaling is not that easy to implement but horizontal scaling is easy to implement. • Examples of horizontal scaling databases are MongoDB, Cassandra etc. • NoSQL can handle huge amount of data because of scalability, as the data grows NoSQL scale itself to handle that data in efficient manner. • High availability-replication feature in NoSQL databases makes it highly available because in case of any failure data replicates itself to the previous consistent state. 22
  • 23. Source: NoSQL Distilled Prepared by Dr. Dipali Meher Disadvantages of NO SQL • Narrow focus-NoSQL databases have very narrow focus as it is mainly designed for storage but it provides very little functionality. Relational databases are a better choice in the field of Transaction Management than NoSQL. • Open source- It is open-source database. There is no reliable standard for NoSQL yet. In other words two database systems are likely to be unequal. • Management Challenge- he purpose of big data tools is to make management of a large amount of data as simple as possible. But it is not so easy. Data management in NoSQL is much more complex than a relational database. NoSQL, in particular, has a reputation for being challenging to install and even more hectic to manage on a daily basis. • GUI is not available- GUI mode tools to access the database is not flexibly available in the market. • Backup- Backup is a great weak point for some NoSQL databases like MongoDB. MongoDB has no approach for the backup of data in a consistent manner. • Large document size-Some database systems like MongoDB and CouchDB store data in JSON format. Which means that documents are quite large (BigData, network bandwidth, speed), and having descriptive key names actually hurts, since they increase the document size. 23
  • 24. Source: NoSQL Distilled Prepared by Dr. Dipali Meher When should NoSQL be used • When huge amount of data need to be stored and retrieved . • The relationship between the data you store is not that important • The data changing over time and is not structured. • Support of Constraints and Joins is not required at database level • The data is growing continuously and you need to scale the database regular to handle the data 24
  • 25. Source: NoSQL Distilled Prepared by Dr. Dipali Meher • successful technology for twenty years, providing persistence, concurrency control, and an integration mechanism. • Application developers have been frustrated with the impedance mismatch between the relational model and the in-memory data structures. • There is a movement away from using databases as integration points towards encapsulating databases within applications and integrating through services. • The vital factor for a change in data storage was the need to support large volumes of data by running on clusters. Relational databases are not designed to run efficiently on clusters. 25 RDBMS
  • 26. Source: NoSQL Distilled Prepared by Dr. Dipali Meher Impedance mismatch Impedance mismatch is the term used to refer to the problems that occurs due to differences between the database model and the programming language model. 26
  • 27. Source: NoSQL Distilled Prepared by Dr. Dipali Meher NO SQL • NoSQL is an accidental neologism.There is no prescriptive definition—all you can make is an observation of common characteristics. • The common characteristics of NoSQL databases are • Not using the relational model • Running well on clusters • Open-source • Built for the 21st century web estates • Schemaless • The most important result of the rise of NoSQL is Polyglot Persistence 27
  • 28. Source: NoSQL Distilled Prepared by Dr. Dipali Meher Aggregate Data Models • An aggregate is a collection of data that we interact with as a unit. • These units of data or aggregates form the boundaries for ACID operations with the database, Key-value, Document, and Column-family databases can all be seen as forms of aggregate-oriented database. 28
  • 29. Source: NoSQL Distilled Prepared by Dr. Dipali Meher Data Model • A data model is the model through which we perceive and manipulate our data • The data model describes how we interact with the data in the database • A data model (or datamodel)is an abstract model that organizes elements of data and standardizes how they relate to one another and to the properties of real-world entities. For instance, a data model may specify that the data element representing a car be composed of a number of other elements which, in turn, represent the color and size of the car and define its owner. • concepts such as entities, attributes, relations, or tables. 29
  • 30. Source: NoSQL Distilled Prepared by Dr. Dipali Meher • Data models are distinct form storage models. • Storage models describes how the database stores and manipulates the data internally. • A storage model is a model that captures key physical aspects of data structure in a data store. 30
  • 31. Source: NoSQL Distilled Prepared by Dr. Dipali Meher 31 Storage model
  • 32. Source: NoSQL Distilled Prepared by Dr. Dipali Meher 32
  • 33. Source: NoSQL Distilled Prepared by Dr. Dipali Meher • Ideally we should be ignorant of the storage model, but in practice we need at least some inkling (impact of thing after it over)of it—primarily to achieve decent ( acceptable standard )performance. 33 “data model” often means the model of the specific data in an application. A developer might point to an entity- relationship diagram of their database and refer to that as their data model containing customers, orders, products Metamodel :the model by which the database organizes data
  • 34. Source: NoSQL Distilled Prepared by Dr. Dipali Meher Aggregates •It recognizes that often, you want to operate on data in units that have a more complex structure than a set of tuples. It can be handy to think in terms of a complex record that allows lists and other record structures to be nested inside it 34 complex record = aggregate Programmers manipulate data through aggregate structures
  • 35. Source: NoSQL Distilled Prepared by Dr. Dipali Meher • Domain-Driven Design • aggregate is a collection of related objects treated as unit • it is a unit for data manipulation and management of consistency • Aggregates will be updated with atomic operations • key-value, document, and column-family databases will do this. • When databases are operating in cluster using of these Aggregate will be easy • why easy (aggregate makes a natural unit for replication and sharding) 35
  • 36. Source: NoSQL Distilled Prepared by Dr. Dipali Meher 36
  • 37. Source: NoSQL Distilled Prepared by Dr. Dipali Meher 37
  • 38. Source: NoSQL Distilled Prepared by Dr. Dipali Meher Sharding 38
  • 39. Source: NoSQL Distilled Prepared by Dr. Dipali Meher 39
  • 40. Source: NoSQL Distilled Prepared by Dr. Dipali Meher Relations and Aggregates: example • ecommerce website: relational databse 40
  • 41. Source: NoSQL Distilled Prepared by Dr. Dipali Meher presents some sample data for this model. 41
  • 42. Source: NoSQL Distilled Prepared by Dr. Dipali Meher Aggregate oriented structure 42
  • 43. Source: NoSQL Distilled Prepared by Dr. Dipali Meher JSON format 43
  • 44. Source: NoSQL Distilled Prepared by Dr. Dipali Meher • In this model, we have two main aggregates: customer and order. 44
  • 45. Source: NoSQL Distilled Prepared by Dr. Dipali Meher 45 Embed all the objects for customer and the customer’s orders
  • 46. Source: NoSQL Distilled Prepared by Dr. Dipali Meher 46
  • 47. Source: NoSQL Distilled Prepared by Dr. Dipali Meher • there’s no universal answer for how to draw your aggregate boundaries. • It depends entirely on how you tend to manipulate your data. • If you tend to access a customer together with all of that customer’s orders at once, then you would prefer a single aggregate. • However, if you tend to focus on accessing a single order at a time, then you should prefer having separate aggregates for each order. • Naturally, this is very context-specific; some applications will prefer one or the other, even within a single system, which is exactly why many people prefer aggregate ignorance 47
  • 48. Source: NoSQL Distilled Prepared by Dr. Dipali Meher Summary of aggregate data models • An aggregate is a collection of data that we interact with as a unit. Aggregates form the boundaries forACID operations with the database. • Key-value, document, and column-family databases can all be seen as forms of aggregate oriented database. • Aggregates make it easier for the database to manage data storage over clusters. • Aggregate-oriented databases work best when most data interaction is done with the same aggregate; aggregate- ignorant databases are better when interactions use data organized in many different formations. 48
  • 49. Source: NoSQL Distilled Prepared by Dr. Dipali Meher Aggregate Data Models Continued… • Aggregates make it easier for the database to manage data storage over clusters, since the unit of data now could reside on any machine and when retrieved from the database gets all the related data along with it. • Aggregate-oriented databases work best when most data interaction is done with the same aggregate, • for example when there is need to get an order and all its details, it better to store order as an aggregate object but dealing with these aggregates to get item details on all the orders is not elegant. • Aggregate-oriented databases make inter-aggregate relationships more difficult to handle than intra-aggregate relationships. • Aggregate-ignorant databases are better when interactions use data organized in many different formations. • Aggregate-oriented databases often compute materialized views to provide data organized differently from their primary aggregates. This is often done with map-reduce computations, such as a map-reduce job to get items sold per day. 49
  • 50. Source: NoSQL Distilled Prepared by Dr. Dipali Meher Details of Data Models • Relationships • Graph Databases • Schemaless databases • MaterializedViews • Modeling for Data Access 50
  • 51. Source: NoSQL Distilled Prepared by Dr. Dipali Meher Aggregates are a central part of the NoSQL story 51
  • 52. Source: NoSQL Distilled Prepared by Dr. Dipali Meher Relationships • Create the aggregates for commonly accessed data. And put all these aggregates together. • In real life this might happen that aggregates access on common data might be accessed differently. • Example: one customer is having many orders Some applications will want to access the order history whenever they access the customer; this fits in well with combining the customer with his order history into a single aggregate. Other applications, however, want to process orders individually and thus model orders as independent aggregates. In this situation customer and order aggregate are separated but keep the same relation ship and(one customer many orders) many databases—even key-value stores—provide ways to make these relationships visible to the database. . Document stores make the content of the aggregate available to the database to form indexes and queries. Riak, a key-value store, allows you to put link information in metadata, supporting partial retrieval and link-walking capability. 52
  • 53. Source: NoSQL Distilled Prepared by Dr. Dipali Meher Important aspect about relationship and aggregates • How updates are handled? • Aggregate oriented databases treat the aggregate as the unit of data-retrieval. • Consequently, atomicity is only supported within the contents of a single aggregate. • If you update multiple aggregates at once, you have to deal yourself with a failure partway through. • Relational databases help you with this by allowing you to modify multiple records in a single transaction, providingACID guarantees while altering many rows. 53
  • 54. Source: NoSQL Distilled Prepared by Dr. Dipali Meher • So when database contains lots of relationships go for RDBMS. 54
  • 55. Source: NoSQL Distilled Prepared by Dr. Dipali Meher Graph Databases • Graph databases are an odd fish in the NoSQL pond 55
  • 56. Source: NoSQL Distilled Prepared by Dr. Dipali Meher • Most of the NOSQL databases run on clusters and are aggregate oriented. • These aggregate data models are of large records with simple connections. • In case of graph databases there are small records with complex interconnections. See example in next slide. 56
  • 57. Source: NoSQL Distilled Prepared by Dr. Dipali Meher 57 a graph isn’t a bar chart or histogram; instead, we refer to a graph data structure of nodes connected by edges
  • 58. Source: NoSQL Distilled Prepared by Dr. Dipali Meher • There is difference between graph databases and relational database queries. In case of graph databases we have to keep in mind graphical network structure and then ask the query. In RDBMS we have to keep schema in mind(like foreign keys, the join) • In graphical query languages user can find answer then query by navigating through network of edges. • Relationships makes graph databases very different from aggregate-oriented databases query work to be navigating (to shows directions)relationships. 58
  • 59. Source: NoSQL Distilled Prepared by Dr. Dipali Meher Navigation in graph databases 59
  • 60. Source: NoSQL Distilled Prepared by Dr. Dipali Meher • The emphasis on relationships makes graph databases very different from aggregate-oriented databases. 60
  • 61. Source: NoSQL Distilled Prepared by Dr. Dipali Meher Schemaless Databases • A common theme across all the forms of NoSQL databases is that they are schemaless. 61 • NoSQL  storing data is much more casual. • A key-value store allows you to store any data you like under a key. • A document database effectively does the same thing, since it makes no restrictions on the structure of the documents you store. • Column-family databases allow you to store any data under any column you like. • Graph databases allow you to freely add new edges and freely add properties to nodes and edges as you wish.
  • 62. Source: NoSQL Distilled Prepared by Dr. Dipali Meher Schemaless databases • freedom and flexibility • With schema  figure out in advance what you need to store/ document it / diagram it which is hard to do • Without schema is no binding  User can easily change your data storage as you learn more about your project. • User can easily add new things as you discover them. • If user donot want to store more attributes in database or any rows in database then tis is allowed in NoSQL 62 a schemaless store also makes it easier to deal with nonuniform data:
  • 63. Source: NoSQL Distilled Prepared by Dr. Dipali Meher Schemaless databases: Nonuniform data • data where each record has a different set of fields. • A schema puts all rows of a table into a straightjacket, which becomes awkward if you have different kinds of data in different rows.You either end up with lots of columns that are usually null (a sparse table), or you end up with meaningless columns like custom column. 63 Schemalessness avoids this, allowing each record to contain just what it needs—no more, no less
  • 64. Source: NoSQL Distilled Prepared by Dr. Dipali Meher In schemaless database implicit schemas are present. • implicit schema is a set of assumptions about the data’s structure in the code that manipulates the data. 64
  • 65. Source: NoSQL Distilled Prepared by Dr. Dipali Meher At last Schemaless means what? 65
  • 66. Source: NoSQL Distilled Prepared by Dr. Dipali Meher MaterializedViews • View in RDBMS • Views provide a mechanism to hide from the client whether data is derived data or base data—but can’t avoid the fact that some views are expensive to compute. • To cope with this, materialized views were invented, which are views that are computed in advance and cached on disk. • Materialized views are effective for data that is read heavily but can stand being somewhat stale(in real life it is nothing but tasteleass in database it is just for view purpose no DDL AND DML FORTHATVIEW). • NoSQL databases don’t have views, they may have precomputed and cached queries, and they reuse the term “materialized view” to describe them. •MAP REDUCETECHNIQUE IS USED • Map-reduce is a data processing paradigm for condensing large volumes of data into useful aggregated results. Materialized views can be used within the same aggregate. 66
  • 67. Source: NoSQL Distilled Prepared by Dr. Dipali Meher 2 main ways to build the materialized view • Eager approach: user can update the materialized view at the same time you update the base data for it. This approach is good when you have more frequent reads of the materialized view than you have writes and you want the materialized views to be as fresh as possible • Application database: user can do any updates to base data also update materialized views. • outside of the database by reading the data, computing the view, and saving it back to the database. 67 If you don’t want to pay that overhead on each update, you can run batch jobs to update the materialized views at regular intervals.Views are updated with MAP REDUCETECHNIQUE
  • 68. Source: NoSQL Distilled Prepared by Dr. Dipali Meher MAP REDUCE • A MapReduce job usually splits the input data-set into independent chunks which are processed by the map tasks in a completely parallel manner. • The framework sorts the outputs of the maps, which are then input to the reduce tasks. • Typically both the input and the output of the job are stored in a file-system. 68
  • 69. Source: NoSQL Distilled Prepared by Dr. Dipali Meher MAP REDUCE 69
  • 70. Source: NoSQL Distilled Prepared by Dr. Dipali Meher MAP REDUCE 70
  • 71. Source: NoSQL Distilled Prepared by Dr. Dipali Meher Key points • Aggregate-oriented databases make inter-aggregate relationships more difficult to handle than intra-aggregate relationships. • Graph databases organize data into node and edge graphs; they work best for data that has complex relationship structures. • Schemaless databases allow you to freely add fields to records, but there is usually an implicit schema expected by users of the data. • Aggregate-oriented databases often compute materialized views to provide data organized differently from their primary aggregates.This is often done with map-reduce computations. 71
  • 72. Source: NoSQL Distilled Prepared by Dr. Dipali Meher ThankYou 72

Editor's Notes

  1. The dominant data model of the last couple of decades is the relational data model, which is best visualized as a set of tables, rather like a page of a spreadsheet. Each table has rows, with each row representing some entity of interest. We describe this entity through columns, each having a single value. A column may refer to another row in the same or different table, which constitutes a relationship between those entities. (We’re using informal but common terminology when we speak of tables and rows; the more formal terms would be relations and tuples.) NoSQL is a move away from the relational model. Each NoSQL solution has a different model that it uses, which we put into four categories widely used in the NoSQL ecosystem: key-value, document, column-family, and graph. Of these, the first three share a common characteristic of their data models which we will call aggregate orientation.
  2. The relational model takes the information that we want to store and divides it into tuples (rows). A tuple is a limited data structure: It captures a set of values, so you cannot nest one tuple within another to get nested records, nor can you put a list of values or tuples within another. This simplicity underpins the relational model—it allows us to think of all operations as operating on and returning tuples.
  3. A database shard, or simply a shard, is a horizontal partition of data in a database or search engine. Each shard is held on a separate database server instance, to spread load. Sharding is a database architecture pattern related to horizontal partitioning — the practice of separating one table's rows into multiple different tables, known as partitions. Sharding and partitioning are both about breaking up a large data set into smaller subsets. The difference is that sharding implies the data is spread across multiple computers while partitioning does not.  Partitioning is about grouping subsets of data within a single database instance. Sharding is necessary if a dataset is too large to be stored in a single database. Moreover, many sharding strategies allow additional machines to be added.  Sharding allows a database cluster to scale along with its data and traffic growth.  Sharding is also referred as horizontal partitioning. A shard is one horizontal partition in a table, relation, or database.  The difference between a shard and horizontal partitioning is that the shard is located on a separate network node.  The benefit of sharding is that you will have less data on each node, so the data will be smaller, more likely to be held in cache, and the indexes will be smaller.   Sharding is not so useful for graph databases. The highly connected nature of nodes and edges in a typical graph database can make it difficult to partition the data effectively. Many graph databases do not provide facilities for edges to reference nodes in different databases. For these databases, scaling up rather than scaling out may be a better option.  I'll talk more about graph databases in a future NoSQL blog post.  
  4. we are going to be selling items directly to customers over the web, and we will have to store information about users, our product catalog, orders, shipping addresses, billing addresses, and payment data. We can use this scenario to model the data using a relation data store as well as NoSQL data stores and talk about their pros and cons. For a relational database, we might start with a data model shown in above figure
  5. The customer contains a list of billing addresses; the order contains a list of order items, a shipping address, and payments. The payment itself contains a billing address for that payment. A single logical address record appears three times in the example data, but instead of using IDs it’s treated as a value and copied each time. This fits the domain where we would not want the shipping address, nor the payment’s billing address, to change. In a relational database, we would ensure that the address rows aren’t updated for this case, making a new row instead. With aggregates, we can copy the whole address structure into the aggregate as we need to. The link between the customer and the order isn’t within either aggregate—it’s a relationship between aggregates. Similarly, the link from an order item would cross into a separate aggregate structure for products, which we haven’t gone into. We’ve shown the product name as part of the order item here—this kind of denormalization is similar to the tradeoffs with relational databases, but is more common with aggregates because we want to minimize the number of aggregates we access during a data interaction. The important thing to notice here isn’t the particular way we’ve drawn the aggregate boundary so much as the fact that you have to think about accessing that data—and make that part of your thinking when developing the application data model. Indeed we could draw our aggregate boundaries differently, putting all the orders for a customer into the customer aggregate
  6. When you want to store data in a relational database, you first have to define a schema—a defined structure for the database which says what tables exist, which columns exist, and what data types each column can hold. Before you store some data, you have to have the schema defined for it.
  7. Having the implicit schema in the application code results in some problems. It means that in order to understand what data is present you have to dig into the application code. If that code is well structured you should be able to find a clear place from which to deduce the schema. But there are no guarantees; it all depends on how clear the application code is
  8. MapReduce is a programming model or pattern within the Hadoop framework that is used to access big data stored in the Hadoop File System (HDFS).