NoSQL is known as Not only SQL database, provides a mechanism for storage and retrieval of data.
In this section is discussing about two data models.
Aggregate Data Models
Distribution Data Models
Key-Value data model, Document data model, Column-family stores and Graph database are come under Aggregate data Models
Distribution data Models are Sharding, Master-slave replication and Peer-peer replication
2. INTRODUCTION
TO NoSQL•NoSQL, known as Not only SQL database, provides a
mechanism for storage and retrieval of data
•NoSQL databases are used in real-time web applications
and big data
•Most of the NoSQL are open source and it has a
capability of horizontal scalability which means that
commodity kind of machines could be added
•It is schema free and there is no requirement to design the
tables and pushing the data to it
3. AGGREGATE
DATA MODELS•Aggregate is a term that comes from
DDD(Domain-Driven Design)
•In DDD, an aggregate is a collection of data that
we interact with as a unit.
•Aggregates make it easier for the database to
manage data storage over clusters.
•4 aggregate data models – Key-value,
Document, Graph and Column-family
4. KEY-VALUE
DATA MODEL•The aggregate is
opaque-that is we can
store whatever we like
in the aggregate
•We can only access an
aggregate by lookup
based on its key
•The key-value
database is a very
DOCUMENT
DATA MODEL•It is able to see a
structure in the
aggregate but imposes
limits on what we can
place in it. We get more
flexibility when accessing
data
•We can submit queries to
the database based on
the fields in the aggregate
and retrieve part of the
5. COLUMN-
FAMILY STORES•These are created to store and process very
large amounts of data distributed over many
machines
•Column-family stores are modeled on
Google’s Big Table
•The first key is often described as a Row
Identifier
6. GRAPH
DATABASE•A graph database is a big dense network structure
•It uses sophisticated shortest path algorithms to
make data queries more efficient
•Graph databases take document databases to the
extreme by introducing the concept of type
relationships between documents or nodes. The
most common example is the relationship
between people on a social network such as
7. NoSQL databases are
Schemaless :
•A key-value store allows you to store any data you like
under a key
•A document database effectively does the same thing ,
since it makes no restrictions on the structure of the
documents you store
•Column-family database allow you to store any data
8. SCHEMALESS
DATABASE•Easily store whatever you need and add
new things as you discover them
•Makes it easier to deal with nonuniform
data: data where each record has a
different set of fields
•Having the implicit schema in the
application means that in order to
understand data, you have to dig into the
9. MATERIALIZED
VIEW•NoSQL databases don’t have views as relational
databases, they may have precomputed and
cached queries – Materialized Views
•2 strategies to manage materialized views
oUpdate the materialized view at the same time
you update the base data for it
oRun batch jobs to update the materialized
views at regular intervals
10. DISTRIBUTION
DATA MODELS•The primary driver of interest in NoSQL has been
its ability to run databases on a large cluster
•Aggregate orientation fits well with scaling out
because the aggregate is a natural unit to use for
distribution
•Various models for data distribution :
•Sharding
•Master-slave replication
11. SHARDING•Sharding is the technique of putting different parts of the
data onto different servers
•It is valuable for performance because it can improve both
read and write performances
•2 main issues in sharding :
•How to clump the data, so that one user mostly gets her
data from a single server
•How to arrange single data clumps on the nodes to
provide the best data access
12. MASTER-SLAVE
REPLICATION•It is most helpful for scaling when you have
a read-intensive dataset.
•One node is designed as the master and
is responsible for processing any updates to
that data.
•Other nodes are slaves. Replication
process synchronizes the slaves with the
master.
13. PEER-TO-PEER
REPLICATION•With a peer-to-peer replication cluster, you
can ride over node failures without losing
access to data. You can easily add nodes
to improve performance.
•The biggest complication is consistency.
When you write to two different places, you
run the risk that two people will attempt to
update the same record at the same time: a