NoSQL Talk at eBuddy

Agenda
What is NoSQL
Databases Overview
Aggregate Data Models
Distributions Models
Consistency
NWR

Purpose of this talk
Just to share some information
To spend time nicely
Facilitate the discussion
(questions are welcome )

Rise of NoSQL
Inspired by 2 papers:
Amazon Dynamo
Google BigTable

What is NoSQL
Not a well defined term
(just the name of one single meetup in
2009 at San Francisco)

So, what does it stand for?
It is better to pay attention what does it
mean rather than what does it stand for

Common characteristics of
NoSQL
● Don't use SQL as a query language
(provide it is own query mechanism)
● Non relational
● Open-source projects
● Run on clusters
● Developed in 21st
century
● Schemaless

Schemaless
While being schemaless, there is still
implicit schema in the application code

Why do you use NoSQL
To operate on big data on multiple
machines running across the cluster
Increase developer productivity
(even if there is no demand for big data)

What is wrong with traditional
RDBMS
● Nothing really, they will not disappear
(who knows ;)
● Well defined tools
(even the whole profession is behind
DBA)
● There is no black or white choice, NoSQL
and RDBMS will continue to work closely
together, i.e. the rise of Polyglot
Persistence

But, RDBMS is not perfect
Impedance mismatch
Running on cluster is a challenge

NoSQL World (major ones)
Document Oriented
Key-Value
Column-Family
Graph Databases

Data Model
Aggregate Oriented VS Relational
- Access by key
- Make it easier to manage data storage over
clusters
- Usually you adopt you aggregate/data model to
the query pattern your application has
Aggregate – is the collection of related objects that we wish to treat as a unit

ACID
NoSQL has ACID, but in scope of one
aggregate
(we can do atomic manipulate of a single
aggregate at a time)
Graph databases actually have full support of ACID

Distribution Models
● Single Sever (no distribution at all)
● Sharding (can be combined with replication)
(shard key – range based or hash based)
● Master-Slave Replication (“read” scalability)
(writes to M, reads can be done from S)
(M – single point of failure)
● Peer-to-Peer Replication (common to CF)
(consistency issue)

(Eventual)Consistency
Actual trade off is between latency and consitency

NWR
● N – number of nodes to replicate to
(replication factor, number of copies in
the cluster)
● W – number of nodes to write before write
succeeded successful
● R – number of nodes to read from before
read succeeded successful

NWR
● W+R <= N – eventual consistency
(eventually all the nodes in the cluster will get
the data)
● W = N, R = 1 – consistency by writes
(what RDBMS does)
● W = 1, R = N – consistency by reads
(conflicts must be resolved somehow)
● W + R > N – consistency by quorum

Quorum (W+R > N)
Read from more than half and
write to more than half
(QUORUM = N/2 + 1)

NoSQL Talk at eBuddy

More Related Content

Viewers also liked

Similar to NoSQL Talk at eBuddy

Recently uploaded

NoSQL Talk at eBuddy