NoSQL Databases: Why, what and when

Lorenzo Alberton
@lorenzoalberton

NoSQL Databases:
Why, what and when
NoSQL Databases Demystiﬁed

PHP UK Conference, 25th February 2011
1

NoSQL: Why
Scalability, Concurrency, New trends

2

New Trends

2002 2004 2006 2008 2010 2012

Big data

3

New Trends

2002 2004 2006 2008 2010 2012

Big data

Concurrency
3

New Trends

2002 2004 2006 2008 2010 2012

Big data Connectivity

Concurrency
3

New Trends

2002 2004 2006 2008 2010 2012

Big data Connectivity

Concurrency Diversity
3

New Trends

2002 2004 2006 2008 2010 2012

Big data Connectivity P2P Knowledge

Concurrency Diversity
3

New Trends

2002 2004 2006 2008 2010 2012

Big data Connectivity P2P Knowledge

Concurrency Diversity Cloud-Grid
3

What’s the problem with RDBMS’s?

http://www.codefutures.com/database-sharding

4

Caching
Master/Slave
Master/Master
Cluster
Table Partitioning
Federated Tables
Sharding
http://www.codefutures.com/database-sharding Distributed DBs
4


5


http://www.ﬂickr.com/photos/dimi3/3096166092 5

Quick Comparison
Overview from 10,000 feet
(random impressions from the interwebs)

6

Quick Comparison
Overview from 10,000 feet
(random impressions from the interwebs)

http://www.ﬂickr.com/photos/42433826@N00/4914337851

6

MongoDB is web-scale

...but /dev/null
is even better!
7

Cassandra is teh schnitz

..Love,v/null
.but /de
is even better
8

CouchDB: Relax!

.buve/a LO n ?
t de ulfree space
..Love,v/T of!l !?
You harenaieth?r Right?
l, r x te?
isyevay b g t
an we
9

No, seriously...*

(*) Not another “Mine is bigger” comparison, please

10

A little theory
Fundamental Principles
of (Distributed) Databases

http://www.timbarcz.com/blog/PassionInProgrammers.aspx

11

ACID

ATOMICITY: All or nothing
CONSISTENCY: Any transaction will take the db from one
consistent state to another, with no broken constraints
(referential integrity)

ISOLATION: Other operations cannot access data that has
been modiﬁed during a transaction that has not yet completed

DURABILITY: Ability to recover the committed transaction
updates against any kind of system failure (transaction log)
12

Isolation Levels, Locking & MVCC
Isolation noun
Property that deﬁnes how/when the changes made by one operation
become visible to other concurrent operations

13

Isolation noun

SERIALIZABLE
All transactions occur in a
completely isolated fashion, as
if they were executed serially

13

Isolation noun

SERIALIZABLE REPEATABLE READ
All transactions occur in a Multiple SELECT statements
completely isolated fashion, as issued in the same transaction
if they were executed serially will always yield the same
result

13

Isolation noun

result
READ COMMITTED
A lock is acquired only on the
rows currently read/updated

13

Isolation noun

result
READ COMMITTED READ UNCOMMITTED
A lock is acquired only on the A transaction can access
rows currently read/updated uncommitted changes made
by other transactions
13

Non-repeatable
Isolation Level Dirty Reads Phantoms
reads

Serializable - - -

Repeatable
- -
Read
Read
-
Committed
Read
Uncommitted

http://www.adayinthelifeof.nl/2010/12/20/innodb-isolation-levels/ 14


Isolation Level Range Lock Read Lock Write Lock

Serializable

Repeatable
-
Read
Read
- -
Committed
Read
- - -
Uncommitted

15

Multi-Version Concurrency Control
Root

Index

Index Index Index

Index Index Index Index Index Index Index
Data

Data

Data

Data

Data

Data

Data

Data

Data

Data

Data

Data
16

obsolete Root
new version

Index Index

Index Index Index Index

Index Index Index Index Index Index Index Index
Data

Data

Data

Data

Data

Data

Data

Data

Data

Data

Data

Data

Data
16

obsolete Root atomic pointer update
new version

Index Index


Data

Data

Data

Data

Data

Data

Data

Data

Data

Data

Data

Data

Data
16

new version marked for
compaction
Index Index


Data

Data

Data

Data

Data

Data

Data

Data

Data

Data

Data

Data

Data
16

new version marked for
compaction
Index Index
Reads:
never
blocked

Data

Data

Data

Data

Data

Data

Data

Data

Data

Data

Data

Data

Data
16

Distributed Transactions - 2PC
Coordinator
1) COMMIT
REQUEST
PHASE
(voting phase)

Participants

17

Coordinator
1) COMMIT
REQUEST
PHASE
(voting phase)
Query to commit

Participants

17

Coordinator
1) COMMIT
REQUEST
PHASE
(voting phase)

Participants
1) Exec Transaction up to the COMMIT request
2) Write entry to undo and redo logs
17

Coordinator
1) COMMIT
REQUEST
PHASE
(voting phase)
Agree
or
Abort

Participants

17

Coordinator
2) COMMIT
PHASE
(completion
phase)

a) SUCCESS
(agreement
from all)

Participants

18

Coordinator
2) COMMIT
PHASE
(completion
phase)
Commit
a) SUCCESS
(agreement
from all)

Participants

18

Coordinator
2) COMMIT
PHASE
(completion
phase)

a) SUCCESS
(agreement
from all)

Participants
1) Complete operation
2) Release locks
18

Coordinator
2) COMMIT
PHASE
(completion
phase)
Acknowledge
a) SUCCESS
(agreement
from all)

Participants

18

Coordinator
2) COMMIT
PHASE
Complete transaction (completion
phase)

a) SUCCESS
(agreement
from all)

Participants

18

Coordinator
2) COMMIT
PHASE
(completion
phase)

b) FAILURE
(abort from
any)

Participants

19

Coordinator
2) COMMIT
PHASE
(completion
phase)
Rollback
b) FAILURE
(abort from
any)

Participants

19

Coordinator
2) COMMIT
PHASE
(completion
phase)

b) FAILURE
(abort from
any)

Participants
1) Undo operation
2) Release locks
19

Coordinator
2) COMMIT
PHASE
(completion
phase)
Acknowledge
b) FAILURE
(abort from
any)

Participants

19

Coordinator
2) COMMIT
PHASE
Undo transaction (completion
phase)

b) FAILURE
(abort from
any)

Participants

19

Problems with 2PC

Blocking Protocol

Risk of indeﬁnite cohort Conservative behaviour
blocks if coordinator fails biased to the abort case

20

Paxos Algorithm (Consensus)
Family of Fault-tolerant, distributed implementations
Spectrum of trade-offs:
Number of processors
Number of message delays
Activity level of participants
Number of messages sent
Types of failures
http://www.usenix.org/event/nsdi09/tech/full_papers/yabandeh/yabandeh_html/

http://en.wikipedia.org/wiki/Paxos_algorithm 21

[PSE image alert]

22

ACID & Distributed Systems

http://images.tribe.net/tribe/upload/photo/deb/074/deb074db-81fc-4b8a-bfbd-b18b922885cb 23

ACID & Distributed Systems

ACID properties are always desirable

But what about:
Latency
Partition Tolerance
High Availability
?

http://images.tribe.net/tribe/upload/photo/deb/074/deb074db-81fc-4b8a-bfbd-b18b922885cb 23

CAP Theorem (Brewer’s conjecture)
2000 Prof. Eric Brewer, PoDC Conference Keynote
2002 Seth Gilbert and Nancy Lynch, ACM SIGACT News 33(2)

Of three properties of shared-data systems -
data Consistency, system Availability and
tolerance to network Partitions - only two can
be achieved at any given moment in time.

http://www.cs.berkeley.edu/~brewer/cs262b-2004/PODC-keynote.pdf http://lpd.epﬂ.ch/sgilbert/pubs/BrewersConjecture-SigAct.pdf

24

Partition Tolerance - Availability
“The network will be allowed to lose arbitrarily many messages
sent from one node to another” [...]
“For a distributed system to be continuously available, every
request received by a non-failing node in the system must result
in a response” - Gilbert and Lynch, SIGACT 2002

http://codahale.com/you-cant-sacriﬁce-partition-tolerance http://pl.atyp.us/wordpress/?p=2521 25


CP: requests can complete at nodes
that have quorum
AP: requests can complete at any
live node, possibly violating strong
consistency



CP: requests can complete at nodes
that have quorum
HIGH LATENCY
AP: requests can complete at any ≈
live node, possiblyPARTITION
NETWORK violating strong
consistency
http://dbmsmusings.blogspot.com/2010/04/problems-with-cap-and-yahoos-little.html


Consistency: Client-side view
A service that is consistent operates fully or not at all.
Strong consistency (as in ACID)
Weak consistency (no guarantee) - Inconsistency window

(*) Temporary inconsistencies
(e.g. in data constraints or
replica versions) are
accepted, but they’re resolved
at the earliest opportunity
http://www.allthingsdistributed.com/2008/12/eventually_consistent.html 26

Consistency: Client-side view
A service that is consistent operates fully or not at all.
Strong consistency (as in ACID)
Weak consistency (no guarantee) - Inconsistency window
Eventual* consistency (e.g. DNS)
Causal consistency
Read-your-writes consistency
(the least surprise)
Session consistency (*) Temporary inconsistencies
(e.g. in data constraints or
Monotonic read consistency replica versions) are
accepted, but they’re resolved
Monotonic write consistency at the earliest opportunity
http://www.allthingsdistributed.com/2008/12/eventually_consistent.html 26

Consistency: Server-side (Quorum)
N = number of nodes with a replica of the data
(*)
W = number of replicas that must acknowledge the update
R = minimum number of replicas that must participate in a
successful read operation
(*) but the data will be written to N nodes no matter what

W+R>N Strong consistency (usually N=3, W=R=2)
W = N, R =1 Optimised for reads
W = 1, R = N Optimised for writes
(durability not guaranteed in presence of failures)

W + R <= N Weak consistency
27

Amazon Dynamo Paper
Consistent Hashing

Vector Clocks

Gossip Protocol

Hinted Handoffs

Read Repair

http://s3.amazonaws.com/AllThingsDistributed/sosp/amazon-dynamo-sosp2007.pdf 28

Modulo-based Hashing

N1 N2 N3 N4

29


N1 N2 N3 N4

?

29


N1 N2 N3 N4

partition = key % n_servers

29


N1 N2 N3 N4

partition = key % n_servers - 1)
(n_servers

29


N1 N2 N3 N4

partition = key % n_servers - 1)
(n_servers

Recalculate the hashes for all the entries if n_servers changes
(i.e. full data redistribution when adding/removing a node)
29

Consistent Hashing
2160 0

A

F B

Ring Same hash function
E (key space) for data and nodes
idx = hash(key)

D
Coordinator: next
C
available clockwise
node
http://en.wikipedia.org/wiki/Consistent_hashing 30

Consistent Hashing
2160 0

A

canonical home
(coordinator node)
for key range A-B
F B

idx = hash(key)

D
Coordinator: next
C
available clockwise
node

Consistent Hashing
2160 0

A

F B

idx = hash(key)

D
Coordinator: next
C canonical home
for key range A-C available clockwise
node

Consistent Hashing
2160 0
only the keys in this
range change location
A

F B

idx = hash(key)

D
Coordinator: next
C canonical home
for key range A-C available clockwise
node

Consistent Hashing - Replication

A

F B

Ring
E (key space)

D
C

http://horicky.blogspot.com/2009/11/nosql-patterns.html 31

Consistent Hashing - Replication
Key hosted
AB

A in B, C, D

F B
Data replicated in
Ring the N-1 clockwise
E (key space) successor nodes

D
C Node hosting
Key , Key , Key
FA AB BC

http://horicky.blogspot.com/2009/11/nosql-patterns.html 31

Consistent Hashing - Node Changes

A

F B

E

D
C

32

Consistent Hashing - Node Changes
Key membership
A and replicas are
updated when a
F B
node joins or leaves
Copy Key the network.
Range AB The number of
E
Copy Key replicas for all data
Range FA is kept consistent.
D
C Copy Key
Range EF
32

Consistent Hashing - Load Distribution
2160 0
Different Strategies
A
I
Virtual Nodes
H B
Random tokens per each
Ring physical node, partition by
C
G (key space) token value
D

Node 1: tokens A, E, G
F Node 2: tokens C, F, H
E Node 3: tokens B, D, I

33

Consistent Hashing - Load Distribution
2160 0
Different Strategies

Virtual Nodes

Q equal-sized partitions,
Ring S nodes, Q/S tokens per
(key space) node (with Q >> S)
Node 1
Node 2
Node 3
Node 4
...
34

Vector Clocks & Conﬂict Detection
A B C write handled by A
Causality-based partial
order over events that
D1 ([A, 1]) happen in the system.

Document version
history: a counter for
each node that updated
the document.

If all update counters in
V1 are smaller or equal
to all update counters in
V2, then V1 precedes V2.
http://en.wikipedia.org/wiki/Vector_clock http://pl.atyp.us/wordpress/?p=2601 35

write handled by A
Document version
D2 ([A, 2]) history: a counter for
the document.

If all update counters in

write handled by A
Document version
write handled by B write handled by C
the document.
D3 ([A, 2], [B, 1]) D4 ([A, 2], [C,1]) If all update counters in

write handled by A
Document version
the document.
conﬂict detected reconciliation handled by A
? V2, then V1 precedes V2.

write handled by A
Document version
the document.
conﬂict detected reconciliation handled by A
D5 ([A, 3], [B, 1], [C,1])
? V2, then V1 precedes V2.

Vector Clocks can detect
a conflict. The conflict
D1 ([A, 1]) resolution is left to the
application or the user.

The application might
resolve conflicts by
checking relative
timestamps, or with
other strategies (like
merging the changes).

Vector clocks can grow
quite large (!)

write handled by A application or the user.

D2 ([A, 2])
checking relative
timestamps, or with
other strategies (like

quite large (!)


D2 ([A, 2])
write handled by B un-modiﬁed replica checking relative
timestamps, or with
D3 ([A, 2], [B, 1]) D4 ([A, 2]) other strategies (like

quite large (!)


D2 ([A, 2])
write handled by B un-modiﬁed replica checking relative
timestamps, or with
D3 ([A, 2], [B, 1]) D4 ([A, 2]) other strategies (like
version mismatch D3 ⊇ D4, conﬂict
detected resolved automatically
D5 ([A, 3], [B, 1]) quite large (!)

Gossip Protocol + Hinted Handoff

A

periodic, pairwise,
F B inter-process
interactions of
bounded size
E among randomly-
chosen peers

D
C

37


A

I can’t see B, it might be periodic, pairwise,
F down but I need some B inter-process
ACK. My Merkle Tree
root for range XY is interactions of
“ab031dab4a385afda” bounded size
E among randomly-
I can’t see B either.
My Merkle Tree root for chosen peers
range XY is different!

D
C
B must be down
then. Let’s disable it.

37

My canonical node is
supposed to be B.
A

periodic, pairwise,
F B inter-process
interactions of
bounded size
E among randomly-
chosen peers

D I see. Well, I’ll take care of it
for now, and let B know
C
when B is available again

37

Merkle Trees (Hash Trees)
Leaves: hashes of
ROOT
hash(A, B) data blocks.
Nodes: hashes of
their children.
A B
hash(C, D) hash(E, F)
Used to detect
inconsistencies
C D E F between replicas
hash(001) hash(002) hash(003) hash(004)
(anti-entropy) and
to minimise the
Data Data Data Data
Block Block Block Block amount of
001 002 003 004 transferred data
http://en.wikipedia.org/wiki/Hash_tree 38

Read Repair

A

F B
GET(k, R=2)

E

D
C

39

Read Repair
k=XYZ (v.2)
A
k=XYZ (v.2)

F B
GET(k, R=2)

E

D
C

k=ABC (v.1)
39

Read Repair

A

F B
k=XYZ (v.2)

E

UPDATE(k = XYZ)
D
C

39

NoSQL Break-down
Key-value stores, Column Families,
Document-oriented dbs, Graph databases

40

Focus Of Different Data Models

Key-Value
Stores
Size

Column
Families
Document
Databases
Graph
Databases

Complexity
http://www.slideshare.net/emileifrem/nosql-east-a-nosql-overview-and-the-beneﬁts-of-graph-databases 41

1) Key-value stores
Amazon Dynamo Paper
Data model: collection of key-value pairs

42

Voldemort AP

LICENSE
Dynamo DHT implementation
Consistent hashing,Vector clocks Apache 2
LANGUAGE

Java
API/PROTOCOL
HTTP Java
Thrift
Avro
Protobuf
PERSISTENCE

Pluggable
BDB/MySQL
CONCURRENCY
MVCC

43

Voldemort AP

LICENSE
LANGUAGE

Java
HTTP / Sockets API/PROTOCOL
HTTP Java
Thrift
Avro
Protobuf
PERSISTENCE

Pluggable
BDB/MySQL
CONCURRENCY
MVCC

43

Voldemort AP

LICENSE
LANGUAGE

Java
API/PROTOCOL
Conﬂicts resolved at read HTTP Java
and write time Thrift
Avro
Protobuf
PERSISTENCE

Pluggable
BDB/MySQL
CONCURRENCY
MVCC

43

Voldemort AP

LICENSE
LANGUAGE

Java
API/PROTOCOL
HTTP Java
Thrift
Json, Java String, byte[], Avro
Protobuf
Thrift, Avro, ProtoBuf
PERSISTENCE

Pluggable
BDB/MySQL
CONCURRENCY
MVCC

43

Voldemort AP

LICENSE
LANGUAGE

Java
API/PROTOCOL
HTTP Java
Thrift
Avro
Protobuf
PERSISTENCE

Pluggable
BDB/MySQL
CONCURRENCY
MVCC
Simple optimistic locking
for multi-row updates,
pluggable storage engine
43

Membase CP

LICENSE
DHT (K-V), no SPoF
Apache 2
“VBuckets” LANGUAGE

C/C++
membase memcached Erlang
API/PROTOCOL
persistence distributed
replication in-memory REST/JSON
(fail-over HA) memcached
rebalancing

Unit of consistency and replication
Owner of a subset of the cluster key space

http://dustin.github.com/2010/06/29/memcached-vbuckets.html 44

Membase CP

LICENSE
DHT (K-V), no SPoF
Apache 2

C/C++
API/PROTOCOL
rebalancing


hash function + table lookup


Membase CP

LICENSE
DHT (K-V), no SPoF
Apache 2

C/C++
API/PROTOCOL
rebalancing


hash function + table lookup

All metadata kept in memory (high throughput / low latency).
Manual/Programmatic failover via the Management REST API.

Riak AP

LICENSE

Apache 2
LANGUAGE

C, Erlang
API/PROTOCOL

REST HTTP
*
ProtoBuf

Buckets → K-V
“Links” (~relations)
Targeted JS Map/Reduce
Tune-able consistency (one-quorum-all)
45

Redis CP

LICENSE
K-V store “Data Structures Server”
BSD
Map, Set, Sorted Set, Linked List LANGUAGE
Set/Queue operations, Counters, Pub-Sub, Volatile keys ANSI C
API

*
+ PROTOCOL

Telnet-
like
PERSISTENCE
10-100K op/s (whole dataset in RAM + VM)
in memory
bg snapshots
Persistence via snapshotting (tunable fsync freq.) REPLICATION

master-slave

Distributed if client supports consistent hashing
http://redis.io/presentation/Redis_Cluster.pdf 46

2) Column Families
Google BigTable paper
Data model: big table, column families

47

Google BigTable Paper
Sparse, distributed, persistent multi-dimensional sorted map
indexed by (row_key, column_key, timestamp)

CF:col_name “contents:html” “anchor:cnnsi.com” “anchor:my.look.ca”

“com.cnn.www” <html>... “CNN” “CNN.com”
column column column
row_key row

http://labs.google.com/papers/bigtable-osdi06.pdf 48



<html>... t3
<html>... t5
“com.cnn.www” <html>... t6 “CNN” t9 “CNN.com” t8
row_key row




<html>... t3
<html>... t5
row_key row column family




<html>... t3
<html>... t5

ACL



<html>... t3
<html>... t5

Atomic updates ACL



<html>... t3
<html>... t5

Atomic updates Automatic GC ACL

Google BigTable: Data Structure
SSTable
Smallest building block
Persistent immutable Map[k,v]
Operations: lookup by key / key range scan

SSTable
64KB 64KB 64KB
block block block lookup
index

49

SSTable
Tablet
Smallest building block range of rows
Dynamically partitioned
Built from multiple SSTables
Operations: lookup and loadkey range scan
Unit of distribution by key / balancing

Tablet (range Aaa → Bar)

SSTable SSTable
64KB 64KB 64KB 64KB 64KB 64KB
block block block lookup block block block lookup
index index

49

SSTable
Table
Tablet
Smallest Tablets (table segments) make up a table
Multiple building block
Dynamically partitioned range of rows
Built from multiple SSTables
Operations: lookup and loadkey range scan
Unit of distribution by key / balancing

Table
Tablet (range Aaa → Bar)

SSTable SSTable
64KB 64KB 64KB 64KB 64KB 64KB
block block block lookup block block block lookup
index index

49

Google BigTable: I/O

memtable read

memory
GFS

tablet log

SSTable SSTable SSTable
write

50


memtable read
minor
compaction
memory
GFS

tablet log

write

50


memtable read
minor
compaction
memory
GFS

tablet log

write BMDiff Zippy

50


memtable read
minor
compaction
memory
GFS

tablet log

write BMDiff Zippy

merging / major compaction (GC)
50

Google BigTable: Location Dereferencing

Metadata Tablets User Tables

... ...

Root Tablet
Master File
...
Chubby ... ...

Replicated, persisted
Root of the
lock service; maintains
metadata tree
tablet server locations

5 replicas, one elected ...
master (via quorum)
Up to 3 levels ...

Paxos algorithm used in the metadata
to keep consistency hierarchy

51

Google BigTable: Architecture
fs metadata, ACL,
GC, load balancing

BigTable metadata operations BigTable
client master
data R/W heartbeat
operations messages, GC,
chunk migration

Tablet Tablet Tablet Chubby
Server Server Server track
master lock,
log of live servers

Tablet Tablet Tablet
52

HBase CP

LICENSE
OSS implementation of BigTable
Apache 2
LANGUAGE

Java
API/PROTOCOL

REST HTTP
Thrift
PERSISTENCE

memtable/
SSTable

53

HBase CP

LICENSE
Apache 2
LANGUAGE

Java
ZooKeeper as API/PROTOCOL

coordinator REST HTTP
Thrift
(instead of Chubby)
PERSISTENCE

memtable/
SSTable

53

HBase CP

LICENSE
Apache 2
LANGUAGE

Support for Java
multiple masters API/PROTOCOL

REST HTTP
Thrift
PERSISTENCE

memtable/
SSTable

53

HBase CP

LICENSE
Apache 2
LANGUAGE

Java
API/PROTOCOL

REST HTTP
Thrift
PERSISTENCE

memtable/
SSTable

HDFS, S3, S3N, EBS
(with GZip/LZO
CF compression)

53

HBase CP

LICENSE
Apache 2
LANGUAGE

Java
API/PROTOCOL

REST HTTP
Thrift
PERSISTENCE

memtable/
Data sorted by key SSTable
but evenly distributed
across the cluster

53

HBase CP

LICENSE
Apache 2
LANGUAGE

Java
API/PROTOCOL

REST HTTP
Thrift
PERSISTENCE

memtable/
SSTable
Batch Streaming,
Map/Reduce

53

Hypertable CP

LICENSE
OSS BigTable implementation
Faster than HBase (10-30K/s) GPLv2
LANGUAGE

C++
API/PROTOCOL

C++
Thrift
PERSISTENCE

memtable/
SSTable
CONCURRENCY
MVCC

HQL (~SQL)
54

Hypertable CP

LICENSE
LANGUAGE

C++
API/PROTOCOL

C++
Hyperspace Thrift
(paxos) used PERSISTENCE

instead of memtable/
SSTable
ZooKeeper
CONCURRENCY
MVCC

HQL (~SQL)
54

Hypertable CP

LICENSE
LANGUAGE

C++
API/PROTOCOL

C++
Thrift
Dynamically PERSISTENCE
adapts to
memtable/
changes in SSTable
workload CONCURRENCY
MVCC

HQL (~SQL)
54

Cassandra AP

LICENSE
Data model of BigTable, infrastructure of Dynamo
Apache 2
LANGUAGE

Java
PROTOCOL
B
col_name Thrift
Avro
col_value PERSISTENCE
timestamp
memtable/
SSTable
Column
CONSISTENCY

Tunable
R/W/N

x

http://www.javageneration.com/?p=70 @cassandralondon http://www.meetup.com/Cassandra-London/ 55

Cassandra AP

LICENSE
Apache 2
LANGUAGE

super_column_name Java
PROTOCOL
B
col_name col_name Thrift
... Avro
PERSISTENCE
col_value col_value
timestamp timestamp
memtable/
SSTable
CONSISTENCY

Tunable
R/W/N

x


Cassandra AP

LICENSE
Column Family Apache 2
LANGUAGE

Java
PROTOCOL
B
col_name col_name Thrift
row_key
... Avro
PERSISTENCE
col_value col_value
timestamp timestamp
memtable/
SSTable
CONSISTENCY

Tunable
R/W/N

x


Cassandra AP

LICENSE
Super Column Family Apache 2
LANGUAGE

super_column_name super_column_name Java
PROTOCOL
B
col_name col_name col_name col_name Thrift
row_key
... ... ... Avro
col_value col_value col_value col_value PERSISTENCE
timestamp timestamp timestamp timestamp
memtable/
SSTable
CONSISTENCY
keyspace.get("column_family", key, ["super_column",] "column")
Tunable
R/W/N

x


NoSQL Databases: Why, what and when

NoSQL Databases: Why, what and when

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (14)

Similar to NoSQL Databases: Why, what and when

Similar to NoSQL Databases: Why, what and when (20)

Recently uploaded

Recently uploaded (20)

NoSQL Databases: Why, what and when

Editor's Notes