SlideShare a Scribd company logo
Dynamo: Amazon’s Highly Available Key-value Store
Amir H. Payberah
amir@sics.se
Amirkabir University of Technology
(Tehran Polytechnic)
Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 1 / 64
Database and Database Management System
Database: an organized collection of data.
Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 2 / 64
Database and Database Management System
Database: an organized collection of data.
Database Management System (DBMS): a software that interacts
with users, other applications, and the database itself to capture
and analyze data.
Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 2 / 64
History of Databases
1960s
• Navigational data model: hierarchical model (IMS) and network
model (CODASYL).
• Disk-aware
Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 3 / 64
History of Databases
1960s
• Navigational data model: hierarchical model (IMS) and network
model (CODASYL).
• Disk-aware
1970s
• Relational data model: Edgar F. Codd paper.
• Logical data is disconnected from physical information storage.
Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 3 / 64
History of Databases
1960s
• Navigational data model: hierarchical model (IMS) and network
model (CODASYL).
• Disk-aware
1970s
• Relational data model: Edgar F. Codd paper.
• Logical data is disconnected from physical information storage.
1980s
• Object databases: information is represented in the form of objects.
Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 3 / 64
History of Databases
1960s
• Navigational data model: hierarchical model (IMS) and network
model (CODASYL).
• Disk-aware
1970s
• Relational data model: Edgar F. Codd paper.
• Logical data is disconnected from physical information storage.
1980s
• Object databases: information is represented in the form of objects.
2000s
• NoSQL databases: BASE instead of ACID.
• NewSQL databases: scalable performance of NoSQL + ACID.
Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 3 / 64
Relational Databases Management Systems (RDMBSs)
RDMBSs: the dominant technology for storing structured data in
web and business applications.
SQL is good
• Rich language
• Easy to use and integrate
• Rich toolset
• Many vendors
They promise: ACID
Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 4 / 64
ACID Properties
Atomicity
• All included statements in a transaction are either executed or the
whole transaction is aborted without affecting the database.
Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 5 / 64
ACID Properties
Atomicity
• All included statements in a transaction are either executed or the
whole transaction is aborted without affecting the database.
Consistency
• A database is in a consistent state before and after a transaction.
Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 5 / 64
ACID Properties
Atomicity
• All included statements in a transaction are either executed or the
whole transaction is aborted without affecting the database.
Consistency
• A database is in a consistent state before and after a transaction.
Isolation
• Transactions can not see uncommitted changes in the database.
Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 5 / 64
ACID Properties
Atomicity
• All included statements in a transaction are either executed or the
whole transaction is aborted without affecting the database.
Consistency
• A database is in a consistent state before and after a transaction.
Isolation
• Transactions can not see uncommitted changes in the database.
Durability
• Changes are written to a disk before a database commits a transaction
so that committed data cannot be lost through a power failure.
Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 5 / 64
RDBMS is Good ...
Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 6 / 64
RDBMS Challenges
Web-based applications caused spikes.
• Internet-scale data size
• High read-write rates
• Frequent schema changes
Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 7 / 64
Let’s Scale RDBMSs
RDBMS were not designed to be distributed.
Possible solutions:
• Replication
• Sharding
Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 8 / 64
Let’s Scale RDBMSs - Replication
Master/Slave architecture
Scales read operations
Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 9 / 64
Let’s Scale RDBMSs - Sharding
Dividing the database across many machines.
It scales read and write operations.
Cannot execute transactions across shards (partitions).
Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 10 / 64
Scaling RDBMSs is Expensive and Inefficient
[http://www.couchbase.com/sites/default/files/uploads/all/whitepapers/NoSQLWhitepaper.pdf]
Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 11 / 64
Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 12 / 64
Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 13 / 64
NoSQL
Avoidance of unneeded complexity
High throughput
Horizontal scalability and running on commodity hardware
Compromising reliability for better performance
Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 14 / 64
The NoSQL Movement
It was first used in 1998 by Carlo Strozzi to name his relational
database that did not expose the standard SQL interface.
The term was picked up again in 2009 when a Last.fm developer, Jo-
han Oskarsson, wanted to organize an event to discuss open source
distributed databases.
The name attempted to label the emergence of a growing number
of non-relational, distributed data stores that often did not attempt
to provide ACID.
Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 15 / 64
NoSQL Cost and Performance
[http://www.couchbase.com/sites/default/files/uploads/all/whitepapers/NoSQLWhitepaper.pdf]
Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 16 / 64
RDBMS vs. NoSQL
[http://www.couchbase.com/sites/default/files/uploads/all/whitepapers/NoSQLWhitepaper.pdf]
Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 17 / 64
NoSQL Data Models
Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 18 / 64
NoSQL Data Models
[http://highlyscalable.wordpress.com/2012/03/01/nosql-data-modeling-techniques]
Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 19 / 64
Key-Value Data Model
Collection of key/value pairs.
Ordered Key-Value: processing over key ranges.
Dynamo, Scalaris, Voldemort, Riak, ...
Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 20 / 64
Column-Oriented Data Model
Similar to a key/value store, but the value can have multiple at-
tributes (Columns).
Column: a set of data values of a particular type.
Store and process data by column instead of row.
BigTable, Hbase, Cassandra, ...
Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 21 / 64
Document Data Model
Similar to a column-oriented store, but values can have complex
documents, instead of fixed format.
Flexible schema.
XML, YAML, JSON, and BSON.
CouchDB, MongoDB, ...
{
FirstName: "Bob",
Address: "5 Oak St.",
Hobby: "sailing"
}
{
FirstName: "Jonathan",
Address: "15 Wanamassa Point Road",
Children: [
{Name: "Michael", Age: 10},
{Name: "Jennifer", Age: 8},
]
}
Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 22 / 64
Graph Data Model
Uses graph structures with nodes, edges, and properties to represent
and store data.
Neo4J, InfoGrid, ...
[http://en.wikipedia.org/wiki/Graph database]
Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 23 / 64
Consistency
Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 24 / 64
Consistency
Strong consistency
• After an update completes, any subsequent access will return the
updated value.
Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 25 / 64
Consistency
Strong consistency
• After an update completes, any subsequent access will return the
updated value.
Eventual consistency
• Does not guarantee that subsequent accesses will return the
updated value.
• Inconsistency window.
• If no new updates are made to the object, eventually all accesses
will return the last updated value.
Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 25 / 64
Quorum Model
N: the number of nodes to which a data item is replicated.
R: the number of nodes a value has to be read from to be accepted.
W: the number of nodes a new value has to be written to before
the write operation is finished.
To enforce strong consistency: R + W > N
Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 26 / 64
Quorum Model
N: the number of nodes to which a data item is replicated.
R: the number of nodes a value has to be read from to be accepted.
W: the number of nodes a new value has to be written to before
the write operation is finished.
To enforce strong consistency: R + W > N
Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 26 / 64
Consistency vs. Availability
The large-scale applications have to be reliable: availability + re-
dundancy
These properties are difficult to achieve with ACID properties.
The BASE approach forfeits the ACID properties of consistency and
isolation in favor of availability, graceful degradation, and perfor-
mance.
Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 27 / 64
BASE Properties
Basic Availability
• Possibilities of faults but not a fault of the whole system.
Soft-state
• Copies of a data item may be inconsistent
Eventually consistent
• Copies becomes consistent at some later time if there are no more
updates to that data item
Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 28 / 64
CAP Theorem
Consistency
• Consistent state of data after the execution of an operation.
Availability
• Clients can always read and write data.
Partition Tolerance
• Continue the operation in the presence of network partitions.
You can choose only two!
Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 29 / 64
Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 30 / 64
Dyanmo
Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 31 / 64
Dynamo
Distributed key/value storage system
Scalable
Highly available
CAP
Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 32 / 64
Design Consideration
It sacrifices strong consistency for availability: always writable
Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 33 / 64
Design Consideration
It sacrifices strong consistency for availability: always writable
Incremental scalability
Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 33 / 64
Design Consideration
It sacrifices strong consistency for availability: always writable
Incremental scalability
Symmetry: every node should have the same set of responsibilities
as its peers
Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 33 / 64
Design Consideration
It sacrifices strong consistency for availability: always writable
Incremental scalability
Symmetry: every node should have the same set of responsibilities
as its peers
Decentralization
Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 33 / 64
Design Consideration
It sacrifices strong consistency for availability: always writable
Incremental scalability
Symmetry: every node should have the same set of responsibilities
as its peers
Decentralization
Heterogeneity
Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 33 / 64
System Architecture
Data partitioning
Replication
Data versioning
Dynamo API
Membership management
Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 34 / 64
Data Partitioning
Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 35 / 64
Partitioning
If size of data exceeds the capacity of a single machine: partitioning
Sharding data (horizontal partitioning).
Consistent hashing is one form of automatic sharding.
Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 36 / 64
Consistent Hashing
Hash both data and nodes using the same hash function in a same
id space.
partition = hash(d) mod n, d: data, n: number of nodes
Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 37 / 64
Consistent Hashing
Hash both data and nodes using the same hash function in a same
id space.
partition = hash(d) mod n, d: data, n: number of nodes
hash("Fatemeh") = 12
hash("Ahmad") = 2
hash("Seif") = 9
hash("Jim") = 14
hash("Sverker") = 4
Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 37 / 64
Load Imbalance (1/4)
Consistent hashing may lead to imbalance.
Node identifiers may not be balanced.
Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 38 / 64
Load Imbalance (2/4)
Consistent hashing may lead to imbalance.
Data identifiers may not be balanced.
Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 39 / 64
Load Imbalance (3/4)
Consistent hashing may lead to imbalance.
Hot spots.
Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 40 / 64
Load Imbalance (4/4)
Consistent hashing may lead to imbalance.
Heterogeneous nodes.
Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 41 / 64
Load Balancing via Virtual Nodes
Each physical node picks multiple random identifiers.
Each identifier represents a virtual node.
Each node runs multiple virtual nodes.
Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 42 / 64
Replication
Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 43 / 64
Replication
To achieve high availability and durability, data should be replicates
on multiple nodes.
Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 44 / 64
Data Versioning
Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 45 / 64
Data Versioning (1/3)
Updates are propagated asynchronously.
Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 46 / 64
Data Versioning (1/3)
Updates are propagated asynchronously.
Each update/modification of an item results in a new and immutable
version of the data.
• Multiple versions of an object may exist.
Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 46 / 64
Data Versioning (1/3)
Updates are propagated asynchronously.
Each update/modification of an item results in a new and immutable
version of the data.
• Multiple versions of an object may exist.
Replicas eventually become consistent.
Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 46 / 64
Data Versioning (2/3)
Version branching can happen due to node/network failures.
Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 47 / 64
Data Versioning (2/3)
Version branching can happen due to node/network failures.
Use vector clocks for capturing causality, in the form of (node,
counter)
Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 47 / 64
Data Versioning (2/3)
Version branching can happen due to node/network failures.
Use vector clocks for capturing causality, in the form of (node,
counter)
• If causal: older version can be forgotten
Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 47 / 64
Data Versioning (2/3)
Version branching can happen due to node/network failures.
Use vector clocks for capturing causality, in the form of (node,
counter)
• If causal: older version can be forgotten
• If concurrent: conflict exists, requiring reconciliation
Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 47 / 64
Data Versioning (3/3)
Client C1 writes new object via Sx.
Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 48 / 64
Data Versioning (3/3)
Client C1 writes new object via Sx.
C1 updates the object via Sx.
Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 48 / 64
Data Versioning (3/3)
Client C1 writes new object via Sx.
C1 updates the object via Sx.
C1 updates the object via Sy.
Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 48 / 64
Data Versioning (3/3)
Client C1 writes new object via Sx.
C1 updates the object via Sx.
C1 updates the object via Sy.
C2 reads D2 and updates the
object via Sz.
Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 48 / 64
Data Versioning (3/3)
Client C1 writes new object via Sx.
C1 updates the object via Sx.
C1 updates the object via Sy.
C2 reads D2 and updates the
object via Sz.
C3 reads D3 and D4 via Sx.
• The read context is a
summary of the clocks of D3
and D4: [(Sx, 2), (Sy, 1), (Sz, 1)].
Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 48 / 64
Data Versioning (3/3)
Client C1 writes new object via Sx.
C1 updates the object via Sx.
C1 updates the object via Sy.
C2 reads D2 and updates the
object via Sz.
C3 reads D3 and D4 via Sx.
• The read context is a
summary of the clocks of D3
and D4: [(Sx, 2), (Sy, 1), (Sz, 1)].
Reconciliation
Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 48 / 64
Dynamo API
Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 49 / 64
Dynamo API
get(key)
• Return single object or list of objects with conflicting version and
context.
Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 50 / 64
Dynamo API
get(key)
• Return single object or list of objects with conflicting version and
context.
put(key, context, object)
• Store object and context under key.
• Context encodes system metadata, e.g., version number.
Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 50 / 64
Execution of Operations
Client can send the request:
• To the node responsible for the data (coordinator): save on latency,
code on client
• To a generic load balancer: extra hope
Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 51 / 64
put Operation
Coordinator generates new vector clock and writes the new version
locally.
Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 52 / 64
put Operation
Coordinator generates new vector clock and writes the new version
locally.
Send to N nodes.
Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 52 / 64
put Operation
Coordinator generates new vector clock and writes the new version
locally.
Send to N nodes.
Wait for response from W nodes.
Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 52 / 64
put Operation
Coordinator generates new vector clock and writes the new version
locally.
Send to N nodes.
Wait for response from W nodes.
Using W=1
• High availability for writes
• Low durability
Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 52 / 64
get Operation
Coordinator requests existing versions from N.
• Wait for response from R nodes.
Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 53 / 64
get Operation
Coordinator requests existing versions from N.
• Wait for response from R nodes.
If multiple versions, return all versions that are causally unrelated.
Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 53 / 64
get Operation
Coordinator requests existing versions from N.
• Wait for response from R nodes.
If multiple versions, return all versions that are causally unrelated.
Divergent versions are then reconciled.
Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 53 / 64
get Operation
Coordinator requests existing versions from N.
• Wait for response from R nodes.
If multiple versions, return all versions that are causally unrelated.
Divergent versions are then reconciled.
Reconciled version written back.
Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 53 / 64
get Operation
Coordinator requests existing versions from N.
• Wait for response from R nodes.
If multiple versions, return all versions that are causally unrelated.
Divergent versions are then reconciled.
Reconciled version written back.
Using R=1
• High performance read engine.
Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 53 / 64
Membership Management
Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 54 / 64
Membership Management
Administrator explicitly adds and removes nodes.
Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 55 / 64
Membership Management
Administrator explicitly adds and removes nodes.
Gossiping to propagate membership changes.
• Eventually consistent view.
• O(1) hop overlay.
Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 55 / 64
Adding Nodes
A new node X added to system.
• X is assigned key ranges w.r.t. its virtual servers.
• For each key range, it transfers the data items.
Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 56 / 64
Removing Nodes
Reallocation of keys is a reverse process of adding nodes.
Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 57 / 64
Failure Detection
Passive failure detection.
• Use pings only for detection from failed to alive.
In the absence of client requests, node A doesn’t need to know if
node B is alive.
Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 58 / 64
Handling Permanent Failure
Anti-entropy for replica synchronization.
Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 59 / 64
Handling Permanent Failure
Anti-entropy for replica synchronization.
Use Merkle trees for fast inconsistency detection and minimum
transfer of data.
Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 59 / 64
Handling Permanent Failure
Anti-entropy for replica synchronization.
Use Merkle trees for fast inconsistency detection and minimum
transfer of data.
• Nodes maintain Merkle tree of each key range.
Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 59 / 64
Handling Permanent Failure
Anti-entropy for replica synchronization.
Use Merkle trees for fast inconsistency detection and minimum
transfer of data.
• Nodes maintain Merkle tree of each key range.
• Exchange root of Merkle tree to check if the key ranges are updated.
Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 59 / 64
Failure Transient Detection
Due to partitions, quorums might not exist.
• Sloppy quorum.
• Create transient replicas: N healthy
nodes from the preference list.
• Reconcile after partition heals.
Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 60 / 64
Failure Transient Detection
Due to partitions, quorums might not exist.
• Sloppy quorum.
• Create transient replicas: N healthy
nodes from the preference list.
• Reconcile after partition heals.
Say A is unreachable.
put will use D.
Later, D detects A is alive.
• Sends the replica to A
• Removes the replica.
Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 60 / 64
Summary
Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 61 / 64
Summary
NoSQL data models: key-value, column-oriented, document-
oriented, graph-based
Sharding and consistent hashing
ACID vs. BASE
CAP (Consistency vs. Availability)
Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 62 / 64
Summary
Dynamo: key/value storage: put and get
Data partitioning: consistent hashing
Load balancing: virtual server
Replication: several nodes, preference list
Data versioning: vector clock, resolve conflict at read time by the
application
Membership management: join/leave by admin, gossip-based to up-
date the nodes’ views, ping to detect failure
Handling transient failure: sloppy quorum
Handling permanent failure: Merkle tree
Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 63 / 64
Questions?
Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 64 / 64

More Related Content

Viewers also liked

OpenPaas Collaboration Platform. OW2con'15, November 17, Paris.
OpenPaas Collaboration Platform. OW2con'15, November 17, Paris. OpenPaas Collaboration Platform. OW2con'15, November 17, Paris.
OpenPaas Collaboration Platform. OW2con'15, November 17, Paris.
OW2
 
II república y guerra civil
II república y guerra civilII república y guerra civil
II república y guerra civil
María Belén García Llamas
 
6 october 09.20_am_hejnowski_ver pl
6 october 09.20_am_hejnowski_ver pl6 october 09.20_am_hejnowski_ver pl
6 october 09.20_am_hejnowski_ver pl
Ciszewski MSL
 
Open Source Market Overview OW2con11, Nov 24-25, Paris
Open Source Market Overview OW2con11, Nov 24-25, ParisOpen Source Market Overview OW2con11, Nov 24-25, Paris
Open Source Market Overview OW2con11, Nov 24-25, ParisOW2
 
ACCEDE WEB, LES GUIDES D’ACCESSIBILITE POUR PROJETS WEB
ACCEDE WEB, LES GUIDES D’ACCESSIBILITE POUR PROJETS WEB ACCEDE WEB, LES GUIDES D’ACCESSIBILITE POUR PROJETS WEB
ACCEDE WEB, LES GUIDES D’ACCESSIBILITE POUR PROJETS WEB
OW2
 
Itf ipp ch03_2012_final
Itf ipp ch03_2012_finalItf ipp ch03_2012_final
Itf ipp ch03_2012_finaldphil002
 
Palacio Gobierno del Ecuador
Palacio Gobierno del EcuadorPalacio Gobierno del Ecuador
Palacio Gobierno del Ecuador
Pablo Guaña
 
OW2con'14 - OpenCloudware: The vApp Lifecycle Management Solution for Multi-C...
OW2con'14 - OpenCloudware: The vApp Lifecycle Management Solution for Multi-C...OW2con'14 - OpenCloudware: The vApp Lifecycle Management Solution for Multi-C...
OW2con'14 - OpenCloudware: The vApp Lifecycle Management Solution for Multi-C...
OW2
 
Project NGX - Proposal
Project NGX - ProposalProject NGX - Proposal
Project NGX - Proposal
Matthew Chang
 
Los 88 pelda+os del +ëxitov 02
Los 88 pelda+os del +ëxitov 02Los 88 pelda+os del +ëxitov 02
Los 88 pelda+os del +ëxitov 02
María Belén García Llamas
 
2013 cch basic principles ch02
2013 cch basic principles ch022013 cch basic principles ch02
2013 cch basic principles ch02dphil002
 
Mobile integration
Mobile integrationMobile integration
Mobile integration
wall530
 
Action Apps Business Intelligence, OW2con11, Nov 24-25, 2011, Paris
Action Apps Business Intelligence, OW2con11, Nov 24-25, 2011, ParisAction Apps Business Intelligence, OW2con11, Nov 24-25, 2011, Paris
Action Apps Business Intelligence, OW2con11, Nov 24-25, 2011, ParisOW2
 
Ow2stack, the OW2 Community Cloud Testbed, Xiaolong Kong, OW2
Ow2stack, the OW2 Community Cloud Testbed, Xiaolong Kong, OW2Ow2stack, the OW2 Community Cloud Testbed, Xiaolong Kong, OW2
Ow2stack, the OW2 Community Cloud Testbed, Xiaolong Kong, OW2OW2
 
Mystery Salamanca
Mystery SalamancaMystery Salamanca
Mystery Salamancaalfcoltrane
 
2013 cch basic principles ch12
2013 cch basic principles ch122013 cch basic principles ch12
2013 cch basic principles ch12dphil002
 
OW2con'14 - How to best manage your community of users? The example of Bonit...
OW2con'14 -  How to best manage your community of users? The example of Bonit...OW2con'14 -  How to best manage your community of users? The example of Bonit...
OW2con'14 - How to best manage your community of users? The example of Bonit...
OW2
 
Self Service BI with SpagoBI 4, Virginie Pasquon, Engineering Group.
 Self Service BI with SpagoBI 4, Virginie Pasquon, Engineering Group. Self Service BI with SpagoBI 4, Virginie Pasquon, Engineering Group.
Self Service BI with SpagoBI 4, Virginie Pasquon, Engineering Group.
OW2
 

Viewers also liked (20)

Tumbas Rumania
Tumbas RumaniaTumbas Rumania
Tumbas Rumania
 
OpenPaas Collaboration Platform. OW2con'15, November 17, Paris.
OpenPaas Collaboration Platform. OW2con'15, November 17, Paris. OpenPaas Collaboration Platform. OW2con'15, November 17, Paris.
OpenPaas Collaboration Platform. OW2con'15, November 17, Paris.
 
II república y guerra civil
II república y guerra civilII república y guerra civil
II república y guerra civil
 
6 october 09.20_am_hejnowski_ver pl
6 october 09.20_am_hejnowski_ver pl6 october 09.20_am_hejnowski_ver pl
6 october 09.20_am_hejnowski_ver pl
 
Open Source Market Overview OW2con11, Nov 24-25, Paris
Open Source Market Overview OW2con11, Nov 24-25, ParisOpen Source Market Overview OW2con11, Nov 24-25, Paris
Open Source Market Overview OW2con11, Nov 24-25, Paris
 
ACCEDE WEB, LES GUIDES D’ACCESSIBILITE POUR PROJETS WEB
ACCEDE WEB, LES GUIDES D’ACCESSIBILITE POUR PROJETS WEB ACCEDE WEB, LES GUIDES D’ACCESSIBILITE POUR PROJETS WEB
ACCEDE WEB, LES GUIDES D’ACCESSIBILITE POUR PROJETS WEB
 
Itf ipp ch03_2012_final
Itf ipp ch03_2012_finalItf ipp ch03_2012_final
Itf ipp ch03_2012_final
 
Palacio Gobierno del Ecuador
Palacio Gobierno del EcuadorPalacio Gobierno del Ecuador
Palacio Gobierno del Ecuador
 
OW2con'14 - OpenCloudware: The vApp Lifecycle Management Solution for Multi-C...
OW2con'14 - OpenCloudware: The vApp Lifecycle Management Solution for Multi-C...OW2con'14 - OpenCloudware: The vApp Lifecycle Management Solution for Multi-C...
OW2con'14 - OpenCloudware: The vApp Lifecycle Management Solution for Multi-C...
 
Project NGX - Proposal
Project NGX - ProposalProject NGX - Proposal
Project NGX - Proposal
 
Chapter 4
Chapter 4Chapter 4
Chapter 4
 
Los 88 pelda+os del +ëxitov 02
Los 88 pelda+os del +ëxitov 02Los 88 pelda+os del +ëxitov 02
Los 88 pelda+os del +ëxitov 02
 
2013 cch basic principles ch02
2013 cch basic principles ch022013 cch basic principles ch02
2013 cch basic principles ch02
 
Mobile integration
Mobile integrationMobile integration
Mobile integration
 
Action Apps Business Intelligence, OW2con11, Nov 24-25, 2011, Paris
Action Apps Business Intelligence, OW2con11, Nov 24-25, 2011, ParisAction Apps Business Intelligence, OW2con11, Nov 24-25, 2011, Paris
Action Apps Business Intelligence, OW2con11, Nov 24-25, 2011, Paris
 
Ow2stack, the OW2 Community Cloud Testbed, Xiaolong Kong, OW2
Ow2stack, the OW2 Community Cloud Testbed, Xiaolong Kong, OW2Ow2stack, the OW2 Community Cloud Testbed, Xiaolong Kong, OW2
Ow2stack, the OW2 Community Cloud Testbed, Xiaolong Kong, OW2
 
Mystery Salamanca
Mystery SalamancaMystery Salamanca
Mystery Salamanca
 
2013 cch basic principles ch12
2013 cch basic principles ch122013 cch basic principles ch12
2013 cch basic principles ch12
 
OW2con'14 - How to best manage your community of users? The example of Bonit...
OW2con'14 -  How to best manage your community of users? The example of Bonit...OW2con'14 -  How to best manage your community of users? The example of Bonit...
OW2con'14 - How to best manage your community of users? The example of Bonit...
 
Self Service BI with SpagoBI 4, Virginie Pasquon, Engineering Group.
 Self Service BI with SpagoBI 4, Virginie Pasquon, Engineering Group. Self Service BI with SpagoBI 4, Virginie Pasquon, Engineering Group.
Self Service BI with SpagoBI 4, Virginie Pasquon, Engineering Group.
 

Similar to Dynamo and NoSQL Databases

NoSQL Application Development with JSON and MapR-DB
NoSQL Application Development with JSON and MapR-DBNoSQL Application Development with JSON and MapR-DB
NoSQL Application Development with JSON and MapR-DB
MapR Technologies
 
OrientDB the database for the web 1.1
OrientDB the database for the web 1.1OrientDB the database for the web 1.1
OrientDB the database for the web 1.1
Luca Garulli
 
Introduction to cloud computing and big data - part1
Introduction to cloud computing and big data - part1Introduction to cloud computing and big data - part1
Introduction to cloud computing and big data - part1
Amir Payberah
 
Navigating NoSQL in cloudy skies
Navigating NoSQL in cloudy skiesNavigating NoSQL in cloudy skies
Navigating NoSQL in cloudy skies
shnkr_rmchndrn
 
Putting Apache Drill into Production
Putting Apache Drill into ProductionPutting Apache Drill into Production
Putting Apache Drill into Production
MapR Technologies
 
Apache Drill - Why, What, How
Apache Drill - Why, What, HowApache Drill - Why, What, How
Apache Drill - Why, What, How
mcsrivas
 
TriHUG - Beyond Batch
TriHUG - Beyond BatchTriHUG - Beyond Batch
TriHUG - Beyond Batchboorad
 
Cloudera Operational DB (Apache HBase & Apache Phoenix)
Cloudera Operational DB (Apache HBase & Apache Phoenix)Cloudera Operational DB (Apache HBase & Apache Phoenix)
Cloudera Operational DB (Apache HBase & Apache Phoenix)
Timothy Spann
 
Hive and Shark
Hive and SharkHive and Shark
Hive and Shark
Amir Payberah
 
NoSQL Architecture Overview
NoSQL Architecture OverviewNoSQL Architecture Overview
NoSQL Architecture Overview
Christopher Foot
 
Spark
SparkSpark
High-Performance Big Data Analytics with RDMA over NVM and NVMe-SSD
High-Performance Big Data Analytics with RDMA over NVM and NVMe-SSDHigh-Performance Big Data Analytics with RDMA over NVM and NVMe-SSD
High-Performance Big Data Analytics with RDMA over NVM and NVMe-SSD
inside-BigData.com
 
Bigtable
BigtableBigtable
Bigtable
Amir Payberah
 
OUG Scotland 2014 - NoSQL and MySQL - The best of both worlds
OUG Scotland 2014 - NoSQL and MySQL - The best of both worldsOUG Scotland 2014 - NoSQL and MySQL - The best of both worlds
OUG Scotland 2014 - NoSQL and MySQL - The best of both worlds
Andrew Morgan
 
Storage cassandra
Storage   cassandraStorage   cassandra
Storage cassandraPL dream
 
Ultimate journey towards realtime data platform with 2.5M events per sec
Ultimate journey towards realtime data platform with 2.5M events per secUltimate journey towards realtime data platform with 2.5M events per sec
Ultimate journey towards realtime data platform with 2.5M events per sec
b0ris_1
 
Analytics Beyond RAM Capacity using R
Analytics Beyond RAM Capacity using RAnalytics Beyond RAM Capacity using R
Analytics Beyond RAM Capacity using R
Alex Palamides
 
Microservices with Terraform, Docker and the Cloud. DevOps Wet 2018
Microservices with Terraform, Docker and the Cloud. DevOps Wet 2018Microservices with Terraform, Docker and the Cloud. DevOps Wet 2018
Microservices with Terraform, Docker and the Cloud. DevOps Wet 2018
Derek Ashmore
 
Sql vs nosql
Sql vs nosqlSql vs nosql
Sql vs nosql
Nick Verschueren
 
Threads
ThreadsThreads
Threads
Amir Payberah
 

Similar to Dynamo and NoSQL Databases (20)

NoSQL Application Development with JSON and MapR-DB
NoSQL Application Development with JSON and MapR-DBNoSQL Application Development with JSON and MapR-DB
NoSQL Application Development with JSON and MapR-DB
 
OrientDB the database for the web 1.1
OrientDB the database for the web 1.1OrientDB the database for the web 1.1
OrientDB the database for the web 1.1
 
Introduction to cloud computing and big data - part1
Introduction to cloud computing and big data - part1Introduction to cloud computing and big data - part1
Introduction to cloud computing and big data - part1
 
Navigating NoSQL in cloudy skies
Navigating NoSQL in cloudy skiesNavigating NoSQL in cloudy skies
Navigating NoSQL in cloudy skies
 
Putting Apache Drill into Production
Putting Apache Drill into ProductionPutting Apache Drill into Production
Putting Apache Drill into Production
 
Apache Drill - Why, What, How
Apache Drill - Why, What, HowApache Drill - Why, What, How
Apache Drill - Why, What, How
 
TriHUG - Beyond Batch
TriHUG - Beyond BatchTriHUG - Beyond Batch
TriHUG - Beyond Batch
 
Cloudera Operational DB (Apache HBase & Apache Phoenix)
Cloudera Operational DB (Apache HBase & Apache Phoenix)Cloudera Operational DB (Apache HBase & Apache Phoenix)
Cloudera Operational DB (Apache HBase & Apache Phoenix)
 
Hive and Shark
Hive and SharkHive and Shark
Hive and Shark
 
NoSQL Architecture Overview
NoSQL Architecture OverviewNoSQL Architecture Overview
NoSQL Architecture Overview
 
Spark
SparkSpark
Spark
 
High-Performance Big Data Analytics with RDMA over NVM and NVMe-SSD
High-Performance Big Data Analytics with RDMA over NVM and NVMe-SSDHigh-Performance Big Data Analytics with RDMA over NVM and NVMe-SSD
High-Performance Big Data Analytics with RDMA over NVM and NVMe-SSD
 
Bigtable
BigtableBigtable
Bigtable
 
OUG Scotland 2014 - NoSQL and MySQL - The best of both worlds
OUG Scotland 2014 - NoSQL and MySQL - The best of both worldsOUG Scotland 2014 - NoSQL and MySQL - The best of both worlds
OUG Scotland 2014 - NoSQL and MySQL - The best of both worlds
 
Storage cassandra
Storage   cassandraStorage   cassandra
Storage cassandra
 
Ultimate journey towards realtime data platform with 2.5M events per sec
Ultimate journey towards realtime data platform with 2.5M events per secUltimate journey towards realtime data platform with 2.5M events per sec
Ultimate journey towards realtime data platform with 2.5M events per sec
 
Analytics Beyond RAM Capacity using R
Analytics Beyond RAM Capacity using RAnalytics Beyond RAM Capacity using R
Analytics Beyond RAM Capacity using R
 
Microservices with Terraform, Docker and the Cloud. DevOps Wet 2018
Microservices with Terraform, Docker and the Cloud. DevOps Wet 2018Microservices with Terraform, Docker and the Cloud. DevOps Wet 2018
Microservices with Terraform, Docker and the Cloud. DevOps Wet 2018
 
Sql vs nosql
Sql vs nosqlSql vs nosql
Sql vs nosql
 
Threads
ThreadsThreads
Threads
 

More from Amir Payberah

P2P Content Distribution Network
P2P Content Distribution NetworkP2P Content Distribution Network
P2P Content Distribution Network
Amir Payberah
 
Cloud Computing
Cloud ComputingCloud Computing
Cloud Computing
Amir Payberah
 
Data Intensive Computing Frameworks
Data Intensive Computing FrameworksData Intensive Computing Frameworks
Data Intensive Computing Frameworks
Amir Payberah
 
The Stratosphere Big Data Analytics Platform
The Stratosphere Big Data Analytics PlatformThe Stratosphere Big Data Analytics Platform
The Stratosphere Big Data Analytics Platform
Amir Payberah
 
The Spark Big Data Analytics Platform
The Spark Big Data Analytics PlatformThe Spark Big Data Analytics Platform
The Spark Big Data Analytics Platform
Amir Payberah
 
Security
SecuritySecurity
Security
Amir Payberah
 
Protection
ProtectionProtection
Protection
Amir Payberah
 
Linux Module Programming
Linux Module ProgrammingLinux Module Programming
Linux Module Programming
Amir Payberah
 
IO Systems
IO SystemsIO Systems
IO Systems
Amir Payberah
 
File System Implementation - Part2
File System Implementation - Part2File System Implementation - Part2
File System Implementation - Part2
Amir Payberah
 
File System Implementation - Part1
File System Implementation - Part1File System Implementation - Part1
File System Implementation - Part1
Amir Payberah
 
File System Interface
File System InterfaceFile System Interface
File System Interface
Amir Payberah
 
Storage
StorageStorage
Storage
Amir Payberah
 
Virtual Memory - Part2
Virtual Memory - Part2Virtual Memory - Part2
Virtual Memory - Part2
Amir Payberah
 
Virtual Memory - Part1
Virtual Memory - Part1Virtual Memory - Part1
Virtual Memory - Part1
Amir Payberah
 
Main Memory - Part2
Main Memory - Part2Main Memory - Part2
Main Memory - Part2
Amir Payberah
 
Main Memory - Part1
Main Memory - Part1Main Memory - Part1
Main Memory - Part1
Amir Payberah
 
Deadlocks
DeadlocksDeadlocks
Deadlocks
Amir Payberah
 
CPU Scheduling - Part2
CPU Scheduling - Part2CPU Scheduling - Part2
CPU Scheduling - Part2
Amir Payberah
 
CPU Scheduling - Part1
CPU Scheduling - Part1CPU Scheduling - Part1
CPU Scheduling - Part1
Amir Payberah
 

More from Amir Payberah (20)

P2P Content Distribution Network
P2P Content Distribution NetworkP2P Content Distribution Network
P2P Content Distribution Network
 
Cloud Computing
Cloud ComputingCloud Computing
Cloud Computing
 
Data Intensive Computing Frameworks
Data Intensive Computing FrameworksData Intensive Computing Frameworks
Data Intensive Computing Frameworks
 
The Stratosphere Big Data Analytics Platform
The Stratosphere Big Data Analytics PlatformThe Stratosphere Big Data Analytics Platform
The Stratosphere Big Data Analytics Platform
 
The Spark Big Data Analytics Platform
The Spark Big Data Analytics PlatformThe Spark Big Data Analytics Platform
The Spark Big Data Analytics Platform
 
Security
SecuritySecurity
Security
 
Protection
ProtectionProtection
Protection
 
Linux Module Programming
Linux Module ProgrammingLinux Module Programming
Linux Module Programming
 
IO Systems
IO SystemsIO Systems
IO Systems
 
File System Implementation - Part2
File System Implementation - Part2File System Implementation - Part2
File System Implementation - Part2
 
File System Implementation - Part1
File System Implementation - Part1File System Implementation - Part1
File System Implementation - Part1
 
File System Interface
File System InterfaceFile System Interface
File System Interface
 
Storage
StorageStorage
Storage
 
Virtual Memory - Part2
Virtual Memory - Part2Virtual Memory - Part2
Virtual Memory - Part2
 
Virtual Memory - Part1
Virtual Memory - Part1Virtual Memory - Part1
Virtual Memory - Part1
 
Main Memory - Part2
Main Memory - Part2Main Memory - Part2
Main Memory - Part2
 
Main Memory - Part1
Main Memory - Part1Main Memory - Part1
Main Memory - Part1
 
Deadlocks
DeadlocksDeadlocks
Deadlocks
 
CPU Scheduling - Part2
CPU Scheduling - Part2CPU Scheduling - Part2
CPU Scheduling - Part2
 
CPU Scheduling - Part1
CPU Scheduling - Part1CPU Scheduling - Part1
CPU Scheduling - Part1
 

Recently uploaded

Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
Aftab Hussain
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
mikeeftimakis1
 
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex ProofszkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
Alex Pruden
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
Kari Kakkonen
 
RESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for studentsRESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for students
KAMESHS29
 
Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1
DianaGray10
 
UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5
DianaGray10
 
Mind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AIMind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AI
Kumud Singh
 
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
名前 です男
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
James Anderson
 
20 Comprehensive Checklist of Designing and Developing a Website
20 Comprehensive Checklist of Designing and Developing a Website20 Comprehensive Checklist of Designing and Developing a Website
20 Comprehensive Checklist of Designing and Developing a Website
Pixlogix Infotech
 
Climate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing DaysClimate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing Days
Kari Kakkonen
 
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
Neo4j
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
DanBrown980551
 
By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024
Pierluigi Pugliese
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
Alan Dix
 
UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6
DianaGray10
 
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfUnlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Malak Abu Hammad
 
Full-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalizationFull-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalization
Zilliz
 
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AI
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AIEnchancing adoption of Open Source Libraries. A case study on Albumentations.AI
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AI
Vladimir Iglovikov, Ph.D.
 

Recently uploaded (20)

Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
 
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex ProofszkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
 
RESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for studentsRESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for students
 
Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1
 
UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5
 
Mind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AIMind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AI
 
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
 
20 Comprehensive Checklist of Designing and Developing a Website
20 Comprehensive Checklist of Designing and Developing a Website20 Comprehensive Checklist of Designing and Developing a Website
20 Comprehensive Checklist of Designing and Developing a Website
 
Climate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing DaysClimate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing Days
 
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
 
By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
 
UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6
 
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfUnlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
 
Full-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalizationFull-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalization
 
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AI
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AIEnchancing adoption of Open Source Libraries. A case study on Albumentations.AI
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AI
 

Dynamo and NoSQL Databases

  • 1. Dynamo: Amazon’s Highly Available Key-value Store Amir H. Payberah amir@sics.se Amirkabir University of Technology (Tehran Polytechnic) Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 1 / 64
  • 2. Database and Database Management System Database: an organized collection of data. Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 2 / 64
  • 3. Database and Database Management System Database: an organized collection of data. Database Management System (DBMS): a software that interacts with users, other applications, and the database itself to capture and analyze data. Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 2 / 64
  • 4. History of Databases 1960s • Navigational data model: hierarchical model (IMS) and network model (CODASYL). • Disk-aware Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 3 / 64
  • 5. History of Databases 1960s • Navigational data model: hierarchical model (IMS) and network model (CODASYL). • Disk-aware 1970s • Relational data model: Edgar F. Codd paper. • Logical data is disconnected from physical information storage. Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 3 / 64
  • 6. History of Databases 1960s • Navigational data model: hierarchical model (IMS) and network model (CODASYL). • Disk-aware 1970s • Relational data model: Edgar F. Codd paper. • Logical data is disconnected from physical information storage. 1980s • Object databases: information is represented in the form of objects. Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 3 / 64
  • 7. History of Databases 1960s • Navigational data model: hierarchical model (IMS) and network model (CODASYL). • Disk-aware 1970s • Relational data model: Edgar F. Codd paper. • Logical data is disconnected from physical information storage. 1980s • Object databases: information is represented in the form of objects. 2000s • NoSQL databases: BASE instead of ACID. • NewSQL databases: scalable performance of NoSQL + ACID. Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 3 / 64
  • 8. Relational Databases Management Systems (RDMBSs) RDMBSs: the dominant technology for storing structured data in web and business applications. SQL is good • Rich language • Easy to use and integrate • Rich toolset • Many vendors They promise: ACID Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 4 / 64
  • 9. ACID Properties Atomicity • All included statements in a transaction are either executed or the whole transaction is aborted without affecting the database. Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 5 / 64
  • 10. ACID Properties Atomicity • All included statements in a transaction are either executed or the whole transaction is aborted without affecting the database. Consistency • A database is in a consistent state before and after a transaction. Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 5 / 64
  • 11. ACID Properties Atomicity • All included statements in a transaction are either executed or the whole transaction is aborted without affecting the database. Consistency • A database is in a consistent state before and after a transaction. Isolation • Transactions can not see uncommitted changes in the database. Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 5 / 64
  • 12. ACID Properties Atomicity • All included statements in a transaction are either executed or the whole transaction is aborted without affecting the database. Consistency • A database is in a consistent state before and after a transaction. Isolation • Transactions can not see uncommitted changes in the database. Durability • Changes are written to a disk before a database commits a transaction so that committed data cannot be lost through a power failure. Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 5 / 64
  • 13. RDBMS is Good ... Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 6 / 64
  • 14. RDBMS Challenges Web-based applications caused spikes. • Internet-scale data size • High read-write rates • Frequent schema changes Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 7 / 64
  • 15. Let’s Scale RDBMSs RDBMS were not designed to be distributed. Possible solutions: • Replication • Sharding Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 8 / 64
  • 16. Let’s Scale RDBMSs - Replication Master/Slave architecture Scales read operations Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 9 / 64
  • 17. Let’s Scale RDBMSs - Sharding Dividing the database across many machines. It scales read and write operations. Cannot execute transactions across shards (partitions). Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 10 / 64
  • 18. Scaling RDBMSs is Expensive and Inefficient [http://www.couchbase.com/sites/default/files/uploads/all/whitepapers/NoSQLWhitepaper.pdf] Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 11 / 64
  • 19. Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 12 / 64
  • 20. Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 13 / 64
  • 21. NoSQL Avoidance of unneeded complexity High throughput Horizontal scalability and running on commodity hardware Compromising reliability for better performance Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 14 / 64
  • 22. The NoSQL Movement It was first used in 1998 by Carlo Strozzi to name his relational database that did not expose the standard SQL interface. The term was picked up again in 2009 when a Last.fm developer, Jo- han Oskarsson, wanted to organize an event to discuss open source distributed databases. The name attempted to label the emergence of a growing number of non-relational, distributed data stores that often did not attempt to provide ACID. Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 15 / 64
  • 23. NoSQL Cost and Performance [http://www.couchbase.com/sites/default/files/uploads/all/whitepapers/NoSQLWhitepaper.pdf] Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 16 / 64
  • 25. NoSQL Data Models Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 18 / 64
  • 27. Key-Value Data Model Collection of key/value pairs. Ordered Key-Value: processing over key ranges. Dynamo, Scalaris, Voldemort, Riak, ... Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 20 / 64
  • 28. Column-Oriented Data Model Similar to a key/value store, but the value can have multiple at- tributes (Columns). Column: a set of data values of a particular type. Store and process data by column instead of row. BigTable, Hbase, Cassandra, ... Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 21 / 64
  • 29. Document Data Model Similar to a column-oriented store, but values can have complex documents, instead of fixed format. Flexible schema. XML, YAML, JSON, and BSON. CouchDB, MongoDB, ... { FirstName: "Bob", Address: "5 Oak St.", Hobby: "sailing" } { FirstName: "Jonathan", Address: "15 Wanamassa Point Road", Children: [ {Name: "Michael", Age: 10}, {Name: "Jennifer", Age: 8}, ] } Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 22 / 64
  • 30. Graph Data Model Uses graph structures with nodes, edges, and properties to represent and store data. Neo4J, InfoGrid, ... [http://en.wikipedia.org/wiki/Graph database] Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 23 / 64
  • 31. Consistency Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 24 / 64
  • 32. Consistency Strong consistency • After an update completes, any subsequent access will return the updated value. Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 25 / 64
  • 33. Consistency Strong consistency • After an update completes, any subsequent access will return the updated value. Eventual consistency • Does not guarantee that subsequent accesses will return the updated value. • Inconsistency window. • If no new updates are made to the object, eventually all accesses will return the last updated value. Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 25 / 64
  • 34. Quorum Model N: the number of nodes to which a data item is replicated. R: the number of nodes a value has to be read from to be accepted. W: the number of nodes a new value has to be written to before the write operation is finished. To enforce strong consistency: R + W > N Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 26 / 64
  • 35. Quorum Model N: the number of nodes to which a data item is replicated. R: the number of nodes a value has to be read from to be accepted. W: the number of nodes a new value has to be written to before the write operation is finished. To enforce strong consistency: R + W > N Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 26 / 64
  • 36. Consistency vs. Availability The large-scale applications have to be reliable: availability + re- dundancy These properties are difficult to achieve with ACID properties. The BASE approach forfeits the ACID properties of consistency and isolation in favor of availability, graceful degradation, and perfor- mance. Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 27 / 64
  • 37. BASE Properties Basic Availability • Possibilities of faults but not a fault of the whole system. Soft-state • Copies of a data item may be inconsistent Eventually consistent • Copies becomes consistent at some later time if there are no more updates to that data item Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 28 / 64
  • 38. CAP Theorem Consistency • Consistent state of data after the execution of an operation. Availability • Clients can always read and write data. Partition Tolerance • Continue the operation in the presence of network partitions. You can choose only two! Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 29 / 64
  • 39. Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 30 / 64
  • 40. Dyanmo Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 31 / 64
  • 41. Dynamo Distributed key/value storage system Scalable Highly available CAP Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 32 / 64
  • 42. Design Consideration It sacrifices strong consistency for availability: always writable Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 33 / 64
  • 43. Design Consideration It sacrifices strong consistency for availability: always writable Incremental scalability Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 33 / 64
  • 44. Design Consideration It sacrifices strong consistency for availability: always writable Incremental scalability Symmetry: every node should have the same set of responsibilities as its peers Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 33 / 64
  • 45. Design Consideration It sacrifices strong consistency for availability: always writable Incremental scalability Symmetry: every node should have the same set of responsibilities as its peers Decentralization Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 33 / 64
  • 46. Design Consideration It sacrifices strong consistency for availability: always writable Incremental scalability Symmetry: every node should have the same set of responsibilities as its peers Decentralization Heterogeneity Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 33 / 64
  • 47. System Architecture Data partitioning Replication Data versioning Dynamo API Membership management Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 34 / 64
  • 48. Data Partitioning Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 35 / 64
  • 49. Partitioning If size of data exceeds the capacity of a single machine: partitioning Sharding data (horizontal partitioning). Consistent hashing is one form of automatic sharding. Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 36 / 64
  • 50. Consistent Hashing Hash both data and nodes using the same hash function in a same id space. partition = hash(d) mod n, d: data, n: number of nodes Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 37 / 64
  • 51. Consistent Hashing Hash both data and nodes using the same hash function in a same id space. partition = hash(d) mod n, d: data, n: number of nodes hash("Fatemeh") = 12 hash("Ahmad") = 2 hash("Seif") = 9 hash("Jim") = 14 hash("Sverker") = 4 Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 37 / 64
  • 52. Load Imbalance (1/4) Consistent hashing may lead to imbalance. Node identifiers may not be balanced. Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 38 / 64
  • 53. Load Imbalance (2/4) Consistent hashing may lead to imbalance. Data identifiers may not be balanced. Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 39 / 64
  • 54. Load Imbalance (3/4) Consistent hashing may lead to imbalance. Hot spots. Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 40 / 64
  • 55. Load Imbalance (4/4) Consistent hashing may lead to imbalance. Heterogeneous nodes. Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 41 / 64
  • 56. Load Balancing via Virtual Nodes Each physical node picks multiple random identifiers. Each identifier represents a virtual node. Each node runs multiple virtual nodes. Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 42 / 64
  • 57. Replication Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 43 / 64
  • 58. Replication To achieve high availability and durability, data should be replicates on multiple nodes. Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 44 / 64
  • 59. Data Versioning Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 45 / 64
  • 60. Data Versioning (1/3) Updates are propagated asynchronously. Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 46 / 64
  • 61. Data Versioning (1/3) Updates are propagated asynchronously. Each update/modification of an item results in a new and immutable version of the data. • Multiple versions of an object may exist. Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 46 / 64
  • 62. Data Versioning (1/3) Updates are propagated asynchronously. Each update/modification of an item results in a new and immutable version of the data. • Multiple versions of an object may exist. Replicas eventually become consistent. Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 46 / 64
  • 63. Data Versioning (2/3) Version branching can happen due to node/network failures. Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 47 / 64
  • 64. Data Versioning (2/3) Version branching can happen due to node/network failures. Use vector clocks for capturing causality, in the form of (node, counter) Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 47 / 64
  • 65. Data Versioning (2/3) Version branching can happen due to node/network failures. Use vector clocks for capturing causality, in the form of (node, counter) • If causal: older version can be forgotten Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 47 / 64
  • 66. Data Versioning (2/3) Version branching can happen due to node/network failures. Use vector clocks for capturing causality, in the form of (node, counter) • If causal: older version can be forgotten • If concurrent: conflict exists, requiring reconciliation Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 47 / 64
  • 67. Data Versioning (3/3) Client C1 writes new object via Sx. Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 48 / 64
  • 68. Data Versioning (3/3) Client C1 writes new object via Sx. C1 updates the object via Sx. Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 48 / 64
  • 69. Data Versioning (3/3) Client C1 writes new object via Sx. C1 updates the object via Sx. C1 updates the object via Sy. Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 48 / 64
  • 70. Data Versioning (3/3) Client C1 writes new object via Sx. C1 updates the object via Sx. C1 updates the object via Sy. C2 reads D2 and updates the object via Sz. Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 48 / 64
  • 71. Data Versioning (3/3) Client C1 writes new object via Sx. C1 updates the object via Sx. C1 updates the object via Sy. C2 reads D2 and updates the object via Sz. C3 reads D3 and D4 via Sx. • The read context is a summary of the clocks of D3 and D4: [(Sx, 2), (Sy, 1), (Sz, 1)]. Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 48 / 64
  • 72. Data Versioning (3/3) Client C1 writes new object via Sx. C1 updates the object via Sx. C1 updates the object via Sy. C2 reads D2 and updates the object via Sz. C3 reads D3 and D4 via Sx. • The read context is a summary of the clocks of D3 and D4: [(Sx, 2), (Sy, 1), (Sz, 1)]. Reconciliation Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 48 / 64
  • 73. Dynamo API Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 49 / 64
  • 74. Dynamo API get(key) • Return single object or list of objects with conflicting version and context. Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 50 / 64
  • 75. Dynamo API get(key) • Return single object or list of objects with conflicting version and context. put(key, context, object) • Store object and context under key. • Context encodes system metadata, e.g., version number. Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 50 / 64
  • 76. Execution of Operations Client can send the request: • To the node responsible for the data (coordinator): save on latency, code on client • To a generic load balancer: extra hope Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 51 / 64
  • 77. put Operation Coordinator generates new vector clock and writes the new version locally. Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 52 / 64
  • 78. put Operation Coordinator generates new vector clock and writes the new version locally. Send to N nodes. Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 52 / 64
  • 79. put Operation Coordinator generates new vector clock and writes the new version locally. Send to N nodes. Wait for response from W nodes. Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 52 / 64
  • 80. put Operation Coordinator generates new vector clock and writes the new version locally. Send to N nodes. Wait for response from W nodes. Using W=1 • High availability for writes • Low durability Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 52 / 64
  • 81. get Operation Coordinator requests existing versions from N. • Wait for response from R nodes. Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 53 / 64
  • 82. get Operation Coordinator requests existing versions from N. • Wait for response from R nodes. If multiple versions, return all versions that are causally unrelated. Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 53 / 64
  • 83. get Operation Coordinator requests existing versions from N. • Wait for response from R nodes. If multiple versions, return all versions that are causally unrelated. Divergent versions are then reconciled. Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 53 / 64
  • 84. get Operation Coordinator requests existing versions from N. • Wait for response from R nodes. If multiple versions, return all versions that are causally unrelated. Divergent versions are then reconciled. Reconciled version written back. Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 53 / 64
  • 85. get Operation Coordinator requests existing versions from N. • Wait for response from R nodes. If multiple versions, return all versions that are causally unrelated. Divergent versions are then reconciled. Reconciled version written back. Using R=1 • High performance read engine. Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 53 / 64
  • 86. Membership Management Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 54 / 64
  • 87. Membership Management Administrator explicitly adds and removes nodes. Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 55 / 64
  • 88. Membership Management Administrator explicitly adds and removes nodes. Gossiping to propagate membership changes. • Eventually consistent view. • O(1) hop overlay. Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 55 / 64
  • 89. Adding Nodes A new node X added to system. • X is assigned key ranges w.r.t. its virtual servers. • For each key range, it transfers the data items. Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 56 / 64
  • 90. Removing Nodes Reallocation of keys is a reverse process of adding nodes. Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 57 / 64
  • 91. Failure Detection Passive failure detection. • Use pings only for detection from failed to alive. In the absence of client requests, node A doesn’t need to know if node B is alive. Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 58 / 64
  • 92. Handling Permanent Failure Anti-entropy for replica synchronization. Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 59 / 64
  • 93. Handling Permanent Failure Anti-entropy for replica synchronization. Use Merkle trees for fast inconsistency detection and minimum transfer of data. Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 59 / 64
  • 94. Handling Permanent Failure Anti-entropy for replica synchronization. Use Merkle trees for fast inconsistency detection and minimum transfer of data. • Nodes maintain Merkle tree of each key range. Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 59 / 64
  • 95. Handling Permanent Failure Anti-entropy for replica synchronization. Use Merkle trees for fast inconsistency detection and minimum transfer of data. • Nodes maintain Merkle tree of each key range. • Exchange root of Merkle tree to check if the key ranges are updated. Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 59 / 64
  • 96. Failure Transient Detection Due to partitions, quorums might not exist. • Sloppy quorum. • Create transient replicas: N healthy nodes from the preference list. • Reconcile after partition heals. Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 60 / 64
  • 97. Failure Transient Detection Due to partitions, quorums might not exist. • Sloppy quorum. • Create transient replicas: N healthy nodes from the preference list. • Reconcile after partition heals. Say A is unreachable. put will use D. Later, D detects A is alive. • Sends the replica to A • Removes the replica. Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 60 / 64
  • 98. Summary Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 61 / 64
  • 99. Summary NoSQL data models: key-value, column-oriented, document- oriented, graph-based Sharding and consistent hashing ACID vs. BASE CAP (Consistency vs. Availability) Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 62 / 64
  • 100. Summary Dynamo: key/value storage: put and get Data partitioning: consistent hashing Load balancing: virtual server Replication: several nodes, preference list Data versioning: vector clock, resolve conflict at read time by the application Membership management: join/leave by admin, gossip-based to up- date the nodes’ views, ping to detect failure Handling transient failure: sloppy quorum Handling permanent failure: Merkle tree Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 63 / 64
  • 101. Questions? Amir H. Payberah (Tehran Polytechnic) Dynamo 1393/7/19 64 / 64