The Road 
to Akka Cluster 
and Beyond… 
Jonas Bonér 
CTO Typesafe 
@jboner
What is a 
Distributed 
System?
What is a 
Distributed 
System? 
and Why would You Need one?
Distributed 
Computing 
is the New 
normal
Distributed 
Computing 
is the New 
normal 
you already have a 
distributed system, 
WHETHER 
you want it or not
Distributed 
Computing 
is the New 
normal 
you already have a 
distributed system, 
WHETHER 
you want it or not 
Mobile 
...
What is the 
essence of 
distributed 
computing?
What is the It’s to try to 
essence of 
distributed 
computing? 
overcome 
1. Information travels at 
the speed of light 
...
Why do we need it?
Why do we need it? 
Elasticity 
When you outgrow 
the resources of 
a single node
Why do we need it? 
Elasticity 
When you outgrow 
the resources of 
a single node 
Availability 
Providing resilience if 
...
Why do we need it? 
Elasticity 
When you outgrow 
the resources of 
a single node 
Availability 
Providing resilience if 
...
So, what’s the problem?
So, what’s the problem? 
It is still 
Very Hard
The network is 
Inherently 
Unreliable
You can’t tell the DIFFERENCE 
Between a 
Slow NODE 
and a 
Dead NODE
Fallacies 
Peter Deutsch’s 
8 Fallacies 
of 
Distributed 
Computing
Fallacies 
1. The network is reliable 
2. Latency is zero 
3. Bandwidth is infinite 
4. The network is secure 
5. Topology...
So, oh yes…
So, oh yes… 
It is still 
Very Hard
Graveyard of distributed systems 
1. Guaranteed Delivery 
2. Synchronous RPC 
3. Distributed Objects 
4. Distributed Share...
General strategies 
Divide & Conquer 
Partition 
for scale 
Replicate 
for resilience
General strategies 
WHICH Requires 
SHARE NOTHING 
Designs 
Asynchronous 
Message-Passing
General strategies 
WHICH Requires 
SHARE NOTHING 
Designs 
Asynchronous 
Message-Passing 
Location 
Transparency 
ISolati...
theoretical 
Models
A model for distributed Computation 
Should 
Allow 
explicit 
reasoning 
abouT 
1. Concurrency 
2. Distribution 
3. Mobili...
Lambda Calculus 
Alonzo Church 1930
Lambda Calculus 
Alonzo Church 1930 
state 
Immutable state 
Managed through 
functional application 
Referential transpar...
Lambda Calculus 
Alonzo Church 1930 
order 
β-reduction—can be 
performed in any order 
Normal order 
Applicative order 
C...
Lambda Calculus 
Even in parallel 
Alonzo Church 1930 
order 
β-reduction—can be 
performed in any order 
Normal order 
Ap...
Lambda Calculus 
Even in parallel 
Alonzo Church 1930 
order 
β-reduction—can be 
performed in any order 
Normal order 
Ap...
Lambda Calculus 
Even in parallel 
Alonzo Church 1930 
order 
β-reduction—can be 
performed in any order 
Normal order 
Ap...
Lambda Calculus 
Even in parallel 
Alonzo Church 1930 
No order 
model for 
Mobility 
β-reduction—can be 
performed in any...
Von neumann machine 
John von Neumann 1945 
Memory 
Control Unit Arithmetic Logic Unit 
Accumulator 
Input Output
Von neumann machine 
John von Neumann 1945
Von neumann machine 
John von Neumann 1945 
state 
Mutable state 
In-place updates
Von neumann machine 
John von Neumann 1945 
order 
Total order 
List of instructions 
Array of memory 
state 
Mutable stat...
Von neumann machine 
John von Neumann 1945 
order 
for 
No Concurrency 
model Total order 
List of instructions 
Array of ...
Von neumann machine 
John von Neumann 1945 
order 
for 
state 
No Concurrencymodel No Total order 
Mutable model state 
Li...
Von neumann machine 
John von Neumann 1945 
No model for 
Mobility 
order 
for 
state 
No Concurrencymodel No Total order ...
transactions 
Jim Gray 1981
transactions 
Jim Gray 1981 
state 
Isolation of updates 
Atomicity
transactions 
Jim Gray 1981 
order 
Serializability 
Disorder across 
transactions 
Illusion of order within 
transactions...
transactions 
Jim Gray 1981 
Concurrency 
order 
Serializability 
Disorder Works 
across 
Work transactions 
Well 
Illusio...
transactions 
Jim Gray 1981 
Concurrency 
Works 
Work Well 
Distribution 
Does Not 
Work Well 
order 
Serializability 
Dis...
actors 
Carl HEWITT 1973
actors 
Carl HEWITT 1973 
state 
Share nothing 
Atomicity within the actor
actors 
Carl HEWITT 1973 
order 
Async message passing 
Non-determinism in 
message delivery 
state 
Share nothing 
Atomic...
actors 
Carl HEWITT 1973 
order 
model for 
Great Concurrency 
Async message passing 
Non-determinism in 
message delivery...
actors 
Carl HEWITT 1973 
order 
for 
Great model Great Concurrencymodel for 
Async message passing 
Non-determinism in 
m...
actors 
Carl HEWITT 1973 
Great order 
for 
Great model Great model Concurrencymodel for 
Mobility 
for 
Async message pas...
other interesting models 
That are 
suitable for 
distributed 
systems 
1. Pi Calculus 
2. Ambient Calculus 
3. Join Calcu...
state of the 
The Art
Impossibility 
Theorems
Impossibility of Distributed 
Consensus with One Faulty Process
Impossibility of Distributed 
Consensus with One Faulty Process 
FLP 
Fischer 
Lynch 
Paterson 
1985
Impossibility of Distributed 
Consensus with One Faulty Process 
FLP 
Fischer 
Lynch 
Paterson 
Consensus 
1985 
is imposs...
Impossibility of Distributed 
Consensus with One Faulty Process 
FLP “The FLP result shows that in an 
asynchronous settin...
Impossibility of Distributed 
Consensus with One Faulty Process 
FLP 
Fischer 
Lynch 
Paterson 
1985
Impossibility of Distributed 
Consensus with One Faulty Process 
FLP “These results do not show that 
such problems cannot...
CAP 
Theorem
Linearizability 
is impossible 
CAP 
Theorem
Conjecture by 
Eric Brewer 2000 
Proof by 
Lynch & Gilbert 2002 
Linearizability 
is impossible 
CAP 
Theorem
Brewer’s Conjecture and the Feasibility of 
Consistent, Available, Partition-Tolerant Web Services 
Conjecture by 
Eric Br...
linearizability
linearizability 
“Under linearizable consistency, all 
operations appear to have executed 
atomically in an order that is ...
linearizability 
“Under linearizable consistency, all 
operations appear to have executed 
atomically in an order that is ...
dissecting CAP
dissecting CAP 
1. Very influential—but very NARROW scope
dissecting CAP 
1. Very influential—but very NARROW scope 
2. “[CAP] has lead to confusion and misunderstandings 
regardin...
dissecting CAP 
1. Very influential—but very NARROW scope 
2. “[CAP] has lead to confusion and misunderstandings 
regardin...
dissecting CAP 
1. Very influential—but very NARROW scope 
2. “[CAP] has lead to confusion and misunderstandings 
regardin...
dissecting CAP 
1. Very influential—but very NARROW scope 
2. “[CAP] has lead to confusion and misunderstandings 
regardin...
dissecting CAP 
1. Very influential—but very NARROW scope 
2. “[CAP] has lead to confusion and misunderstandings 
regardin...
dissecting CAP 
1. Very influential—but very NARROW scope 
2. “[CAP] has lead to confusion and misunderstandings 
regardin...
consensus
consensus 
“The problem of reaching agreement 
among remote processes is one of the 
most fundamental problems in distribu...
Consistency models
Consistency models 
Strong
Consistency models 
Strong 
Weak
Consistency models 
Strong 
Weak 
Eventual
Time & 
Order
Last write wins 
global clock 
timestamp
Last write wins 
global clock 
timestamp
lamport clocks 
logical clock 
causal consistency 
Leslie lamport 1978
lamport clocks 
logical clock 
causal consistency 
Leslie lamport 1978 
1. When a process does work, increment the counter
lamport clocks 
logical clock 
causal consistency 
Leslie lamport 1978 
1. When a process does work, increment the counter...
lamport clocks 
logical clock 
causal consistency 
Leslie lamport 1978 
1. When a process does work, increment the counter...
vector clocks 
Extends 
lamport clocks 
colin fidge 1988
vector clocks 
Extends 
lamport clocks 
colin fidge 1988 
1. Each node owns and increments its own Lamport Clock
vector clocks 
Extends 
lamport clocks 
colin fidge 1988 
1. Each node owns and increments its own Lamport Clock 
[node ->...
vector clocks 
Extends 
lamport clocks 
colin fidge 1988 
1. Each node owns and increments its own Lamport Clock 
[node ->...
vector clocks 
Extends 
lamport clocks 
colin fidge 1988 
1. Each node owns and increments its own Lamport Clock 
[node ->...
vector clocks 
Extends 
lamport clocks 
colin fidge 1988 
1. Each node owns and increments its own Lamport Clock 
[node ->...
Quorum
Quorum 
Strict majority vote
Quorum 
Strict majority vote 
Sloppy partial vote
Quorum 
Strict majority vote 
Sloppy partial vote 
• Most use R + W > N ⇒ R & W overlap
Quorum 
Strict majority vote 
Sloppy partial vote 
• Most use R + W > N ⇒ R & W overlap 
• If N / 2 + 1 is still alive ⇒ a...
Quorum 
Strict majority vote 
Sloppy partial vote 
• Most use R + W > N ⇒ R & W overlap 
• If N / 2 + 1 is still alive ⇒ a...
failure 
Detection
Failure detection 
Formal model
Failure detection 
Formal model 
Strong completeness
Failure detection 
Formal model 
Strong completeness 
Every crashed process is eventually suspected by every correct proce...
Failure detection 
Formal model 
Strong completeness 
Everyone knows 
Every crashed process is eventually suspected by eve...
Failure detection 
Formal model 
Strong completeness 
Everyone knows 
Every crashed process is eventually suspected by eve...
Failure detection 
Formal model 
Strong completeness 
Everyone knows 
Every crashed process is eventually suspected by eve...
Failure detection 
Formal model 
Strong completeness 
Everyone knows 
Every crashed process is eventually suspected by eve...
Failure detection 
Formal model 
Strong completeness 
Everyone knows 
Every crashed process is eventually suspected by eve...
Failure detection 
Formal model 
Strong completeness 
Everyone knows 
Every crashed process is eventually suspected by eve...
Failure detection 
Strong completeness 
Every crashed process is eventually suspected by every correct process 
Weak compl...
Failure detection 
Strong completeness 
Every crashed process is eventually suspected by every correct process 
Weak compl...
Failure detection 
Strong completeness 
Every crashed process is eventually suspected by every correct process 
Weak compl...
Failure detection 
Strong completeness 
Every crashed process is eventually suspected by every correct process 
Weak compl...
Accrual Failure detector 
Hayashibara et. al. 2004
Accrual Failure detector 
Hayashibara et. al. 2004 
Keeps history of 
heartbeat statistics
Accrual Failure detector 
Hayashibara et. al. 2004 
Keeps history of 
heartbeat statistics 
Decouples monitoring 
from int...
Accrual Failure detector 
Hayashibara et. al. 2004 
Keeps history of 
heartbeat statistics 
Decouples monitoring 
from int...
Accrual Failure detector 
Hayashibara et. al. 2004 
Not YES or NO 
Keeps history of 
heartbeat statistics 
Decouples monit...
Accrual Failure detector 
Hayashibara et. al. 2004 
Not YES or NO 
Keeps history of 
heartbeat statistics 
Decouples monit...
Accrual Failure detector 
Hayashibara et. al. 2004 
Not YES or NO 
Keeps history of 
heartbeat statistics 
Decouples monit...
SWIM Failure detector 
das et. al. 2002
SWIM Failure detector 
das et. al. 2002 
Separates heartbeats from cluster dissemination
SWIM Failure detector 
das et. al. 2002 
Separates heartbeats from cluster dissemination 
Quarantine: suspected ⇒ time win...
SWIM Failure detector 
das et. al. 2002 
Separates heartbeats from cluster dissemination 
Quarantine: suspected ⇒ time win...
byzantine Failure detector 
liskov et. al. 1999
byzantine Failure detector 
liskov et. al. 1999 
Supports 
misbehaving 
processes
byzantine Failure detector 
liskov et. al. 1999 
Supports 
misbehaving 
processes 
Omission failures
byzantine Failure detector 
liskov et. al. 1999 
Supports 
misbehaving 
processes 
Omission failures 
Crash failures, fail...
byzantine Failure detector 
liskov et. al. 1999 
Supports 
misbehaving 
processes 
Omission failures 
Crash failures, fail...
byzantine Failure detector 
liskov et. al. 1999 
Supports 
misbehaving 
processes 
Omission failures 
Crash failures, fail...
byzantine Failure detector 
liskov et. al. 1999 
Supports 
misbehaving 
processes 
Omission failures 
Crash failures, fail...
replication
Types of replication 
Active (Push) 
! 
Asynchronous 
Passive (Pull) 
! 
Synchronous 
VS 
VS
master/slave Replication
Tree replication
master/master Replication
buddy Replication
buddy Replication
analysis of 
replication 
consensus 
strategies 
Ryan Barrett 2009
Strong 
Consistency
Distributed 
transactions 
Strikes Back
Highly Available Transactions 
Peter Bailis et. al. 2013 
HAT NOT 
CAP
Highly Available Transactions 
Peter Bailis et. al. 2013 
Executive Summary 
HAT NOT 
CAP
Highly Available Transactions 
Peter Bailis et. al. 2013 
Executive Summary 
• Most SQL DBs do not provide 
Serializabilit...
Highly Available Transactions 
Peter Bailis et. al. 2013 
Executive Summary 
• Most SQL DBs do not provide 
Serializabilit...
Highly Available Transactions 
Peter Bailis et. al. 2013 
Executive Summary 
• Most SQL DBs do not provide 
Serializabilit...
HAT
UnAvailable 
• Serializable 
• Snapshot Isolation 
• Repeatable Read 
• Cursor Stability 
• etc. 
Highly Available 
• Read...
Other scalable or 
Highly Available 
Transactional Research
Other scalable or 
Highly Available 
Transactional Research 
Bolt-On Consistency Bailis et. al. 2013
Other scalable or 
Highly Available 
Transactional Research 
Bolt-On Consistency Bailis et. al. 2013 
Calvin Thompson et. ...
Other scalable or 
Highly Available 
Transactional Research 
Bolt-On Consistency Bailis et. al. 2013 
Calvin Thompson et. ...
consensus 
Protocols
Specification
Specification 
Properties
Events 
1. Request(v) 
2. Decide(v) 
Specification 
Properties
Events 
1. Request(v) 
2. Decide(v) 
Specification 
Properties 
1. Termination: every process eventually decides on a valu...
Events 
1. Request(v) 
2. Decide(v) 
Specification 
Properties 
1. Termination: every process eventually decides on a valu...
Events 
1. Request(v) 
2. Decide(v) 
Specification 
Properties 
1. Termination: every process eventually decides on a valu...
Events 
1. Request(v) 
2. Decide(v) 
Specification 
Properties 
1. Termination: every process eventually decides on a valu...
Consensus Algorithms 
CAP
Consensus Algorithms 
CAP
Consensus Algorithms 
VR Oki & liskov 1988 CAP
Consensus Algorithms 
VR Oki & liskov 1988 
CAP Paxos Lamport 1989
Consensus Algorithms 
VR Oki & liskov 1988 
Paxos Lamport 1989 
ZAB reed & junquiera 2008 CAP
Consensus Algorithms 
VR Oki & liskov 1988 
Paxos Lamport 1989 
ZAB reed & junquiera 2008 
Raft ongaro & ousterhout 2013 
...
Event 
Log
Immutability 
“Immutability Changes Everything” - Pat Helland 
Immutable Data 
Share Nothing Architecture
Immutability 
“Immutability Changes Everything” - Pat Helland 
Immutable Data 
Share Nothing Architecture 
Is the path tow...
Think In Facts 
"The database is a cache of a subset of the log” - Pat Helland
Think In Facts 
"The database is a cache of a subset of the log” - Pat Helland 
Never delete data 
Knowledge only grows 
A...
Aggregate Roots 
Can wrap multiple Entities 
Aggregate Root is the Transactional Boundary
Aggregate Roots 
Can wrap multiple Entities 
Aggregate Root is the Transactional Boundary 
Strong Consistency Within Aggre...
Aggregate Roots 
Can wrap multiple Entities 
Aggregate Root is the Transactional Boundary 
Strong Consistency Within Aggre...
eventual 
Consistency
Dynamo 
VerY 
influential 
CAP 
Vogels et. al. 2007
Dynamo 
Popularized 
• Eventual consistency 
• Epidemic gossip 
• Consistent hashing 
! 
• Hinted handoff 
• Read repair 
...
Consistent Hashing 
Karger et. al. 1997
Consistent Hashing 
Karger et. al. 1997 
Support elasticity— 
easier to scale up and 
down 
Avoids hotspots 
Enables parti...
Consistent Hashing 
Karger et. al. 1997 
Support elasticity— 
easier to scale up and 
down 
Avoids hotspots 
Enables parti...
How eventual is
How eventual is Eventual 
consistency?
How eventual is 
How consistent is 
Eventual 
consistency?
How eventual is 
How consistent is 
Eventual 
consistency? 
PBS 
Probabilistically 
Bounded Staleness 
Peter Bailis et. al...
How eventual is 
How consistent is 
Eventual 
consistency? 
PBS 
Probabilistically 
Bounded Staleness 
Peter Bailis et. al...
epidemic 
Gossip
Node ring & Epidemic Gossip 
CHORD 
Stoica et al 2001
Node ring & Epidemic Gossip 
Member 
Node 
Member 
Node 
Member 
Node 
Member 
Node 
Member 
Node 
Member 
Node 
Member 
N...
Node ring & Epidemic Gossip 
Member 
Node 
Member 
Node 
Member 
Node 
Member 
Node 
Member 
Node 
Member 
Node 
Member 
N...
Node ring & Epidemic Gossip 
Member 
Node 
Member 
Node 
Member 
Node 
Member 
Node 
Member 
Node 
Member 
Node 
Member 
N...
Node ring & Epidemic Gossip 
CHORD 
Stoica et al 2001 CAP 
Member 
Node 
Member 
Node 
Member 
Node 
Member 
Node 
Member ...
Benefits of Epidemic Gossip 
Decentralized P2P 
No SPOF or SPOB 
Very Scalable 
Fully Elastic 
! 
Requires minimal 
admini...
Some Standard 
Optimizations to 
Epidemic Gossip 
1. Separation of failure detection heartbeat 
and dissemination of data ...
disorderly 
Programming
ACID 2.0
ACID 2.0 
Associative 
Batch-insensitive 
(grouping doesn't matter) 
a+(b+c)=(a+b)+c
ACID 2.0 
Associative 
Batch-insensitive 
(grouping doesn't matter) 
a+(b+c)=(a+b)+c 
Commutative 
Order-insensitive 
(ord...
ACID 2.0 
Associative 
Batch-insensitive 
(grouping doesn't matter) 
a+(b+c)=(a+b)+c 
Commutative 
Order-insensitive 
(ord...
ACID 2.0 
Associative 
Batch-insensitive 
(grouping doesn't matter) 
a+(b+c)=(a+b)+c 
Commutative 
Order-insensitive 
(ord...
Convergent & Commutative 
Replicated Data Types 
Shapiro et. al. 2011
Convergent & Commutative 
Replicated Data Types 
Shapiro et. al. 2011 
CRDT
Convergent & Commutative 
Replicated Data Types 
Shapiro et. al. 2011 
CRDT 
Join Semilattice 
Monotonic merge function
Convergent & Commutative 
Replicated Data Types 
Shapiro et. al. 2011 
Data types 
Counters 
Registers 
Sets 
Maps 
Graphs...
Convergent & Commutative 
Replicated Data Types 
Shapiro et. al. 2011 
Data types 
Counters 
Registers 
Sets 
Maps 
Graphs...
2 TYPES of CRDTs 
CvRDT 
Convergent 
State-based 
CmRDT 
Commutative 
Ops-based
2 TYPES of CRDTs 
CvRDT 
Convergent 
State-based 
CmRDT 
Commutative 
Ops-based 
Self contained, 
holds all history
2 TYPES of CRDTs 
CvRDT 
Convergent 
State-based 
CmRDT 
Commutative 
Ops-based 
Self contained, 
holds all history 
Needs...
CALM theorem 
Hellerstein et. al. 2011 
Consistency As Logical Monotonicity
CALM theorem 
Hellerstein et. al. 2011 
Consistency As Logical Monotonicity 
Bloom Language 
Compiler help to detect & 
en...
CALM theorem 
Hellerstein et. al. 2011 
Consistency As Logical Monotonicity 
Bloom Language 
Distributed Logic 
Datalog/De...
The Akka Way
Akka Actors
Akka IO 
Akka Actors
Akka REMOTE 
Akka IO 
Akka Actors
Akka CLUSTER 
Akka REMOTE 
Akka IO 
Akka Actors
Akka CLUSTER EXTENSIONS 
Akka CLUSTER 
Akka REMOTE 
Akka IO 
Akka Actors
What is Akka CLUSTER all about? 
• Cluster Membership 
• Leader & Singleton 
• Cluster Sharding 
• Clustered Routers (adap...
cluster membership in Akka
cluster membership in Akka 
• Dynamo-style master-less decentralized P2P
cluster membership in Akka 
• Dynamo-style master-less decentralized P2P 
• Epidemic Gossip—Node Ring
cluster membership in Akka 
• Dynamo-style master-less decentralized P2P 
• Epidemic Gossip—Node Ring 
• Vector Clocks for...
cluster membership in Akka 
• Dynamo-style master-less decentralized P2P 
• Epidemic Gossip—Node Ring 
• Vector Clocks for...
cluster membership in Akka 
• Dynamo-style master-less decentralized P2P 
• Epidemic Gossip—Node Ring 
• Vector Clocks for...
cluster membership in Akka 
• Dynamo-style master-less decentralized P2P 
• Epidemic Gossip—Node Ring 
• Vector Clocks for...
Gossip 
State 
GOSSIPING 
case class Gossip( 
members: SortedSet[Member], 
seen: Set[Member], 
unreachable: Set[Member], 
...
Gossip 
State 
GOSSIPING 
Is a CRDT 
case class Gossip( 
members: SortedSet[Member], 
seen: Set[Member], 
unreachable: Set...
Gossip 
State 
GOSSIPING 
Is a CRDT 
Ordered node ring 
case class Gossip( 
members: SortedSet[Member], 
seen: Set[Member]...
Gossip 
State 
GOSSIPING 
Is a CRDT 
Ordered node ring 
case class Gossip( 
members: SortedSet[Member], 
seen: Set[Member]...
Gossip 
State 
GOSSIPING 
Is a CRDT 
Ordered node ring 
case class Gossip( 
members: SortedSet[Member], 
seen: Set[Member]...
Gossip 
State 
GOSSIPING 
Is a CRDT 
Ordered node ring 
case class Gossip( 
members: SortedSet[Member], 
seen: Set[Member]...
Gossip 
Seen set 
for convergence 
Unreachable set 
State 
GOSSIPING 
Is a CRDT 
Ordered node ring 
case class Gossip( 
me...
Gossip 
Seen set 
for convergence 
Unreachable set 
State 
GOSSIPING 
Is a CRDT 
Ordered node ring 
case class Gossip( 
me...
Gossip 
Seen set 
for convergence 
Unreachable set 
State 
GOSSIPING 
Is a CRDT 
Ordered node ring 
case class Gossip( 
me...
Cluster 
Convergence
Cluster 
Convergence 
Reached when: 
1. All nodes are represented in the seen set 
2. No members are unreachable, or 
3. A...
BIASED 
GOSSIP
BIASED 
GOSSIP 
80% bias to nodes not in seen table 
Up to 400 nodes, then reduced
PUSH/PULL 
GOSSIP
PUSH/PULL 
GOSSIP 
Variation
PUSH/PULL 
GOSSIP 
Variation 
case class Status(version: VectorClock)
LEADER 
ROLE
LEADER 
ROLE 
Any node can 
be the leader
LEADER 
ROLE 
Any node can 
be the leader 
1. No election, but deterministic
LEADER 
ROLE 
Any node can 
be the leader 
1. No election, but deterministic 
2. Can change after cluster convergence
LEADER 
ROLE 
Any node can 
be the leader 
1. No election, but deterministic 
2. Can change after cluster convergence 
3. ...
Node Lifecycle in Akka
Failure Detection
Failure Detection 
Hashes the node ring 
Picks 5 nodes 
Request/Reply heartbeat
Failure Detection 
To increase likelihood of bridging 
racks and data centers 
Hashes the node ring 
Picks 5 nodes 
Reques...
Failure Detection 
Used by racks and data centers 
Cluster Membership 
Remote Death Watch 
Remote Supervision 
To increase...
Failure Detection 
Is an Accrual Failure Detector
Failure Detection 
Is an Accrual Failure Detector 
Does not 
help much 
in practice
Failure Detection 
Is an Accrual Failure Detector 
Does not 
help much 
in practice 
Need to add delay to deal with Garbag...
Failure Detection 
Is an Accrual Failure Detector 
Does not 
help much 
in practice 
Instead of this 
Need to add delay to...
Failure Detection 
Is an Accrual Failure Detector 
Does not 
help much 
in practice 
Instead of this It often looks like t...
Network Partitions
Network Partitions 
• Failure Detector can mark an unavailable member Unreachable
Network Partitions 
• Failure Detector can mark an unavailable member Unreachable 
• If one node is Unreachable then no cl...
Network Partitions 
• Failure Detector can mark an unavailable member Unreachable 
• If one node is Unreachable then no cl...
Network Partitions 
• Failure Detector can mark an unavailable • If one node is Split Unreachable then Brain 
member Unrea...
Network Partitions 
• Failure Detector can mark an unavailable • If one node is Split Unreachable then Brain 
member Unrea...
Network Partitions 
• Failure Detector can mark an unavailable • If one node is Split Unreachable then Brain 
member Unrea...
Network Partitions 
• Failure Detector can mark an unavailable • If one node is Split Unreachable then Brain 
member Unrea...
Potential FUTURE Optimizations
Potential FUTURE Optimizations 
• Vector Clock HISTORY pruning
Potential FUTURE Optimizations 
• Vector Clock HISTORY pruning 
• Delegated heartbeat
Potential FUTURE Optimizations 
• Vector Clock HISTORY pruning 
• Delegated heartbeat 
• “Real” push/pull gossip
Potential FUTURE Optimizations 
• Vector Clock HISTORY pruning 
• Delegated heartbeat 
• “Real” push/pull gossip 
• More o...
Akka Modules For Distribution
Akka Modules For Distribution 
Akka Cluster 
Akka Remote 
Akka HTTP 
Akka IO
Akka Modules For Distribution 
Akka Cluster 
Akka Remote 
Akka HTTP 
Akka IO 
Clustered Singleton 
Clustered Routers 
Clus...
…and 
Beyond
Akka & The Road Ahead 
Akka HTTP 
Akka Streams 
Akka CRDT 
Akka Raft
Akka & The Road Ahead 
Akka HTTP 
Akka Streams 
Akka CRDT 
Akka Raft 
Akka 2.4
Akka & The Road Ahead 
Akka HTTP 
Akka Streams 
Akka CRDT 
Akka Raft 
Akka 2.4 
Akka 2.4
Akka & The Road Ahead 
Akka HTTP 
Akka Streams 
Akka CRDT 
Akka Raft 
Akka 2.4 
Akka 2.4 
?
Akka & The Road Ahead 
Akka HTTP 
Akka Streams 
Akka CRDT 
Akka Raft 
Akka 2.4 
Akka 2.4 
? 
?
Eager 
for more?
Try AKKA out 
akka.io
Join us at 
React Conf 
San Francisco 
Nov 18-21 
reactconf.com
Early Registration 
ends tomorrow 
Join us at 
React Conf 
San Francisco 
Nov 18-21 
reactconf.com
References 
• General Distributed Systems 
• Summary of network reliability post-mortems—more terrifying than the most 
ho...
References 
! 
• Actor Model 
• Great discussion between Erik Meijer & Carl 
Hewitt or the essence of the Actor Model: htt...
References 
• FLP 
• Impossibility of Distributed Consensus with One Faulty Process: http:// 
cs-www.cs.yale.edu/homes/arv...
References 
• Time & Order 
• Post on the problems with Last Write Wins in Riak: http:// 
aphyr.com/posts/285-call-me-mayb...
References 
• Transactions 
• Jim Gray’s classic book: http://www.amazon.com/Transaction- 
Processing-Concepts-Techniques-...
References 
• Consensus 
• Paxos Made Simple: http://research.microsoft.com/en-us/ 
um/people/lamport/pubs/paxos-simple.pd...
References 
• Eventual Consistency 
• Dynamo: Amazon’s Highly Available Key-value 
Store: http://www.read.seas.harvard.edu...
References 
• Epidemic Gossip 
• Chord: A Scalable Peer-to-peer Lookup Service for Internet 
• Applications: http://pdos.c...
References 
• Conflict-Free Replicated Data Types (CRDTs) 
• A comprehensive study of Convergent and Commutative 
Replicat...
References 
• Akka Cluster 
• My Akka Cluster Implementation Notes: https:// 
gist.github.com/jboner/7692270 
• Akka Clust...
any Questions?
The Road 
to Akka Cluster 
and Beyond… 
Jonas Bonér 
CTO Typesafe 
@jboner
The Road to Akka Cluster and Beyond
The Road to Akka Cluster and Beyond
The Road to Akka Cluster and Beyond
The Road to Akka Cluster and Beyond
The Road to Akka Cluster and Beyond
Upcoming SlideShare
Loading in...5
×

The Road to Akka Cluster and Beyond

38,537

Published on

Today, the skills of writing distributed applications is both more important and at the same time more challenging than ever. With the advent of mobile devices, NoSQL databases, cloud services etc. you most likely already have a distributed system at your hands—whether you like it or not. Distributed computing is the new norm.

In this talk we will take you on a journey across the distributed computing landscape. We will start with walking through some of the early work in computer architecture—setting the stage for what we are doing today. Then continue through distributed computing, discussing things like important Impossibility Theorems (FLP, CAP), Consensus Protocols (Raft, HAT, Epidemic Gossip etc.), Failure Detection (Accrual, Byzantine etc.), up to today’s very exciting research in the field, like ACID 2.0, Disorderly Programming (CRDTs, CALM etc).

Along the way we will discuss the decisions and trade-offs that were made when creating Akka Cluster, its theoretical foundation, why it is designed the way it is and what the future holds.

Published in: Technology
4 Comments
217 Likes
Statistics
Notes
No Downloads
Views
Total Views
38,537
On Slideshare
0
From Embeds
0
Number of Embeds
61
Actions
Shares
0
Downloads
1,070
Comments
4
Likes
217
Embeds 0
No embeds

No notes for slide

The Road to Akka Cluster and Beyond

  1. 1. The Road to Akka Cluster and Beyond… Jonas Bonér CTO Typesafe @jboner
  2. 2. What is a Distributed System?
  3. 3. What is a Distributed System? and Why would You Need one?
  4. 4. Distributed Computing is the New normal
  5. 5. Distributed Computing is the New normal you already have a distributed system, WHETHER you want it or not
  6. 6. Distributed Computing is the New normal you already have a distributed system, WHETHER you want it or not Mobile NOSQL Databases SQL Replication Cloud & REST Services
  7. 7. What is the essence of distributed computing?
  8. 8. What is the It’s to try to essence of distributed computing? overcome 1. Information travels at the speed of light 2. Independent things fail independently
  9. 9. Why do we need it?
  10. 10. Why do we need it? Elasticity When you outgrow the resources of a single node
  11. 11. Why do we need it? Elasticity When you outgrow the resources of a single node Availability Providing resilience if one node fails
  12. 12. Why do we need it? Elasticity When you outgrow the resources of a single node Availability Providing resilience if one node fails Rich stateful clients
  13. 13. So, what’s the problem?
  14. 14. So, what’s the problem? It is still Very Hard
  15. 15. The network is Inherently Unreliable
  16. 16. You can’t tell the DIFFERENCE Between a Slow NODE and a Dead NODE
  17. 17. Fallacies Peter Deutsch’s 8 Fallacies of Distributed Computing
  18. 18. Fallacies 1. The network is reliable 2. Latency is zero 3. Bandwidth is infinite 4. The network is secure 5. Topology doesn't change 6. There is one administrator 7. Transport cost is zero 8. The network is homogeneous Peter Deutsch’s 8 Fallacies of Distributed Computing
  19. 19. So, oh yes…
  20. 20. So, oh yes… It is still Very Hard
  21. 21. Graveyard of distributed systems 1. Guaranteed Delivery 2. Synchronous RPC 3. Distributed Objects 4. Distributed Shared Mutable State 5. Serializable Distributed Transactions
  22. 22. General strategies Divide & Conquer Partition for scale Replicate for resilience
  23. 23. General strategies WHICH Requires SHARE NOTHING Designs Asynchronous Message-Passing
  24. 24. General strategies WHICH Requires SHARE NOTHING Designs Asynchronous Message-Passing Location Transparency ISolation & Containment
  25. 25. theoretical Models
  26. 26. A model for distributed Computation Should Allow explicit reasoning abouT 1. Concurrency 2. Distribution 3. Mobility Carlos Varela 2013
  27. 27. Lambda Calculus Alonzo Church 1930
  28. 28. Lambda Calculus Alonzo Church 1930 state Immutable state Managed through functional application Referential transparent
  29. 29. Lambda Calculus Alonzo Church 1930 order β-reduction—can be performed in any order Normal order Applicative order Call-by-name order Call-by-value order Call-by-need order state Immutable state Managed through functional application Referential transparent
  30. 30. Lambda Calculus Even in parallel Alonzo Church 1930 order β-reduction—can be performed in any order Normal order Applicative order Call-by-name order Call-by-value order Call-by-need order state Immutable state Managed through functional application Referential transparent
  31. 31. Lambda Calculus Even in parallel Alonzo Church 1930 order β-reduction—can be performed in any order Normal order Applicative order Call-by-name order Call-by-value order Call-by-need order state Immutable state Managed through functional application Referential transparent Supports Concurrency
  32. 32. Lambda Calculus Even in parallel Alonzo Church 1930 order β-reduction—can be performed in any order Normal order Applicative order Call-by-name order Call-by-value order Call-by-need order state Immutable state Managed through functional application Referential transparent Supports ConcurrencyNo model for Distribution
  33. 33. Lambda Calculus Even in parallel Alonzo Church 1930 No order model for Mobility β-reduction—can be performed in any order Normal order Applicative order Call-by-name order Call-by-value order Call-by-need order state Immutable state Managed through functional application Referential transparent Supports ConcurrencyNo model for Distribution
  34. 34. Von neumann machine John von Neumann 1945 Memory Control Unit Arithmetic Logic Unit Accumulator Input Output
  35. 35. Von neumann machine John von Neumann 1945
  36. 36. Von neumann machine John von Neumann 1945 state Mutable state In-place updates
  37. 37. Von neumann machine John von Neumann 1945 order Total order List of instructions Array of memory state Mutable state In-place updates
  38. 38. Von neumann machine John von Neumann 1945 order for No Concurrency model Total order List of instructions Array of memory state Mutable state In-place updates
  39. 39. Von neumann machine John von Neumann 1945 order for state No Concurrencymodel No Total order Mutable model state List of instructions In-place updates for Array of memory Distribution
  40. 40. Von neumann machine John von Neumann 1945 No model for Mobility order for state No Concurrencymodel No Total order Mutable model state List of instructions In-place updates for Array of memory Distribution
  41. 41. transactions Jim Gray 1981
  42. 42. transactions Jim Gray 1981 state Isolation of updates Atomicity
  43. 43. transactions Jim Gray 1981 order Serializability Disorder across transactions Illusion of order within transactions state Isolation of updates Atomicity
  44. 44. transactions Jim Gray 1981 Concurrency order Serializability Disorder Works across Work transactions Well Illusion of order within transactions state Isolation of updates Atomicity
  45. 45. transactions Jim Gray 1981 Concurrency Works Work Well Distribution Does Not Work Well order Serializability Disorder across transactions Illusion of order within transactions state Isolation of updates Atomicity
  46. 46. actors Carl HEWITT 1973
  47. 47. actors Carl HEWITT 1973 state Share nothing Atomicity within the actor
  48. 48. actors Carl HEWITT 1973 order Async message passing Non-determinism in message delivery state Share nothing Atomicity within the actor
  49. 49. actors Carl HEWITT 1973 order model for Great Concurrency Async message passing Non-determinism in message delivery state Share nothing Atomicity within the actor
  50. 50. actors Carl HEWITT 1973 order for Great model Great Concurrencymodel for Async message passing Non-determinism in message delivery state Share nothing Atomicity within the actor Distribution
  51. 51. actors Carl HEWITT 1973 Great order for Great model Great model Concurrencymodel for Mobility for Async message passing Non-determinism in message delivery state Share nothing Atomicity within the actor Distribution
  52. 52. other interesting models That are suitable for distributed systems 1. Pi Calculus 2. Ambient Calculus 3. Join Calculus
  53. 53. state of the The Art
  54. 54. Impossibility Theorems
  55. 55. Impossibility of Distributed Consensus with One Faulty Process
  56. 56. Impossibility of Distributed Consensus with One Faulty Process FLP Fischer Lynch Paterson 1985
  57. 57. Impossibility of Distributed Consensus with One Faulty Process FLP Fischer Lynch Paterson Consensus 1985 is impossible
  58. 58. Impossibility of Distributed Consensus with One Faulty Process FLP “The FLP result shows that in an asynchronous setting, where only one processor might crash, there is no distributed algorithm that solves the consensus problem” - The Paper Trail Fischer Lynch Paterson 1985 Consensus is impossible
  59. 59. Impossibility of Distributed Consensus with One Faulty Process FLP Fischer Lynch Paterson 1985
  60. 60. Impossibility of Distributed Consensus with One Faulty Process FLP “These results do not show that such problems cannot be “solved” in practice; rather, they point up the need for more refined models of distributed computing” - FLP paper Fischer Lynch Paterson 1985
  61. 61. CAP Theorem
  62. 62. Linearizability is impossible CAP Theorem
  63. 63. Conjecture by Eric Brewer 2000 Proof by Lynch & Gilbert 2002 Linearizability is impossible CAP Theorem
  64. 64. Brewer’s Conjecture and the Feasibility of Consistent, Available, Partition-Tolerant Web Services Conjecture by Eric Brewer 2000 Proof by Lynch & Gilbert 2002 Linearizability is impossible CAP Theorem
  65. 65. linearizability
  66. 66. linearizability “Under linearizable consistency, all operations appear to have executed atomically in an order that is consistent with the global real-time ordering of operations.” Herlihy & Wing 1991
  67. 67. linearizability “Under linearizable consistency, all operations appear to have executed atomically in an order that is consistent with the global real-time ordering of operations.” Herlihy & Wing 1991 Less formally: A read will return the last completed write (made on any replica)
  68. 68. dissecting CAP
  69. 69. dissecting CAP 1. Very influential—but very NARROW scope
  70. 70. dissecting CAP 1. Very influential—but very NARROW scope 2. “[CAP] has lead to confusion and misunderstandings regarding replica consistency, transactional isolation and high availability” - Bailis et.al in HAT paper
  71. 71. dissecting CAP 1. Very influential—but very NARROW scope 2. “[CAP] has lead to confusion and misunderstandings regarding replica consistency, transactional isolation and high availability” - Bailis et.al in HAT paper 3. Linearizability is very often NOT required
  72. 72. dissecting CAP 1. Very influential—but very NARROW scope 2. “[CAP] has lead to confusion and misunderstandings regarding replica consistency, transactional isolation and high availability” - Bailis et.al in HAT paper 3. Linearizability is very often NOT required 4. Ignores LATENCY—but in practice latency & partitions are deeply related
  73. 73. dissecting CAP 1. Very influential—but very NARROW scope 2. “[CAP] has lead to confusion and misunderstandings regarding replica consistency, transactional isolation and high availability” - Bailis et.al in HAT paper 3. Linearizability is very often NOT required 4. Ignores LATENCY—but in practice latency & partitions are deeply related 5. Partitions are RARE—so why sacrifice C or A ALL the time?
  74. 74. dissecting CAP 1. Very influential—but very NARROW scope 2. “[CAP] has lead to confusion and misunderstandings regarding replica consistency, transactional isolation and high availability” - Bailis et.al in HAT paper 3. Linearizability is very often NOT required 4. Ignores LATENCY—but in practice latency & partitions are deeply related 5. Partitions are RARE—so why sacrifice C or A ALL the time? 6. NOT black and white—can be fine-grained and dynamic
  75. 75. dissecting CAP 1. Very influential—but very NARROW scope 2. “[CAP] has lead to confusion and misunderstandings regarding replica consistency, transactional isolation and high availability” - Bailis et.al in HAT paper 3. Linearizability is very often NOT required 4. Ignores LATENCY—but in practice latency & partitions are deeply related 5. Partitions are RARE—so why sacrifice C or A ALL the time? 6. NOT black and white—can be fine-grained and dynamic 7. Read ‘CAP Twelve Years Later’ - Eric Brewer
  76. 76. consensus
  77. 77. consensus “The problem of reaching agreement among remote processes is one of the most fundamental problems in distributed computing and is at the core of many algorithms for distributed data processing, distributed file management, and fault-tolerant distributed applications.” Fischer, Lynch & Paterson 1985
  78. 78. Consistency models
  79. 79. Consistency models Strong
  80. 80. Consistency models Strong Weak
  81. 81. Consistency models Strong Weak Eventual
  82. 82. Time & Order
  83. 83. Last write wins global clock timestamp
  84. 84. Last write wins global clock timestamp
  85. 85. lamport clocks logical clock causal consistency Leslie lamport 1978
  86. 86. lamport clocks logical clock causal consistency Leslie lamport 1978 1. When a process does work, increment the counter
  87. 87. lamport clocks logical clock causal consistency Leslie lamport 1978 1. When a process does work, increment the counter 2. When a process sends a message, include the counter
  88. 88. lamport clocks logical clock causal consistency Leslie lamport 1978 1. When a process does work, increment the counter 2. When a process sends a message, include the counter 3. When a message is received, merge the counter (set the counter to max(local, received) + 1)
  89. 89. vector clocks Extends lamport clocks colin fidge 1988
  90. 90. vector clocks Extends lamport clocks colin fidge 1988 1. Each node owns and increments its own Lamport Clock
  91. 91. vector clocks Extends lamport clocks colin fidge 1988 1. Each node owns and increments its own Lamport Clock [node -> lamport clock]
  92. 92. vector clocks Extends lamport clocks colin fidge 1988 1. Each node owns and increments its own Lamport Clock [node -> lamport clock]
  93. 93. vector clocks Extends lamport clocks colin fidge 1988 1. Each node owns and increments its own Lamport Clock [node -> lamport clock] 2. Alway keep the full history of all increments
  94. 94. vector clocks Extends lamport clocks colin fidge 1988 1. Each node owns and increments its own Lamport Clock [node -> lamport clock] 2. Alway keep the full history of all increments 3. Merges by calculating the max—monotonic merge
  95. 95. Quorum
  96. 96. Quorum Strict majority vote
  97. 97. Quorum Strict majority vote Sloppy partial vote
  98. 98. Quorum Strict majority vote Sloppy partial vote • Most use R + W > N ⇒ R & W overlap
  99. 99. Quorum Strict majority vote Sloppy partial vote • Most use R + W > N ⇒ R & W overlap • If N / 2 + 1 is still alive ⇒ all good
  100. 100. Quorum Strict majority vote Sloppy partial vote • Most use R + W > N ⇒ R & W overlap • If N / 2 + 1 is still alive ⇒ all good • Most use N ⩵ 3
  101. 101. failure Detection
  102. 102. Failure detection Formal model
  103. 103. Failure detection Formal model Strong completeness
  104. 104. Failure detection Formal model Strong completeness Every crashed process is eventually suspected by every correct process
  105. 105. Failure detection Formal model Strong completeness Everyone knows Every crashed process is eventually suspected by every correct process
  106. 106. Failure detection Formal model Strong completeness Everyone knows Every crashed process is eventually suspected by every correct process Weak completeness
  107. 107. Failure detection Formal model Strong completeness Everyone knows Every crashed process is eventually suspected by every correct process Weak completeness Every crashed process is eventually suspected by some correct process
  108. 108. Failure detection Formal model Strong completeness Everyone knows Every crashed process is eventually suspected by every correct process Weak completeness Someone knows Every crashed process is eventually suspected by some correct process
  109. 109. Failure detection Formal model Strong completeness Everyone knows Every crashed process is eventually suspected by every correct process Weak completeness Someone knows Every crashed process is eventually suspected by some correct process Strong accuracy
  110. 110. Failure detection Formal model Strong completeness Everyone knows Every crashed process is eventually suspected by every correct process Weak completeness Someone knows Every crashed process is eventually suspected by some correct process Strong accuracy No correct process is suspected ever
  111. 111. Failure detection Strong completeness Every crashed process is eventually suspected by every correct process Weak completeness Every crashed process is eventually suspected by some correct process Strong accuracy No false positives No correct process is suspected ever Formal model Everyone knows Someone knows
  112. 112. Failure detection Strong completeness Every crashed process is eventually suspected by every correct process Weak completeness Every crashed process is eventually suspected by some correct process Strong accuracy No false positives No correct process is suspected ever Weak accuracy Formal model Everyone knows Someone knows
  113. 113. Failure detection Strong completeness Every crashed process is eventually suspected by every correct process Weak completeness Every crashed process is eventually suspected by some correct process Strong accuracy No false positives No correct process is suspected ever Weak accuracy Some correct process is never suspected Formal model Everyone knows Someone knows
  114. 114. Failure detection Strong completeness Every crashed process is eventually suspected by every correct process Weak completeness Every crashed process is eventually suspected by some correct process Strong accuracy No false positives No correct process is suspected ever Weak accuracy Some false positives Some correct process is never suspected Formal model Everyone knows Someone knows
  115. 115. Accrual Failure detector Hayashibara et. al. 2004
  116. 116. Accrual Failure detector Hayashibara et. al. 2004 Keeps history of heartbeat statistics
  117. 117. Accrual Failure detector Hayashibara et. al. 2004 Keeps history of heartbeat statistics Decouples monitoring from interpretation
  118. 118. Accrual Failure detector Hayashibara et. al. 2004 Keeps history of heartbeat statistics Decouples monitoring from interpretation Calculates a likelihood (phi value) that the process is down
  119. 119. Accrual Failure detector Hayashibara et. al. 2004 Not YES or NO Keeps history of heartbeat statistics Decouples monitoring from interpretation Calculates a likelihood (phi value) that the process is down
  120. 120. Accrual Failure detector Hayashibara et. al. 2004 Not YES or NO Keeps history of heartbeat statistics Decouples monitoring from interpretation Calculates a likelihood (phi value) that the process is down Takes network hiccups into account
  121. 121. Accrual Failure detector Hayashibara et. al. 2004 Not YES or NO Keeps history of heartbeat statistics Decouples monitoring from interpretation Calculates a likelihood (phi value) that the process is down Takes network hiccups into account phi = -log10(1 - F(timeSinceLastHeartbeat)) F is the cumulative distribution function of a normal distribution with mean and standard deviation estimated from historical heartbeat inter-arrival times
  122. 122. SWIM Failure detector das et. al. 2002
  123. 123. SWIM Failure detector das et. al. 2002 Separates heartbeats from cluster dissemination
  124. 124. SWIM Failure detector das et. al. 2002 Separates heartbeats from cluster dissemination Quarantine: suspected ⇒ time window ⇒ faulty
  125. 125. SWIM Failure detector das et. al. 2002 Separates heartbeats from cluster dissemination Quarantine: suspected ⇒ time window ⇒ faulty Delegated heartbeat to bridge network splits
  126. 126. byzantine Failure detector liskov et. al. 1999
  127. 127. byzantine Failure detector liskov et. al. 1999 Supports misbehaving processes
  128. 128. byzantine Failure detector liskov et. al. 1999 Supports misbehaving processes Omission failures
  129. 129. byzantine Failure detector liskov et. al. 1999 Supports misbehaving processes Omission failures Crash failures, failing to receive a request, or failing to send a response
  130. 130. byzantine Failure detector liskov et. al. 1999 Supports misbehaving processes Omission failures Crash failures, failing to receive a request, or failing to send a response Commission failures
  131. 131. byzantine Failure detector liskov et. al. 1999 Supports misbehaving processes Omission failures Crash failures, failing to receive a request, or failing to send a response Commission failures Processing a request incorrectly, corrupting local state, and/or sending an incorrect or inconsistent response to a request
  132. 132. byzantine Failure detector liskov et. al. 1999 Supports misbehaving processes Omission failures Crash failures, failing to receive a request, or failing to send a response Commission failures Processing a request incorrectly, corrupting local state, and/or sending an incorrect or inconsistent response to a request Very expensive, not practical
  133. 133. replication
  134. 134. Types of replication Active (Push) ! Asynchronous Passive (Pull) ! Synchronous VS VS
  135. 135. master/slave Replication
  136. 136. Tree replication
  137. 137. master/master Replication
  138. 138. buddy Replication
  139. 139. buddy Replication
  140. 140. analysis of replication consensus strategies Ryan Barrett 2009
  141. 141. Strong Consistency
  142. 142. Distributed transactions Strikes Back
  143. 143. Highly Available Transactions Peter Bailis et. al. 2013 HAT NOT CAP
  144. 144. Highly Available Transactions Peter Bailis et. al. 2013 Executive Summary HAT NOT CAP
  145. 145. Highly Available Transactions Peter Bailis et. al. 2013 Executive Summary • Most SQL DBs do not provide Serializability, but weaker guarantees— for performance reasons HAT NOT CAP
  146. 146. Highly Available Transactions Peter Bailis et. al. 2013 Executive Summary • Most SQL DBs do not provide Serializability, but weaker guarantees— for performance reasons • Some weaker transaction guarantees are possible to implement in a HA manner HAT NOT CAP
  147. 147. Highly Available Transactions Peter Bailis et. al. 2013 Executive Summary • Most SQL DBs do not provide Serializability, but weaker guarantees— for performance reasons • Some weaker transaction guarantees are possible to implement in a HA manner • What transaction semantics can be provided with HA? HAT NOT CAP
  148. 148. HAT
  149. 149. UnAvailable • Serializable • Snapshot Isolation • Repeatable Read • Cursor Stability • etc. Highly Available • Read Committed • Read Uncommited • Read Your Writes • Monotonic Atomic View • Monotonic Read/Write • etc. HAT
  150. 150. Other scalable or Highly Available Transactional Research
  151. 151. Other scalable or Highly Available Transactional Research Bolt-On Consistency Bailis et. al. 2013
  152. 152. Other scalable or Highly Available Transactional Research Bolt-On Consistency Bailis et. al. 2013 Calvin Thompson et. al. 2012
  153. 153. Other scalable or Highly Available Transactional Research Bolt-On Consistency Bailis et. al. 2013 Calvin Thompson et. al. 2012 Spanner (Google) Corbett et. al. 2012
  154. 154. consensus Protocols
  155. 155. Specification
  156. 156. Specification Properties
  157. 157. Events 1. Request(v) 2. Decide(v) Specification Properties
  158. 158. Events 1. Request(v) 2. Decide(v) Specification Properties 1. Termination: every process eventually decides on a value v
  159. 159. Events 1. Request(v) 2. Decide(v) Specification Properties 1. Termination: every process eventually decides on a value v 2. Validity: if a process decides v, then v was proposed by some process
  160. 160. Events 1. Request(v) 2. Decide(v) Specification Properties 1. Termination: every process eventually decides on a value v 2. Validity: if a process decides v, then v was proposed by some process 3. Integrity: no process decides twice
  161. 161. Events 1. Request(v) 2. Decide(v) Specification Properties 1. Termination: every process eventually decides on a value v 2. Validity: if a process decides v, then v was proposed by some process 3. Integrity: no process decides twice 4. Agreement: no two correct processes decide differently
  162. 162. Consensus Algorithms CAP
  163. 163. Consensus Algorithms CAP
  164. 164. Consensus Algorithms VR Oki & liskov 1988 CAP
  165. 165. Consensus Algorithms VR Oki & liskov 1988 CAP Paxos Lamport 1989
  166. 166. Consensus Algorithms VR Oki & liskov 1988 Paxos Lamport 1989 ZAB reed & junquiera 2008 CAP
  167. 167. Consensus Algorithms VR Oki & liskov 1988 Paxos Lamport 1989 ZAB reed & junquiera 2008 Raft ongaro & ousterhout 2013 CAP
  168. 168. Event Log
  169. 169. Immutability “Immutability Changes Everything” - Pat Helland Immutable Data Share Nothing Architecture
  170. 170. Immutability “Immutability Changes Everything” - Pat Helland Immutable Data Share Nothing Architecture Is the path towards TRUE Scalability
  171. 171. Think In Facts "The database is a cache of a subset of the log” - Pat Helland
  172. 172. Think In Facts "The database is a cache of a subset of the log” - Pat Helland Never delete data Knowledge only grows Append-Only Event Log Use Event Sourcing and/or CQRS
  173. 173. Aggregate Roots Can wrap multiple Entities Aggregate Root is the Transactional Boundary
  174. 174. Aggregate Roots Can wrap multiple Entities Aggregate Root is the Transactional Boundary Strong Consistency Within Aggregate Eventual Consistency Between Aggregates
  175. 175. Aggregate Roots Can wrap multiple Entities Aggregate Root is the Transactional Boundary Strong Consistency Within Aggregate Eventual Consistency Between Aggregates No limit to scalability
  176. 176. eventual Consistency
  177. 177. Dynamo VerY influential CAP Vogels et. al. 2007
  178. 178. Dynamo Popularized • Eventual consistency • Epidemic gossip • Consistent hashing ! • Hinted handoff • Read repair • Anti-Entropy W/ Merkle trees VerY influential CAP Vogels et. al. 2007
  179. 179. Consistent Hashing Karger et. al. 1997
  180. 180. Consistent Hashing Karger et. al. 1997 Support elasticity— easier to scale up and down Avoids hotspots Enables partitioning and replication
  181. 181. Consistent Hashing Karger et. al. 1997 Support elasticity— easier to scale up and down Avoids hotspots Enables partitioning and replication Only K/N nodes needs to be remapped when adding or removing a node (K=#keys, N=#nodes)
  182. 182. How eventual is
  183. 183. How eventual is Eventual consistency?
  184. 184. How eventual is How consistent is Eventual consistency?
  185. 185. How eventual is How consistent is Eventual consistency? PBS Probabilistically Bounded Staleness Peter Bailis et. al 2012
  186. 186. How eventual is How consistent is Eventual consistency? PBS Probabilistically Bounded Staleness Peter Bailis et. al 2012
  187. 187. epidemic Gossip
  188. 188. Node ring & Epidemic Gossip CHORD Stoica et al 2001
  189. 189. Node ring & Epidemic Gossip Member Node Member Node Member Node Member Node Member Node Member Node Member Node Member Node Member Node Member Node CHORD Stoica et al 2001
  190. 190. Node ring & Epidemic Gossip Member Node Member Node Member Node Member Node Member Node Member Node Member Node Member Node Member Node Member Node CHORD Stoica et al 2001
  191. 191. Node ring & Epidemic Gossip Member Node Member Node Member Node Member Node Member Node Member Node Member Node Member Node Member Node Member Node CHORD Stoica et al 2001
  192. 192. Node ring & Epidemic Gossip CHORD Stoica et al 2001 CAP Member Node Member Node Member Node Member Node Member Node Member Node Member Node Member Node Member Node Member Node
  193. 193. Benefits of Epidemic Gossip Decentralized P2P No SPOF or SPOB Very Scalable Fully Elastic ! Requires minimal administration Often used with VECTOR CLOCKS
  194. 194. Some Standard Optimizations to Epidemic Gossip 1. Separation of failure detection heartbeat and dissemination of data - DAS et. al. 2002 (SWIM) 2. Push/Pull gossip - Khambatti et. al 2003 1. Hash and compare data 2. Use single hash or Merkle Trees
  195. 195. disorderly Programming
  196. 196. ACID 2.0
  197. 197. ACID 2.0 Associative Batch-insensitive (grouping doesn't matter) a+(b+c)=(a+b)+c
  198. 198. ACID 2.0 Associative Batch-insensitive (grouping doesn't matter) a+(b+c)=(a+b)+c Commutative Order-insensitive (order doesn't matter) a+b=b+a
  199. 199. ACID 2.0 Associative Batch-insensitive (grouping doesn't matter) a+(b+c)=(a+b)+c Commutative Order-insensitive (order doesn't matter) a+b=b+a Idempotent Retransmission-insensitive (duplication does not matter) a+a=a
  200. 200. ACID 2.0 Associative Batch-insensitive (grouping doesn't matter) a+(b+c)=(a+b)+c Commutative Order-insensitive (order doesn't matter) a+b=b+a Idempotent Retransmission-insensitive (duplication does not matter) a+a=a Eventually Consistent
  201. 201. Convergent & Commutative Replicated Data Types Shapiro et. al. 2011
  202. 202. Convergent & Commutative Replicated Data Types Shapiro et. al. 2011 CRDT
  203. 203. Convergent & Commutative Replicated Data Types Shapiro et. al. 2011 CRDT Join Semilattice Monotonic merge function
  204. 204. Convergent & Commutative Replicated Data Types Shapiro et. al. 2011 Data types Counters Registers Sets Maps Graphs CRDT Join Semilattice Monotonic merge function
  205. 205. Convergent & Commutative Replicated Data Types Shapiro et. al. 2011 Data types Counters Registers Sets Maps Graphs CRDT CAP Join Semilattice Monotonic merge function
  206. 206. 2 TYPES of CRDTs CvRDT Convergent State-based CmRDT Commutative Ops-based
  207. 207. 2 TYPES of CRDTs CvRDT Convergent State-based CmRDT Commutative Ops-based Self contained, holds all history
  208. 208. 2 TYPES of CRDTs CvRDT Convergent State-based CmRDT Commutative Ops-based Self contained, holds all history Needs a reliable broadcast channel
  209. 209. CALM theorem Hellerstein et. al. 2011 Consistency As Logical Monotonicity
  210. 210. CALM theorem Hellerstein et. al. 2011 Consistency As Logical Monotonicity Bloom Language Compiler help to detect & encapsulate non-monotonicity
  211. 211. CALM theorem Hellerstein et. al. 2011 Consistency As Logical Monotonicity Bloom Language Distributed Logic Datalog/Dedalus Monotonic functions Just add facts to the system Model state as Lattices Similar to CRDTs (without the scope problem) Compiler help to detect & encapsulate non-monotonicity
  212. 212. The Akka Way
  213. 213. Akka Actors
  214. 214. Akka IO Akka Actors
  215. 215. Akka REMOTE Akka IO Akka Actors
  216. 216. Akka CLUSTER Akka REMOTE Akka IO Akka Actors
  217. 217. Akka CLUSTER EXTENSIONS Akka CLUSTER Akka REMOTE Akka IO Akka Actors
  218. 218. What is Akka CLUSTER all about? • Cluster Membership • Leader & Singleton • Cluster Sharding • Clustered Routers (adaptive, consistent hashing, …) • Clustered Supervision and Deathwatch • Clustered Pub/Sub • and more
  219. 219. cluster membership in Akka
  220. 220. cluster membership in Akka • Dynamo-style master-less decentralized P2P
  221. 221. cluster membership in Akka • Dynamo-style master-less decentralized P2P • Epidemic Gossip—Node Ring
  222. 222. cluster membership in Akka • Dynamo-style master-less decentralized P2P • Epidemic Gossip—Node Ring • Vector Clocks for causal consistency
  223. 223. cluster membership in Akka • Dynamo-style master-less decentralized P2P • Epidemic Gossip—Node Ring • Vector Clocks for causal consistency • Fully elastic with no SPOF or SPOB
  224. 224. cluster membership in Akka • Dynamo-style master-less decentralized P2P • Epidemic Gossip—Node Ring • Vector Clocks for causal consistency • Fully elastic with no SPOF or SPOB • Very scalable—2400 nodes (on GCE)
  225. 225. cluster membership in Akka • Dynamo-style master-less decentralized P2P • Epidemic Gossip—Node Ring • Vector Clocks for causal consistency • Fully elastic with no SPOF or SPOB • Very scalable—2400 nodes (on GCE) • High throughput—1000 nodes in 4 min (on GCE)
  226. 226. Gossip State GOSSIPING case class Gossip( members: SortedSet[Member], seen: Set[Member], unreachable: Set[Member], version: VectorClock)
  227. 227. Gossip State GOSSIPING Is a CRDT case class Gossip( members: SortedSet[Member], seen: Set[Member], unreachable: Set[Member], version: VectorClock)
  228. 228. Gossip State GOSSIPING Is a CRDT Ordered node ring case class Gossip( members: SortedSet[Member], seen: Set[Member], unreachable: Set[Member], version: VectorClock)
  229. 229. Gossip State GOSSIPING Is a CRDT Ordered node ring case class Gossip( members: SortedSet[Member], seen: Set[Member], unreachable: Set[Member], version: VectorClock) Seen set for convergence
  230. 230. Gossip State GOSSIPING Is a CRDT Ordered node ring case class Gossip( members: SortedSet[Member], seen: Set[Member], unreachable: Set[Member], version: VectorClock) Seen set for convergence Unreachable set
  231. 231. Gossip State GOSSIPING Is a CRDT Ordered node ring case class Gossip( members: SortedSet[Member], seen: Set[Member], unreachable: Set[Member], version: VectorClock) Seen set for convergence Unreachable set Version
  232. 232. Gossip Seen set for convergence Unreachable set State GOSSIPING Is a CRDT Ordered node ring case class Gossip( members: SortedSet[Member], seen: Set[Member], unreachable: Set[Member], version: VectorClock) Version 1. Picks random node with older/newer version
  233. 233. Gossip Seen set for convergence Unreachable set State GOSSIPING Is a CRDT Ordered node ring case class Gossip( members: SortedSet[Member], seen: Set[Member], unreachable: Set[Member], version: VectorClock) Version 1. Picks random node with older/newer version 2. Gossips in a request/reply fashion
  234. 234. Gossip Seen set for convergence Unreachable set State GOSSIPING Is a CRDT Ordered node ring case class Gossip( members: SortedSet[Member], seen: Set[Member], unreachable: Set[Member], version: VectorClock) Version 1. Picks random node with older/newer version 2. Gossips in a request/reply fashion 3. Updates internal state and adds himself to ‘seen’ set
  235. 235. Cluster Convergence
  236. 236. Cluster Convergence Reached when: 1. All nodes are represented in the seen set 2. No members are unreachable, or 3. All unreachable members have status down or exiting
  237. 237. BIASED GOSSIP
  238. 238. BIASED GOSSIP 80% bias to nodes not in seen table Up to 400 nodes, then reduced
  239. 239. PUSH/PULL GOSSIP
  240. 240. PUSH/PULL GOSSIP Variation
  241. 241. PUSH/PULL GOSSIP Variation case class Status(version: VectorClock)
  242. 242. LEADER ROLE
  243. 243. LEADER ROLE Any node can be the leader
  244. 244. LEADER ROLE Any node can be the leader 1. No election, but deterministic
  245. 245. LEADER ROLE Any node can be the leader 1. No election, but deterministic 2. Can change after cluster convergence
  246. 246. LEADER ROLE Any node can be the leader 1. No election, but deterministic 2. Can change after cluster convergence 3. Leader has special duties
  247. 247. Node Lifecycle in Akka
  248. 248. Failure Detection
  249. 249. Failure Detection Hashes the node ring Picks 5 nodes Request/Reply heartbeat
  250. 250. Failure Detection To increase likelihood of bridging racks and data centers Hashes the node ring Picks 5 nodes Request/Reply heartbeat
  251. 251. Failure Detection Used by racks and data centers Cluster Membership Remote Death Watch Remote Supervision To increase likelihood of bridging Hashes the node ring Picks 5 nodes Request/Reply heartbeat
  252. 252. Failure Detection Is an Accrual Failure Detector
  253. 253. Failure Detection Is an Accrual Failure Detector Does not help much in practice
  254. 254. Failure Detection Is an Accrual Failure Detector Does not help much in practice Need to add delay to deal with Garbage Collection
  255. 255. Failure Detection Is an Accrual Failure Detector Does not help much in practice Instead of this Need to add delay to deal with Garbage Collection
  256. 256. Failure Detection Is an Accrual Failure Detector Does not help much in practice Instead of this It often looks like this Need to add delay to deal with Garbage Collection
  257. 257. Network Partitions
  258. 258. Network Partitions • Failure Detector can mark an unavailable member Unreachable
  259. 259. Network Partitions • Failure Detector can mark an unavailable member Unreachable • If one node is Unreachable then no cluster Convergence
  260. 260. Network Partitions • Failure Detector can mark an unavailable member Unreachable • If one node is Unreachable then no cluster Convergence • This means that the Leader can no longer perform it’s duties
  261. 261. Network Partitions • Failure Detector can mark an unavailable • If one node is Split Unreachable then Brain member Unreachable no cluster Convergence • This means that the Leader can no longer perform it’s duties
  262. 262. Network Partitions • Failure Detector can mark an unavailable • If one node is Split Unreachable then Brain member Unreachable no cluster Convergence • This means that the Leader can no longer perform it’s duties • Member can come back from Unreachable—Else:
  263. 263. Network Partitions • Failure Detector can mark an unavailable • If one node is Split Unreachable then Brain member Unreachable no cluster Convergence • This means that the Leader can no longer perform it’s duties • Member can come back from Unreachable—Else: • The node needs to be marked as Down—either through:
  264. 264. Network Partitions • Failure Detector can mark an unavailable • If one node is Split Unreachable then Brain member Unreachable no cluster Convergence • This means that the Leader can no longer perform it’s duties • Member can come back from Unreachable—Else: • The node needs to be marked as Down—either through: 1. auto-down 2. Manual down
  265. 265. Potential FUTURE Optimizations
  266. 266. Potential FUTURE Optimizations • Vector Clock HISTORY pruning
  267. 267. Potential FUTURE Optimizations • Vector Clock HISTORY pruning • Delegated heartbeat
  268. 268. Potential FUTURE Optimizations • Vector Clock HISTORY pruning • Delegated heartbeat • “Real” push/pull gossip
  269. 269. Potential FUTURE Optimizations • Vector Clock HISTORY pruning • Delegated heartbeat • “Real” push/pull gossip • More out-of-the-box auto-down patterns
  270. 270. Akka Modules For Distribution
  271. 271. Akka Modules For Distribution Akka Cluster Akka Remote Akka HTTP Akka IO
  272. 272. Akka Modules For Distribution Akka Cluster Akka Remote Akka HTTP Akka IO Clustered Singleton Clustered Routers Clustered Pub/Sub Cluster Client Consistent Hashing
  273. 273. …and Beyond
  274. 274. Akka & The Road Ahead Akka HTTP Akka Streams Akka CRDT Akka Raft
  275. 275. Akka & The Road Ahead Akka HTTP Akka Streams Akka CRDT Akka Raft Akka 2.4
  276. 276. Akka & The Road Ahead Akka HTTP Akka Streams Akka CRDT Akka Raft Akka 2.4 Akka 2.4
  277. 277. Akka & The Road Ahead Akka HTTP Akka Streams Akka CRDT Akka Raft Akka 2.4 Akka 2.4 ?
  278. 278. Akka & The Road Ahead Akka HTTP Akka Streams Akka CRDT Akka Raft Akka 2.4 Akka 2.4 ? ?
  279. 279. Eager for more?
  280. 280. Try AKKA out akka.io
  281. 281. Join us at React Conf San Francisco Nov 18-21 reactconf.com
  282. 282. Early Registration ends tomorrow Join us at React Conf San Francisco Nov 18-21 reactconf.com
  283. 283. References • General Distributed Systems • Summary of network reliability post-mortems—more terrifying than the most horrifying Stephen King novel: http://aphyr.com/posts/288-the-network-is-reliable • A Note on Distributed Computing: http://citeseerx.ist.psu.edu/viewdoc/ summary?doi=10.1.1.41.7628 • On the problems with RPC: http://steve.vinoski.net/pdf/IEEE-Convenience_ Over_Correctness.pdf • 8 Fallacies of Distributed Computing: https://blogs.oracle.com/jag/resource/ Fallacies.html • 6 Misconceptions of Distributed Computing: www.dsg.cs.tcd.ie/~vjcahill/ sigops98/papers/vogels.ps • Distributed Computing Systems—A Foundational Approach: http:// www.amazon.com/Programming-Distributed-Computing-Systems- Foundational/dp/0262018985 • Introduction to Reliable and Secure Distributed Programming: http:// www.distributedprogramming.net/ • Nice short overview on Distributed Systems: http://book.mixu.net/distsys/ • Meta list of distributed systems readings: https://gist.github.com/macintux/ 6227368
  284. 284. References ! • Actor Model • Great discussion between Erik Meijer & Carl Hewitt or the essence of the Actor Model: http:// channel9.msdn.com/Shows/Going+Deep/Hewitt- Meijer-and-Szyperski-The-Actor-Model-everything- you-wanted-to-know-but-were-afraid-to- ask • Carl Hewitt’s 1973 paper defining the Actor Model: http://worrydream.com/refs/Hewitt- ActorModel.pdf • Gul Agha’s Doctoral Dissertation: https:// dspace.mit.edu/handle/1721.1/6952
  285. 285. References • FLP • Impossibility of Distributed Consensus with One Faulty Process: http:// cs-www.cs.yale.edu/homes/arvind/cs425/doc/fischer.pdf • A Brief Tour of FLP: http://the-paper-trail.org/blog/a-brief-tour-of-flp-impossibility/ • CAP • Brewer’s Conjecture and the Feasibility of Consistent, Available, Partition-Tolerant Web Services: http://lpd.epfl.ch/sgilbert/pubs/ BrewersConjecture-SigAct.pdf • You Can’t Sacrifice Partition Tolerance: http://codahale.com/you-cant-sacrifice- partition-tolerance/ • Linearizability: A Correctness Condition for Concurrent Objects: http:// courses.cs.vt.edu/~cs5204/fall07-kafura/Papers/TransactionalMemory/ Linearizability.pdf • CAP Twelve Years Later: How the "Rules" Have Changed: http:// www.infoq.com/articles/cap-twelve-years-later-how-the-rules-have-changed • Consistency vs. Availability: http://www.infoq.com/news/2008/01/ consistency-vs-availability
  286. 286. References • Time & Order • Post on the problems with Last Write Wins in Riak: http:// aphyr.com/posts/285-call-me-maybe-riak • Time, Clocks, and the Ordering of Events in a Distributed System: http://research.microsoft.com/en-us/um/people/lamport/pubs/ time-clocks.pdf • Vector Clocks: http://zoo.cs.yale.edu/classes/cs426/2012/lab/ bib/fidge88timestamps.pdf • Failure Detection • Unreliable Failure Detectors for Reliable Distributed Systems: http://www.cs.utexas.edu/~lorenzo/corsi/cs380d/papers/p225- chandra.pdf • The ϕ Accrual Failure Detector: http://ddg.jaist.ac.jp/pub/HDY +04.pdf • SWIM Failure Detector: http://www.cs.cornell.edu/~asdas/ research/dsn02-swim.pdf • Practical Byzantine Fault Tolerance: http://www.pmg.lcs.mit.edu/ papers/osdi99.pdf
  287. 287. References • Transactions • Jim Gray’s classic book: http://www.amazon.com/Transaction- Processing-Concepts-Techniques-Management/dp/1558601902 • Highly Available Transactions: Virtues and Limitations: http:// www.bailis.org/papers/hat-vldb2014.pdf • Bolt on Consistency: http://db.cs.berkeley.edu/papers/sigmod13- bolton.pdf • Calvin: Fast Distributed Transactions for Partitioned Database Systems: http://cs.yale.edu/homes/thomson/publications/calvin-sigmod12.pdf • Spanner: Google's Globally-Distributed Database: http:// research.google.com/archive/spanner.html • Life beyond Distributed Transactions: an Apostate’s Opinion https:// cs.brown.edu/courses/cs227/archives/2012/papers/weaker/ cidr07p15.pdf • Immutability Changes Everything—Pat Hellands talk at Ricon: http:// vimeo.com/52831373 • Unschackle Your Domain (Event Sourcing): http://www.infoq.com/ presentations/greg-young-unshackle-qcon08 • CQRS: http://martinfowler.com/bliki/CQRS.html
  288. 288. References • Consensus • Paxos Made Simple: http://research.microsoft.com/en-us/ um/people/lamport/pubs/paxos-simple.pdf • Paxos Made Moderately Complex: http:// www.cs.cornell.edu/courses/cs7412/2011sp/paxos.pdf • A simple totally ordered broadcast protocol (ZAB): labs.yahoo.com/files/ladis08.pdf • In Search of an Understandable Consensus Algorithm (Raft): https://ramcloud.stanford.edu/wiki/download/ attachments/11370504/raft.pdf • Replication strategy comparison diagram: http:// snarfed.org/transactions_across_datacenters_io.html • Distributed Snapshots: Determining Global States of Distributed Systems: http://www.cs.swarthmore.edu/ ~newhall/readings/snapshots.pdf
  289. 289. References • Eventual Consistency • Dynamo: Amazon’s Highly Available Key-value Store: http://www.read.seas.harvard.edu/ ~kohler/class/cs239-w08/ decandia07dynamo.pdf • Consistency vs. Availability: http:// www.infoq.com/news/2008/01/consistency-vs- availability • Consistent Hashing and Random Trees: http:// thor.cs.ucsb.edu/~ravenben/papers/coreos/kll +97.pdf • PBS: Probabilistically Bounded Staleness: http://pbs.cs.berkeley.edu/
  290. 290. References • Epidemic Gossip • Chord: A Scalable Peer-to-peer Lookup Service for Internet • Applications: http://pdos.csail.mit.edu/papers/chord:sigcomm01/ chord_sigcomm.pdf • Gossip-style Failure Detector: http://www.cs.cornell.edu/home/rvr/ papers/GossipFD.pdf • GEMS: http://www.hcs.ufl.edu/pubs/GEMS2005.pdf • Efficient Reconciliation and Flow Control for Anti-Entropy Protocols: http://www.cs.cornell.edu/home/rvr/papers/flowgossip.pdf • 2400 Akka nodes on GCE: http://typesafe.com/blog/running-a-2400- akka-nodes-cluster-on-google-compute-engine • Starting 1000 Akka nodes in 4 min: http://typesafe.com/blog/starting-up- a-1000-node-akka-cluster-in-4-minutes-on-google-compute-engine • Push Pull Gossiping: http://khambatti.com/mujtaba/ ArticlesAndPapers/pdpta03.pdf • SWIM: Scalable Weakly-consistent Infection-style Process Group Membership Protocol: http://www.cs.cornell.edu/~asdas/research/ dsn02-swim.pdf
  291. 291. References • Conflict-Free Replicated Data Types (CRDTs) • A comprehensive study of Convergent and Commutative Replicated Data Types: http://hal.upmc.fr/docs/ 00/55/55/88/PDF/techreport.pdf • Mark Shapiro talks about CRDTs at Microsoft: http:// research.microsoft.com/apps/video/dl.aspx?id=153540 • Akka CRDT project: https://github.com/jboner/akka-crdt • CALM • Dedalus: Datalog in Time and Space: http:// db.cs.berkeley.edu/papers/datalog2011-dedalus.pdf • CALM: http://www.cs.berkeley.edu/~palvaro/cidr11.pdf • Logic and Lattices for Distributed Programming: http:// db.cs.berkeley.edu/papers/UCB-lattice-tr.pdf • Bloom Language website: http://bloom-lang.net • Joe Hellerstein talks about CALM: http://vimeo.com/ 53904989
  292. 292. References • Akka Cluster • My Akka Cluster Implementation Notes: https:// gist.github.com/jboner/7692270 • Akka Cluster Specification: http://doc.akka.io/docs/ akka/snapshot/common/cluster.html • Akka Cluster Docs: http://doc.akka.io/docs/akka/ snapshot/scala/cluster-usage.html • Akka Failure Detector Docs: http://doc.akka.io/docs/ akka/snapshot/scala/remoting.html#Failure_Detector • Akka Roadmap: https://docs.google.com/a/ typesafe.com/document/d/18W9- fKs55wiFNjXL9q50PYOnR7-nnsImzJqHOPPbM4E/ mobilebasic?pli=1&hl=en_US • Where Akka Came From: http://letitcrash.com/post/ 40599293211/where-akka-came-from
  293. 293. any Questions?
  294. 294. The Road to Akka Cluster and Beyond… Jonas Bonér CTO Typesafe @jboner
  1. ¿Le ha llamado la atención una diapositiva en particular?

    Recortar diapositivas es una manera útil de recopilar información importante para consultarla más tarde.

×