SlideShare a Scribd company logo
Failures vs Errors,
Isolation, Delegation
and Replication
in Reactive Systems
Jonas Bonér
CTO TypEsafe
“But it ain’t how hard you’re hit;
it’s about how hard you can get
hit, and keep moving forward.
How much you can take, and
keep moving forward. That’s
how winning is done.”
- Rocky Balboa
“But it ain’t how hard you’re hit;
it’s about how hard you can get
hit, and keep moving forward.
How much you can take, and
keep moving forward. That’s
how winning is done.”
- Rocky Balboa
This is Fault Tolerance
is Beyond
Fault Tolerance
“The ability of a substance or
object to spring back into shape.
The capacity to recover quickly
from difficulties.”
-Merriam Webster
Without Resilience
Nothing Else Matters
Without Resilience
Nothing Else Matters
“We can model and understand in isolation. 

But, when released into competitive nominally
regulated societies, their connections proliferate, 

their interactions and interdependencies multiply, 

their complexities mushroom. 

And we are caught short.”
- Sidney Dekker
Drift into Failure - Sidney Dekker
We Need to Study
Resilience in
Complex Systems
Complicated System
Complicated System
Complex System
Complex System
Complex System
Complicated ≠Complex
“Counterintuitive. That’s [Jay] Forrester’s word
to describe complex systems. Leverage points
are not intuitive. Or if they are, we intuitively
use them backward, systematically worsening
whatever problems we are trying to solve.”
- Donella Meadows
Leverage Points: Places to Intervene in a System - Donella Meadows
‘‘Going solid’’: a model of system dynamics and consequences for patient safety - R Cook, J Rasmussen
Resilience in complex adaptive systems: Operating at the Edge of Failure - Richard Cook - Talk at Velocity NY 2013
Operating at the Edge of Failure
‘‘Going solid’’: a model of system dynamics and consequences for patient safety - R Cook, J Rasmussen
Resilience in complex adaptive systems: Operating at the Edge of Failure - Richard Cook - Talk at Velocity NY 2013
Operating at the Edge of Failure
‘‘Going solid’’: a model of system dynamics and consequences for patient safety - R Cook, J Rasmussen
Resilience in complex adaptive systems: Operating at the Edge of Failure - Richard Cook - Talk at Velocity NY 2013
Operating at the Edge of Failure
‘‘Going solid’’: a model of system dynamics and consequences for patient safety - R Cook, J Rasmussen
Resilience in complex adaptive systems: Operating at the Edge of Failure - Richard Cook - Talk at Velocity NY 2013
Operating at the Edge of Failure
‘‘Going solid’’: a model of system dynamics and consequences for patient safety - R Cook, J Rasmussen
Resilience in complex adaptive systems: Operating at the Edge of Failure - Richard Cook - Talk at Velocity NY 2013
Operating Point
Operating at the Edge of Failure
‘‘Going solid’’: a model of system dynamics and consequences for patient safety - R Cook, J Rasmussen
Resilience in complex adaptive systems: Operating at the Edge of Failure - Richard Cook - Talk at Velocity NY 2013
Operating at the Edge of Failure
‘‘Going solid’’: a model of system dynamics and consequences for patient safety - R Cook, J Rasmussen
Resilience in complex adaptive systems: Operating at the Edge of Failure - Richard Cook - Talk at Velocity NY 2013
Operating at the Edge of Failure
‘‘Going solid’’: a model of system dynamics and consequences for patient safety - R Cook, J Rasmussen
Resilience in complex adaptive systems: Operating at the Edge of Failure - Richard Cook - Talk at Velocity NY 2013
Operating at the Edge of Failure
Management Pressure
Towards Economic Efficiency
‘‘Going solid’’: a model of system dynamics and consequences for patient safety - R Cook, J Rasmussen
Resilience in complex adaptive systems: Operating at the Edge of Failure - Richard Cook - Talk at Velocity NY 2013
Operating at the Edge of Failure
Management Pressure
Towards Economic Efficiency
Gradient Towards
Least Effort
‘‘Going solid’’: a model of system dynamics and consequences for patient safety - R Cook, J Rasmussen
Resilience in complex adaptive systems: Operating at the Edge of Failure - Richard Cook - Talk at Velocity NY 2013
Operating at the Edge of Failure
Management Pressure
Towards Economic Efficiency
Gradient Towards
Least Effort
‘‘Going solid’’: a model of system dynamics and consequences for patient safety - R Cook, J Rasmussen
Resilience in complex adaptive systems: Operating at the Edge of Failure - Richard Cook - Talk at Velocity NY 2013
Operating at the Edge of Failure
Management Pressure
Towards Economic Efficiency
Gradient Towards
Least Effort
Counter Gradient
For More Resilience
‘‘Going solid’’: a model of system dynamics and consequences for patient safety - R Cook, J Rasmussen
Resilience in complex adaptive systems: Operating at the Edge of Failure - Richard Cook - Talk at Velocity NY 2013
Operating at the Edge of Failure
‘‘Going solid’’: a model of system dynamics and consequences for patient safety - R Cook, J Rasmussen
Resilience in complex adaptive systems: Operating at the Edge of Failure - Richard Cook - Talk at Velocity NY 2013
Operating at the Edge of Failure
‘‘Going solid’’: a model of system dynamics and consequences for patient safety - R Cook, J Rasmussen
Resilience in complex adaptive systems: Operating at the Edge of Failure - Richard Cook - Talk at Velocity NY 2013
Error Margin
Operating at the Edge of Failure
‘‘Going solid’’: a model of system dynamics and consequences for patient safety - R Cook, J Rasmussen
Resilience in complex adaptive systems: Operating at the Edge of Failure - Richard Cook - Talk at Velocity NY 2013
Error Margin
Operating at the Edge of Failure
‘‘Going solid’’: a model of system dynamics and consequences for patient safety - R Cook, J Rasmussen
Resilience in complex adaptive systems: Operating at the Edge of Failure - Richard Cook - Talk at Velocity NY 2013
Error Margin
Operating at the Edge of Failure
‘‘Going solid’’: a model of system dynamics and consequences for patient safety - R Cook, J Rasmussen
Resilience in complex adaptive systems: Operating at the Edge of Failure - Richard Cook - Talk at Velocity NY 2013
Operating at the Edge of Failure
‘‘Going solid’’: a model of system dynamics and consequences for patient safety - R Cook, J Rasmussen
Resilience in complex adaptive systems: Operating at the Edge of Failure - Richard Cook - Talk at Velocity NY 2013
Operating at the Edge of Failure
‘‘Going solid’’: a model of system dynamics and consequences for patient safety - R Cook, J Rasmussen
Resilience in complex adaptive systems: Operating at the Edge of Failure - Richard Cook - Talk at Velocity NY 2013
Operating at the Edge of Failure
‘‘Going solid’’: a model of system dynamics and consequences for patient safety - R Cook, J Rasmussen
Resilience in complex adaptive systems: Operating at the Edge of Failure - Richard Cook - Talk at Velocity NY 2013
Operating at the Edge of Failure
‘‘Going solid’’: a model of system dynamics and consequences for patient safety - R Cook, J Rasmussen
Resilience in complex adaptive systems: Operating at the Edge of Failure - Richard Cook - Talk at Velocity NY 2013
Operating at the Edge of Failure
‘‘Going solid’’: a model of system dynamics and consequences for patient safety - R Cook, J Rasmussen
Resilience in complex adaptive systems: Operating at the Edge of Failure - Richard Cook - Talk at Velocity NY 2013
Operating at the Edge of Failure
Embrace Failure
Resilience in
Social Systems
Simple Critical Infrastructure Map
Understanding vital services, and how they keep you safe
6ways to die
3sets of essential services
7layers of PROTECTION
Dealing in Security - Mike Bennet, Vinay Gupta
7 Principles for Building
Resilience in Social Systems
1. Maintain diversity & Redundancy
2. Manage connectivity
3. Manage slow variables & feedback
4. Foster complex adaptive systems thinking
5. Encourage learning
6. Broaden participation
7. Promote polycentric governance
Principles for Building Resilience: Sustaining Ecosystem Services in Social-Ecological Systems - Reinette Biggs et. al.
Resilience in
Biological Systems
MeerkatsPuppies! Now that I’ve got your attention, complexity theory - Nicolas Perony, TED talk
What We Can Learn
From Biological Systems
1. Feature Diversity and redundancy
2. Inter-Connected network structure
3. Wide distribution across all scales
4. Capacity to self-adapt & self-organize
Toward Resilient Architectures 1: Biology Lessons - Michael Mehaffy, Nikos A. Salingaros
“Animals show extraordinary social complexity,
and this allows them to adapt and 

respond to changes in their environment.
In three words, in the animal kingdom,
simplicity leads to complexity 

which leads to resilience.”
- Nicolas Perony
Puppies! Now that I’ve got your attention, complexity theory - Nicolas Perony, TED talk
Resilience in
Computer Systems
“Complex systems run in degraded mode.”
“Complex systems run as broken systems.”
- richard Cook
How Complex Systems Fail - Richard Cook
is by
Photo courtesy of FEMA/Joselyne Augustino
We Need to
Manage Failure
“Post-accident attribution to a 

‘root cause’ is fundamentally wrong: 

Because overt failure requires multiple faults,
there is no isolated ‘cause’ of an accident.”
- richard Cook
How Complex Systems Fail - Richard Cook
There is No
Root Cause
Crash Only
Crash-Only Software - George Candea, Armando Fox
Stop = Crash Safely
Start = Recover Fast
“To make a system of interconnected components
crash-only, it must be designed so that components
can tolerate the crashes and temporary unavailability
of their peers. This means we require: [1] strong
modularity with relatively impermeable component
boundaries, [2] timeout-based communication and
lease-based resource allocation, and [3] self-
describing requests that carry a time-to-live and
information on whether they are idempotent.”
- George Candea, Armando Fox
Crash-Only Software - George Candea, Armando Fox
Recursive Restartability
Turning the Crash-Only Sledgehammer into a Scalpel
Recursive Restartability: Turning the Reboot Sledgehammer into a Scalpel - George Candea, Armando Fox
Services need to accept
NO for an answer
"Software components should be designed such
that they can deny service for any request or call.
Then, if an underlying component can say No,
apps must be designed to take No for an answer
and decide how to proceed: give up, wait and
retry, reduce fidelity, etc.”
- George Candea, Armando Fox
Recursive Restartability: Turning the Reboot Sledgehammer into a Scalpel - George Candea, Armando Fox
Learn to take
NO for an answer
“The explosive growth of software has
added greatly to systems’ interactive
complexity. With software, the possible
states that a system can end up in become
- Sidney Dekker
Drift into Failure - Sidney Dekker
of State
• Static Data
• Scratch Data
• Dynamic Data
• Recomputable
• not recomputable
of State
• Static Data
• Scratch Data
• Dynamic Data
• Recomputable
• not recomputable
We Need a
Way Out of the
State Tar Pit
Out of the Tar Pit - Ben Moseley , Peter Marks
We Need a
Way Out of the
State Tar Pit
Out of the Tar Pit - Ben Moseley , Peter Marks
We Need a
Way Out of the
State Tar Pit
Out of the Tar Pit - Ben Moseley , Peter Marks
We Need a
Way Out of the
State Tar Pit
Out of the Tar Pit - Ben Moseley , Peter Marks
State and
State Management
Critical state
that needs protection
Thread boundary
State Management
Critical state
that needs protection
Thread boundary
State Management
Critical state
that needs protection
Thread boundary
State Management
Critical state
that needs protection
Thread boundary
Synchronous dispatch Thread boundary
State Management
Critical state
that needs protection
Thread boundary
Synchronous dispatch Thread boundary
State Management
Critical state
that needs protection
Thread boundary
Synchronous dispatch Thread boundary
State Management
Critical state
that needs protection
Thread boundary
Synchronous dispatch Thread boundary
Utterly broken
Requirements for a
Sane Failure Mode
1. Contained
2. Reified—as messages
3. Signalled—Asynchronously
4. Observed—by 1-N
5. Managed
Failures need to be
Akka Actors
Akka Actors
case class Greeting(who: String)
class GreetingActor extends Actor with ActorLogging {
def receive = {
case Greeting(who) =>”Hello ${who}")
Akka Actors
case class Greeting(who: String)
class GreetingActor extends Actor with ActorLogging {
def receive = {
case Greeting(who) =>”Hello ${who}")
Define the message(s) the Actor
should be able to respond to
Akka Actors
case class Greeting(who: String)
class GreetingActor extends Actor with ActorLogging {
def receive = {
case Greeting(who) =>”Hello ${who}")
Define the message(s) the Actor
should be able to respond to
Define the Actor class
Akka Actors
case class Greeting(who: String)
class GreetingActor extends Actor with ActorLogging {
def receive = {
case Greeting(who) =>”Hello ${who}")
Define the message(s) the Actor
should be able to respond to
Define the Actor class
Define the Actor’s behavior
Akka Actors
case class Greeting(who: String)
class GreetingActor extends Actor with ActorLogging {
def receive = {
case Greeting(who) =>”Hello ${who}")
Akka Actors
case class Greeting(who: String)
class GreetingActor extends Actor with ActorLogging {
def receive = {
case Greeting(who) =>”Hello ${who}")
val system = ActorSystem("MySystem")
val greeter = system.actorOf(Props[GreetingActor], "greeter")
Akka Actors
case class Greeting(who: String)
class GreetingActor extends Actor with ActorLogging {
def receive = {
case Greeting(who) =>”Hello ${who}")
val system = ActorSystem("MySystem")
val greeter = system.actorOf(Props[GreetingActor], "greeter")
Create an Actor system
Akka Actors
case class Greeting(who: String)
class GreetingActor extends Actor with ActorLogging {
def receive = {
case Greeting(who) =>”Hello ${who}")
val system = ActorSystem("MySystem")
val greeter = system.actorOf(Props[GreetingActor], "greeter")
Create an Actor system
Actor configuration
Akka Actors
case class Greeting(who: String)
class GreetingActor extends Actor with ActorLogging {
def receive = {
case Greeting(who) =>”Hello ${who}")
val system = ActorSystem("MySystem")
val greeter = system.actorOf(Props[GreetingActor], "greeter")
Give it a name
Create an Actor system
Actor configuration
Akka Actors
case class Greeting(who: String)
class GreetingActor extends Actor with ActorLogging {
def receive = {
case Greeting(who) =>”Hello ${who}")
val system = ActorSystem("MySystem")
val greeter = system.actorOf(Props[GreetingActor], "greeter")
Give it a nameCreate the Actor
Create an Actor system
Actor configuration
Akka Actors
case class Greeting(who: String)
class GreetingActor extends Actor with ActorLogging {
def receive = {
case Greeting(who) =>”Hello ${who}")
val system = ActorSystem("MySystem")
val greeter = system.actorOf(Props[GreetingActor], "greeter")
Give it a nameCreate the ActorYou get an ActorRef back
Create an Actor system
Actor configuration
Akka Actors
case class Greeting(who: String)
class GreetingActor extends Actor with ActorLogging {
def receive = {
case Greeting(who) =>”Hello ${who}")
val system = ActorSystem("MySystem")
val greeter = system.actorOf(Props[GreetingActor], "greeter")
Akka Actors
case class Greeting(who: String)
class GreetingActor extends Actor with ActorLogging {
def receive = {
case Greeting(who) =>”Hello ${who}")
val system = ActorSystem("MySystem")
val greeter = system.actorOf(Props[GreetingActor], "greeter")
greeter ! Greeting("Charlie Parker")
Akka Actors
case class Greeting(who: String)
class GreetingActor extends Actor with ActorLogging {
def receive = {
case Greeting(who) =>”Hello ${who}")
val system = ActorSystem("MySystem")
val greeter = system.actorOf(Props[GreetingActor], "greeter")
greeter ! Greeting("Charlie Parker")
Send the message asynchronously
Akka Actors
case class Greeting(who: String)
class GreetingActor extends Actor with ActorLogging {
def receive = {
case Greeting(who) =>”Hello ${who}")
val system = ActorSystem("MySystem")
val greeter = system.actorOf(Props[GreetingActor], "greeter")
greeter ! Greeting("Charlie Parker")
Enter Supervision
Enter Supervision
Think Vending Machine
Think Vending Machine
Inserts coins
Think Vending Machine
Inserts coins
Add more coins
Think Vending Machine
Inserts coins
Gets coffee
Add more coins
Think Vending Machine
Think Vending Machine
Inserts coins
Think Vending Machine
Inserts coins
Out of coffee beans error Coffee
Think Vending Machine
Inserts coins
Out of coffee beans error
Think Vending Machine
Inserts coins
Think Vending Machine
Inserts coins
Out of
coffee beans
Think Vending Machine
Inserts coins
Out of
coffee beans
Think Vending Machine
Inserts coins
Out of
coffee beans
Think Vending Machine
Inserts coins
Gets coffee
Out of
coffee beans
Think Vending Machine
Think Vending Machine
Think Vending Machine
Think Vending Machine
Validation Error
Think Vending Machine
Validation Error
Think Vending Machine
Validation Error
Think Vending Machine
Validation Error
“Accidents come from relationships 
not broken parts.”
- Sidney dekker
Drift into Failure - Sidney Dekker
Error Kernel
Onion-layered state & Failure management
Making reliable distributed systems in the presence of software errors - Joe Armstrong
On Erlang, State and Crashes - Jesper Louis Andersen
Onion Layered
State Management
Critical state
that needs protection
Thread boundary
Onion Layered
State Management
Critical state
that needs protection
Thread boundary
Onion Layered
State Management
Error Kernel
Critical state
that needs protection
Thread boundary
Onion Layered
State Management
Error Kernel
Critical state
that needs protection
Thread boundary
Onion Layered
State Management
Error Kernel
Critical state
that needs protection
Thread boundary
Onion Layered
State Management
Error Kernel
Critical state
that needs protection
Thread boundary
Onion Layered
State Management
Error Kernel
Critical state
that needs protection
Thread boundary
Onion Layered
State Management
Error Kernel
Critical state
that needs protection
Thread boundary
Onion Layered
State Management
Error Kernel
Critical state
that needs protection
Thread boundary
Onion Layered
State Management
Error Kernel
Critical state
that needs protection
Thread boundary
Onion Layered
State Management
Error Kernel
Critical state
that needs protection
Thread boundary
Onion Layered
State Management
Error Kernel
Critical state
that needs protection
Thread boundary
Onion Layered
State Management
Error Kernel
Critical state
that needs protection
Thread boundary
Onion Layered
State Management
Error Kernel
Critical state
that needs protection
Thread boundary
Supervision in Akka
Every actor has a default supervisor strategy.
Which can, and often should, be overridden.
class Supervisor extends Actor {
override val supervisorStrategy =
OneForOneStrategy(maxNrOfRetries = 10, withinTimeRange = 1 minute) {
case _: ArithmeticException => Resume
case _: NullPointerException => Restart
case _: Exception => Escalate
val worker = context.actorOf(Props[Worker], name = "worker")
def receive = {
case number: Int => worker.forward(number)
Supervision in Akka
Every actor has a default supervisor strategy.
Which can, and often should, be overridden.
class Supervisor extends Actor {
override val supervisorStrategy =
OneForOneStrategy(maxNrOfRetries = 10, withinTimeRange = 1 minute) {
case _: ArithmeticException => Resume
case _: NullPointerException => Restart
case _: Exception => Escalate
val worker = context.actorOf(Props[Worker], name = "worker")
def receive = {
case number: Int => worker.forward(number)
Parent actor
Supervision in Akka
Every actor has a default supervisor strategy.
Which can, and often should, be overridden.
class Supervisor extends Actor {
override val supervisorStrategy =
OneForOneStrategy(maxNrOfRetries = 10, withinTimeRange = 1 minute) {
case _: ArithmeticException => Resume
case _: NullPointerException => Restart
case _: Exception => Escalate
val worker = context.actorOf(Props[Worker], name = "worker")
def receive = {
case number: Int => worker.forward(number)
Parent actor
All its children have their life-cycle managed
through this declarative supervision strategy
Supervision in Akka
Every actor has a default supervisor strategy.
Which can, and often should, be overridden.
class Supervisor extends Actor {
override val supervisorStrategy =
OneForOneStrategy(maxNrOfRetries = 10, withinTimeRange = 1 minute) {
case _: ArithmeticException => Resume
case _: NullPointerException => Restart
case _: Exception => Escalate
val worker = context.actorOf(Props[Worker], name = "worker")
def receive = {
case number: Int => worker.forward(number)
Create a supervised child actor
Parent actor
All its children have their life-cycle managed
through this declarative supervision strategy
Monitor through
Death Watch
class Watcher extends Actor {
val child = context.actorOf(Props.empty, "child")
def receive = {
case Terminated(`child`) => … // handle child termination
Monitor through
Death Watch
class Watcher extends Actor {
val child = context.actorOf(Props.empty, "child")
def receive = {
case Terminated(`child`) => … // handle child termination
Create a child actor
Monitor through
Death Watch
class Watcher extends Actor {
val child = context.actorOf(Props.empty, "child")
def receive = {
case Terminated(`child`) => … // handle child termination
Create a child actor
Watch it
Monitor through
Death Watch
class Watcher extends Actor {
val child = context.actorOf(Props.empty, "child")
def receive = {
case Terminated(`child`) => … // handle child termination
Create a child actor
Watch it
Receive termination signal
Maintain Diversity
and Redundancy
Maintain Diversity
and Redundancy
Akka Routing
akka {
actor {
deployment {
/service/router {
router = round-robin-pool
resizer {
lower-bound = 12
upper-bound = 15
Akka Cluster
akka {
actor {
deployment {
/service/router {
router = round-robin-pool
resizer {
lower-bound = 12
upper-bound = 15
provider = "akka.cluster.ClusterActorRefProvider"
cluster {
seed-nodes = [
The Network
is Reliable
The Network
is Reliable
We are living in the
We are living in the
CAP: Consistency is impossible
We are living in the
CAP: Consistency is impossible
FLP: Consensus is impossible
Is the wrong default
We need
Decoupling IN
Time and Space
Resilient Protocols
Depend on
Asynchronous Communication
Eventual Consistency
Resilient Protocols
• are tolerant to
• Message loss
• Message reordering
• Message duplication
Depend on
Asynchronous Communication
Eventual Consistency
Resilient Protocols
• are tolerant to
• Message loss
• Message reordering
• Message duplication
• Embrace ACID 2.0
• Associative
• Commutative
• Idempotent
• Distributed
Depend on
Asynchronous Communication
Eventual Consistency
Akka Distributed Data
class DataBot extends Actor with ActorLogging {
val replicator = DistributedData(context.system).replicator
implicit val node = Cluster(context.system)
val tickTask = context.system.scheduler.schedule(
5.seconds, 5.seconds, self, Add)
val DataKey = ORSetKey[String]("key")
replicator ! Subscribe(DataKey, self)
Akka Distributed Data
class DataBot extends Actor with ActorLogging {
val replicator = DistributedData(context.system).replicator
implicit val node = Cluster(context.system)
val tickTask = context.system.scheduler.schedule(
5.seconds, 5.seconds, self, Add)
val DataKey = ORSetKey[String]("key")
replicator ! Subscribe(DataKey, self)
def receive = {
case Tick =>
val s = ThreadLocalRandom.current().nextInt(97, 123).toChar.toString
if (ThreadLocalRandom.current().nextBoolean())
replicator ! Update(DataKey, ORSet.empty[String], WriteLocal)(_ + s)
replicator ! Update(DataKey, ORSet.empty[String], WriteLocal)(_ - s)
case c @ Changed(DataKey) =>
val data = c.get(DataKey)
case _: UpdateResponse[_] => // ignore
“An escalator can never break: it can only
become stairs. You should never see an
Escalator Temporarily Out Of Order sign, just
Escalator Temporarily Stairs. Sorry for the
- Mitch Hedberg
Always Rely on
Always Rely on
…And If not pOssiblE
Always use
Circuit Breaker
Circuit Breaker in Akka
val breaker =
new CircuitBreaker(
maxFailures = 5,
callTimeout = 10.seconds,
resetTimeout = 1.minute
val result = breaker.withCircuitBreaker(Future(dangerousCall()))
Little’s Law
W: Response Time
L: Queue Length
Little’s Law
Queue Length = Arrival Rate * Response Time
W: Response Time
L: Queue Length
Little’s Law
Response Time = Queue Length / Arrival Rate
W: Response Time
L: Queue Length
Flow Control
Flow Control
Always Apply Back Pressure
Akka Streams
Akka Streams
Akka Streams
in ~> f1 ~> bcast ~> f2 ~> merge ~> f3 ~> out
bcast ~> f4 ~> merge
Akka Streams
in ~> f1 ~> bcast ~> f2 ~> merge ~> f3 ~> out
bcast ~> f4 ~> merge
val g = FlowGraph.closed() {
implicit builder: FlowGraph.Builder =>
import FlowGraph.Implicits._
val in = Source(1 to 10)
val out = Sink.ignore
val bcast = builder.add(Broadcast[Int](2))
val merge = builder.add(Merge[Int](2))
val f1, f2, f3, f4 = Flow[Int].map(_ + 10)
Akka Streams
in ~> f1 ~> bcast ~> f2 ~> merge ~> f3 ~> out
bcast ~> f4 ~> merge
val g = FlowGraph.closed() {
implicit builder: FlowGraph.Builder =>
import FlowGraph.Implicits._
val in = Source(1 to 10)
val out = Sink.ignore
val bcast = builder.add(Broadcast[Int](2))
val merge = builder.add(Merge[Int](2))
val f1, f2, f3, f4 = Flow[Int].map(_ + 10)
Set up the context
Akka Streams
in ~> f1 ~> bcast ~> f2 ~> merge ~> f3 ~> out
bcast ~> f4 ~> merge
val g = FlowGraph.closed() {
implicit builder: FlowGraph.Builder =>
import FlowGraph.Implicits._
val in = Source(1 to 10)
val out = Sink.ignore
val bcast = builder.add(Broadcast[Int](2))
val merge = builder.add(Merge[Int](2))
val f1, f2, f3, f4 = Flow[Int].map(_ + 10)
Set up the context
Create the
Source and Sink
Akka Streams
in ~> f1 ~> bcast ~> f2 ~> merge ~> f3 ~> out
bcast ~> f4 ~> merge
val g = FlowGraph.closed() {
implicit builder: FlowGraph.Builder =>
import FlowGraph.Implicits._
val in = Source(1 to 10)
val out = Sink.ignore
val bcast = builder.add(Broadcast[Int](2))
val merge = builder.add(Merge[Int](2))
val f1, f2, f3, f4 = Flow[Int].map(_ + 10)
Set up the context
Create the
Source and Sink
Create the fan out
and fan in stages
Akka Streams
in ~> f1 ~> bcast ~> f2 ~> merge ~> f3 ~> out
bcast ~> f4 ~> merge
val g = FlowGraph.closed() {
implicit builder: FlowGraph.Builder =>
import FlowGraph.Implicits._
val in = Source(1 to 10)
val out = Sink.ignore
val bcast = builder.add(Broadcast[Int](2))
val merge = builder.add(Merge[Int](2))
val f1, f2, f3, f4 = Flow[Int].map(_ + 10)
Set up the context
Create the
Source and Sink
Create the fan out
and fan in stages
Create a set of
Akka Streams
in ~> f1 ~> bcast ~> f2 ~> merge ~> f3 ~> out
bcast ~> f4 ~> merge
val g = FlowGraph.closed() {
implicit builder: FlowGraph.Builder =>
import FlowGraph.Implicits._
val in = Source(1 to 10)
val out = Sink.ignore
val bcast = builder.add(Broadcast[Int](2))
val merge = builder.add(Merge[Int](2))
val f1, f2, f3, f4 = Flow[Int].map(_ + 10)
Set up the context
Create the
Source and Sink
Create the fan out
and fan in stages
Create a set of
Define the graph
processing blueprint
What can we learn from Arnold?
What can we learn from Arnold?
What can we learn from Arnold?
Blow things up
Your App
Pull the Plug
…and see what happens
“Complex systems run in degraded mode.”
“Complex systems run as broken systems.”
- richard Cook
How Complex Systems Fail - Richard Cook
is by
Photo courtesy of FEMA/Joselyne Augustino
ReferencesAntifragile: Things That Gain from Disorder - 

Drift into Failure -

How Complex Systems Fail -

Leverage Points: Places to Intervene in a System -
Going Solid: A Model of System Dynamics and Consequences for Patient Safety -

Resilience in Complex Adaptive Systems: Operating at the Edge of Failure -

Dealing in Security -

Principles for Building Resilience: Sustaining Ecosystem Services in Social-Ecological Systems -

Puppies! Now that I’ve got your attention, Complexity Theory -

How Bacteria Becomes Resistant -

Towards Resilient Architectures: Biology Lessons -

Crash-Only Software -

Recursive Restartability: Turning the Reboot Sledgehammer into a Scalpel -

Out of the Tar Pit -

Bulkhead Pattern -

Making Reliable Distributed Systems in the Presence of Software Errors -

On Erlang, State and Crashes -

Akka Supervision -

Release It!: Design and Deploy Production-Ready Software -

Hystrix -

Akka Circuit Breaker - 

Reactive Streams -

Akka Streams -

RxJava -

Feedback Control for Computer Systems -

Simian Army -

Gatling -

Akka MultiNode Testing -
Delivered On-site For Spark, Scala, Akka And Play
Help is just a click away. Get in touch with
Typesafe about our training courses.
• Intro Workshop to Apache Spark
• Fast Track & Advanced Scala
• Fast Track to Akka with Java or Scala
• Fast Track to Play with Java or Scala
• Advanced Akka with Java or Scala
Ask us about local trainings available by 24
Typesafe partners in 14 countries around
the world.
CONTACT US Learn more about on-site training

More Related Content

Similar to Reactive Revealed Part 3 of 3: Resiliency, Failures vs Errors, Isolation, Delegation and Replication in Reactive Systems

Black Sky Thinking: Wide-Area Power Failure
Black Sky Thinking: Wide-Area Power FailureBlack Sky Thinking: Wide-Area Power Failure
Black Sky Thinking: Wide-Area Power Failure
Prof. David E. Alexander (UCL)
Resilience of Critical Infrastructures to Climate Change
Resilience of Critical Infrastructures to Climate ChangeResilience of Critical Infrastructures to Climate Change
Resilience of Critical Infrastructures to Climate Change
Resilience of Critical Infrastructures to Climate Change (old)
Resilience of Critical Infrastructures to Climate Change (old)Resilience of Critical Infrastructures to Climate Change (old)
Resilience of Critical Infrastructures to Climate Change (old)
Software Availability by Resiliency
Software Availability by ResiliencySoftware Availability by Resiliency
Software Availability by Resiliency
Reza Samei
Better by Measure: Two Tales of Disruption (Class 3, SVA Products of Design 2...
Better by Measure: Two Tales of Disruption (Class 3, SVA Products of Design 2...Better by Measure: Two Tales of Disruption (Class 3, SVA Products of Design 2...
Better by Measure: Two Tales of Disruption (Class 3, SVA Products of Design 2...
Rebecca Gard Silver

Similar to Reactive Revealed Part 3 of 3: Resiliency, Failures vs Errors, Isolation, Delegation and Replication in Reactive Systems (6)

Black Sky Thinking: Wide-Area Power Failure
Black Sky Thinking: Wide-Area Power FailureBlack Sky Thinking: Wide-Area Power Failure
Black Sky Thinking: Wide-Area Power Failure
Resilience of Critical Infrastructures to Climate Change
Resilience of Critical Infrastructures to Climate ChangeResilience of Critical Infrastructures to Climate Change
Resilience of Critical Infrastructures to Climate Change
Resilience of Critical Infrastructures to Climate Change (old)
Resilience of Critical Infrastructures to Climate Change (old)Resilience of Critical Infrastructures to Climate Change (old)
Resilience of Critical Infrastructures to Climate Change (old)
Software Availability by Resiliency
Software Availability by ResiliencySoftware Availability by Resiliency
Software Availability by Resiliency
Better by Measure: Two Tales of Disruption (Class 3, SVA Products of Design 2...
Better by Measure: Two Tales of Disruption (Class 3, SVA Products of Design 2...Better by Measure: Two Tales of Disruption (Class 3, SVA Products of Design 2...
Better by Measure: Two Tales of Disruption (Class 3, SVA Products of Design 2...
Accountability 2010
Accountability 2010Accountability 2010
Accountability 2010

More from Legacy Typesafe (now Lightbend)

The How and Why of Fast Data Analytics with Apache Spark
The How and Why of Fast Data Analytics with Apache SparkThe How and Why of Fast Data Analytics with Apache Spark
The How and Why of Fast Data Analytics with Apache Spark
Legacy Typesafe (now Lightbend)
Reactive Design Patterns
Reactive Design PatternsReactive Design Patterns
Reactive Design Patterns
Legacy Typesafe (now Lightbend)
Revitalizing Aging Architectures with Microservices
Revitalizing Aging Architectures with MicroservicesRevitalizing Aging Architectures with Microservices
Revitalizing Aging Architectures with Microservices
Legacy Typesafe (now Lightbend)
Typesafe Reactive Platform: Monitoring 1.0, Commercial features and more
Typesafe Reactive Platform: Monitoring 1.0, Commercial features and moreTypesafe Reactive Platform: Monitoring 1.0, Commercial features and more
Typesafe Reactive Platform: Monitoring 1.0, Commercial features and more
Legacy Typesafe (now Lightbend)
Akka 2.4 plus new commercial features in Typesafe Reactive Platform
Akka 2.4 plus new commercial features in Typesafe Reactive PlatformAkka 2.4 plus new commercial features in Typesafe Reactive Platform
Akka 2.4 plus new commercial features in Typesafe Reactive Platform
Legacy Typesafe (now Lightbend)
How to deploy Apache Spark 
to Mesos/DCOS
How to deploy Apache Spark 
to Mesos/DCOSHow to deploy Apache Spark 
to Mesos/DCOS
How to deploy Apache Spark 
to Mesos/DCOS
Legacy Typesafe (now Lightbend)
Akka 2.4 plus commercial features in Typesafe Reactive Platform
Akka 2.4 plus commercial features in Typesafe Reactive PlatformAkka 2.4 plus commercial features in Typesafe Reactive Platform
Akka 2.4 plus commercial features in Typesafe Reactive Platform
Legacy Typesafe (now Lightbend)
Reactive Revealed Part 2: Scalability, Elasticity and Location Transparency i...
Reactive Revealed Part 2: Scalability, Elasticity and Location Transparency i...Reactive Revealed Part 2: Scalability, Elasticity and Location Transparency i...
Reactive Revealed Part 2: Scalability, Elasticity and Location Transparency i...
Legacy Typesafe (now Lightbend)
Microservices 101: Exploiting Reality's Constraints with Technology
Microservices 101: Exploiting Reality's Constraints with TechnologyMicroservices 101: Exploiting Reality's Constraints with Technology
Microservices 101: Exploiting Reality's Constraints with Technology
Legacy Typesafe (now Lightbend)
Four Things to Know About Reliable Spark Streaming with Typesafe and Databricks
Four Things to Know About Reliable Spark Streaming with Typesafe and DatabricksFour Things to Know About Reliable Spark Streaming with Typesafe and Databricks
Four Things to Know About Reliable Spark Streaming with Typesafe and Databricks
Legacy Typesafe (now Lightbend)
A Deeper Look Into Reactive Streams with Akka Streams 1.0 and Slick 3.0
A Deeper Look Into Reactive Streams with Akka Streams 1.0 and Slick 3.0A Deeper Look Into Reactive Streams with Akka Streams 1.0 and Slick 3.0
A Deeper Look Into Reactive Streams with Akka Streams 1.0 and Slick 3.0
Legacy Typesafe (now Lightbend)
Modernizing Your Aging Architecture: What Enterprise Architects Need To Know ...
Modernizing Your Aging Architecture: What Enterprise Architects Need To Know ...Modernizing Your Aging Architecture: What Enterprise Architects Need To Know ...
Modernizing Your Aging Architecture: What Enterprise Architects Need To Know ...
Legacy Typesafe (now Lightbend)
Reactive Streams 1.0.0 and Why You Should Care (webinar)
Reactive Streams 1.0.0 and Why You Should Care (webinar)Reactive Streams 1.0.0 and Why You Should Care (webinar)
Reactive Streams 1.0.0 and Why You Should Care (webinar)
Legacy Typesafe (now Lightbend)
Going Reactive in Java with Typesafe Reactive Platform
Going Reactive in Java with Typesafe Reactive PlatformGoing Reactive in Java with Typesafe Reactive Platform
Going Reactive in Java with Typesafe Reactive Platform
Legacy Typesafe (now Lightbend)
Why Play Framework is fast
Why Play Framework is fastWhy Play Framework is fast
Why Play Framework is fast
Legacy Typesafe (now Lightbend)
[Sneak Preview] Apache Spark: Preparing for the next wave of Reactive Big Data
[Sneak Preview] Apache Spark: Preparing for the next wave of Reactive Big Data[Sneak Preview] Apache Spark: Preparing for the next wave of Reactive Big Data
[Sneak Preview] Apache Spark: Preparing for the next wave of Reactive Big Data
Legacy Typesafe (now Lightbend)

More from Legacy Typesafe (now Lightbend) (16)

The How and Why of Fast Data Analytics with Apache Spark
The How and Why of Fast Data Analytics with Apache SparkThe How and Why of Fast Data Analytics with Apache Spark
The How and Why of Fast Data Analytics with Apache Spark
Reactive Design Patterns
Reactive Design PatternsReactive Design Patterns
Reactive Design Patterns
Revitalizing Aging Architectures with Microservices
Revitalizing Aging Architectures with MicroservicesRevitalizing Aging Architectures with Microservices
Revitalizing Aging Architectures with Microservices
Typesafe Reactive Platform: Monitoring 1.0, Commercial features and more
Typesafe Reactive Platform: Monitoring 1.0, Commercial features and moreTypesafe Reactive Platform: Monitoring 1.0, Commercial features and more
Typesafe Reactive Platform: Monitoring 1.0, Commercial features and more
Akka 2.4 plus new commercial features in Typesafe Reactive Platform
Akka 2.4 plus new commercial features in Typesafe Reactive PlatformAkka 2.4 plus new commercial features in Typesafe Reactive Platform
Akka 2.4 plus new commercial features in Typesafe Reactive Platform
How to deploy Apache Spark 
to Mesos/DCOS
How to deploy Apache Spark 
to Mesos/DCOSHow to deploy Apache Spark 
to Mesos/DCOS
How to deploy Apache Spark 
to Mesos/DCOS
Akka 2.4 plus commercial features in Typesafe Reactive Platform
Akka 2.4 plus commercial features in Typesafe Reactive PlatformAkka 2.4 plus commercial features in Typesafe Reactive Platform
Akka 2.4 plus commercial features in Typesafe Reactive Platform
Reactive Revealed Part 2: Scalability, Elasticity and Location Transparency i...
Reactive Revealed Part 2: Scalability, Elasticity and Location Transparency i...Reactive Revealed Part 2: Scalability, Elasticity and Location Transparency i...
Reactive Revealed Part 2: Scalability, Elasticity and Location Transparency i...
Microservices 101: Exploiting Reality's Constraints with Technology
Microservices 101: Exploiting Reality's Constraints with TechnologyMicroservices 101: Exploiting Reality's Constraints with Technology
Microservices 101: Exploiting Reality's Constraints with Technology
Four Things to Know About Reliable Spark Streaming with Typesafe and Databricks
Four Things to Know About Reliable Spark Streaming with Typesafe and DatabricksFour Things to Know About Reliable Spark Streaming with Typesafe and Databricks
Four Things to Know About Reliable Spark Streaming with Typesafe and Databricks
A Deeper Look Into Reactive Streams with Akka Streams 1.0 and Slick 3.0
A Deeper Look Into Reactive Streams with Akka Streams 1.0 and Slick 3.0A Deeper Look Into Reactive Streams with Akka Streams 1.0 and Slick 3.0
A Deeper Look Into Reactive Streams with Akka Streams 1.0 and Slick 3.0
Modernizing Your Aging Architecture: What Enterprise Architects Need To Know ...
Modernizing Your Aging Architecture: What Enterprise Architects Need To Know ...Modernizing Your Aging Architecture: What Enterprise Architects Need To Know ...
Modernizing Your Aging Architecture: What Enterprise Architects Need To Know ...
Reactive Streams 1.0.0 and Why You Should Care (webinar)
Reactive Streams 1.0.0 and Why You Should Care (webinar)Reactive Streams 1.0.0 and Why You Should Care (webinar)
Reactive Streams 1.0.0 and Why You Should Care (webinar)
Going Reactive in Java with Typesafe Reactive Platform
Going Reactive in Java with Typesafe Reactive PlatformGoing Reactive in Java with Typesafe Reactive Platform
Going Reactive in Java with Typesafe Reactive Platform
Why Play Framework is fast
Why Play Framework is fastWhy Play Framework is fast
Why Play Framework is fast
[Sneak Preview] Apache Spark: Preparing for the next wave of Reactive Big Data
[Sneak Preview] Apache Spark: Preparing for the next wave of Reactive Big Data[Sneak Preview] Apache Spark: Preparing for the next wave of Reactive Big Data
[Sneak Preview] Apache Spark: Preparing for the next wave of Reactive Big Data

Recently uploaded

RISE with SAP and Journey to the Intelligent Enterprise
RISE with SAP and Journey to the Intelligent EnterpriseRISE with SAP and Journey to the Intelligent Enterprise
RISE with SAP and Journey to the Intelligent Enterprise
Dominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdf
Dominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdfDominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdf
Dominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdf
Prosigns: Transforming Business with Tailored Technology Solutions
Prosigns: Transforming Business with Tailored Technology SolutionsProsigns: Transforming Business with Tailored Technology Solutions
Prosigns: Transforming Business with Tailored Technology Solutions
AI Pilot Review: The World’s First Virtual Assistant Marketing Suite
AI Pilot Review: The World’s First Virtual Assistant Marketing SuiteAI Pilot Review: The World’s First Virtual Assistant Marketing Suite
AI Pilot Review: The World’s First Virtual Assistant Marketing Suite
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoam
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoamOpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoam
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoam
Lecture 1 Introduction to games development
Lecture 1 Introduction to games developmentLecture 1 Introduction to games development
Lecture 1 Introduction to games development
Vitthal Shirke Microservices Resume Montevideo
Vitthal Shirke Microservices Resume MontevideoVitthal Shirke Microservices Resume Montevideo
Vitthal Shirke Microservices Resume Montevideo
Vitthal Shirke
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptx
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptxTop Features to Include in Your Winzo Clone App for Business Growth (4).pptx
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptx
Large Language Models and the End of Programming
Large Language Models and the End of ProgrammingLarge Language Models and the End of Programming
Large Language Models and the End of Programming
Matt Welsh
Webinar: Salesforce Document Management 2.0 - Smarter, Faster, Better
Webinar: Salesforce Document Management 2.0 - Smarter, Faster, BetterWebinar: Salesforce Document Management 2.0 - Smarter, Faster, Better
Webinar: Salesforce Document Management 2.0 - Smarter, Faster, Better
Orion Context Broker introduction 20240604
Orion Context Broker introduction 20240604Orion Context Broker introduction 20240604
Orion Context Broker introduction 20240604
Fermin Galan
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...
Corporate Management | Session 3 of 3 | Tendenci AMS
Corporate Management | Session 3 of 3 | Tendenci AMSCorporate Management | Session 3 of 3 | Tendenci AMS
Corporate Management | Session 3 of 3 | Tendenci AMS
Tendenci - The Open Source AMS (Association Management Software)
First Steps with Globus Compute Multi-User Endpoints
First Steps with Globus Compute Multi-User EndpointsFirst Steps with Globus Compute Multi-User Endpoints
First Steps with Globus Compute Multi-User Endpoints
Enhancing Research Orchestration Capabilities at ORNL.pdf
Enhancing Research Orchestration Capabilities at ORNL.pdfEnhancing Research Orchestration Capabilities at ORNL.pdf
Enhancing Research Orchestration Capabilities at ORNL.pdf
May Marketo Masterclass, London MUG May 22 2024.pdf
May Marketo Masterclass, London MUG May 22 2024.pdfMay Marketo Masterclass, London MUG May 22 2024.pdf
May Marketo Masterclass, London MUG May 22 2024.pdf
Adele Miller
Beyond Event Sourcing - Embracing CRUD for Wix Platform - Java.IL
Beyond Event Sourcing - Embracing CRUD for Wix Platform - Java.ILBeyond Event Sourcing - Embracing CRUD for Wix Platform - Java.IL
Beyond Event Sourcing - Embracing CRUD for Wix Platform - Java.IL
Natan Silnitsky
Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...
Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...
Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...
Juraj Vysvader
Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...
Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...
Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...
Anthony Dahanne

Recently uploaded (20)

RISE with SAP and Journey to the Intelligent Enterprise
RISE with SAP and Journey to the Intelligent EnterpriseRISE with SAP and Journey to the Intelligent Enterprise
RISE with SAP and Journey to the Intelligent Enterprise
Dominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdf
Dominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdfDominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdf
Dominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdf
Prosigns: Transforming Business with Tailored Technology Solutions
Prosigns: Transforming Business with Tailored Technology SolutionsProsigns: Transforming Business with Tailored Technology Solutions
Prosigns: Transforming Business with Tailored Technology Solutions
AI Pilot Review: The World’s First Virtual Assistant Marketing Suite
AI Pilot Review: The World’s First Virtual Assistant Marketing SuiteAI Pilot Review: The World’s First Virtual Assistant Marketing Suite
AI Pilot Review: The World’s First Virtual Assistant Marketing Suite
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoam
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoamOpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoam
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoam
Lecture 1 Introduction to games development
Lecture 1 Introduction to games developmentLecture 1 Introduction to games development
Lecture 1 Introduction to games development
Vitthal Shirke Microservices Resume Montevideo
Vitthal Shirke Microservices Resume MontevideoVitthal Shirke Microservices Resume Montevideo
Vitthal Shirke Microservices Resume Montevideo
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptx
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptxTop Features to Include in Your Winzo Clone App for Business Growth (4).pptx
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptx
Large Language Models and the End of Programming
Large Language Models and the End of ProgrammingLarge Language Models and the End of Programming
Large Language Models and the End of Programming
Webinar: Salesforce Document Management 2.0 - Smarter, Faster, Better
Webinar: Salesforce Document Management 2.0 - Smarter, Faster, BetterWebinar: Salesforce Document Management 2.0 - Smarter, Faster, Better
Webinar: Salesforce Document Management 2.0 - Smarter, Faster, Better
Orion Context Broker introduction 20240604
Orion Context Broker introduction 20240604Orion Context Broker introduction 20240604
Orion Context Broker introduction 20240604
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...
Corporate Management | Session 3 of 3 | Tendenci AMS
Corporate Management | Session 3 of 3 | Tendenci AMSCorporate Management | Session 3 of 3 | Tendenci AMS
Corporate Management | Session 3 of 3 | Tendenci AMS
First Steps with Globus Compute Multi-User Endpoints
First Steps with Globus Compute Multi-User EndpointsFirst Steps with Globus Compute Multi-User Endpoints
First Steps with Globus Compute Multi-User Endpoints
Enhancing Research Orchestration Capabilities at ORNL.pdf
Enhancing Research Orchestration Capabilities at ORNL.pdfEnhancing Research Orchestration Capabilities at ORNL.pdf
Enhancing Research Orchestration Capabilities at ORNL.pdf
May Marketo Masterclass, London MUG May 22 2024.pdf
May Marketo Masterclass, London MUG May 22 2024.pdfMay Marketo Masterclass, London MUG May 22 2024.pdf
May Marketo Masterclass, London MUG May 22 2024.pdf
Beyond Event Sourcing - Embracing CRUD for Wix Platform - Java.IL
Beyond Event Sourcing - Embracing CRUD for Wix Platform - Java.ILBeyond Event Sourcing - Embracing CRUD for Wix Platform - Java.IL
Beyond Event Sourcing - Embracing CRUD for Wix Platform - Java.IL
Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...
Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...
Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...
Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...
Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...
Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...

Reactive Revealed Part 3 of 3: Resiliency, Failures vs Errors, Isolation, Delegation and Replication in Reactive Systems

  • 1. Resiliency, Failures vs Errors, Isolation, Delegation and Replication in Reactive Systems Jonas Bonér CTO TypEsafe @jboner
  • 2. “But it ain’t how hard you’re hit; it’s about how hard you can get hit, and keep moving forward. How much you can take, and keep moving forward. That’s how winning is done.” - Rocky Balboa
  • 3. “But it ain’t how hard you’re hit; it’s about how hard you can get hit, and keep moving forward. How much you can take, and keep moving forward. That’s how winning is done.” - Rocky Balboa This is Fault Tolerance
  • 5. Resilience “The ability of a substance or object to spring back into shape. The capacity to recover quickly from difficulties.” -Merriam Webster
  • 8. “We can model and understand in isolation. 
 But, when released into competitive nominally regulated societies, their connections proliferate, 
 their interactions and interdependencies multiply, 
 their complexities mushroom. 
 And we are caught short.” - Sidney Dekker Drift into Failure - Sidney Dekker
  • 9. We Need to Study Resilience in Complex Systems
  • 16. “Counterintuitive. That’s [Jay] Forrester’s word to describe complex systems. Leverage points are not intuitive. Or if they are, we intuitively use them backward, systematically worsening whatever problems we are trying to solve.” - Donella Meadows Leverage Points: Places to Intervene in a System - Donella Meadows
  • 17. ‘‘Going solid’’: a model of system dynamics and consequences for patient safety - R Cook, J Rasmussen Resilience in complex adaptive systems: Operating at the Edge of Failure - Richard Cook - Talk at Velocity NY 2013 Operating at the Edge of Failure
  • 18. ‘‘Going solid’’: a model of system dynamics and consequences for patient safety - R Cook, J Rasmussen Resilience in complex adaptive systems: Operating at the Edge of Failure - Richard Cook - Talk at Velocity NY 2013 Economic Failure Boundary Operating at the Edge of Failure
  • 19. ‘‘Going solid’’: a model of system dynamics and consequences for patient safety - R Cook, J Rasmussen Resilience in complex adaptive systems: Operating at the Edge of Failure - Richard Cook - Talk at Velocity NY 2013 Economic Failure Boundary Unacceptable Workload Boundary Operating at the Edge of Failure
  • 20. ‘‘Going solid’’: a model of system dynamics and consequences for patient safety - R Cook, J Rasmussen Resilience in complex adaptive systems: Operating at the Edge of Failure - Richard Cook - Talk at Velocity NY 2013 Economic Failure Boundary Unacceptable Workload Boundary Accident Boundary Operating at the Edge of Failure
  • 21. ‘‘Going solid’’: a model of system dynamics and consequences for patient safety - R Cook, J Rasmussen Resilience in complex adaptive systems: Operating at the Edge of Failure - Richard Cook - Talk at Velocity NY 2013 Economic Failure Boundary Unacceptable Workload Boundary Operating Point Accident Boundary Operating at the Edge of Failure
  • 22. ‘‘Going solid’’: a model of system dynamics and consequences for patient safety - R Cook, J Rasmussen Resilience in complex adaptive systems: Operating at the Edge of Failure - Richard Cook - Talk at Velocity NY 2013 Economic Failure Boundary Unacceptable Workload Boundary Accident Boundary Operating at the Edge of Failure
  • 23. ‘‘Going solid’’: a model of system dynamics and consequences for patient safety - R Cook, J Rasmussen Resilience in complex adaptive systems: Operating at the Edge of Failure - Richard Cook - Talk at Velocity NY 2013 Economic Failure Boundary Unacceptable Workload Boundary FAILURE Accident Boundary Operating at the Edge of Failure
  • 24. Economic Failure Boundary Unacceptable Workload Boundary Accident Boundary ‘‘Going solid’’: a model of system dynamics and consequences for patient safety - R Cook, J Rasmussen Resilience in complex adaptive systems: Operating at the Edge of Failure - Richard Cook - Talk at Velocity NY 2013 Operating at the Edge of Failure
  • 25. Economic Failure Boundary Unacceptable Workload Boundary Accident Boundary Management Pressure Towards Economic Efficiency ‘‘Going solid’’: a model of system dynamics and consequences for patient safety - R Cook, J Rasmussen Resilience in complex adaptive systems: Operating at the Edge of Failure - Richard Cook - Talk at Velocity NY 2013 Operating at the Edge of Failure
  • 26. Economic Failure Boundary Unacceptable Workload Boundary Accident Boundary Management Pressure Towards Economic Efficiency Gradient Towards Least Effort ‘‘Going solid’’: a model of system dynamics and consequences for patient safety - R Cook, J Rasmussen Resilience in complex adaptive systems: Operating at the Edge of Failure - Richard Cook - Talk at Velocity NY 2013 Operating at the Edge of Failure
  • 27. Economic Failure Boundary Unacceptable Workload Boundary Accident Boundary Management Pressure Towards Economic Efficiency Gradient Towards Least Effort ‘‘Going solid’’: a model of system dynamics and consequences for patient safety - R Cook, J Rasmussen Resilience in complex adaptive systems: Operating at the Edge of Failure - Richard Cook - Talk at Velocity NY 2013 Operating at the Edge of Failure
  • 28. Economic Failure Boundary Unacceptable Workload Boundary Accident Boundary Management Pressure Towards Economic Efficiency Gradient Towards Least Effort Counter Gradient For More Resilience ‘‘Going solid’’: a model of system dynamics and consequences for patient safety - R Cook, J Rasmussen Resilience in complex adaptive systems: Operating at the Edge of Failure - Richard Cook - Talk at Velocity NY 2013 Operating at the Edge of Failure
  • 29. ‘‘Going solid’’: a model of system dynamics and consequences for patient safety - R Cook, J Rasmussen Resilience in complex adaptive systems: Operating at the Edge of Failure - Richard Cook - Talk at Velocity NY 2013 Economic Failure Boundary Unacceptable Workload Boundary Accident Boundary Operating at the Edge of Failure
  • 30. ‘‘Going solid’’: a model of system dynamics and consequences for patient safety - R Cook, J Rasmussen Resilience in complex adaptive systems: Operating at the Edge of Failure - Richard Cook - Talk at Velocity NY 2013 Economic Failure Boundary Unacceptable Workload Boundary Accident Boundary Error Margin Marginal Boundary Operating at the Edge of Failure
  • 31. ‘‘Going solid’’: a model of system dynamics and consequences for patient safety - R Cook, J Rasmussen Resilience in complex adaptive systems: Operating at the Edge of Failure - Richard Cook - Talk at Velocity NY 2013 Economic Failure Boundary Unacceptable Workload Boundary Accident Boundary Error Margin Marginal Boundary Operating at the Edge of Failure
  • 32. ‘‘Going solid’’: a model of system dynamics and consequences for patient safety - R Cook, J Rasmussen Resilience in complex adaptive systems: Operating at the Edge of Failure - Richard Cook - Talk at Velocity NY 2013 Economic Failure Boundary Unacceptable Workload Boundary Accident Boundary Error Margin Marginal Boundary Operating at the Edge of Failure
  • 33. ‘‘Going solid’’: a model of system dynamics and consequences for patient safety - R Cook, J Rasmussen Resilience in complex adaptive systems: Operating at the Edge of Failure - Richard Cook - Talk at Velocity NY 2013 Accident Boundary Marginal Boundary Operating at the Edge of Failure
  • 34. ‘‘Going solid’’: a model of system dynamics and consequences for patient safety - R Cook, J Rasmussen Resilience in complex adaptive systems: Operating at the Edge of Failure - Richard Cook - Talk at Velocity NY 2013 Marginal Boundary ? Operating at the Edge of Failure
  • 35. ‘‘Going solid’’: a model of system dynamics and consequences for patient safety - R Cook, J Rasmussen Resilience in complex adaptive systems: Operating at the Edge of Failure - Richard Cook - Talk at Velocity NY 2013 Operating at the Edge of Failure Accident Boundary Marginal Boundary
  • 36. ‘‘Going solid’’: a model of system dynamics and consequences for patient safety - R Cook, J Rasmussen Resilience in complex adaptive systems: Operating at the Edge of Failure - Richard Cook - Talk at Velocity NY 2013 Operating at the Edge of Failure Accident Boundary Marginal Boundary
  • 37. ‘‘Going solid’’: a model of system dynamics and consequences for patient safety - R Cook, J Rasmussen Resilience in complex adaptive systems: Operating at the Edge of Failure - Richard Cook - Talk at Velocity NY 2013 Operating at the Edge of Failure Accident Boundary Marginal Boundary
  • 38. ‘‘Going solid’’: a model of system dynamics and consequences for patient safety - R Cook, J Rasmussen Resilience in complex adaptive systems: Operating at the Edge of Failure - Richard Cook - Talk at Velocity NY 2013 Operating at the Edge of Failure Accident Boundary Marginal Boundary
  • 41. Simple Critical Infrastructure Map Understanding vital services, and how they keep you safe 6ways to die 3sets of essential services 7layers of PROTECTION Dealing in Security - Mike Bennet, Vinay Gupta
  • 42. 7 Principles for Building Resilience in Social Systems 1. Maintain diversity & Redundancy 2. Manage connectivity 3. Manage slow variables & feedback 4. Foster complex adaptive systems thinking 5. Encourage learning 6. Broaden participation 7. Promote polycentric governance Principles for Building Resilience: Sustaining Ecosystem Services in Social-Ecological Systems - Reinette Biggs et. al.
  • 44. MeerkatsPuppies! Now that I’ve got your attention, complexity theory - Nicolas Perony, TED talk
  • 45. What We Can Learn From Biological Systems 1. Feature Diversity and redundancy 2. Inter-Connected network structure 3. Wide distribution across all scales 4. Capacity to self-adapt & self-organize Toward Resilient Architectures 1: Biology Lessons - Michael Mehaffy, Nikos A. Salingaros
  • 46. “Animals show extraordinary social complexity, and this allows them to adapt and 
 respond to changes in their environment. In three words, in the animal kingdom, simplicity leads to complexity 
 which leads to resilience.” - Nicolas Perony Puppies! Now that I’ve got your attention, complexity theory - Nicolas Perony, TED talk
  • 48.
  • 49. “Complex systems run in degraded mode.” “Complex systems run as broken systems.” - richard Cook How Complex Systems Fail - Richard Cook
  • 50. Resilience is by Design Photo courtesy of FEMA/Joselyne Augustino
  • 51. We Need to Manage Failure
  • 52. “Post-accident attribution to a 
 ‘root cause’ is fundamentally wrong: 
 Because overt failure requires multiple faults, there is no isolated ‘cause’ of an accident.” - richard Cook How Complex Systems Fail - Richard Cook
  • 54. Crash Only Software Crash-Only Software - George Candea, Armando Fox Stop = Crash Safely Start = Recover Fast
  • 55. “To make a system of interconnected components crash-only, it must be designed so that components can tolerate the crashes and temporary unavailability of their peers. This means we require: [1] strong modularity with relatively impermeable component boundaries, [2] timeout-based communication and lease-based resource allocation, and [3] self- describing requests that carry a time-to-live and information on whether they are idempotent.” - George Candea, Armando Fox Crash-Only Software - George Candea, Armando Fox
  • 56. Recursive Restartability Turning the Crash-Only Sledgehammer into a Scalpel Recursive Restartability: Turning the Reboot Sledgehammer into a Scalpel - George Candea, Armando Fox
  • 57. Services need to accept NO for an answer
  • 58. "Software components should be designed such that they can deny service for any request or call. Then, if an underlying component can say No, apps must be designed to take No for an answer and decide how to proceed: give up, wait and retry, reduce fidelity, etc.” - George Candea, Armando Fox Recursive Restartability: Turning the Reboot Sledgehammer into a Scalpel - George Candea, Armando Fox Learn to take NO for an answer
  • 59. “The explosive growth of software has added greatly to systems’ interactive complexity. With software, the possible states that a system can end up in become mind-boggling.” - Sidney Dekker Drift into Failure - Sidney Dekker
  • 60. Classification of State • Static Data • Scratch Data • Dynamic Data • Recomputable • not recomputable
  • 61. Classification of State • Static Data • Scratch Data • Dynamic Data • Recomputable • not recomputable Critical
  • 62. We Need a Way Out of the State Tar Pit Out of the Tar Pit - Ben Moseley , Peter Marks
  • 63. Essential State We Need a Way Out of the State Tar Pit Out of the Tar Pit - Ben Moseley , Peter Marks
  • 64. Essential State We Need a Way Out of the State Tar Pit Out of the Tar Pit - Ben Moseley , Peter Marks Essential Logic
  • 65. Essential State We Need a Way Out of the State Tar Pit Out of the Tar Pit - Ben Moseley , Peter Marks Essential Logic Accidental State and Control
  • 66. Traditional State Management Object Critical state that needs protection Client Thread boundary
  • 67. Traditional State Management Object Critical state that needs protection Client Thread boundary
  • 68. Traditional State Management Object Critical state that needs protection Client Thread boundary
  • 69. Traditional State Management Object Critical state that needs protection Client Thread boundary Synchronous dispatch Thread boundary
  • 70. Traditional State Management Object Critical state that needs protection Client Thread boundary Synchronous dispatch Thread boundary
  • 71. Traditional State Management Object Critical state that needs protection Client Thread boundary Synchronous dispatch Thread boundary ?
  • 72. Traditional State Management Object Critical state that needs protection Client Thread boundary Synchronous dispatch Thread boundary ? Utterly broken
  • 73. Requirements for a Sane Failure Mode 1. Contained 2. Reified—as messages 3. Signalled—Asynchronously 4. Observed—by 1-N 5. Managed Failures need to be
  • 75. Akka Actors case class Greeting(who: String) class GreetingActor extends Actor with ActorLogging { def receive = { case Greeting(who) =>”Hello ${who}") } }
  • 76. Akka Actors case class Greeting(who: String) class GreetingActor extends Actor with ActorLogging { def receive = { case Greeting(who) =>”Hello ${who}") } } Define the message(s) the Actor should be able to respond to
  • 77. Akka Actors case class Greeting(who: String) class GreetingActor extends Actor with ActorLogging { def receive = { case Greeting(who) =>”Hello ${who}") } } Define the message(s) the Actor should be able to respond to Define the Actor class
  • 78. Akka Actors case class Greeting(who: String) class GreetingActor extends Actor with ActorLogging { def receive = { case Greeting(who) =>”Hello ${who}") } } Define the message(s) the Actor should be able to respond to Define the Actor class Define the Actor’s behavior
  • 79. Akka Actors case class Greeting(who: String) class GreetingActor extends Actor with ActorLogging { def receive = { case Greeting(who) =>”Hello ${who}") } }
  • 80. Akka Actors case class Greeting(who: String) class GreetingActor extends Actor with ActorLogging { def receive = { case Greeting(who) =>”Hello ${who}") } } val system = ActorSystem("MySystem") val greeter = system.actorOf(Props[GreetingActor], "greeter")
  • 81. Akka Actors case class Greeting(who: String) class GreetingActor extends Actor with ActorLogging { def receive = { case Greeting(who) =>”Hello ${who}") } } val system = ActorSystem("MySystem") val greeter = system.actorOf(Props[GreetingActor], "greeter") Create an Actor system
  • 82. Akka Actors case class Greeting(who: String) class GreetingActor extends Actor with ActorLogging { def receive = { case Greeting(who) =>”Hello ${who}") } } val system = ActorSystem("MySystem") val greeter = system.actorOf(Props[GreetingActor], "greeter") Create an Actor system Actor configuration
  • 83. Akka Actors case class Greeting(who: String) class GreetingActor extends Actor with ActorLogging { def receive = { case Greeting(who) =>”Hello ${who}") } } val system = ActorSystem("MySystem") val greeter = system.actorOf(Props[GreetingActor], "greeter") Give it a name Create an Actor system Actor configuration
  • 84. Akka Actors case class Greeting(who: String) class GreetingActor extends Actor with ActorLogging { def receive = { case Greeting(who) =>”Hello ${who}") } } val system = ActorSystem("MySystem") val greeter = system.actorOf(Props[GreetingActor], "greeter") Give it a nameCreate the Actor Create an Actor system Actor configuration
  • 85. Akka Actors case class Greeting(who: String) class GreetingActor extends Actor with ActorLogging { def receive = { case Greeting(who) =>”Hello ${who}") } } val system = ActorSystem("MySystem") val greeter = system.actorOf(Props[GreetingActor], "greeter") Give it a nameCreate the ActorYou get an ActorRef back Create an Actor system Actor configuration
  • 86. Akka Actors case class Greeting(who: String) class GreetingActor extends Actor with ActorLogging { def receive = { case Greeting(who) =>”Hello ${who}") } } val system = ActorSystem("MySystem") val greeter = system.actorOf(Props[GreetingActor], "greeter")
  • 87. Akka Actors case class Greeting(who: String) class GreetingActor extends Actor with ActorLogging { def receive = { case Greeting(who) =>”Hello ${who}") } } val system = ActorSystem("MySystem") val greeter = system.actorOf(Props[GreetingActor], "greeter") greeter ! Greeting("Charlie Parker")
  • 88. Akka Actors case class Greeting(who: String) class GreetingActor extends Actor with ActorLogging { def receive = { case Greeting(who) =>”Hello ${who}") } } val system = ActorSystem("MySystem") val greeter = system.actorOf(Props[GreetingActor], "greeter") greeter ! Greeting("Charlie Parker") Send the message asynchronously
  • 89. Akka Actors case class Greeting(who: String) class GreetingActor extends Actor with ActorLogging { def receive = { case Greeting(who) =>”Hello ${who}") } } val system = ActorSystem("MySystem") val greeter = system.actorOf(Props[GreetingActor], "greeter") greeter ! Greeting("Charlie Parker")
  • 102. Think Vending Machine Programmer Inserts coins Out of coffee beans error Coffee Machine
  • 103. Think Vending Machine Programmer Inserts coins Out of coffee beans error WRONG Coffee Machine
  • 105. Think Vending Machine Programmer Inserts coins Out of coffee beans failure Coffee Machine
  • 106. Think Vending Machine Programmer Service Guy Inserts coins Out of coffee beans failure Coffee Machine
  • 107. Think Vending Machine Programmer Service Guy Inserts coins Out of coffee beans failure Adds more beans Coffee Machine
  • 108. Think Vending Machine Programmer Service Guy Inserts coins Gets coffee Out of coffee beans failure Adds more beans Coffee Machine
  • 117.
  • 118. Error Kernel Pattern Onion-layered state & Failure management Making reliable distributed systems in the presence of software errors - Joe Armstrong On Erlang, State and Crashes - Jesper Louis Andersen
  • 119. Onion Layered State Management Object Critical state that needs protection Client Thread boundary
  • 120. Onion Layered State Management Object Critical state that needs protection Client Thread boundary
  • 121. Onion Layered State Management Error Kernel Object Critical state that needs protection Client Thread boundary
  • 122. Onion Layered State Management Error Kernel Object Critical state that needs protection Client Thread boundary
  • 123. Onion Layered State Management Error Kernel Object Critical state that needs protection Client Supervision Thread boundary
  • 124. Onion Layered State Management Error Kernel Object Critical state that needs protection Client Supervision Supervision Thread boundary
  • 125. Onion Layered State Management Error Kernel Object Critical state that needs protection Client Supervision Supervision Thread boundary
  • 126. Onion Layered State Management Error Kernel Object Critical state that needs protection Client Supervision Supervision Thread boundary
  • 127. Onion Layered State Management Error Kernel Object Critical state that needs protection Client Supervision Supervision Thread boundary
  • 128. Onion Layered State Management Error Kernel Object Critical state that needs protection Client Supervision Supervision Thread boundary
  • 129. Onion Layered State Management Error Kernel Object Critical state that needs protection Client Supervision Supervision Thread boundary
  • 130. Onion Layered State Management Error Kernel Object Critical state that needs protection Client Supervision Supervision Thread boundary
  • 131. Onion Layered State Management Error Kernel Object Critical state that needs protection Client Supervision Supervision Thread boundary
  • 132. Onion Layered State Management Error Kernel Object Critical state that needs protection Client Supervision Supervision Thread boundary
  • 133. Supervision in Akka Every actor has a default supervisor strategy. Which can, and often should, be overridden. class Supervisor extends Actor { override val supervisorStrategy = OneForOneStrategy(maxNrOfRetries = 10, withinTimeRange = 1 minute) { case _: ArithmeticException => Resume case _: NullPointerException => Restart case _: Exception => Escalate } val worker = context.actorOf(Props[Worker], name = "worker") def receive = { case number: Int => worker.forward(number) } }
  • 134. Supervision in Akka Every actor has a default supervisor strategy. Which can, and often should, be overridden. class Supervisor extends Actor { override val supervisorStrategy = OneForOneStrategy(maxNrOfRetries = 10, withinTimeRange = 1 minute) { case _: ArithmeticException => Resume case _: NullPointerException => Restart case _: Exception => Escalate } val worker = context.actorOf(Props[Worker], name = "worker") def receive = { case number: Int => worker.forward(number) } } Parent actor
  • 135. Supervision in Akka Every actor has a default supervisor strategy. Which can, and often should, be overridden. class Supervisor extends Actor { override val supervisorStrategy = OneForOneStrategy(maxNrOfRetries = 10, withinTimeRange = 1 minute) { case _: ArithmeticException => Resume case _: NullPointerException => Restart case _: Exception => Escalate } val worker = context.actorOf(Props[Worker], name = "worker") def receive = { case number: Int => worker.forward(number) } } Parent actor All its children have their life-cycle managed through this declarative supervision strategy
  • 136. Supervision in Akka Every actor has a default supervisor strategy. Which can, and often should, be overridden. class Supervisor extends Actor { override val supervisorStrategy = OneForOneStrategy(maxNrOfRetries = 10, withinTimeRange = 1 minute) { case _: ArithmeticException => Resume case _: NullPointerException => Restart case _: Exception => Escalate } val worker = context.actorOf(Props[Worker], name = "worker") def receive = { case number: Int => worker.forward(number) } } Create a supervised child actor Parent actor All its children have their life-cycle managed through this declarative supervision strategy
  • 137. Monitor through Death Watch class Watcher extends Actor { val child = context.actorOf(Props.empty, "child") def receive = { case Terminated(`child`) => … // handle child termination } }
  • 138. Monitor through Death Watch class Watcher extends Actor { val child = context.actorOf(Props.empty, "child") def receive = { case Terminated(`child`) => … // handle child termination } } Create a child actor
  • 139. Monitor through Death Watch class Watcher extends Actor { val child = context.actorOf(Props.empty, "child") def receive = { case Terminated(`child`) => … // handle child termination } } Create a child actor Watch it
  • 140. Monitor through Death Watch class Watcher extends Actor { val child = context.actorOf(Props.empty, "child") def receive = { case Terminated(`child`) => … // handle child termination } } Create a child actor Watch it Receive termination signal
  • 143. Akka Routing akka { actor { deployment { /service/router { router = round-robin-pool resizer { lower-bound = 12 upper-bound = 15 } } } } }
  • 144. Akka Cluster akka { actor { deployment { /service/router { router = round-robin-pool resizer { lower-bound = 12 upper-bound = 15 } } } provider = "akka.cluster.ClusterActorRefProvider" }    cluster { seed-nodes = [ “akka.tcp://ClusterSystem@", “akka.tcp://ClusterSystem@" ]   } }
  • 147. We are living in the Looming Shadowof Impossibility Theorems
  • 148. We are living in the Looming Shadowof Impossibility Theorems CAP: Consistency is impossible
  • 149. We are living in the Looming Shadowof Impossibility Theorems CAP: Consistency is impossible FLP: Consensus is impossible
  • 152. Resilient Protocols Depend on Asynchronous Communication Eventual Consistency
  • 153. Resilient Protocols • are tolerant to • Message loss • Message reordering • Message duplication Depend on Asynchronous Communication Eventual Consistency
  • 154. Resilient Protocols • are tolerant to • Message loss • Message reordering • Message duplication • Embrace ACID 2.0 • Associative • Commutative • Idempotent • Distributed Depend on Asynchronous Communication Eventual Consistency
  • 155. Akka Distributed Data class DataBot extends Actor with ActorLogging { val replicator = DistributedData(context.system).replicator implicit val node = Cluster(context.system) val tickTask = context.system.scheduler.schedule( 5.seconds, 5.seconds, self, Add) val DataKey = ORSetKey[String]("key") replicator ! Subscribe(DataKey, self)
  • 156. Akka Distributed Data class DataBot extends Actor with ActorLogging { val replicator = DistributedData(context.system).replicator implicit val node = Cluster(context.system) val tickTask = context.system.scheduler.schedule( 5.seconds, 5.seconds, self, Add) val DataKey = ORSetKey[String]("key") replicator ! Subscribe(DataKey, self) def receive = { case Tick => val s = ThreadLocalRandom.current().nextInt(97, 123).toChar.toString if (ThreadLocalRandom.current().nextBoolean()) replicator ! Update(DataKey, ORSet.empty[String], WriteLocal)(_ + s) else replicator ! Update(DataKey, ORSet.empty[String], WriteLocal)(_ - s) case c @ Changed(DataKey) => val data = c.get(DataKey) case _: UpdateResponse[_] => // ignore } }
  • 157. “An escalator can never break: it can only become stairs. You should never see an Escalator Temporarily Out Of Order sign, just Escalator Temporarily Stairs. Sorry for the convenience.” - Mitch Hedberg
  • 160. Always Rely on Asynchronous Communication …And If not pOssiblE Always use Timeouts
  • 162. Circuit Breaker in Akka val breaker = new CircuitBreaker( context.system.scheduler, maxFailures = 5, callTimeout = 10.seconds, resetTimeout = 1.minute ).onOpen(…) .onHalfOpen(…) val result = breaker.withCircuitBreaker(Future(dangerousCall()))
  • 163. Little’s Law W: Response Time L: Queue Length
  • 164. Little’s Law Queue Length = Arrival Rate * Response Time W: Response Time L: Queue Length
  • 165. Little’s Law Response Time = Queue Length / Arrival Rate W: Response Time L: Queue Length
  • 167. Flow Control Always Apply Back Pressure
  • 170. Akka Streams in ~> f1 ~> bcast ~> f2 ~> merge ~> f3 ~> out bcast ~> f4 ~> merge
  • 171. Akka Streams in ~> f1 ~> bcast ~> f2 ~> merge ~> f3 ~> out bcast ~> f4 ~> merge val g = FlowGraph.closed() { implicit builder: FlowGraph.Builder => import FlowGraph.Implicits._ val in = Source(1 to 10) val out = Sink.ignore val bcast = builder.add(Broadcast[Int](2)) val merge = builder.add(Merge[Int](2)) val f1, f2, f3, f4 = Flow[Int].map(_ + 10) }
  • 172. Akka Streams in ~> f1 ~> bcast ~> f2 ~> merge ~> f3 ~> out bcast ~> f4 ~> merge val g = FlowGraph.closed() { implicit builder: FlowGraph.Builder => import FlowGraph.Implicits._ val in = Source(1 to 10) val out = Sink.ignore val bcast = builder.add(Broadcast[Int](2)) val merge = builder.add(Merge[Int](2)) val f1, f2, f3, f4 = Flow[Int].map(_ + 10) } Set up the context
  • 173. Akka Streams in ~> f1 ~> bcast ~> f2 ~> merge ~> f3 ~> out bcast ~> f4 ~> merge val g = FlowGraph.closed() { implicit builder: FlowGraph.Builder => import FlowGraph.Implicits._ val in = Source(1 to 10) val out = Sink.ignore val bcast = builder.add(Broadcast[Int](2)) val merge = builder.add(Merge[Int](2)) val f1, f2, f3, f4 = Flow[Int].map(_ + 10) } Set up the context Create the Source and Sink
  • 174. Akka Streams in ~> f1 ~> bcast ~> f2 ~> merge ~> f3 ~> out bcast ~> f4 ~> merge val g = FlowGraph.closed() { implicit builder: FlowGraph.Builder => import FlowGraph.Implicits._ val in = Source(1 to 10) val out = Sink.ignore val bcast = builder.add(Broadcast[Int](2)) val merge = builder.add(Merge[Int](2)) val f1, f2, f3, f4 = Flow[Int].map(_ + 10) } Set up the context Create the Source and Sink Create the fan out and fan in stages
  • 175. Akka Streams in ~> f1 ~> bcast ~> f2 ~> merge ~> f3 ~> out bcast ~> f4 ~> merge val g = FlowGraph.closed() { implicit builder: FlowGraph.Builder => import FlowGraph.Implicits._ val in = Source(1 to 10) val out = Sink.ignore val bcast = builder.add(Broadcast[Int](2)) val merge = builder.add(Merge[Int](2)) val f1, f2, f3, f4 = Flow[Int].map(_ + 10) } Set up the context Create the Source and Sink Create the fan out and fan in stages Create a set of transformations
  • 176. Akka Streams in ~> f1 ~> bcast ~> f2 ~> merge ~> f3 ~> out bcast ~> f4 ~> merge val g = FlowGraph.closed() { implicit builder: FlowGraph.Builder => import FlowGraph.Implicits._ val in = Source(1 to 10) val out = Sink.ignore val bcast = builder.add(Broadcast[Int](2)) val merge = builder.add(Merge[Int](2)) val f1, f2, f3, f4 = Flow[Int].map(_ + 10) } Set up the context Create the Source and Sink Create the fan out and fan in stages Create a set of transformations Define the graph processing blueprint
  • 178. What can we learn from Arnold?
  • 179. What can we learn from Arnold?
  • 180. What can we learn from Arnold? Blow things up
  • 182. Pull the Plug …and see what happens
  • 183.
  • 185. “Complex systems run in degraded mode.” “Complex systems run as broken systems.” - richard Cook How Complex Systems Fail - Richard Cook
  • 186. Resilience is by Design Photo courtesy of FEMA/Joselyne Augustino
  • 187. ReferencesAntifragile: Things That Gain from Disorder - Drift into Failure - How Complex Systems Fail - Leverage Points: Places to Intervene in a System - Going Solid: A Model of System Dynamics and Consequences for Patient Safety - Resilience in Complex Adaptive Systems: Operating at the Edge of Failure - Dealing in Security - Principles for Building Resilience: Sustaining Ecosystem Services in Social-Ecological Systems - Building-Resilience-Sustaining-Social-Ecological/dp/110708265X Puppies! Now that I’ve got your attention, Complexity Theory - nicolas_perony_puppies_now_that_i_ve_got_your_attention_complexity_theory How Bacteria Becomes Resistant - Towards Resilient Architectures: Biology Lessons - Architectures-1-Biology-Lessons/ Crash-Only Software - Recursive Restartability: Turning the Reboot Sledgehammer into a Scalpel - Out of the Tar Pit - Bulkhead Pattern - Making Reliable Distributed Systems in the Presence of Software Errors - On Erlang, State and Crashes - Akka Supervision - Release It!: Design and Deploy Production-Ready Software - Hystrix - Akka Circuit Breaker - Reactive Streams - Akka Streams - RxJava - Feedback Control for Computer Systems - Simian Army - Gatling - Akka MultiNode Testing -
  • 188. EXPERT TRAINING Delivered On-site For Spark, Scala, Akka And Play Help is just a click away. Get in touch with Typesafe about our training courses. • Intro Workshop to Apache Spark • Fast Track & Advanced Scala • Fast Track to Akka with Java or Scala • Fast Track to Play with Java or Scala • Advanced Akka with Java or Scala Ask us about local trainings available by 24 Typesafe partners in 14 countries around the world. CONTACT US Learn more about on-site training