Preparing for distributed system failures using akka #ScalaMatsuri

TIS Inc.
TIS Inc.TIS Inc. - Section Chief at TIS Inc.
Copyright © 2017 TIS Inc. All rights reserved.
Preparing for distributed
system failures using Akka
2017.2.25 Scala Matsuri
Yugo Maede
@yugolf
Copyright © 2017 TIS Inc. All rights reserved. 2
Who am I?
TIS Inc.
provides “Reactive Systems Consulting Service”
@yugolf
https://twitter.com/okapies/status/781439220330164225
- support PoC projects
- review designs
- review codes
       etc
リアクティブシステムのコンサルティングサービ
スをやっています
Copyright © 2017 TIS Inc. All rights reserved.
3
Todayʼs Topics
What are Architectural Safety
Measures in distributed system?
How to realize them with Akka
分散システムに考慮が必要な安全対策
Akkaでどうやるか?
Copyright © 2017 TIS Inc. All rights reserved. 4
Microservices mean
distributed systems
from Monolith to Microservices
マイクロサービス、
すなわち、分散システム
Copyright © 2017 TIS Inc. All rights reserved. 5
"Mooreʼs law is dead"
means
"distributed systems are
the beginning"
limitation of CPU performance
ムーアの法則の終焉、
すなわち、分散システムの幕開け
Copyright © 2017 TIS Inc. All rights reserved.
confront with distributed system
6
Building large-scale systems
requires distributed systems
分散システムなしにはビジネスの成功はない
Copyright © 2017 TIS Inc. All rights reserved. 7
- increasing the number of server means
increasing failure points
- face new enemies "network"
サーバが増えれば障害点も増える
ネットワークという新たな敵の出現
building distributed system is not easy
Copyright © 2017 TIS Inc. All rights reserved.
Architectural Safety Measures
8
define Cross-Functional Requirements
- availability
- response time and latency
機能横断要件を定義しましょう
可⽤性と応答時間/遅延
Copyright © 2017 TIS Inc. All rights reserved.
systems based on failure
9
- needs Antifragile Organizations
- needs systems based on failure
アンチフラジャイルな組織と障害を前提とした
システムが必要
Copyright © 2017 TIS Inc. All rights reserved.
Architectural Safety Measures need
10
timeout
bulkhead
circuit breaker
...
タイムアウト、隔壁、サーキットブレーカー、…
Copyright © 2017 TIS Inc. All rights reserved.
Akka is here
11
Akka has tools to deal with
distributed system failures
Akkaには分散システムに関わる障害に対処する
ためのツールが備わっている
Copyright © 2017 TIS Inc. All rights reserved.
Akka Actor
12
participant
Actor processes messages in order of arrival
$30
host
アクターはメッセージを到達順に処理
シンプルに⾮同期処理を実装可能
$10
$10
$10
status
$10 $10 $10
mailbox
Copyright © 2017 TIS Inc. All rights reserved.
Supervisor Hierarchy
13
let it crash
スーパーバイザーが⼦アクターを監視し障害制
御などを⾏う
supervisor
child actorchild actor
supervise
signal failure
- restart
- resume
- stop
- escalate
Copyright © 2017 TIS Inc. All rights reserved.
timeout
14
Copyright © 2017 TIS Inc. All rights reserved.
request-response needs timeout
15
request
response
応答が遅かったり、返ってこないこともある
☓
Copyright © 2017 TIS Inc. All rights reserved.
message passing
16
!
tell(fire and forget)を使う
askの場合はタイムアウトを適切に設定
?
1s
tell(fire and forget)
ask
Copyright © 2017 TIS Inc. All rights reserved.
timeout configuration
17
import akka.pattern.ask

import akka.util.Timeout

import system.dispatcher



implicit val timeout = Timeout(5 seconds)


val response = kitchen ? KitchenActor.DripCoffee(count)



response.mapTo[OrderCompleted] onComplete {

case Success(result) =>

log.info(s"success: ${result.message}")

case Failure(e: AskTimeoutException) =>

log.info(s"failure: ${e.getMessage}")

case Failure(t) =>

log.info(s"failure: ${t.getMessage}")

}
askのタイムアウト設定
Copyright © 2017 TIS Inc. All rights reserved. 18
送信先に問題があった場合は?
?
1s
if a receiver has a problem
Copyright © 2017 TIS Inc. All rights reserved. 19
supervisor
never return failure to sender
障害の事実を送信元に返さない
if a receiver has a problem
- restart
- resume
- stop
- escalate
Copyright © 2017 TIS Inc. All rights reserved. 20
timeout!
レスポンスが返ってこないためタイムアウトが
必要
?
1s
if a receiver has a problem
☓
Copyright © 2017 TIS Inc. All rights reserved.
implements of ask pattern 1/2
21
def ?(message: Any)(implicit timeout: Timeout, sender: ActorRef
= Actor.noSender): Future[Any] = internalAsk(message, timeout, sender)
private[pattern] def internalAsk(message: Any, timeout: Timeout, sender: ActorRef):
Future[Any] = actorSel.anchor match {

case ref: InternalActorRef ⇒

if (timeout.duration.length <= 0)

Future.failed[Any](

new IllegalArgumentException(s"""Timeout length must not be negative, question
not sent to [$actorSel]. Sender[$sender] sent the message of type "$
{message.getClass.getName}"."""))

else {

val a = PromiseActorRef(ref.provider, timeout,
targetName = actorSel, message.getClass.getName, sender)

actorSel.tell(message, a)

a.result.future

}

case _ ⇒ Future.failed[Any](new IllegalArgumentException(s"""Unsupported recipient
ActorRef type, question not sent to [$actorSel]. Sender[$sender] sent the message of
type "${message.getClass.getName}"."""))

}
?
internalAsk
Copyright © 2017 TIS Inc. All rights reserved. 22
akka.pattern.PromiseActorRef
def apply(provider: ActorRefProvider, timeout: Timeout, targetName: Any,
messageClassName: String, sender: ActorRef = Actor.noSender): PromiseActorRef = {

val result = Promise[Any]()

val scheduler = provider.guardian.underlying.system.scheduler

val a = new PromiseActorRef(provider, result, messageClassName)

implicit val ec = a.internalCallingThreadExecutionContext

val f = scheduler.scheduleOnce(timeout.duration) {

result tryComplete Failure(

new AskTimeoutException(s"""Ask timed out on
[$targetName] after [${timeout.duration.toMillis} ms].
Sender[$sender] sent message of type "$
{a.messageClassName}"."""))

}

result.future onComplete { _ ⇒ try a.stop() finally f.cancel() }

a

}
スケジューラを設定し時間がくれば
AskTimeoutException送信
implements of ask pattern 2/2
Copyright © 2017 TIS Inc. All rights reserved.
circuit breaker
23
Copyright © 2017 TIS Inc. All rights reserved.
a receiver is down
24
問い合わせたいサービスがダウンしていること
もある
Copyright © 2017 TIS Inc. All rights reserved.
response latency will rise
25
100ms
1s
normal
abnormal(timeout=1s)
レスポンス劣化
過負荷により性能劣化が拡⼤
Copyright © 2017 TIS Inc. All rights reserved.
apply circuit breaker
26
サーキットブレーカ でダウンしているサービス
には問い合わせをしないように
circuit breaker
Copyright © 2017 TIS Inc. All rights reserved.
what is circuit breaker
27
https://martinfowler.com/bliki/CircuitBreaker.html
⼀定回数の失敗を繰り返す
と接続を抑⽌
Once the failures reach a certain
threshold, the circuit breaker trips
Copyright © 2017 TIS Inc. All rights reserved.
circuit breaker has three statuses
28
http://doc.akka.io/docs/akka/current/common/circuitbreaker.html
Closed:メッセージ送信可能
Open :メッセージ送信不可
Copyright © 2017 TIS Inc. All rights reserved.
decrease the latency
29
無駄な問い合わせをやめてレイテンシを発⽣さ
せないようにする
100ms
x ms
normal
abnormal(timeout=1s)
1s
Open
Close
Copyright © 2017 TIS Inc. All rights reserved.
apply circuit breaker: implement
30
val breaker =

new CircuitBreaker(

context.system.scheduler,

maxFailures = 5,

callTimeout = 10.seconds,

resetTimeout = 1.minute).onOpen(notifyMeOnOpen())
http://doc.akka.io/docs/akka/current/common/circuitbreaker.html
def receive = {

case "dangerousCall" =>

breaker.withCircuitBreaker(Future(dangerousCall)) pipeTo sender()

}
5回失敗するとOpenになり、1分間はメッセー
ジを送信させない
Copyright © 2017 TIS Inc. All rights reserved.
block threads
31
ブロッキング処理があるとスレッドが枯渇しレ
イテンシが伝播
blockingblocking
threads threads
Copyright © 2017 TIS Inc. All rights reserved.
prevention of propagation
32
異常サービスを切り離すことで、問題が上流へ
伝播しない
blockingblocking
threads threads
Copyright © 2017 TIS Inc. All rights reserved.
CAP trade-off
33
return old information
vs
don't return anything
just do my work
vs
need synchronize with others
cache
push
- read
- write
古い情報を返してもよいか?
他者との同期なしで問題ないか?
Copyright © 2017 TIS Inc. All rights reserved.
rate limiting
34
rate
limiter
同じクライアントからの集中したリクエストか
ら守る
no more than
100 requests in any 3 sec interval
Copyright © 2017 TIS Inc. All rights reserved.
bulkhead
35
Copyright © 2017 TIS Inc. All rights reserved.
Even if there is damage next door, are you OK?
36
無関係なお隣さんがダウンしたとき、影響を被
る不運な出来事
Copyright © 2017 TIS Inc. All rights reserved.
bulkhead blocks the damage
37
スレッドをブロックするアクターと影響を受け
るアクターの間に隔壁
threadsthreads
blocking
Copyright © 2017 TIS Inc. All rights reserved.
isolating the blocking calls to actors
38
val blockingActor = context.actorOf(Props[BlockingActor].

withDispatcher(“blocking-actor-dispatcher”),

"blocking-actor")



class BlockingActor extends Actor {

def receive = {

case GetCustomer(id) =>

// calling database

…

}

}
ブロッキングコードはアクターごと分離してリ
ソースを共有しない
Copyright © 2017 TIS Inc. All rights reserved.
the blocking in Future
39
Future{
// blocking
}
ブロックするFutureによりディスパッチャが枯
渇
threads
Copyright © 2017 TIS Inc. All rights reserved. 40
http://www.slideshare.net/ktoso/zen-of-akka#44
デフォルトディスパッチャを利⽤した場合
using the default dispatcher
Copyright © 2017 TIS Inc. All rights reserved. 41
ブロッキング処理を分離
threadsthreads
Future{
// blocking
}
isolating the blocking Future
Copyright © 2017 TIS Inc. All rights reserved. 42
http://www.slideshare.net/ktoso/zen-of-akka#44
using a dedicated dispatcher
専⽤ディスパッチャの利⽤
Copyright © 2017 TIS Inc. All rights reserved.
CQRS:Command and Query Responsibility Segregation
43
コマンドとクエリを分離する
write
read
command
query
Copyright © 2017 TIS Inc. All rights reserved.
cluster
44
Copyright © 2017 TIS Inc. All rights reserved.
hardware will fail
45
If there are 365 machines failing once
a year, one machine will fail a day
Wouldn't a machine break even when
it's hosted on the cloud?
1年に1回故障するマシンが365台あれば平均毎
⽇1台故障する
Copyright © 2017 TIS Inc. All rights reserved.
availability of AWS
46
例:AWSの可⽤性検証サイト
https://cloudharmony.com/status-of-compute-and-storage-and-cdn-and-dns-for-aws
Copyright © 2017 TIS Inc. All rights reserved.
preparing for failure of hardware
47
- minimize single point of failure
- allow recovery of State
単⼀障害点を最⼩化
状態を永続化
Copyright © 2017 TIS Inc. All rights reserved.
Cluster
monitor each other by sending heartbeats
48
node1
node2
node3
node4
クラスタのメンバーがハートビートを送り合い
障害を検知
Copyright © 2017 TIS Inc. All rights reserved.
recovery states
49
Cluster
永続化しておいたイベントをリプレイすること
で状態の復元が可能
persist
replay
node1
node2
node3
node4
events
state
akka-persistence
Copyright © 2017 TIS Inc. All rights reserved.
the database may be down or overloaded
50
永続化機能の障害未復旧時に闇雲にリトライし
ない
persist
replay
node3
node4
replay
replay
db has not started yet
Copyright © 2017 TIS Inc. All rights reserved.
BackoffSupervisor
51
http://doc.akka.io/docs/akka/current/general/supervision.html#Delayed_restarts_with_the_BackoffSupervisor_pattern
3秒後、6秒後、12秒後、…の間隔でスタートを
試みる


val childProps = Props(classOf[EchoActor])



val supervisor = BackoffSupervisor.props(

Backoff.onStop(

childProps,

childName = "myEcho",

minBackoff = 3.seconds,

maxBackoff = 30.seconds,

randomFactor = 0.2 // adds 20% "noise" to vary the intervals slightly

))



system.actorOf(supervisor, name = "echoSupervisor")
increasing intervals of 3, 6, 12, ...
Copyright © 2017 TIS Inc. All rights reserved.
split brain resolver
52
Copyright © 2017 TIS Inc. All rights reserved.
Cluster
node1
node2
node3
node4
In the case of network partitions
53
ネットワークが切れることもある
Copyright © 2017 TIS Inc. All rights reserved.
node1
node4
Cluster1 Cluster2
using split brain resolver
54
クラスタ間での⼀貫性維持のためSplit brain
resolverを適⽤
node2
node3
node5
split brain resolver
Copyright © 2017 TIS Inc. All rights reserved.
strategy 1/4: Static Quorum
55
quorum-size = 3
クラスタ内のノード数が⼀定数以上の場合⽣存
node2
node1
node4
node3
node5
Which can survive?
- If the number of nodes is quorum-size or more
Copyright © 2017 TIS Inc. All rights reserved.
strategy 2/4: Keep Majority
56
ノード数が50%より多い場合に⽣存
node2
node1
node4
node3
node5
Which can survive?
- If the number of nodes is more than 50%
Copyright © 2017 TIS Inc. All rights reserved.
strategy 3/4: Keep Oldest
57
最古のノードが⾃グループに含まれている場合
に⽣存
node2
node4
node3
node5
Which can survive?
- If contain the oldest node
node1
oldest
Copyright © 2017 TIS Inc. All rights reserved.
strategy 4/4: Keep Referee
58
特定のノードが含まれている場合に⽣存
node2
node4
node3
node5
node1
Which can survive?
- If contain the given referee node
address = "akka.tcp://system@node1:port"
Copyright © 2017 TIS Inc. All rights reserved. 59
SBR is included
in Lightbend Reactive Platform
https://github.com/TanUkkii007/akka-cluster-custom-downing
http://doc.akka.io/docs/akka/rp-current/scala/split-brain-resolver.html
Lightbend Reactive Platform
akka-cluster-custom-downing
SBRはLightbend Reactive Platformで提供され
ています
Copyright © 2017 TIS Inc. All rights reserved.
idempotence
60
冪等性
Copyright © 2017 TIS Inc. All rights reserved.
Failed to receive ack message
61
Order(coffee,1)
Order(coffee,1)
ackを受信できずメッセージを再送すると2重注
⽂してしまう
coffee please!
becomes a duplicate order by resending the message
Copyright © 2017 TIS Inc. All rights reserved.
idempotence
62
メッセージを複数回受信しても問題ないように
冪等な設計で⼀貫性を維持
Order(id1, coffee, 1)
Order(id1, coffee, 1)
coffee, please!
applying it multiple times is not harmful
Copyright © 2017 TIS Inc. All rights reserved.
summary
63
Copyright © 2017 TIS Inc. All rights reserved.
summary
64
- Microservices mean distributed systems
- define Cross-Functional Requirements
- design for failure
障害は発⽣するものなので、受け⼊れましょう
Copyright © 2017 TIS Inc. All rights reserved.
summary
65
timeout
circuit breaker
bulkhead
cluster
backoff
split brain resolver ...
by using Akka
Akkaは分散システムの障害に対処するための
ツールキットを備えています
Copyright © 2017 TIS Inc. All rights reserved.
reference materials
66
- Building Microservices
- Reactive Design Patterns
- Reactive Application Development
- Effective Akka
- http://akka.io/
Copyright © 2017 TIS Inc. All rights reserved. 67
https://gitter.im/akka-ja/akka-doc-ja
https://github.com/akka-ja/akka-doc-ja/
akka.io翻訳協⼒者募集中!!
Gitterにジョインしてください。
now translating
THANK YOU
1 of 68

Recommended

Reactive Systems that focus on High Availability with Lerna by
Reactive Systems that focus on High Availability with LernaReactive Systems that focus on High Availability with Lerna
Reactive Systems that focus on High Availability with LernaTIS Inc.
568 views17 slides
Akka in Action workshop #ScalaMatsuri 2018 by
Akka in Action workshop #ScalaMatsuri 2018Akka in Action workshop #ScalaMatsuri 2018
Akka in Action workshop #ScalaMatsuri 2018TIS Inc.
1.5K views75 slides
Security threats with Kubernetes - Igor Khoroshchenko by
 Security threats with Kubernetes - Igor Khoroshchenko Security threats with Kubernetes - Igor Khoroshchenko
Security threats with Kubernetes - Igor KhoroshchenkoKuberton
340 views33 slides
OpenStack Ottawa Meetup - March 29th 2017 by
OpenStack Ottawa Meetup - March 29th 2017OpenStack Ottawa Meetup - March 29th 2017
OpenStack Ottawa Meetup - March 29th 2017Stacy Véronneau
1.1K views45 slides
Korejanai Story by
Korejanai StoryKorejanai Story
Korejanai StoryKentaro Takeda
3.6K views37 slides
Adventures in a Microservice world at REA Group by
Adventures in a Microservice world at REA GroupAdventures in a Microservice world at REA Group
Adventures in a Microservice world at REA Groupevanbottcher
1.6K views84 slides

More Related Content

What's hot

The Seven Habits of the Highly Effective DevSecOp by
The Seven Habits of the Highly Effective DevSecOpThe Seven Habits of the Highly Effective DevSecOp
The Seven Habits of the Highly Effective DevSecOpJames Wickett
2.3K views160 slides
The New Ways of Chaos, Security, and DevOps by
The New Ways of Chaos, Security, and DevOpsThe New Ways of Chaos, Security, and DevOps
The New Ways of Chaos, Security, and DevOpsJames Wickett
499 views159 slides
Microservices Practitioner Summit Jan '15 - Maximizing Developer Productivity... by
Microservices Practitioner Summit Jan '15 - Maximizing Developer Productivity...Microservices Practitioner Summit Jan '15 - Maximizing Developer Productivity...
Microservices Practitioner Summit Jan '15 - Maximizing Developer Productivity...Ambassador Labs
1.9K views36 slides
Security and Advanced Automation in the Enterprise by
Security and Advanced Automation in the EnterpriseSecurity and Advanced Automation in the Enterprise
Security and Advanced Automation in the EnterpriseAmazon Web Services
1.2K views28 slides
Serverless Security: A How-to Guide @ SnowFROC 2019 by
Serverless Security: A How-to Guide @ SnowFROC 2019Serverless Security: A How-to Guide @ SnowFROC 2019
Serverless Security: A How-to Guide @ SnowFROC 2019James Wickett
719 views96 slides
An Open, Open source way to enable your Cloud Native Journey by
An Open, Open source way to enable your Cloud Native JourneyAn Open, Open source way to enable your Cloud Native Journey
An Open, Open source way to enable your Cloud Native Journeyinwin stack
257 views30 slides

What's hot(20)

The Seven Habits of the Highly Effective DevSecOp by James Wickett
The Seven Habits of the Highly Effective DevSecOpThe Seven Habits of the Highly Effective DevSecOp
The Seven Habits of the Highly Effective DevSecOp
James Wickett2.3K views
The New Ways of Chaos, Security, and DevOps by James Wickett
The New Ways of Chaos, Security, and DevOpsThe New Ways of Chaos, Security, and DevOps
The New Ways of Chaos, Security, and DevOps
James Wickett499 views
Microservices Practitioner Summit Jan '15 - Maximizing Developer Productivity... by Ambassador Labs
Microservices Practitioner Summit Jan '15 - Maximizing Developer Productivity...Microservices Practitioner Summit Jan '15 - Maximizing Developer Productivity...
Microservices Practitioner Summit Jan '15 - Maximizing Developer Productivity...
Ambassador Labs1.9K views
Security and Advanced Automation in the Enterprise by Amazon Web Services
Security and Advanced Automation in the EnterpriseSecurity and Advanced Automation in the Enterprise
Security and Advanced Automation in the Enterprise
Amazon Web Services1.2K views
Serverless Security: A How-to Guide @ SnowFROC 2019 by James Wickett
Serverless Security: A How-to Guide @ SnowFROC 2019Serverless Security: A How-to Guide @ SnowFROC 2019
Serverless Security: A How-to Guide @ SnowFROC 2019
James Wickett719 views
An Open, Open source way to enable your Cloud Native Journey by inwin stack
An Open, Open source way to enable your Cloud Native JourneyAn Open, Open source way to enable your Cloud Native Journey
An Open, Open source way to enable your Cloud Native Journey
inwin stack257 views
[OpenStack Day in Korea 2015] Track 3-2 - Huawei Cloud Computing Powered by O... by OpenStack Korea Community
[OpenStack Day in Korea 2015] Track 3-2 - Huawei Cloud Computing Powered by O...[OpenStack Day in Korea 2015] Track 3-2 - Huawei Cloud Computing Powered by O...
[OpenStack Day in Korea 2015] Track 3-2 - Huawei Cloud Computing Powered by O...
OpenStack Architected Like AWS (and GCP) by Randy Bias
OpenStack Architected Like AWS (and GCP)OpenStack Architected Like AWS (and GCP)
OpenStack Architected Like AWS (and GCP)
Randy Bias9.6K views
Container Orchestration Wars (2017 Edition) by Karl Isenberg
Container Orchestration Wars (2017 Edition)Container Orchestration Wars (2017 Edition)
Container Orchestration Wars (2017 Edition)
Karl Isenberg11.1K views
Accelerate Your OpenStack Deployment Presented by SolidFire and Red Hat by NetApp
Accelerate Your OpenStack Deployment Presented by SolidFire and Red HatAccelerate Your OpenStack Deployment Presented by SolidFire and Red Hat
Accelerate Your OpenStack Deployment Presented by SolidFire and Red Hat
NetApp293 views
Deploying Microservices to Cloud Foundry by Matt Stine
Deploying Microservices to Cloud FoundryDeploying Microservices to Cloud Foundry
Deploying Microservices to Cloud Foundry
Matt Stine3.9K views
Release Your Inner DevSecOp by James Wickett
Release Your Inner DevSecOpRelease Your Inner DevSecOp
Release Your Inner DevSecOp
James Wickett485 views
OpenStack + VMware at the Hong Kong OpenStack Summit by Dan Wendlandt
OpenStack + VMware at the Hong Kong OpenStack SummitOpenStack + VMware at the Hong Kong OpenStack Summit
OpenStack + VMware at the Hong Kong OpenStack Summit
Dan Wendlandt6.8K views
The 'Untold' OpenStack Enterprise Customer Stories: Anthony Rees & Alex Tesch... by OpenStack
The 'Untold' OpenStack Enterprise Customer Stories: Anthony Rees & Alex Tesch...The 'Untold' OpenStack Enterprise Customer Stories: Anthony Rees & Alex Tesch...
The 'Untold' OpenStack Enterprise Customer Stories: Anthony Rees & Alex Tesch...
OpenStack1.5K views
Resilient Microservices with Spring Cloud by VMware Tanzu
Resilient Microservices with Spring CloudResilient Microservices with Spring Cloud
Resilient Microservices with Spring Cloud
VMware Tanzu1.6K views
Automated Security Hardening with OpenStack-Ansible by Major Hayden
Automated Security Hardening with OpenStack-AnsibleAutomated Security Hardening with OpenStack-Ansible
Automated Security Hardening with OpenStack-Ansible
Major Hayden9.6K views

Similar to Preparing for distributed system failures using akka #ScalaMatsuri

Leverage Streaming Data in a Microservices Ecosystem by
Leverage Streaming Data in a Microservices EcosystemLeverage Streaming Data in a Microservices Ecosystem
Leverage Streaming Data in a Microservices EcosystemTechWell
78 views21 slides
Cloud Computing for Business - The Road to IT-as-a-Service by
Cloud Computing for Business - The Road to IT-as-a-ServiceCloud Computing for Business - The Road to IT-as-a-Service
Cloud Computing for Business - The Road to IT-as-a-ServiceJames Urquhart
1K views56 slides
From Obstacle to Advantage: The Changing Role of Security & Compliance in You... by
From Obstacle to Advantage: The Changing Role of Security & Compliance in You...From Obstacle to Advantage: The Changing Role of Security & Compliance in You...
From Obstacle to Advantage: The Changing Role of Security & Compliance in You...Amazon Web Services
2.4K views28 slides
Design Patterns para Microsserviços com MicroProfile by
 Design Patterns para Microsserviços com MicroProfile Design Patterns para Microsserviços com MicroProfile
Design Patterns para Microsserviços com MicroProfileVíctor Leonel Orozco López
222 views48 slides
Ingesting streaming data for analysis in apache ignite (stream sets theme) by
Ingesting streaming data for analysis in apache ignite (stream sets theme)Ingesting streaming data for analysis in apache ignite (stream sets theme)
Ingesting streaming data for analysis in apache ignite (stream sets theme)Tom Diederich
396 views30 slides
Azure from scratch part 4 by
Azure from scratch part 4Azure from scratch part 4
Azure from scratch part 4Girish Kalamati
293 views97 slides

Similar to Preparing for distributed system failures using akka #ScalaMatsuri(20)

Leverage Streaming Data in a Microservices Ecosystem by TechWell
Leverage Streaming Data in a Microservices EcosystemLeverage Streaming Data in a Microservices Ecosystem
Leverage Streaming Data in a Microservices Ecosystem
TechWell78 views
Cloud Computing for Business - The Road to IT-as-a-Service by James Urquhart
Cloud Computing for Business - The Road to IT-as-a-ServiceCloud Computing for Business - The Road to IT-as-a-Service
Cloud Computing for Business - The Road to IT-as-a-Service
James Urquhart1K views
From Obstacle to Advantage: The Changing Role of Security & Compliance in You... by Amazon Web Services
From Obstacle to Advantage: The Changing Role of Security & Compliance in You...From Obstacle to Advantage: The Changing Role of Security & Compliance in You...
From Obstacle to Advantage: The Changing Role of Security & Compliance in You...
Amazon Web Services2.4K views
Ingesting streaming data for analysis in apache ignite (stream sets theme) by Tom Diederich
Ingesting streaming data for analysis in apache ignite (stream sets theme)Ingesting streaming data for analysis in apache ignite (stream sets theme)
Ingesting streaming data for analysis in apache ignite (stream sets theme)
Tom Diederich396 views
Hardware, and Trust Security: Explain it like I’m 5! by Teddy Reed
Hardware, and Trust Security: Explain it like I’m 5!Hardware, and Trust Security: Explain it like I’m 5!
Hardware, and Trust Security: Explain it like I’m 5!
Teddy Reed1.6K views
Jose Selvi - Side-Channels Uncovered [rootedvlc2018] by RootedCON
Jose Selvi - Side-Channels Uncovered [rootedvlc2018]Jose Selvi - Side-Channels Uncovered [rootedvlc2018]
Jose Selvi - Side-Channels Uncovered [rootedvlc2018]
RootedCON234 views
Tomas Della Vedova - Building a future proof framework - Codemotion Milan 2018 by Codemotion
Tomas Della Vedova - Building a future proof framework - Codemotion Milan 2018Tomas Della Vedova - Building a future proof framework - Codemotion Milan 2018
Tomas Della Vedova - Building a future proof framework - Codemotion Milan 2018
Codemotion232 views
Cloud patterns - NDC Oslo 2016 - Tamir Dresher by Tamir Dresher
Cloud patterns - NDC Oslo 2016 - Tamir DresherCloud patterns - NDC Oslo 2016 - Tamir Dresher
Cloud patterns - NDC Oslo 2016 - Tamir Dresher
Tamir Dresher375 views
MongoDB World 2018: MongoDB for High Volume Time Series Data Streams by MongoDB
MongoDB World 2018: MongoDB for High Volume Time Series Data StreamsMongoDB World 2018: MongoDB for High Volume Time Series Data Streams
MongoDB World 2018: MongoDB for High Volume Time Series Data Streams
MongoDB737 views
A miało być tak... bez wycieków by Konrad Kokosa
A miało być tak... bez wyciekówA miało być tak... bez wycieków
A miało być tak... bez wycieków
Konrad Kokosa118 views
Futures and Rx Observables: powerful abstractions for consuming web services ... by Chris Richardson
Futures and Rx Observables: powerful abstractions for consuming web services ...Futures and Rx Observables: powerful abstractions for consuming web services ...
Futures and Rx Observables: powerful abstractions for consuming web services ...
Chris Richardson4.3K views
Parquet Vectorization in Hive by Sahil Takiar
Parquet Vectorization in HiveParquet Vectorization in Hive
Parquet Vectorization in Hive
Sahil Takiar895 views
10 Excellent Ways to Secure Spring Boot Applications - Okta Webinar 2020 by Matt Raible
10 Excellent Ways to Secure Spring Boot Applications - Okta Webinar 202010 Excellent Ways to Secure Spring Boot Applications - Okta Webinar 2020
10 Excellent Ways to Secure Spring Boot Applications - Okta Webinar 2020
Matt Raible177 views
CTD307_Case Study How Mobile Device Service Company Asurion Architected Its A... by Amazon Web Services
CTD307_Case Study How Mobile Device Service Company Asurion Architected Its A...CTD307_Case Study How Mobile Device Service Company Asurion Architected Its A...
CTD307_Case Study How Mobile Device Service Company Asurion Architected Its A...
JEEConf 2017 - How to find deadlock not getting into it by Nikita Koval
JEEConf 2017 - How to find deadlock not getting into itJEEConf 2017 - How to find deadlock not getting into it
JEEConf 2017 - How to find deadlock not getting into it
Nikita Koval250 views
Reutov, yunusov, nagibin random numbers take ii by DefconRussia
Reutov, yunusov, nagibin   random numbers take iiReutov, yunusov, nagibin   random numbers take ii
Reutov, yunusov, nagibin random numbers take ii
DefconRussia14.9K views

More from TIS Inc.

AWSマネージドサービスとOSSによるミッションクリティカルなシステムの実現 by
AWSマネージドサービスとOSSによるミッションクリティカルなシステムの実現AWSマネージドサービスとOSSによるミッションクリティカルなシステムの実現
AWSマネージドサービスとOSSによるミッションクリティカルなシステムの実現TIS Inc.
334 views29 slides
Starting Reactive Systems with Lerna #reactive_shinjuku by
Starting Reactive Systems with Lerna #reactive_shinjukuStarting Reactive Systems with Lerna #reactive_shinjuku
Starting Reactive Systems with Lerna #reactive_shinjukuTIS Inc.
253 views16 slides
可用性を突き詰めたリアクティブシステム by
可用性を突き詰めたリアクティブシステム可用性を突き詰めたリアクティブシステム
可用性を突き詰めたリアクティブシステムTIS Inc.
3.9K views21 slides
EventStormingワークショップ 〜かつてない図書館をモデリングしてみよう〜 by
EventStormingワークショップ 〜かつてない図書館をモデリングしてみよう〜EventStormingワークショップ 〜かつてない図書館をモデリングしてみよう〜
EventStormingワークショップ 〜かつてない図書館をモデリングしてみよう〜TIS Inc.
3.9K views64 slides
Akkaの並行性 by
Akkaの並行性Akkaの並行性
Akkaの並行性TIS Inc.
650 views33 slides
JavaからAkkaハンズオン by
JavaからAkkaハンズオンJavaからAkkaハンズオン
JavaからAkkaハンズオンTIS Inc.
1.8K views84 slides

More from TIS Inc.(16)

AWSマネージドサービスとOSSによるミッションクリティカルなシステムの実現 by TIS Inc.
AWSマネージドサービスとOSSによるミッションクリティカルなシステムの実現AWSマネージドサービスとOSSによるミッションクリティカルなシステムの実現
AWSマネージドサービスとOSSによるミッションクリティカルなシステムの実現
TIS Inc.334 views
Starting Reactive Systems with Lerna #reactive_shinjuku by TIS Inc.
Starting Reactive Systems with Lerna #reactive_shinjukuStarting Reactive Systems with Lerna #reactive_shinjuku
Starting Reactive Systems with Lerna #reactive_shinjuku
TIS Inc.253 views
可用性を突き詰めたリアクティブシステム by TIS Inc.
可用性を突き詰めたリアクティブシステム可用性を突き詰めたリアクティブシステム
可用性を突き詰めたリアクティブシステム
TIS Inc.3.9K views
EventStormingワークショップ 〜かつてない図書館をモデリングしてみよう〜 by TIS Inc.
EventStormingワークショップ 〜かつてない図書館をモデリングしてみよう〜EventStormingワークショップ 〜かつてない図書館をモデリングしてみよう〜
EventStormingワークショップ 〜かつてない図書館をモデリングしてみよう〜
TIS Inc.3.9K views
Akkaの並行性 by TIS Inc.
Akkaの並行性Akkaの並行性
Akkaの並行性
TIS Inc.650 views
JavaからAkkaハンズオン by TIS Inc.
JavaからAkkaハンズオンJavaからAkkaハンズオン
JavaからAkkaハンズオン
TIS Inc.1.8K views
リアクティブシステムとAkka by TIS Inc.
リアクティブシステムとAkkaリアクティブシステムとAkka
リアクティブシステムとAkka
TIS Inc.844 views
Akkaで実現するステートフルでスケーラブルなアーキテクチャ by TIS Inc.
Akkaで実現するステートフルでスケーラブルなアーキテクチャAkkaで実現するステートフルでスケーラブルなアーキテクチャ
Akkaで実現するステートフルでスケーラブルなアーキテクチャ
TIS Inc.2.7K views
akka-doc-ja by TIS Inc.
akka-doc-jaakka-doc-ja
akka-doc-ja
TIS Inc.1.2K views
10分で分かるリアクティブシステム by TIS Inc.
10分で分かるリアクティブシステム10分で分かるリアクティブシステム
10分で分かるリアクティブシステム
TIS Inc.5.1K views
Typesafe Reactive Platformで作るReactive System入門 by TIS Inc.
Typesafe Reactive Platformで作るReactive System入門Typesafe Reactive Platformで作るReactive System入門
Typesafe Reactive Platformで作るReactive System入門
TIS Inc.5.4K views
Typesafe Reactive Platformで作るReactive System by TIS Inc.
Typesafe Reactive Platformで作るReactive SystemTypesafe Reactive Platformで作るReactive System
Typesafe Reactive Platformで作るReactive System
TIS Inc.14.4K views
Effective Akka読書会2 by TIS Inc.
Effective Akka読書会2Effective Akka読書会2
Effective Akka読書会2
TIS Inc.1.2K views
再帰で脱Javaライク by TIS Inc.
再帰で脱Javaライク再帰で脱Javaライク
再帰で脱Javaライク
TIS Inc.3K views
Scalable Generator: Using Scala in SIer Business (ScalaMatsuri) by TIS Inc.
Scalable Generator: Using Scala in SIer Business (ScalaMatsuri)Scalable Generator: Using Scala in SIer Business (ScalaMatsuri)
Scalable Generator: Using Scala in SIer Business (ScalaMatsuri)
TIS Inc.4.8K views
甲賀流Jenkins活用術 by TIS Inc.
甲賀流Jenkins活用術甲賀流Jenkins活用術
甲賀流Jenkins活用術
TIS Inc.1.6K views

Recently uploaded

Empathic Computing: Delivering the Potential of the Metaverse by
Empathic Computing: Delivering  the Potential of the MetaverseEmpathic Computing: Delivering  the Potential of the Metaverse
Empathic Computing: Delivering the Potential of the MetaverseMark Billinghurst
470 views80 slides
Data-centric AI and the convergence of data and model engineering: opportunit... by
Data-centric AI and the convergence of data and model engineering:opportunit...Data-centric AI and the convergence of data and model engineering:opportunit...
Data-centric AI and the convergence of data and model engineering: opportunit...Paolo Missier
34 views40 slides
Throughput by
ThroughputThroughput
ThroughputMoisés Armani Ramírez
36 views11 slides
Understanding GenAI/LLM and What is Google Offering - Felix Goh by
Understanding GenAI/LLM and What is Google Offering - Felix GohUnderstanding GenAI/LLM and What is Google Offering - Felix Goh
Understanding GenAI/LLM and What is Google Offering - Felix GohNUS-ISS
41 views33 slides
Transcript: The Details of Description Techniques tips and tangents on altern... by
Transcript: The Details of Description Techniques tips and tangents on altern...Transcript: The Details of Description Techniques tips and tangents on altern...
Transcript: The Details of Description Techniques tips and tangents on altern...BookNet Canada
130 views15 slides
Future of Learning - Yap Aye Wee.pdf by
Future of Learning - Yap Aye Wee.pdfFuture of Learning - Yap Aye Wee.pdf
Future of Learning - Yap Aye Wee.pdfNUS-ISS
41 views11 slides

Recently uploaded(20)

Empathic Computing: Delivering the Potential of the Metaverse by Mark Billinghurst
Empathic Computing: Delivering  the Potential of the MetaverseEmpathic Computing: Delivering  the Potential of the Metaverse
Empathic Computing: Delivering the Potential of the Metaverse
Mark Billinghurst470 views
Data-centric AI and the convergence of data and model engineering: opportunit... by Paolo Missier
Data-centric AI and the convergence of data and model engineering:opportunit...Data-centric AI and the convergence of data and model engineering:opportunit...
Data-centric AI and the convergence of data and model engineering: opportunit...
Paolo Missier34 views
Understanding GenAI/LLM and What is Google Offering - Felix Goh by NUS-ISS
Understanding GenAI/LLM and What is Google Offering - Felix GohUnderstanding GenAI/LLM and What is Google Offering - Felix Goh
Understanding GenAI/LLM and What is Google Offering - Felix Goh
NUS-ISS41 views
Transcript: The Details of Description Techniques tips and tangents on altern... by BookNet Canada
Transcript: The Details of Description Techniques tips and tangents on altern...Transcript: The Details of Description Techniques tips and tangents on altern...
Transcript: The Details of Description Techniques tips and tangents on altern...
BookNet Canada130 views
Future of Learning - Yap Aye Wee.pdf by NUS-ISS
Future of Learning - Yap Aye Wee.pdfFuture of Learning - Yap Aye Wee.pdf
Future of Learning - Yap Aye Wee.pdf
NUS-ISS41 views
Spesifikasi Lengkap ASUS Vivobook Go 14 by Dot Semarang
Spesifikasi Lengkap ASUS Vivobook Go 14Spesifikasi Lengkap ASUS Vivobook Go 14
Spesifikasi Lengkap ASUS Vivobook Go 14
Dot Semarang35 views
Special_edition_innovator_2023.pdf by WillDavies22
Special_edition_innovator_2023.pdfSpecial_edition_innovator_2023.pdf
Special_edition_innovator_2023.pdf
WillDavies2216 views
Igniting Next Level Productivity with AI-Infused Data Integration Workflows by Safe Software
Igniting Next Level Productivity with AI-Infused Data Integration Workflows Igniting Next Level Productivity with AI-Infused Data Integration Workflows
Igniting Next Level Productivity with AI-Infused Data Integration Workflows
Safe Software225 views
PharoJS - Zürich Smalltalk Group Meetup November 2023 by Noury Bouraqadi
PharoJS - Zürich Smalltalk Group Meetup November 2023PharoJS - Zürich Smalltalk Group Meetup November 2023
PharoJS - Zürich Smalltalk Group Meetup November 2023
Noury Bouraqadi120 views
RADIUS-Omnichannel Interaction System by RADIUS
RADIUS-Omnichannel Interaction SystemRADIUS-Omnichannel Interaction System
RADIUS-Omnichannel Interaction System
RADIUS15 views
handbook for web 3 adoption.pdf by Liveplex
handbook for web 3 adoption.pdfhandbook for web 3 adoption.pdf
handbook for web 3 adoption.pdf
Liveplex19 views
[2023] Putting the R! in R&D.pdf by Eleanor McHugh
[2023] Putting the R! in R&D.pdf[2023] Putting the R! in R&D.pdf
[2023] Putting the R! in R&D.pdf
Eleanor McHugh38 views
Voice Logger - Telephony Integration Solution at Aegis by Nirmal Sharma
Voice Logger - Telephony Integration Solution at AegisVoice Logger - Telephony Integration Solution at Aegis
Voice Logger - Telephony Integration Solution at Aegis
Nirmal Sharma17 views
Digital Product-Centric Enterprise and Enterprise Architecture - Tan Eng Tsze by NUS-ISS
Digital Product-Centric Enterprise and Enterprise Architecture - Tan Eng TszeDigital Product-Centric Enterprise and Enterprise Architecture - Tan Eng Tsze
Digital Product-Centric Enterprise and Enterprise Architecture - Tan Eng Tsze
NUS-ISS19 views
Five Things You SHOULD Know About Postman by Postman
Five Things You SHOULD Know About PostmanFive Things You SHOULD Know About Postman
Five Things You SHOULD Know About Postman
Postman27 views
How the World's Leading Independent Automotive Distributor is Reinventing Its... by NUS-ISS
How the World's Leading Independent Automotive Distributor is Reinventing Its...How the World's Leading Independent Automotive Distributor is Reinventing Its...
How the World's Leading Independent Automotive Distributor is Reinventing Its...
NUS-ISS15 views
The Importance of Cybersecurity for Digital Transformation by NUS-ISS
The Importance of Cybersecurity for Digital TransformationThe Importance of Cybersecurity for Digital Transformation
The Importance of Cybersecurity for Digital Transformation
NUS-ISS27 views
Attacking IoT Devices from a Web Perspective - Linux Day by Simone Onofri
Attacking IoT Devices from a Web Perspective - Linux Day Attacking IoT Devices from a Web Perspective - Linux Day
Attacking IoT Devices from a Web Perspective - Linux Day
Simone Onofri15 views

Preparing for distributed system failures using akka #ScalaMatsuri

  • 1. Copyright © 2017 TIS Inc. All rights reserved. Preparing for distributed system failures using Akka 2017.2.25 Scala Matsuri Yugo Maede @yugolf
  • 2. Copyright © 2017 TIS Inc. All rights reserved. 2 Who am I? TIS Inc. provides “Reactive Systems Consulting Service” @yugolf https://twitter.com/okapies/status/781439220330164225 - support PoC projects - review designs - review codes        etc リアクティブシステムのコンサルティングサービ スをやっています
  • 3. Copyright © 2017 TIS Inc. All rights reserved. 3 Todayʼs Topics What are Architectural Safety Measures in distributed system? How to realize them with Akka 分散システムに考慮が必要な安全対策 Akkaでどうやるか?
  • 4. Copyright © 2017 TIS Inc. All rights reserved. 4 Microservices mean distributed systems from Monolith to Microservices マイクロサービス、 すなわち、分散システム
  • 5. Copyright © 2017 TIS Inc. All rights reserved. 5 "Mooreʼs law is dead" means "distributed systems are the beginning" limitation of CPU performance ムーアの法則の終焉、 すなわち、分散システムの幕開け
  • 6. Copyright © 2017 TIS Inc. All rights reserved. confront with distributed system 6 Building large-scale systems requires distributed systems 分散システムなしにはビジネスの成功はない
  • 7. Copyright © 2017 TIS Inc. All rights reserved. 7 - increasing the number of server means increasing failure points - face new enemies "network" サーバが増えれば障害点も増える ネットワークという新たな敵の出現 building distributed system is not easy
  • 8. Copyright © 2017 TIS Inc. All rights reserved. Architectural Safety Measures 8 define Cross-Functional Requirements - availability - response time and latency 機能横断要件を定義しましょう 可⽤性と応答時間/遅延
  • 9. Copyright © 2017 TIS Inc. All rights reserved. systems based on failure 9 - needs Antifragile Organizations - needs systems based on failure アンチフラジャイルな組織と障害を前提とした システムが必要
  • 10. Copyright © 2017 TIS Inc. All rights reserved. Architectural Safety Measures need 10 timeout bulkhead circuit breaker ... タイムアウト、隔壁、サーキットブレーカー、…
  • 11. Copyright © 2017 TIS Inc. All rights reserved. Akka is here 11 Akka has tools to deal with distributed system failures Akkaには分散システムに関わる障害に対処する ためのツールが備わっている
  • 12. Copyright © 2017 TIS Inc. All rights reserved. Akka Actor 12 participant Actor processes messages in order of arrival $30 host アクターはメッセージを到達順に処理 シンプルに⾮同期処理を実装可能 $10 $10 $10 status $10 $10 $10 mailbox
  • 13. Copyright © 2017 TIS Inc. All rights reserved. Supervisor Hierarchy 13 let it crash スーパーバイザーが⼦アクターを監視し障害制 御などを⾏う supervisor child actorchild actor supervise signal failure - restart - resume - stop - escalate
  • 14. Copyright © 2017 TIS Inc. All rights reserved. timeout 14
  • 15. Copyright © 2017 TIS Inc. All rights reserved. request-response needs timeout 15 request response 応答が遅かったり、返ってこないこともある ☓
  • 16. Copyright © 2017 TIS Inc. All rights reserved. message passing 16 ! tell(fire and forget)を使う askの場合はタイムアウトを適切に設定 ? 1s tell(fire and forget) ask
  • 17. Copyright © 2017 TIS Inc. All rights reserved. timeout configuration 17 import akka.pattern.ask
 import akka.util.Timeout
 import system.dispatcher
 
 implicit val timeout = Timeout(5 seconds) 
 val response = kitchen ? KitchenActor.DripCoffee(count)
 
 response.mapTo[OrderCompleted] onComplete {
 case Success(result) =>
 log.info(s"success: ${result.message}")
 case Failure(e: AskTimeoutException) =>
 log.info(s"failure: ${e.getMessage}")
 case Failure(t) =>
 log.info(s"failure: ${t.getMessage}")
 } askのタイムアウト設定
  • 18. Copyright © 2017 TIS Inc. All rights reserved. 18 送信先に問題があった場合は? ? 1s if a receiver has a problem
  • 19. Copyright © 2017 TIS Inc. All rights reserved. 19 supervisor never return failure to sender 障害の事実を送信元に返さない if a receiver has a problem - restart - resume - stop - escalate
  • 20. Copyright © 2017 TIS Inc. All rights reserved. 20 timeout! レスポンスが返ってこないためタイムアウトが 必要 ? 1s if a receiver has a problem ☓
  • 21. Copyright © 2017 TIS Inc. All rights reserved. implements of ask pattern 1/2 21 def ?(message: Any)(implicit timeout: Timeout, sender: ActorRef = Actor.noSender): Future[Any] = internalAsk(message, timeout, sender) private[pattern] def internalAsk(message: Any, timeout: Timeout, sender: ActorRef): Future[Any] = actorSel.anchor match {
 case ref: InternalActorRef ⇒
 if (timeout.duration.length <= 0)
 Future.failed[Any](
 new IllegalArgumentException(s"""Timeout length must not be negative, question not sent to [$actorSel]. Sender[$sender] sent the message of type "$ {message.getClass.getName}"."""))
 else {
 val a = PromiseActorRef(ref.provider, timeout, targetName = actorSel, message.getClass.getName, sender)
 actorSel.tell(message, a)
 a.result.future
 }
 case _ ⇒ Future.failed[Any](new IllegalArgumentException(s"""Unsupported recipient ActorRef type, question not sent to [$actorSel]. Sender[$sender] sent the message of type "${message.getClass.getName}"."""))
 } ? internalAsk
  • 22. Copyright © 2017 TIS Inc. All rights reserved. 22 akka.pattern.PromiseActorRef def apply(provider: ActorRefProvider, timeout: Timeout, targetName: Any, messageClassName: String, sender: ActorRef = Actor.noSender): PromiseActorRef = {
 val result = Promise[Any]()
 val scheduler = provider.guardian.underlying.system.scheduler
 val a = new PromiseActorRef(provider, result, messageClassName)
 implicit val ec = a.internalCallingThreadExecutionContext
 val f = scheduler.scheduleOnce(timeout.duration) {
 result tryComplete Failure(
 new AskTimeoutException(s"""Ask timed out on [$targetName] after [${timeout.duration.toMillis} ms]. Sender[$sender] sent message of type "$ {a.messageClassName}"."""))
 }
 result.future onComplete { _ ⇒ try a.stop() finally f.cancel() }
 a
 } スケジューラを設定し時間がくれば AskTimeoutException送信 implements of ask pattern 2/2
  • 23. Copyright © 2017 TIS Inc. All rights reserved. circuit breaker 23
  • 24. Copyright © 2017 TIS Inc. All rights reserved. a receiver is down 24 問い合わせたいサービスがダウンしていること もある
  • 25. Copyright © 2017 TIS Inc. All rights reserved. response latency will rise 25 100ms 1s normal abnormal(timeout=1s) レスポンス劣化 過負荷により性能劣化が拡⼤
  • 26. Copyright © 2017 TIS Inc. All rights reserved. apply circuit breaker 26 サーキットブレーカ でダウンしているサービス には問い合わせをしないように circuit breaker
  • 27. Copyright © 2017 TIS Inc. All rights reserved. what is circuit breaker 27 https://martinfowler.com/bliki/CircuitBreaker.html ⼀定回数の失敗を繰り返す と接続を抑⽌ Once the failures reach a certain threshold, the circuit breaker trips
  • 28. Copyright © 2017 TIS Inc. All rights reserved. circuit breaker has three statuses 28 http://doc.akka.io/docs/akka/current/common/circuitbreaker.html Closed:メッセージ送信可能 Open :メッセージ送信不可
  • 29. Copyright © 2017 TIS Inc. All rights reserved. decrease the latency 29 無駄な問い合わせをやめてレイテンシを発⽣さ せないようにする 100ms x ms normal abnormal(timeout=1s) 1s Open Close
  • 30. Copyright © 2017 TIS Inc. All rights reserved. apply circuit breaker: implement 30 val breaker =
 new CircuitBreaker(
 context.system.scheduler,
 maxFailures = 5,
 callTimeout = 10.seconds,
 resetTimeout = 1.minute).onOpen(notifyMeOnOpen()) http://doc.akka.io/docs/akka/current/common/circuitbreaker.html def receive = {
 case "dangerousCall" =>
 breaker.withCircuitBreaker(Future(dangerousCall)) pipeTo sender()
 } 5回失敗するとOpenになり、1分間はメッセー ジを送信させない
  • 31. Copyright © 2017 TIS Inc. All rights reserved. block threads 31 ブロッキング処理があるとスレッドが枯渇しレ イテンシが伝播 blockingblocking threads threads
  • 32. Copyright © 2017 TIS Inc. All rights reserved. prevention of propagation 32 異常サービスを切り離すことで、問題が上流へ 伝播しない blockingblocking threads threads
  • 33. Copyright © 2017 TIS Inc. All rights reserved. CAP trade-off 33 return old information vs don't return anything just do my work vs need synchronize with others cache push - read - write 古い情報を返してもよいか? 他者との同期なしで問題ないか?
  • 34. Copyright © 2017 TIS Inc. All rights reserved. rate limiting 34 rate limiter 同じクライアントからの集中したリクエストか ら守る no more than 100 requests in any 3 sec interval
  • 35. Copyright © 2017 TIS Inc. All rights reserved. bulkhead 35
  • 36. Copyright © 2017 TIS Inc. All rights reserved. Even if there is damage next door, are you OK? 36 無関係なお隣さんがダウンしたとき、影響を被 る不運な出来事
  • 37. Copyright © 2017 TIS Inc. All rights reserved. bulkhead blocks the damage 37 スレッドをブロックするアクターと影響を受け るアクターの間に隔壁 threadsthreads blocking
  • 38. Copyright © 2017 TIS Inc. All rights reserved. isolating the blocking calls to actors 38 val blockingActor = context.actorOf(Props[BlockingActor].
 withDispatcher(“blocking-actor-dispatcher”),
 "blocking-actor")
 
 class BlockingActor extends Actor {
 def receive = {
 case GetCustomer(id) =>
 // calling database
 …
 }
 } ブロッキングコードはアクターごと分離してリ ソースを共有しない
  • 39. Copyright © 2017 TIS Inc. All rights reserved. the blocking in Future 39 Future{ // blocking } ブロックするFutureによりディスパッチャが枯 渇 threads
  • 40. Copyright © 2017 TIS Inc. All rights reserved. 40 http://www.slideshare.net/ktoso/zen-of-akka#44 デフォルトディスパッチャを利⽤した場合 using the default dispatcher
  • 41. Copyright © 2017 TIS Inc. All rights reserved. 41 ブロッキング処理を分離 threadsthreads Future{ // blocking } isolating the blocking Future
  • 42. Copyright © 2017 TIS Inc. All rights reserved. 42 http://www.slideshare.net/ktoso/zen-of-akka#44 using a dedicated dispatcher 専⽤ディスパッチャの利⽤
  • 43. Copyright © 2017 TIS Inc. All rights reserved. CQRS:Command and Query Responsibility Segregation 43 コマンドとクエリを分離する write read command query
  • 44. Copyright © 2017 TIS Inc. All rights reserved. cluster 44
  • 45. Copyright © 2017 TIS Inc. All rights reserved. hardware will fail 45 If there are 365 machines failing once a year, one machine will fail a day Wouldn't a machine break even when it's hosted on the cloud? 1年に1回故障するマシンが365台あれば平均毎 ⽇1台故障する
  • 46. Copyright © 2017 TIS Inc. All rights reserved. availability of AWS 46 例:AWSの可⽤性検証サイト https://cloudharmony.com/status-of-compute-and-storage-and-cdn-and-dns-for-aws
  • 47. Copyright © 2017 TIS Inc. All rights reserved. preparing for failure of hardware 47 - minimize single point of failure - allow recovery of State 単⼀障害点を最⼩化 状態を永続化
  • 48. Copyright © 2017 TIS Inc. All rights reserved. Cluster monitor each other by sending heartbeats 48 node1 node2 node3 node4 クラスタのメンバーがハートビートを送り合い 障害を検知
  • 49. Copyright © 2017 TIS Inc. All rights reserved. recovery states 49 Cluster 永続化しておいたイベントをリプレイすること で状態の復元が可能 persist replay node1 node2 node3 node4 events state akka-persistence
  • 50. Copyright © 2017 TIS Inc. All rights reserved. the database may be down or overloaded 50 永続化機能の障害未復旧時に闇雲にリトライし ない persist replay node3 node4 replay replay db has not started yet
  • 51. Copyright © 2017 TIS Inc. All rights reserved. BackoffSupervisor 51 http://doc.akka.io/docs/akka/current/general/supervision.html#Delayed_restarts_with_the_BackoffSupervisor_pattern 3秒後、6秒後、12秒後、…の間隔でスタートを 試みる 
 val childProps = Props(classOf[EchoActor])
 
 val supervisor = BackoffSupervisor.props(
 Backoff.onStop(
 childProps,
 childName = "myEcho",
 minBackoff = 3.seconds,
 maxBackoff = 30.seconds,
 randomFactor = 0.2 // adds 20% "noise" to vary the intervals slightly
 ))
 
 system.actorOf(supervisor, name = "echoSupervisor") increasing intervals of 3, 6, 12, ...
  • 52. Copyright © 2017 TIS Inc. All rights reserved. split brain resolver 52
  • 53. Copyright © 2017 TIS Inc. All rights reserved. Cluster node1 node2 node3 node4 In the case of network partitions 53 ネットワークが切れることもある
  • 54. Copyright © 2017 TIS Inc. All rights reserved. node1 node4 Cluster1 Cluster2 using split brain resolver 54 クラスタ間での⼀貫性維持のためSplit brain resolverを適⽤ node2 node3 node5 split brain resolver
  • 55. Copyright © 2017 TIS Inc. All rights reserved. strategy 1/4: Static Quorum 55 quorum-size = 3 クラスタ内のノード数が⼀定数以上の場合⽣存 node2 node1 node4 node3 node5 Which can survive? - If the number of nodes is quorum-size or more
  • 56. Copyright © 2017 TIS Inc. All rights reserved. strategy 2/4: Keep Majority 56 ノード数が50%より多い場合に⽣存 node2 node1 node4 node3 node5 Which can survive? - If the number of nodes is more than 50%
  • 57. Copyright © 2017 TIS Inc. All rights reserved. strategy 3/4: Keep Oldest 57 最古のノードが⾃グループに含まれている場合 に⽣存 node2 node4 node3 node5 Which can survive? - If contain the oldest node node1 oldest
  • 58. Copyright © 2017 TIS Inc. All rights reserved. strategy 4/4: Keep Referee 58 特定のノードが含まれている場合に⽣存 node2 node4 node3 node5 node1 Which can survive? - If contain the given referee node address = "akka.tcp://system@node1:port"
  • 59. Copyright © 2017 TIS Inc. All rights reserved. 59 SBR is included in Lightbend Reactive Platform https://github.com/TanUkkii007/akka-cluster-custom-downing http://doc.akka.io/docs/akka/rp-current/scala/split-brain-resolver.html Lightbend Reactive Platform akka-cluster-custom-downing SBRはLightbend Reactive Platformで提供され ています
  • 60. Copyright © 2017 TIS Inc. All rights reserved. idempotence 60 冪等性
  • 61. Copyright © 2017 TIS Inc. All rights reserved. Failed to receive ack message 61 Order(coffee,1) Order(coffee,1) ackを受信できずメッセージを再送すると2重注 ⽂してしまう coffee please! becomes a duplicate order by resending the message
  • 62. Copyright © 2017 TIS Inc. All rights reserved. idempotence 62 メッセージを複数回受信しても問題ないように 冪等な設計で⼀貫性を維持 Order(id1, coffee, 1) Order(id1, coffee, 1) coffee, please! applying it multiple times is not harmful
  • 63. Copyright © 2017 TIS Inc. All rights reserved. summary 63
  • 64. Copyright © 2017 TIS Inc. All rights reserved. summary 64 - Microservices mean distributed systems - define Cross-Functional Requirements - design for failure 障害は発⽣するものなので、受け⼊れましょう
  • 65. Copyright © 2017 TIS Inc. All rights reserved. summary 65 timeout circuit breaker bulkhead cluster backoff split brain resolver ... by using Akka Akkaは分散システムの障害に対処するための ツールキットを備えています
  • 66. Copyright © 2017 TIS Inc. All rights reserved. reference materials 66 - Building Microservices - Reactive Design Patterns - Reactive Application Development - Effective Akka - http://akka.io/
  • 67. Copyright © 2017 TIS Inc. All rights reserved. 67 https://gitter.im/akka-ja/akka-doc-ja https://github.com/akka-ja/akka-doc-ja/ akka.io翻訳協⼒者募集中!! Gitterにジョインしてください。 now translating