SlideShare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our User Agreement and Privacy Policy.
SlideShare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our Privacy Policy and User Agreement for details.
Successfully reported this slideshow.
Activate your 14 day free trial to unlock unlimited reading.
2.
Who am I?
• Yuta Okamoto (@okapies)
• Software Engineer in the manufacturing industry
• Lover of Scala and Scala OSSs
• Also specialize in DevOps technologies
• A member of ScalaMatsuri 2016 committee
岡本 雄太 (@okapies) と申します!
ScalaMatsuri 運営委員その2です
6.
Challenges facing today's software
• Asynchronous & event-driven programming
• Concurrent / parallel processing
• Scalability of the system & the organization
• Fault tolerance (Resilience)
近年のソフト開発の課題: 非同期イベント駆動、
並行並列処理、システムと組織のスケーラビリティ、耐障害性
(Microservices)
7.
Reactive is everywhere!
• From frontend GUIs to backend servers
• Lots of similar but different concepts:
• Reactive Programming
• Reactive Streams
• Reactive Manifesto
フロントエンドからバックエンドまで
様々な文脈で〈リアクティブ〉がキーワードになっている
8.
Frontend GUI
Create a multi-clicks stream
https://gist.github.com/staltz/868e7e9bc2a7b8c1f754
accumulate clicks
within 250 msecs
map each list
to its length
Asynchronous click stream in RxJS
例: 非同期クリックストリームからマルチクリックを検出
9.
GUI and network service
https://github.com/reark/reark
A demo app in
RxJava + Android
Combine async GUI
events with JSON
APIs in a concurrent
fashion
例: 非同期な GUI イベントと JSON API 呼び出しを
並行処理として組み合わせる
10.
Microservices
val userAndTweets = Future.join(
userService.findByUserId(userId),
tweetService.findByUserId(userId)
)
find
find
userId userAndTweets
User
Service
Tweet
Service
http://www.slideshare.net/knoldus/finagle-by-twitter-engineer/16
join
Query other microservices and
combine all two responses into
a tuple asynchronously
in Twitter Finagle
例: 二つのマイクロサービスへの非同期クエリを束ねて出力
11.
Bigdata processing
https://speakerdeck.com/googlecloudjapan/google-cloud-dataflowwoli-jie-suru
Distributed and parallel
processing of big data
例: 大規模データの分散並列処理
in Google Cloud Dataflow
15.
Foundations of Reactive
• Reactive component
!
• Reactive data flow
リアクティブの基盤: コンポーネントとデータフロー
in out
1
2
A
B C
16.
Reactive Component
• Only react to input, and output some data
• Self-contained
in out
入力にのみ反応 (react) する自己完結したコンポーネント
Function?
Object?
Actor?
Sub-system?
17.
Reactive Data Flow
Source
Source Sink
in out
A pipeline delivering
data between inputs and
outputs of components
コンポーネントの入力と出力を結びつけて
データを運ぶパイプライン
18.
Reactive ⃝⃝
Programming model
!
Runtime engine
!
Architecture
=
まず〈リアクティブ・プログラミング〉の話から始めよう
vs
vs
19.
Challenges (revisited)
• Asynchronous & event-driven programming
• Concurrent / parallel processing
• Scalability of the system & the organization
• Fault tolerance (Resilience)
課題: 非同期・並行並列プログラミングを楽にしたい
20.
Why not use Callback?
// asynchronous event
def mouseClick(f: (Int, Int) => Unit): Unit
!
// register a callback as an argument
mouseClick { case (x, y) => println(s"x: $x, y: $y”) }
なぜコールバックを使わないのか?
Callback
21.
Callback Hell
• Hard to modularize the code
• Hard to manage state changes (side effects)
and data dependencies
• Hard to control the order of execution
(driven by external events)
コールバック地獄:
モジュール化、副作用の管理、実行順序の制御が困難
22.
Pyramid of Doom
var g = ...
!
step1 { a =>
step2 { b =>
step3 { c =>
step4 { d =>
// do something with a, b, c, d and g
}
}
}
}
+1
—1
×2
+
破滅のピラミッド: 依存性がピラミッドのように積み上がる
外側の状態を暗黙に参照していてモジュール性が低い
Dependent async steps
stacked like a pyramid
Implicit reference to states
in the outer scopes
23.
Reactive Component
• Reacts to inputs only when the asynchronous
event is supplied through the data flows
in out
データフローからイベントが来た時にだけ反応する
24.
Self Containment
• Each component isolates its internal state
from others, and has independent lifecycle
• These properties are suited for asynchrony
自己完結性: 各コンポーネントの内部状態を互いに隔離
独立したライフサイクルを持つので非同期処理に向いている
? ?
in out
× ×Avoid using
outer variables
25.
Benefits
• Better modularity (composability)
• Better containment of state and failures
モジュール性が高く、状態や障害を封じ込めるのが容易
? ?
in out
× ×
26.
Execution order and dependencies
• How can execution be ordered in async
programming?
• Solution: Reactive Programming
非同期プログラミングで実行順序をどうやって制御するか?
→ リアクティブ・プログラミング
28.
Data flow
g1
g2
h
Source
Source Sink
f
in out
A directed graph of the
data flowing between
operators
演算の間を流れるデータの有向グラフ
29.
Conventional (Imperative) Style
A = 1;
B = 2;
C = (A+1) + (B-1)*2;
一般的な命令型のプログラムは〈上から順番〉に実行する
30.
-1
×2
+
+1A
B C
Imperative code in data flow
A = 1;
B = 2;
C = (A+1) + (B-1)*2;
1
2 4
1 2
2
命令型のプログラムをデータフローに写してみよう
31.
Execution model
• The dataflow just describes the dependencies
among the variables and operators
• Execution model specifies how the graph is
executed
A
B C
+1
—1
×2
+
A = 1;
B = 2;
C = (A+1) + (B-1)*2;
データフローの計算方法は〈実行モデル〉で決まる
32.
-1
×2
+
+1A
B C
Reassignment to variables
A = 1; B = 2;
C = (A+1) + (B-1)*2;
A = 2;
1 $ 2
2 4
Never
propagated to C
Imperative Execution Model
×
×
××
× ×
命令型の実行モデルでは、
C を計算した後に A を変更しても(当然)何も起きない
33.
-1
×2
+
+1A
B C
Reassignment to variables
A := 1; B := 2;
C := (A+1) + (B-1)*2;
A := 2;
1 $ 2
4 $ 52
1 $ 3
1 2
Update of A is
propagated to C
Reactive Execution Model
リアクティブな実行モデルでは、
A の変更がグラフに沿って伝播して C が再計算される
34.
-1
×2
+
+1A
B C
Reassignment to variables
2
5 $ 72 $ 3
1 $ 2 2 $ 4
3
C := (A+1) + (B-1)*2;
A := 2;
B := 3;
Reactive Execution Model
次いで、B を変更しても同様に伝播する
Update of B is
propagated to C
35.
-1
×2
+
+1A
B C
Reassignment to variables
2 $ 0
7 $ 53
2 4
3 $ 1
A := 2;
B := 3;
A := 0;
Reactive Execution Model
もう一度 A を変更しても反映される
Update of A is
propagated to C
37.
e.g. (Akka Streams):
implicit val system = ActorSystem()
implicit val mat = ActorMaterializer()
!
val a = Source(...)
val b = Source(...)
!
val a1 = a.map(_ + 1)
val b1 = b.map(_ - 1).map(_ * 2)
!
val c = (a1 zip b1).map{case (a, b) => a + b}
!
c.runWith(Sink.foreach(println))(mat)
A
B C
+1
—1
×2
+
Build the data flow
using functional DSL
先ほどのデータフローを関数型のコードで記述した例
38.
e.g. (Akka Streams):
implicit val system = ActorSystem()
implicit val mat = ActorMaterializer()
!
val a = Source(...)
val b = Source(...)
!
val a1 = a.map(_ + 1)
val b1 = b.map(_ - 1).map(_ * 2)
!
val c = (a1 zip b1).map{case (a, b) => a + b}
!
c.runWith(Sink.foreach(println))(mat)Stick functions together with map
Function
Input
A
B C
+1
—1
×2
+
入力に適用する関数を高階関数 map で繋ぎ合わせる
39.
Functional style and RP
• Why is functional programming suited for
reactive programming?
• To answer this question, we should know
“Why Functional Programming Matters?”
なぜ関数型は RP に適しているのか?
「なぜ関数型プログラミングは重要」なのか?
40.
Why Functional Programming Matters
• An influential paper authored by John
Hughes (also known for QuickCheck and
QuviQ)
• 1st version appeared in 1984 (30 years ago!)
• Discusses how to improve modularity in
programming by leveraging the FP features
ジョン・ヒューズの著名な論文
関数型によるモジュール性の向上について論じている
http://www.cse.chalmers.se/~rjmh/Papers/whyfp.html
41.
Glues in FP languages
• Two vital glues in FP languages:
• Lazy evaluation
• Higher-order functions (combinators)
“The ways in which one can divide up the
original problem depend directly on the ways
in which one can glue solutions together.”
関数型の重要な糊:〈遅延評価〉と〈高階関数〉
「問題を分割する方法は、解を貼り合わせる方法に依存する」
42.
Lazy evaluation
class Cons[A](hd: A, tl: => List[A]) extends List[A]
!
def nats(n: Int): List[Int] = new Cons(n, nats(n+1))
def fizzbuzz(n: Int) = n match {
case _ if n % 15 == 0 => "FizzBuzz"
case _ if n % 3 == 0 => "Fizz"
case _ if n % 5 == 0 => "Buzz"
case _ => n.toString
}
nats.map(fizzbuzz).take(100).foreach(println)
必要になった分だけ新たに値を評価する
コードを生成器と選択器でモジュール化
Call by need
(pull-based)
Modularize in generator and selector
Infinite list
43.
Higher-order function
• Modularize a program into a general higher-
order function & your specializing functions
!
!
• It enables separating your business logic
from the context of the underlying datatype
プログラムを高階関数とユーザ関数にモジュール化
ビジネスロジックとデータ型の文脈を分離
set. map(_ + 1) // Set[A]
map. map(_ + 1) // Map[A, B]
list.map(_ + 1) // List[A]
Localized context Shared business logic
44.
Adopting FP glues to RP
• Lazy evaluation:
• Organize your program into a pipeline,
consisting of generator/selector to handle
async events piece by piece (push-based)
• Higher-order function:
• Separate your business logic from its
underlying async, event-driven behavior
1. 非同期イベントを生成器・選択器で少しずつ処理
2. ビジネスロジックを非同期イベントの挙動と分離
45.
FP glues in FRP
implicit val system = ActorSystem()
implicit val mat = ActorMaterializer()
!
val a = Source(...)
val b = Source(...)
!
val a1 = a.map(_ + 1)
val b1 = b.map(_ - 1).map(_ * 2)
!
val c = (a1 zip b1).map{case (a, b) => a + b}
!
c.runWith(Sink.foreach(println))
A
B C
+1
—1
×2
+
generator
非同期の文脈を局所化した高階関数 (map, zip 等)を使い、
ビジネスロジックをパイプライン化する
selector
Localized async context
46.
• Most of FRP data flow DSLs are declarative:
the data flow constructed by combinators is
actually scheduled and executed by runtime
Separate the what from the how
implicit val system = ActorSystem()
implicit val mat = ActorMaterializer()
!
val c = (a1 zip b1).map{case (a, b) => a + b}
!
c.runWith(Sink.foreach(println))(mat)
多くの FRP のデータフロー DSL は、
宣言的に構築したデータフローをランタイム上で実行する
runtime!
47.
Separate the what from the how
Input
Input
Output(2) Runtime
(1) Programming Model (DSL)
(Executing the how = propagation of change)
(Describing the what = data flow)
“What” を DSL で記述して “How” をランタイムで実行する
48.
Reactive ⃝⃝
Programming model
!
Runtime engine
!
Architecture
=
vs
リアクティブなプログラミングモデルとランタイムには
密接な関係がある
vs
49.
Portability (multi-platform)
• A reactive program can be mapped onto
different architectures by the runtime
• Single machine
• GPU cards
• Distributed environment
リアクティブなプログラムは
様々なアーキテクチャ上にマッピングできる
50.
Optimization
• The runtime can optimize data flow graphs
to improve performance and stability
• Fusion, data-locality, balancing & caching
• Parallelization and distribution
• Verification and validation
ランタイムでデータフローを最適化できる
融合、データ局所性、並列分散化、検証など
51.
e.g. Fusing
• A new feature in Akka Streams 2.0
This new abstraction … is called fusing. This feature
… will be now possible to execute multiple stream
processing steps inside one actor, reducing the
number of thread-hops where they are not
necessary … will increase performance for various
use cases, including HTTP.
複数の処理ステップを一つにまとめる融合 (fusing) 機能
http://akka.io/news/2015/11/05/akka-streams-2.0-M1-released.html
52.
Examples
• The Dataflow DSL + Runtime architecture is
ubiquitous in recent years
• Akka Streams, ReactiveX, …
• Scientific computing: TensorFlow, Halide
• Bigdata processing: Spark, Google Cloud
Dataflow, Asakusa Framework, Gearpump
データフロー DSL とランタイムの組み合わせは
近年、様々な分野で適用されている
53.
e.g. TensorFlow
http://download.tensorflow.org/paper/whitepaper2015.pdf
54.
Challenges
• Asynchronous & event-driven programming
• Concurrent / parallel processing
• Scalability of the system & the organization
• Fault tolerance (Resilience)
DO
NE!
55.
Reactive ⃝⃝
Programming model
!
Runtime engine
!
Architecture
=
vs
vs
リアクティブ・アーキテクチャ
56.
Single machine
!
!
Distributed system
近年、分散システムが一般的になりつつある
57.
Why distributed systems?
• Asynchronous & event-driven programming
• Concurrent / parallel processing
• Scalability of the system & the organization
• Fault tolerance (Resilience)
→ Reactive Systems
なぜ分散システムが必要か?
→ スケーラビリティ、耐障害性、マイクロサービス化
(Microservices)
59.
Changes in application requirements
A few years ago Recent years
Configuration tens of servers
thousands of
multi-core processors
Response
time
seconds milliseconds
Availability
hours of
offline maintenance
100% uptime
Data size gigabytes petabytes
大規模なクラスタとデータを扱いつつ、ミリ秒の応答時間と
100%の可用性を実現するシステムが必要
60.
Changes in the way the systems are built
“We call these Reactive Systems.
…
The largest systems in the world rely upon
architectures based on these properties and
serve the needs of billions of people daily.”
http://www.reactivemanifesto.org/
世の中の大規模システムが既に採用しているアーキテクチャを
〈リアクティブ・システム〉と呼ぼう
61.
Traits of Reactive Systems
リアクティブ・システムの四つの特徴:
即応性、弾力性、レジリエンス、メッセージ駆動
http://www.slideshare.net/Typesafe_Inc/going-reactive-2016-data-preview/6
62.
Traits of Reactive Systems
Rapid and consistent
response time to users
Stay responsive
under failures
Asynchronous message-
passing is the foundation
Stay responsive
under varying workload
〈非同期メッセージ駆動〉が全ての基盤
http://www.slideshare.net/Typesafe_Inc/going-reactive-2016-data-preview/6
63.
Reactive Component in RSs
• Communicate only via messages
• Self-contained & isolated with asynchronous
(binary) boundaries
Actor?
Sub-system?
in out
非同期(バイナリ)境界で隔離されたコンポーネント同士が
メッセージのみでやりとりする
64.
Message Driven Architecture
• Enables elasticity through scaling, sharding,
replication and location transparency
• Enables resilience through replication,
isolation and task/error delegation (“Let it
crash”)
メッセージ駆動によって弾力性とレジリエンスを達成
65.
How do we build Reactive Systems?
We do not want to be prescriptive in the manifesto
as to how this is achieved. — Martin Thompson
• The manifesto just describes the properties
and qualities of reactive component/system
• Let’s examine a real world use case, called
the microservice architecture (MSA)
http://www.infoq.com/news/2014/10/thompson-reactive-manifesto-2
マニフェストはシステムの実現方法には触れていない
→ 実例として〈マイクロサービス〉を調べてみる
66.
Microservice Architecture
• A distributed system improving scalability of
the large organization, such like Amazon,
Netflix and Twitter
• Organizes small, independent and modular
services around business capability (c.f.
Conway’s Law)
• Can be regarded as an instance of the reactive
system
巨大な開発組織をスケールさせるための方法論
リアクティブ・システムの実例の一つと見ることができる
67.
STORAGE &
RETRIEVAL
LOGICPRESENTATIONROUTING
Redis
Memcache
Flock
T-Bird
MySQLTweet
User
Timeline
Social
Graph
DMs
API
Web
Monorail
TFE
HTTP Thrift “Stuff”
http://monkey.org/~marius/scala2015.pdf
e.g. Real world
microservices in Twitter
68.
System Level Dataflow
• MSA is composed of (reactive) components
and their business dependencies
• This structure forms a system level dataflow
• Can we describe the whole distributed
system as a code?
A
B C
MSA はビジネス同士の依存関係に基づくデータフローを成す
→ 分散システム全体をコードとして記述できないか?
69.
Reactive Big DataTM
• Reactive architecture on bigdata processing
• Spark
• Google Cloud Dataflow
• Gearpump (Intel)
• Asakusa Framework
昨今のビッグデータ処理フレームワークは
リアクティブなアーキテクチャを採用していることが多い
70.
https://speakerdeck.com/googlecloudjapan/google-cloud-dataflowwoli-jie-suru
Pipeline representing
a DAG of steps
DAG でデータ処理パイプラインを記述する
in Google Cloud Dataflow
71.
DAG on Spark
https://cloud.githubusercontent.com/assets/2133137/7625997/e0878f8c-f9b4-11e4-8df3-7dd611b13c87.png
実行中の Spark ジョブを DAG として可視化した例
73.
http://knowledge.sakura.ad.jp/tech/4016/
http://docs.asakusafw.com/preview/ja/html/asakusa-on-spark/user-guide.html
Asakusa Framework
Single DSL can be run
on multiple bigdata
framework
Asakusa Framework で記述したコードは
Hadoop 上でも Spark 上でもポータブルに実行できる
74.
Apache Dataflow (New!)
http://googlecloudplatform.blogspot.co.uk/2016/01/Dataflow-and-open-source-proposal-to-join-the-Apache-Incubator.html
Google 主導によるデータフロー記述の標準化の取り組み
75.
https://speakerdeck.com/googlecloudjapan/google-cloud-dataflowwoli-jie-suru
Google Cloud as
Runtime engine
Optimize
Schedule
Flow of pipeline
User code & SDK Monitoring UI
Data flow definition
データフロー・ランタイムとしての Google クラウドが
データフローの最適化とタスクのスケジュールを行う
76.
General overview
ビッグデータにおけるリアクティブ・システムの
アーキテクチャを一般化してみる
Dataflow
Reactive System
Cloud-level Runtime
77.
General overview
• The cloud-level runtime:
• Optimizes the specified dataflow and schedules
reactive components
• Obtains resources from a distributed resource
manager such as YARN and
• Deploys the components to the allocated
resources and run them as the reactive system
クラウドレベルのランタイムがデータフローを
リアクティブシステムとして配備し実行する
Dataflow
Reactive System
Cloud-level Runtime
78.
Web Service Orchestration
• The modern DevOps toolchain focus mainly
on configuring a system on each node in an
imperative manner (such as Dockerfile)
• The immutable infrastructure should be
composed of the reactive components and
described as the declarative dataflow (?)
ウェブサービスのオーケストレーションにも
リアクティブ・システム+データフローの手法が使えるのでは?
79.
Summary
• Reactive Programming in concert with the Reactive
Architecture offer solutions to the modern challenges:
• Asynchronous & event-driven programming
• Concurrent / parallel processing
• Scalability of the system & the organization
• Fault tolerance (Resilience)
リアクティブは非同期イベント駆動、並行並列処理、
スケーラビリティ、耐障害性の課題を解決する
80.
Summary
• Reactive components and the data flows are
great tools to cope with asynchrony and
distribution at every layer of the system
• The capability (performance, fault-tolerance
and operability) provided by the runtimes
weighs more than just the programming
models in the current era
〈リアクティブ〉はシステムのあらゆる階層で有効な概念
プログラミングモデルよりランタイムの能力が重要な時代