SlideShare a Scribd company logo
PAPIs 2015
Akka & Data Science:
Making real-time
predictions
Brian Gawalt
2nd International Conference on Predictive APIs and Apps
August 7, 2015
PAPIs 2015
[A]
Sometimes, data
scientists need to worry
about throughput.
2
PAPIs 2015
[B]
One way to increase
throughput is with
concurrency.
3
PAPIs 2015
[C]
The Actor Model is an
easy way to build a
concurrent system.
4
PAPIs 2015
[D]
Scala+Akka provides an
easy-to-use Actor Model
context.
5
PAPIs 2015
[A + B + C + D ⇒ E]
Data scientists should
check out Scala+Akka.
6
PAPIs 2015
Consider:
● building a model,
● vs. using a model
7
PAPIs 2015
Lots of ways to practice
building a model
8
PAPIs 2015
The Classic Process
1. Load your data set’s raw materials
2. Produce feature vectors:
o Training,
o Validation,
o Testing
3. Build the model with training and validation
vectors
9
PAPIs 2015
The Classic Process:
One-time Testing
10
Load train/valid./test
materials
Make train/valid./test
feature vectors
Train Model
Make test predictions
Build
Use
PAPIs 2015
The Classic Process:
Repeated Testing
11
Load train/valid. materials
Make train/valid.
feature vectors
Train Model
Load test/new materials
Make test/new
feature vectors
Make test/new predictions
(saved model)
(repeat every K minutes)
Build
Use
PAPIs 2015
Sometimes my tasks
work like that, too!
12
PAPIs 2015
But this talk is about the
other kind of tasks.
13
PAPIs 2015
[A]
Sometimes, data
scientists need to worry
about throughput.
14
PAPIs 2015
Example:
Freelancer availability on
15
PAPIs 2015
Hiring Freelancers on Upwork
1. Post a job
2. Search for freelancers
3. Find someone you like
4. Ask them to interview
o Request Accepted!
o or rejected/ignored...
16
THE TASK:
Look at recent
freelancer behavior,
and predict, at time
Step 2, who’s likely
to accept an invite
at time Step 4
PAPIs 2015
Building this model is
business as usual:
17
PAPIs 2015
Building Availability Model
1. Load raw materials:
o Examples of accepts/rejects
o Histories of freelancer site activity
 Job applications sent or received
 Hours worked
 Click logs
 Profile updates
2. Produce feature vectors: 18
Greenplum
Amazon S3
Internal
Service
PAPIs 2015
Using Availability Model
19
Load train/valid. materials
Make train/valid.
feature vectors
Train Model
Load test/new materials
Make test/new
feature vectors
Make test/new predictions
(saved model)
(repeat every 60 minutes)
PAPIs 2015
Using Availability Model
20
Load test/new materials
Make test/new
feature vectors
Make test/new predictions
(saved model)
(repeat every 60 minutes)
Load job app data
(4 min.)
Load click log data
(30 min.)
Load work hours data
(5 min.)
Load profile data
(20 ms/profile)
PAPIs 2015
Using Availability Model
21
Load job app data
(4 min.)
Load click log data
(30 min.)
Load work hours data
(5 min.)
Load profile data
(20 ms/profile)
● Left with under 21 minutes to
collect profile data
○ Rate limit: 20 ms/profile
○ At most, 63K profiles per
hour
● Six Million freelancers who
need avail. predictions: expect
~90 hours between re-scoring
any individual
● Still need to spend time
actually building vectors and
exporting scores!
PAPIs 2015
[B]
One way to increase
throughput is with
concurrency.
22
PAPIs 2015
Expensive Option:
Major infrastructure
overhaul
23
PAPIs 2015
… but that takes a lot of
time, attention, and
cooperation…
24
PAPIs 2015
Simpler Option:
The Actor Model
25
PAPIs 2015
[C]
The Actor Model is an
easy way to build a
concurrent system.
26
PAPIs 2015
● Imagine a mailbox with a brain
● Computation only begins when/if a
message arrives
● Keeps its thoughts private:
○ No other actor can actively read this
actor’s state
○ Other actors will have to wait to hear a
message from this actor
An Actor
27
PAPIs 2015
● Lots of Actors, and each has:
○ Private message queue
○ Private state, shared only sending more
messages
● Execution context:
○ Manages threading of each Actor’s
computation
○ Handles asynch. message routing
○ Can send prescheduled messages
● Each received message’s
computation is fully completed
before Actor moves on to next
message in queue
The Actor Model of Concurrency
28
PAPIs 2015
The Actor Model of Concurrency
29
Execution Context
PAPIs 2015
Parallelizing predictions
30
Refresh work hours
Vectorizer:
● Keep copies of raw data
● Emit vector for each new
profile received
Refresh job apps
Refresh click log Fetch 10 profiles
Apply model;
export
prediction
raw data
raw data
Schedule: Fetch once per hour Schedule: Fetch once per hour
Schedule: Fetch once per hour Schedule: Fetch every 300ms
PAPIs 2015
Serial processing
31
Refresh job apps
Make feature vectors
Export predictions
(repeat every 60 minutes)
Refresh work hours
Refresh click log
Fetch ~50K profiles
...
55 min
5 min
4 min
5 min
30 min
55 - 4 - 5 - 30
= 16 min...
PAPIs 2015
Serial processing
32
Refresh job apps
Make feature vectors
Export predictions
(repeat every 60 minutes)
Refresh work hours
Refresh click log
Fetch ~50K profiles
...
55 min
5 min
4 min
5 min
30 min
55 - 4 - 5 - 30
= 16 min...
Throughput:
48K users/hr
PAPIs 2015
Parallel Processing with Actors
33
Refresh job
apps
...
Refresh
click log
Refresh
work hrs.
Rx data
Fetch pro.
Export
Rx data
Fetch pro.
Fetch pro.
Fetch pro.
Fetch pro.= msg. sent
= msg. rx’d
1/hr.
1/hr.
1/hr. 3/sec. (as rx’ed)
Store
Store
Vectorize
Vectorize
Store
1/hr.
Thr. 1 Thr. 2 Thr. 3 Thr. 4
Vectorize
Fetch pro.
Fetch pro.
(msg. processing time
not to scale)
Rx data
Vectorize
...
PAPIs 2015
Parallel Processing with Actors
34
Refresh job
apps
...
Refresh
click log
Refresh
work hrs.
Rx data
Fetch pro.
Export
Rx data
Fetch pro.
Fetch pro.
Fetch pro.
Fetch pro.= msg. sent
= msg. rx’d
1/hr.
1/hr.
1/hr. 3/sec. (as rx’ed)
Store
Store
Vectorize
Vectorize
Store
1/hr.
Thr. 1 Thr. 2 Thr. 3 Thr. 4
Vectorize
Fetch pro.
Fetch pro.
Throughput:
180K users/hr
Rx data
Vectorize
...
PAPIs 2015
[D]
Scala+Akka provides an
easy-to-use Actor Model
context.
35
PAPIs 2015
Message passing,
scheduling, &
computation behavior
defined in 445 lines.
36
PAPIs 2015
Scala+Akka Actors
● Create Scala class, mix in Actor trait
● Implement the required partial function: receive:
PartialFunction[Any, Unit]
● Define family of message objects this actor’s
planning to handle
● Define behavior for each message case in receive
37
PAPIs 2015
Scala+Akka Actors
38
Mixin same code used for
export in non-Actor
version
Private, mutable state:
stored scores
Private, mutable state: time
of last export
If receiving new scores:
store them!
If storing lots of scores, or if
it’s been awhile: upload
what’s stored, then erase
them
If told to shut down, stop
accepting new scores
PAPIs 2015
Scala+Akka Pros
● Easy to get productive in the Scala
language
● SBT dependency management makes it
easy to move to any box with a JRE
● No global interpreter lock!
39
PAPIs 2015
Scala+Akka Cons
● Moderate Scala learning curve
● Object representation on the JVM has
pretty lousy memory efficiency
● Not a lot of great options for building
models in Scala (compared to R, Python,
Julia)
40
PAPIs 2015
[A]
Sometimes, data
scientists need to worry
about throughput.
41
PAPIs 2015
[B]
One way to increase
throughput is with
concurrency.
42
PAPIs 2015
[C]
The Actor Model is an
easy way to build a
concurrent system.
43
PAPIs 2015
[D]
Scala+Akka provides an
easy-to-use Actor Model
context.
44
PAPIs 2015
[A + B + C + D ⇒ Z]
Data scientists should
check out Scala+Akka
45
PAPIs 2015
Thanks!
Questions?
bgawalt@{upwork, gmail}.com
twitter.com/bgawalt

More Related Content

Viewers also liked

[통계분석연구회] 2016년 겨울 맞이 추천 도서와 영상
[통계분석연구회] 2016년 겨울 맞이 추천 도서와 영상[통계분석연구회] 2016년 겨울 맞이 추천 도서와 영상
[통계분석연구회] 2016년 겨울 맞이 추천 도서와 영상
백승민 Baek Seung Min
 
Lean Analytics_cojette
Lean Analytics_cojetteLean Analytics_cojette
Lean Analytics_cojetteJeongMin Kwon
 
Offering 효과 분석-시계열 예측 모델 활용
Offering 효과 분석-시계열 예측 모델 활용Offering 효과 분석-시계열 예측 모델 활용
Offering 효과 분석-시계열 예측 모델 활용
JeongMin Kwon
 
꿈꾸는 데이터 디자이너 시즌2 교육설명회
꿈꾸는 데이터 디자이너 시즌2 교육설명회꿈꾸는 데이터 디자이너 시즌2 교육설명회
꿈꾸는 데이터 디자이너 시즌2 교육설명회
neuroassociates
 
통계분석연구회 2015년 겨울 맞이 추천 도서와 영상
통계분석연구회 2015년 겨울 맞이 추천 도서와 영상통계분석연구회 2015년 겨울 맞이 추천 도서와 영상
통계분석연구회 2015년 겨울 맞이 추천 도서와 영상
백승민 Baek Seung Min
 
METRIC - 린 분석의 데이터 사용법
METRIC - 린 분석의 데이터 사용법METRIC - 린 분석의 데이터 사용법
METRIC - 린 분석의 데이터 사용법
JeongMin Kwon
 
빅데이터 분석을 위한 스파크 2 프로그래밍 : 대용량 데이터 처리부터 머신러닝까지
빅데이터 분석을 위한 스파크 2 프로그래밍 : 대용량 데이터 처리부터 머신러닝까지빅데이터 분석을 위한 스파크 2 프로그래밍 : 대용량 데이터 처리부터 머신러닝까지
빅데이터 분석을 위한 스파크 2 프로그래밍 : 대용량 데이터 처리부터 머신러닝까지
위키북스
 
R & big data analysis 20120531
R & big data analysis 20120531R & big data analysis 20120531
R & big data analysis 20120531
JeongMin Kwon
 
[우리가 데이터를 쓰는 법] 좋다는 건 알겠는데 좀 써보고 싶소. 데이터! - 넘버웍스 하용호 대표
[우리가 데이터를 쓰는 법] 좋다는 건 알겠는데 좀 써보고 싶소. 데이터! - 넘버웍스 하용호 대표[우리가 데이터를 쓰는 법] 좋다는 건 알겠는데 좀 써보고 싶소. 데이터! - 넘버웍스 하용호 대표
[우리가 데이터를 쓰는 법] 좋다는 건 알겠는데 좀 써보고 싶소. 데이터! - 넘버웍스 하용호 대표
Dylan Ko
 
2011 H3 컨퍼런스-파이썬으로 클라우드 하고 싶어요
2011 H3 컨퍼런스-파이썬으로 클라우드 하고 싶어요2011 H3 컨퍼런스-파이썬으로 클라우드 하고 싶어요
2011 H3 컨퍼런스-파이썬으로 클라우드 하고 싶어요
Yongho Ha
 
데이터분석의 길 2: “고수는 최고의 연장을 사용한다” (툴채인)
데이터분석의 길 2:  “고수는 최고의 연장을 사용한다” (툴채인)데이터분석의 길 2:  “고수는 최고의 연장을 사용한다” (툴채인)
데이터분석의 길 2: “고수는 최고의 연장을 사용한다” (툴채인)
Jaimie Kwon (권재명)
 
데이터분석의 길 3 “r 워크플로우 (스토리텔링)”
데이터분석의 길 3   “r 워크플로우 (스토리텔링)”데이터분석의 길 3   “r 워크플로우 (스토리텔링)”
데이터분석의 길 3 “r 워크플로우 (스토리텔링)”
Jaimie Kwon (권재명)
 
데이터분석의 길 5: “고수는 큰자료를 두려워하지 않는다” (클릭확률예측 상편)
데이터분석의 길 5:  “고수는 큰자료를 두려워하지 않는다” (클릭확률예측 상편)데이터분석의 길 5:  “고수는 큰자료를 두려워하지 않는다” (클릭확률예측 상편)
데이터분석의 길 5: “고수는 큰자료를 두려워하지 않는다” (클릭확률예측 상편)
Jaimie Kwon (권재명)
 
SK플래닛_README_마이크로서비스 아키텍처로 개발하기
SK플래닛_README_마이크로서비스 아키텍처로 개발하기SK플래닛_README_마이크로서비스 아키텍처로 개발하기
SK플래닛_README_마이크로서비스 아키텍처로 개발하기
Lee Ji Eun
 
기술적 변화를 이끌어가기
기술적 변화를 이끌어가기기술적 변화를 이끌어가기
기술적 변화를 이끌어가기
Jaewoo Ahn
 
데이터분석의 길 4: “고수는 통계학습의 달인이다”
데이터분석의 길 4:  “고수는 통계학습의 달인이다”데이터분석의 길 4:  “고수는 통계학습의 달인이다”
데이터분석의 길 4: “고수는 통계학습의 달인이다”
Jaimie Kwon (권재명)
 
오픈소스 SW 라이선스 - 박은정님
오픈소스 SW 라이선스 - 박은정님오픈소스 SW 라이선스 - 박은정님
오픈소스 SW 라이선스 - 박은정님
NAVER D2
 
어떻게 하면 데이터 사이언티스트가 될 수 있나요?
어떻게 하면 데이터 사이언티스트가 될 수 있나요?어떻게 하면 데이터 사이언티스트가 될 수 있나요?
어떻게 하면 데이터 사이언티스트가 될 수 있나요?
Yongho Ha
 
스타트업은 데이터를 어떻게 바라봐야 할까? (개정판)
스타트업은 데이터를 어떻게 바라봐야 할까? (개정판)스타트업은 데이터를 어떻게 바라봐야 할까? (개정판)
스타트업은 데이터를 어떻게 바라봐야 할까? (개정판)
Yongho Ha
 
데이터는 차트가 아니라 돈이 되어야 한다.
데이터는 차트가 아니라 돈이 되어야 한다.데이터는 차트가 아니라 돈이 되어야 한다.
데이터는 차트가 아니라 돈이 되어야 한다.
Yongho Ha
 

Viewers also liked (20)

[통계분석연구회] 2016년 겨울 맞이 추천 도서와 영상
[통계분석연구회] 2016년 겨울 맞이 추천 도서와 영상[통계분석연구회] 2016년 겨울 맞이 추천 도서와 영상
[통계분석연구회] 2016년 겨울 맞이 추천 도서와 영상
 
Lean Analytics_cojette
Lean Analytics_cojetteLean Analytics_cojette
Lean Analytics_cojette
 
Offering 효과 분석-시계열 예측 모델 활용
Offering 효과 분석-시계열 예측 모델 활용Offering 효과 분석-시계열 예측 모델 활용
Offering 효과 분석-시계열 예측 모델 활용
 
꿈꾸는 데이터 디자이너 시즌2 교육설명회
꿈꾸는 데이터 디자이너 시즌2 교육설명회꿈꾸는 데이터 디자이너 시즌2 교육설명회
꿈꾸는 데이터 디자이너 시즌2 교육설명회
 
통계분석연구회 2015년 겨울 맞이 추천 도서와 영상
통계분석연구회 2015년 겨울 맞이 추천 도서와 영상통계분석연구회 2015년 겨울 맞이 추천 도서와 영상
통계분석연구회 2015년 겨울 맞이 추천 도서와 영상
 
METRIC - 린 분석의 데이터 사용법
METRIC - 린 분석의 데이터 사용법METRIC - 린 분석의 데이터 사용법
METRIC - 린 분석의 데이터 사용법
 
빅데이터 분석을 위한 스파크 2 프로그래밍 : 대용량 데이터 처리부터 머신러닝까지
빅데이터 분석을 위한 스파크 2 프로그래밍 : 대용량 데이터 처리부터 머신러닝까지빅데이터 분석을 위한 스파크 2 프로그래밍 : 대용량 데이터 처리부터 머신러닝까지
빅데이터 분석을 위한 스파크 2 프로그래밍 : 대용량 데이터 처리부터 머신러닝까지
 
R & big data analysis 20120531
R & big data analysis 20120531R & big data analysis 20120531
R & big data analysis 20120531
 
[우리가 데이터를 쓰는 법] 좋다는 건 알겠는데 좀 써보고 싶소. 데이터! - 넘버웍스 하용호 대표
[우리가 데이터를 쓰는 법] 좋다는 건 알겠는데 좀 써보고 싶소. 데이터! - 넘버웍스 하용호 대표[우리가 데이터를 쓰는 법] 좋다는 건 알겠는데 좀 써보고 싶소. 데이터! - 넘버웍스 하용호 대표
[우리가 데이터를 쓰는 법] 좋다는 건 알겠는데 좀 써보고 싶소. 데이터! - 넘버웍스 하용호 대표
 
2011 H3 컨퍼런스-파이썬으로 클라우드 하고 싶어요
2011 H3 컨퍼런스-파이썬으로 클라우드 하고 싶어요2011 H3 컨퍼런스-파이썬으로 클라우드 하고 싶어요
2011 H3 컨퍼런스-파이썬으로 클라우드 하고 싶어요
 
데이터분석의 길 2: “고수는 최고의 연장을 사용한다” (툴채인)
데이터분석의 길 2:  “고수는 최고의 연장을 사용한다” (툴채인)데이터분석의 길 2:  “고수는 최고의 연장을 사용한다” (툴채인)
데이터분석의 길 2: “고수는 최고의 연장을 사용한다” (툴채인)
 
데이터분석의 길 3 “r 워크플로우 (스토리텔링)”
데이터분석의 길 3   “r 워크플로우 (스토리텔링)”데이터분석의 길 3   “r 워크플로우 (스토리텔링)”
데이터분석의 길 3 “r 워크플로우 (스토리텔링)”
 
데이터분석의 길 5: “고수는 큰자료를 두려워하지 않는다” (클릭확률예측 상편)
데이터분석의 길 5:  “고수는 큰자료를 두려워하지 않는다” (클릭확률예측 상편)데이터분석의 길 5:  “고수는 큰자료를 두려워하지 않는다” (클릭확률예측 상편)
데이터분석의 길 5: “고수는 큰자료를 두려워하지 않는다” (클릭확률예측 상편)
 
SK플래닛_README_마이크로서비스 아키텍처로 개발하기
SK플래닛_README_마이크로서비스 아키텍처로 개발하기SK플래닛_README_마이크로서비스 아키텍처로 개발하기
SK플래닛_README_마이크로서비스 아키텍처로 개발하기
 
기술적 변화를 이끌어가기
기술적 변화를 이끌어가기기술적 변화를 이끌어가기
기술적 변화를 이끌어가기
 
데이터분석의 길 4: “고수는 통계학습의 달인이다”
데이터분석의 길 4:  “고수는 통계학습의 달인이다”데이터분석의 길 4:  “고수는 통계학습의 달인이다”
데이터분석의 길 4: “고수는 통계학습의 달인이다”
 
오픈소스 SW 라이선스 - 박은정님
오픈소스 SW 라이선스 - 박은정님오픈소스 SW 라이선스 - 박은정님
오픈소스 SW 라이선스 - 박은정님
 
어떻게 하면 데이터 사이언티스트가 될 수 있나요?
어떻게 하면 데이터 사이언티스트가 될 수 있나요?어떻게 하면 데이터 사이언티스트가 될 수 있나요?
어떻게 하면 데이터 사이언티스트가 될 수 있나요?
 
스타트업은 데이터를 어떻게 바라봐야 할까? (개정판)
스타트업은 데이터를 어떻게 바라봐야 할까? (개정판)스타트업은 데이터를 어떻게 바라봐야 할까? (개정판)
스타트업은 데이터를 어떻게 바라봐야 할까? (개정판)
 
데이터는 차트가 아니라 돈이 되어야 한다.
데이터는 차트가 아니라 돈이 되어야 한다.데이터는 차트가 아니라 돈이 되어야 한다.
데이터는 차트가 아니라 돈이 되어야 한다.
 

Similar to [Research] deploying predictive models with the actor framework - Brian Gawalt

How to Productionize Your Machine Learning Models Using Apache Spark MLlib 2....
How to Productionize Your Machine Learning Models Using Apache Spark MLlib 2....How to Productionize Your Machine Learning Models Using Apache Spark MLlib 2....
How to Productionize Your Machine Learning Models Using Apache Spark MLlib 2....
Databricks
 
Tuning for Systematic Trading: Talk 2: Deep Learning
Tuning for Systematic Trading: Talk 2: Deep LearningTuning for Systematic Trading: Talk 2: Deep Learning
Tuning for Systematic Trading: Talk 2: Deep Learning
SigOpt
 
Serverless ML Workshop with Hopsworks at PyData Seattle
Serverless ML Workshop with Hopsworks at PyData SeattleServerless ML Workshop with Hopsworks at PyData Seattle
Serverless ML Workshop with Hopsworks at PyData Seattle
Jim Dowling
 
Exploratory Analysis of Spark Structured Streaming, Todor Ivanov, Jason Taafe...
Exploratory Analysis of Spark Structured Streaming, Todor Ivanov, Jason Taafe...Exploratory Analysis of Spark Structured Streaming, Todor Ivanov, Jason Taafe...
Exploratory Analysis of Spark Structured Streaming, Todor Ivanov, Jason Taafe...
DataBench
 
Exploratory Analysis of Spark Structured Streaming
Exploratory Analysis of Spark Structured StreamingExploratory Analysis of Spark Structured Streaming
Exploratory Analysis of Spark Structured Streaming
t_ivanov
 
An Architecture for Agile Machine Learning in Real-Time Applications
An Architecture for Agile Machine Learning in Real-Time ApplicationsAn Architecture for Agile Machine Learning in Real-Time Applications
An Architecture for Agile Machine Learning in Real-Time Applications
Johann Schleier-Smith
 
Uber Business Metrics Generation and Management Through Apache Flink
Uber Business Metrics Generation and Management Through Apache FlinkUber Business Metrics Generation and Management Through Apache Flink
Uber Business Metrics Generation and Management Through Apache Flink
Wenrui Meng
 
GraphQL Advanced
GraphQL AdvancedGraphQL Advanced
GraphQL Advanced
LeanIX GmbH
 
Monitoring AI with AI
Monitoring AI with AIMonitoring AI with AI
Monitoring AI with AI
Stepan Pushkarev
 
Data Summer Conf 2018, “Monitoring AI with AI (RUS)” — Stepan Pushkarev, CTO ...
Data Summer Conf 2018, “Monitoring AI with AI (RUS)” — Stepan Pushkarev, CTO ...Data Summer Conf 2018, “Monitoring AI with AI (RUS)” — Stepan Pushkarev, CTO ...
Data Summer Conf 2018, “Monitoring AI with AI (RUS)” — Stepan Pushkarev, CTO ...
Provectus
 
Production ready big ml workflows from zero to hero daniel marcous @ waze
Production ready big ml workflows from zero to hero daniel marcous @ wazeProduction ready big ml workflows from zero to hero daniel marcous @ waze
Production ready big ml workflows from zero to hero daniel marcous @ waze
Ido Shilon
 
AnalyticOps - Chicago PAW 2016
AnalyticOps - Chicago PAW 2016AnalyticOps - Chicago PAW 2016
AnalyticOps - Chicago PAW 2016
Robert Grossman
 
Automatic Performance Modelling from Application Performance Management (APM)...
Automatic Performance Modelling from Application Performance Management (APM)...Automatic Performance Modelling from Application Performance Management (APM)...
Automatic Performance Modelling from Application Performance Management (APM)...
Paul Brebner
 
February'16 SDG - Spring'16 new features
February'16 SDG - Spring'16 new featuresFebruary'16 SDG - Spring'16 new features
February'16 SDG - Spring'16 new features
Josep Vall-llovera
 
ML Platform Q1 Meetup: Airbnb's End-to-End Machine Learning Infrastructure
ML Platform Q1 Meetup: Airbnb's End-to-End Machine Learning InfrastructureML Platform Q1 Meetup: Airbnb's End-to-End Machine Learning Infrastructure
ML Platform Q1 Meetup: Airbnb's End-to-End Machine Learning Infrastructure
Fei Chen
 
Machine Learning Infrastructure
Machine Learning InfrastructureMachine Learning Infrastructure
Machine Learning Infrastructure
SigOpt
 
Continuous delivery for machine learning
Continuous delivery for machine learningContinuous delivery for machine learning
Continuous delivery for machine learning
Rajesh Muppalla
 
Splunk Ninjas: New Features, Pivot, and Search Dojo
Splunk Ninjas: New Features, Pivot, and Search DojoSplunk Ninjas: New Features, Pivot, and Search Dojo
Splunk Ninjas: New Features, Pivot, and Search Dojo
Splunk
 
Denys Kovalenko "Scaling Data Science at Bolt"
Denys Kovalenko "Scaling Data Science at Bolt"Denys Kovalenko "Scaling Data Science at Bolt"
Denys Kovalenko "Scaling Data Science at Bolt"
Fwdays
 
Automated Production Ready ML at Scale
Automated Production Ready ML at ScaleAutomated Production Ready ML at Scale
Automated Production Ready ML at Scale
Databricks
 

Similar to [Research] deploying predictive models with the actor framework - Brian Gawalt (20)

How to Productionize Your Machine Learning Models Using Apache Spark MLlib 2....
How to Productionize Your Machine Learning Models Using Apache Spark MLlib 2....How to Productionize Your Machine Learning Models Using Apache Spark MLlib 2....
How to Productionize Your Machine Learning Models Using Apache Spark MLlib 2....
 
Tuning for Systematic Trading: Talk 2: Deep Learning
Tuning for Systematic Trading: Talk 2: Deep LearningTuning for Systematic Trading: Talk 2: Deep Learning
Tuning for Systematic Trading: Talk 2: Deep Learning
 
Serverless ML Workshop with Hopsworks at PyData Seattle
Serverless ML Workshop with Hopsworks at PyData SeattleServerless ML Workshop with Hopsworks at PyData Seattle
Serverless ML Workshop with Hopsworks at PyData Seattle
 
Exploratory Analysis of Spark Structured Streaming, Todor Ivanov, Jason Taafe...
Exploratory Analysis of Spark Structured Streaming, Todor Ivanov, Jason Taafe...Exploratory Analysis of Spark Structured Streaming, Todor Ivanov, Jason Taafe...
Exploratory Analysis of Spark Structured Streaming, Todor Ivanov, Jason Taafe...
 
Exploratory Analysis of Spark Structured Streaming
Exploratory Analysis of Spark Structured StreamingExploratory Analysis of Spark Structured Streaming
Exploratory Analysis of Spark Structured Streaming
 
An Architecture for Agile Machine Learning in Real-Time Applications
An Architecture for Agile Machine Learning in Real-Time ApplicationsAn Architecture for Agile Machine Learning in Real-Time Applications
An Architecture for Agile Machine Learning in Real-Time Applications
 
Uber Business Metrics Generation and Management Through Apache Flink
Uber Business Metrics Generation and Management Through Apache FlinkUber Business Metrics Generation and Management Through Apache Flink
Uber Business Metrics Generation and Management Through Apache Flink
 
GraphQL Advanced
GraphQL AdvancedGraphQL Advanced
GraphQL Advanced
 
Monitoring AI with AI
Monitoring AI with AIMonitoring AI with AI
Monitoring AI with AI
 
Data Summer Conf 2018, “Monitoring AI with AI (RUS)” — Stepan Pushkarev, CTO ...
Data Summer Conf 2018, “Monitoring AI with AI (RUS)” — Stepan Pushkarev, CTO ...Data Summer Conf 2018, “Monitoring AI with AI (RUS)” — Stepan Pushkarev, CTO ...
Data Summer Conf 2018, “Monitoring AI with AI (RUS)” — Stepan Pushkarev, CTO ...
 
Production ready big ml workflows from zero to hero daniel marcous @ waze
Production ready big ml workflows from zero to hero daniel marcous @ wazeProduction ready big ml workflows from zero to hero daniel marcous @ waze
Production ready big ml workflows from zero to hero daniel marcous @ waze
 
AnalyticOps - Chicago PAW 2016
AnalyticOps - Chicago PAW 2016AnalyticOps - Chicago PAW 2016
AnalyticOps - Chicago PAW 2016
 
Automatic Performance Modelling from Application Performance Management (APM)...
Automatic Performance Modelling from Application Performance Management (APM)...Automatic Performance Modelling from Application Performance Management (APM)...
Automatic Performance Modelling from Application Performance Management (APM)...
 
February'16 SDG - Spring'16 new features
February'16 SDG - Spring'16 new featuresFebruary'16 SDG - Spring'16 new features
February'16 SDG - Spring'16 new features
 
ML Platform Q1 Meetup: Airbnb's End-to-End Machine Learning Infrastructure
ML Platform Q1 Meetup: Airbnb's End-to-End Machine Learning InfrastructureML Platform Q1 Meetup: Airbnb's End-to-End Machine Learning Infrastructure
ML Platform Q1 Meetup: Airbnb's End-to-End Machine Learning Infrastructure
 
Machine Learning Infrastructure
Machine Learning InfrastructureMachine Learning Infrastructure
Machine Learning Infrastructure
 
Continuous delivery for machine learning
Continuous delivery for machine learningContinuous delivery for machine learning
Continuous delivery for machine learning
 
Splunk Ninjas: New Features, Pivot, and Search Dojo
Splunk Ninjas: New Features, Pivot, and Search DojoSplunk Ninjas: New Features, Pivot, and Search Dojo
Splunk Ninjas: New Features, Pivot, and Search Dojo
 
Denys Kovalenko "Scaling Data Science at Bolt"
Denys Kovalenko "Scaling Data Science at Bolt"Denys Kovalenko "Scaling Data Science at Bolt"
Denys Kovalenko "Scaling Data Science at Bolt"
 
Automated Production Ready ML at Scale
Automated Production Ready ML at ScaleAutomated Production Ready ML at Scale
Automated Production Ready ML at Scale
 

More from PAPIs.io

Shortening the time from analysis to deployment with ml as-a-service — Luiz A...
Shortening the time from analysis to deployment with ml as-a-service — Luiz A...Shortening the time from analysis to deployment with ml as-a-service — Luiz A...
Shortening the time from analysis to deployment with ml as-a-service — Luiz A...
PAPIs.io
 
Feature engineering — HJ Van Veen (Nubank) @@PAPIs Connect — São Paulo 2017
Feature engineering — HJ Van Veen (Nubank) @@PAPIs Connect — São Paulo 2017Feature engineering — HJ Van Veen (Nubank) @@PAPIs Connect — São Paulo 2017
Feature engineering — HJ Van Veen (Nubank) @@PAPIs Connect — São Paulo 2017
PAPIs.io
 
Extracting information from images using deep learning and transfer learning ...
Extracting information from images using deep learning and transfer learning ...Extracting information from images using deep learning and transfer learning ...
Extracting information from images using deep learning and transfer learning ...
PAPIs.io
 
Discovering the hidden treasure of data using graph analytic — Ana Paula Appe...
Discovering the hidden treasure of data using graph analytic — Ana Paula Appe...Discovering the hidden treasure of data using graph analytic — Ana Paula Appe...
Discovering the hidden treasure of data using graph analytic — Ana Paula Appe...
PAPIs.io
 
Deep learning for sentiment analysis — André Barbosa (elo7) @PAPIs Connect — ...
Deep learning for sentiment analysis — André Barbosa (elo7) @PAPIs Connect — ...Deep learning for sentiment analysis — André Barbosa (elo7) @PAPIs Connect — ...
Deep learning for sentiment analysis — André Barbosa (elo7) @PAPIs Connect — ...
PAPIs.io
 
Building machine learning service in your business — Eric Chen (Uber) @PAPIs ...
Building machine learning service in your business — Eric Chen (Uber) @PAPIs ...Building machine learning service in your business — Eric Chen (Uber) @PAPIs ...
Building machine learning service in your business — Eric Chen (Uber) @PAPIs ...
PAPIs.io
 
Building machine learning applications locally with Spark — Joel Pinho Lucas ...
Building machine learning applications locally with Spark — Joel Pinho Lucas ...Building machine learning applications locally with Spark — Joel Pinho Lucas ...
Building machine learning applications locally with Spark — Joel Pinho Lucas ...
PAPIs.io
 
Battery log data mining — Ramon Oliveira (Datart) @PAPIs Connect — São Paulo ...
Battery log data mining — Ramon Oliveira (Datart) @PAPIs Connect — São Paulo ...Battery log data mining — Ramon Oliveira (Datart) @PAPIs Connect — São Paulo ...
Battery log data mining — Ramon Oliveira (Datart) @PAPIs Connect — São Paulo ...
PAPIs.io
 
A tensorflow recommending system for news — Fabrício Vargas Matos (Hearst tv)...
A tensorflow recommending system for news — Fabrício Vargas Matos (Hearst tv)...A tensorflow recommending system for news — Fabrício Vargas Matos (Hearst tv)...
A tensorflow recommending system for news — Fabrício Vargas Matos (Hearst tv)...
PAPIs.io
 
Scaling machine learning as a service at Uber — Li Erran Li at #papis2016
Scaling machine learning as a service at Uber — Li Erran Li at #papis2016Scaling machine learning as a service at Uber — Li Erran Li at #papis2016
Scaling machine learning as a service at Uber — Li Erran Li at #papis2016
PAPIs.io
 
Real-world applications of AI - Daniel Hulme @ PAPIs Connect
Real-world applications of AI - Daniel Hulme @ PAPIs ConnectReal-world applications of AI - Daniel Hulme @ PAPIs Connect
Real-world applications of AI - Daniel Hulme @ PAPIs Connect
PAPIs.io
 
Past, Present and Future of AI: a Fascinating Journey - Ramon Lopez de Mantar...
Past, Present and Future of AI: a Fascinating Journey - Ramon Lopez de Mantar...Past, Present and Future of AI: a Fascinating Journey - Ramon Lopez de Mantar...
Past, Present and Future of AI: a Fascinating Journey - Ramon Lopez de Mantar...
PAPIs.io
 
Revolutionizing Offline Retail Pricing & Promotions with ML - Daniel Guhl @ P...
Revolutionizing Offline Retail Pricing & Promotions with ML - Daniel Guhl @ P...Revolutionizing Offline Retail Pricing & Promotions with ML - Daniel Guhl @ P...
Revolutionizing Offline Retail Pricing & Promotions with ML - Daniel Guhl @ P...
PAPIs.io
 
Demystifying Deep Learning - Roberto Paredes Palacios @ PAPIs Connect
Demystifying Deep Learning - Roberto Paredes Palacios @ PAPIs ConnectDemystifying Deep Learning - Roberto Paredes Palacios @ PAPIs Connect
Demystifying Deep Learning - Roberto Paredes Palacios @ PAPIs Connect
PAPIs.io
 
Predictive APIs: What about Banking? - Natalino Busa @ PAPIs Connect
Predictive APIs: What about Banking? - Natalino Busa @ PAPIs ConnectPredictive APIs: What about Banking? - Natalino Busa @ PAPIs Connect
Predictive APIs: What about Banking? - Natalino Busa @ PAPIs Connect
PAPIs.io
 
Microdecision making in financial services - Greg Lamp @ PAPIs Connect
Microdecision making in financial services - Greg Lamp @ PAPIs ConnectMicrodecision making in financial services - Greg Lamp @ PAPIs Connect
Microdecision making in financial services - Greg Lamp @ PAPIs Connect
PAPIs.io
 
Engineering the Future of Our Choice with General AI - JoEllen Lukavec Koeste...
Engineering the Future of Our Choice with General AI - JoEllen Lukavec Koeste...Engineering the Future of Our Choice with General AI - JoEllen Lukavec Koeste...
Engineering the Future of Our Choice with General AI - JoEllen Lukavec Koeste...
PAPIs.io
 
Distributed deep learning with spark on AWS - Vincent Van Steenbergen @ PAPIs...
Distributed deep learning with spark on AWS - Vincent Van Steenbergen @ PAPIs...Distributed deep learning with spark on AWS - Vincent Van Steenbergen @ PAPIs...
Distributed deep learning with spark on AWS - Vincent Van Steenbergen @ PAPIs...
PAPIs.io
 
How to predict the future of shopping - Ulrich Kerzel @ PAPIs Connect
How to predict the future of shopping - Ulrich Kerzel @ PAPIs ConnectHow to predict the future of shopping - Ulrich Kerzel @ PAPIs Connect
How to predict the future of shopping - Ulrich Kerzel @ PAPIs Connect
PAPIs.io
 
The emergent opportunity of Big Data for Social Good - Nuria Oliver @ PAPIs C...
The emergent opportunity of Big Data for Social Good - Nuria Oliver @ PAPIs C...The emergent opportunity of Big Data for Social Good - Nuria Oliver @ PAPIs C...
The emergent opportunity of Big Data for Social Good - Nuria Oliver @ PAPIs C...
PAPIs.io
 

More from PAPIs.io (20)

Shortening the time from analysis to deployment with ml as-a-service — Luiz A...
Shortening the time from analysis to deployment with ml as-a-service — Luiz A...Shortening the time from analysis to deployment with ml as-a-service — Luiz A...
Shortening the time from analysis to deployment with ml as-a-service — Luiz A...
 
Feature engineering — HJ Van Veen (Nubank) @@PAPIs Connect — São Paulo 2017
Feature engineering — HJ Van Veen (Nubank) @@PAPIs Connect — São Paulo 2017Feature engineering — HJ Van Veen (Nubank) @@PAPIs Connect — São Paulo 2017
Feature engineering — HJ Van Veen (Nubank) @@PAPIs Connect — São Paulo 2017
 
Extracting information from images using deep learning and transfer learning ...
Extracting information from images using deep learning and transfer learning ...Extracting information from images using deep learning and transfer learning ...
Extracting information from images using deep learning and transfer learning ...
 
Discovering the hidden treasure of data using graph analytic — Ana Paula Appe...
Discovering the hidden treasure of data using graph analytic — Ana Paula Appe...Discovering the hidden treasure of data using graph analytic — Ana Paula Appe...
Discovering the hidden treasure of data using graph analytic — Ana Paula Appe...
 
Deep learning for sentiment analysis — André Barbosa (elo7) @PAPIs Connect — ...
Deep learning for sentiment analysis — André Barbosa (elo7) @PAPIs Connect — ...Deep learning for sentiment analysis — André Barbosa (elo7) @PAPIs Connect — ...
Deep learning for sentiment analysis — André Barbosa (elo7) @PAPIs Connect — ...
 
Building machine learning service in your business — Eric Chen (Uber) @PAPIs ...
Building machine learning service in your business — Eric Chen (Uber) @PAPIs ...Building machine learning service in your business — Eric Chen (Uber) @PAPIs ...
Building machine learning service in your business — Eric Chen (Uber) @PAPIs ...
 
Building machine learning applications locally with Spark — Joel Pinho Lucas ...
Building machine learning applications locally with Spark — Joel Pinho Lucas ...Building machine learning applications locally with Spark — Joel Pinho Lucas ...
Building machine learning applications locally with Spark — Joel Pinho Lucas ...
 
Battery log data mining — Ramon Oliveira (Datart) @PAPIs Connect — São Paulo ...
Battery log data mining — Ramon Oliveira (Datart) @PAPIs Connect — São Paulo ...Battery log data mining — Ramon Oliveira (Datart) @PAPIs Connect — São Paulo ...
Battery log data mining — Ramon Oliveira (Datart) @PAPIs Connect — São Paulo ...
 
A tensorflow recommending system for news — Fabrício Vargas Matos (Hearst tv)...
A tensorflow recommending system for news — Fabrício Vargas Matos (Hearst tv)...A tensorflow recommending system for news — Fabrício Vargas Matos (Hearst tv)...
A tensorflow recommending system for news — Fabrício Vargas Matos (Hearst tv)...
 
Scaling machine learning as a service at Uber — Li Erran Li at #papis2016
Scaling machine learning as a service at Uber — Li Erran Li at #papis2016Scaling machine learning as a service at Uber — Li Erran Li at #papis2016
Scaling machine learning as a service at Uber — Li Erran Li at #papis2016
 
Real-world applications of AI - Daniel Hulme @ PAPIs Connect
Real-world applications of AI - Daniel Hulme @ PAPIs ConnectReal-world applications of AI - Daniel Hulme @ PAPIs Connect
Real-world applications of AI - Daniel Hulme @ PAPIs Connect
 
Past, Present and Future of AI: a Fascinating Journey - Ramon Lopez de Mantar...
Past, Present and Future of AI: a Fascinating Journey - Ramon Lopez de Mantar...Past, Present and Future of AI: a Fascinating Journey - Ramon Lopez de Mantar...
Past, Present and Future of AI: a Fascinating Journey - Ramon Lopez de Mantar...
 
Revolutionizing Offline Retail Pricing & Promotions with ML - Daniel Guhl @ P...
Revolutionizing Offline Retail Pricing & Promotions with ML - Daniel Guhl @ P...Revolutionizing Offline Retail Pricing & Promotions with ML - Daniel Guhl @ P...
Revolutionizing Offline Retail Pricing & Promotions with ML - Daniel Guhl @ P...
 
Demystifying Deep Learning - Roberto Paredes Palacios @ PAPIs Connect
Demystifying Deep Learning - Roberto Paredes Palacios @ PAPIs ConnectDemystifying Deep Learning - Roberto Paredes Palacios @ PAPIs Connect
Demystifying Deep Learning - Roberto Paredes Palacios @ PAPIs Connect
 
Predictive APIs: What about Banking? - Natalino Busa @ PAPIs Connect
Predictive APIs: What about Banking? - Natalino Busa @ PAPIs ConnectPredictive APIs: What about Banking? - Natalino Busa @ PAPIs Connect
Predictive APIs: What about Banking? - Natalino Busa @ PAPIs Connect
 
Microdecision making in financial services - Greg Lamp @ PAPIs Connect
Microdecision making in financial services - Greg Lamp @ PAPIs ConnectMicrodecision making in financial services - Greg Lamp @ PAPIs Connect
Microdecision making in financial services - Greg Lamp @ PAPIs Connect
 
Engineering the Future of Our Choice with General AI - JoEllen Lukavec Koeste...
Engineering the Future of Our Choice with General AI - JoEllen Lukavec Koeste...Engineering the Future of Our Choice with General AI - JoEllen Lukavec Koeste...
Engineering the Future of Our Choice with General AI - JoEllen Lukavec Koeste...
 
Distributed deep learning with spark on AWS - Vincent Van Steenbergen @ PAPIs...
Distributed deep learning with spark on AWS - Vincent Van Steenbergen @ PAPIs...Distributed deep learning with spark on AWS - Vincent Van Steenbergen @ PAPIs...
Distributed deep learning with spark on AWS - Vincent Van Steenbergen @ PAPIs...
 
How to predict the future of shopping - Ulrich Kerzel @ PAPIs Connect
How to predict the future of shopping - Ulrich Kerzel @ PAPIs ConnectHow to predict the future of shopping - Ulrich Kerzel @ PAPIs Connect
How to predict the future of shopping - Ulrich Kerzel @ PAPIs Connect
 
The emergent opportunity of Big Data for Social Good - Nuria Oliver @ PAPIs C...
The emergent opportunity of Big Data for Social Good - Nuria Oliver @ PAPIs C...The emergent opportunity of Big Data for Social Good - Nuria Oliver @ PAPIs C...
The emergent opportunity of Big Data for Social Good - Nuria Oliver @ PAPIs C...
 

Recently uploaded

一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
ewymefz
 
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
u86oixdj
 
Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)
TravisMalana
 
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
NABLAS株式会社
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP
 
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Subhajit Sahu
 
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
mbawufebxi
 
The Building Blocks of QuestDB, a Time Series Database
The Building Blocks of QuestDB, a Time Series DatabaseThe Building Blocks of QuestDB, a Time Series Database
The Building Blocks of QuestDB, a Time Series Database
javier ramirez
 
原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样
原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样
原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样
u86oixdj
 
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
74nqk8xf
 
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
v3tuleee
 
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdfCh03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
haila53
 
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
axoqas
 
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Subhajit Sahu
 
Everything you wanted to know about LIHTC
Everything you wanted to know about LIHTCEverything you wanted to know about LIHTC
Everything you wanted to know about LIHTC
Roger Valdez
 
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
axoqas
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
Timothy Spann
 
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
ahzuo
 
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
dwreak4tg
 
My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.
rwarrenll
 

Recently uploaded (20)

一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
 
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
 
Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)
 
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
 
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
 
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
 
The Building Blocks of QuestDB, a Time Series Database
The Building Blocks of QuestDB, a Time Series DatabaseThe Building Blocks of QuestDB, a Time Series Database
The Building Blocks of QuestDB, a Time Series Database
 
原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样
原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样
原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样
 
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
 
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
 
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdfCh03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
 
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
 
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
 
Everything you wanted to know about LIHTC
Everything you wanted to know about LIHTCEverything you wanted to know about LIHTC
Everything you wanted to know about LIHTC
 
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
 
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
 
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
 
My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.
 

[Research] deploying predictive models with the actor framework - Brian Gawalt

  • 1. PAPIs 2015 Akka & Data Science: Making real-time predictions Brian Gawalt 2nd International Conference on Predictive APIs and Apps August 7, 2015
  • 2. PAPIs 2015 [A] Sometimes, data scientists need to worry about throughput. 2
  • 3. PAPIs 2015 [B] One way to increase throughput is with concurrency. 3
  • 4. PAPIs 2015 [C] The Actor Model is an easy way to build a concurrent system. 4
  • 5. PAPIs 2015 [D] Scala+Akka provides an easy-to-use Actor Model context. 5
  • 6. PAPIs 2015 [A + B + C + D ⇒ E] Data scientists should check out Scala+Akka. 6
  • 7. PAPIs 2015 Consider: ● building a model, ● vs. using a model 7
  • 8. PAPIs 2015 Lots of ways to practice building a model 8
  • 9. PAPIs 2015 The Classic Process 1. Load your data set’s raw materials 2. Produce feature vectors: o Training, o Validation, o Testing 3. Build the model with training and validation vectors 9
  • 10. PAPIs 2015 The Classic Process: One-time Testing 10 Load train/valid./test materials Make train/valid./test feature vectors Train Model Make test predictions Build Use
  • 11. PAPIs 2015 The Classic Process: Repeated Testing 11 Load train/valid. materials Make train/valid. feature vectors Train Model Load test/new materials Make test/new feature vectors Make test/new predictions (saved model) (repeat every K minutes) Build Use
  • 12. PAPIs 2015 Sometimes my tasks work like that, too! 12
  • 13. PAPIs 2015 But this talk is about the other kind of tasks. 13
  • 14. PAPIs 2015 [A] Sometimes, data scientists need to worry about throughput. 14
  • 16. PAPIs 2015 Hiring Freelancers on Upwork 1. Post a job 2. Search for freelancers 3. Find someone you like 4. Ask them to interview o Request Accepted! o or rejected/ignored... 16 THE TASK: Look at recent freelancer behavior, and predict, at time Step 2, who’s likely to accept an invite at time Step 4
  • 17. PAPIs 2015 Building this model is business as usual: 17
  • 18. PAPIs 2015 Building Availability Model 1. Load raw materials: o Examples of accepts/rejects o Histories of freelancer site activity  Job applications sent or received  Hours worked  Click logs  Profile updates 2. Produce feature vectors: 18 Greenplum Amazon S3 Internal Service
  • 19. PAPIs 2015 Using Availability Model 19 Load train/valid. materials Make train/valid. feature vectors Train Model Load test/new materials Make test/new feature vectors Make test/new predictions (saved model) (repeat every 60 minutes)
  • 20. PAPIs 2015 Using Availability Model 20 Load test/new materials Make test/new feature vectors Make test/new predictions (saved model) (repeat every 60 minutes) Load job app data (4 min.) Load click log data (30 min.) Load work hours data (5 min.) Load profile data (20 ms/profile)
  • 21. PAPIs 2015 Using Availability Model 21 Load job app data (4 min.) Load click log data (30 min.) Load work hours data (5 min.) Load profile data (20 ms/profile) ● Left with under 21 minutes to collect profile data ○ Rate limit: 20 ms/profile ○ At most, 63K profiles per hour ● Six Million freelancers who need avail. predictions: expect ~90 hours between re-scoring any individual ● Still need to spend time actually building vectors and exporting scores!
  • 22. PAPIs 2015 [B] One way to increase throughput is with concurrency. 22
  • 23. PAPIs 2015 Expensive Option: Major infrastructure overhaul 23
  • 24. PAPIs 2015 … but that takes a lot of time, attention, and cooperation… 24
  • 26. PAPIs 2015 [C] The Actor Model is an easy way to build a concurrent system. 26
  • 27. PAPIs 2015 ● Imagine a mailbox with a brain ● Computation only begins when/if a message arrives ● Keeps its thoughts private: ○ No other actor can actively read this actor’s state ○ Other actors will have to wait to hear a message from this actor An Actor 27
  • 28. PAPIs 2015 ● Lots of Actors, and each has: ○ Private message queue ○ Private state, shared only sending more messages ● Execution context: ○ Manages threading of each Actor’s computation ○ Handles asynch. message routing ○ Can send prescheduled messages ● Each received message’s computation is fully completed before Actor moves on to next message in queue The Actor Model of Concurrency 28
  • 29. PAPIs 2015 The Actor Model of Concurrency 29 Execution Context
  • 30. PAPIs 2015 Parallelizing predictions 30 Refresh work hours Vectorizer: ● Keep copies of raw data ● Emit vector for each new profile received Refresh job apps Refresh click log Fetch 10 profiles Apply model; export prediction raw data raw data Schedule: Fetch once per hour Schedule: Fetch once per hour Schedule: Fetch once per hour Schedule: Fetch every 300ms
  • 31. PAPIs 2015 Serial processing 31 Refresh job apps Make feature vectors Export predictions (repeat every 60 minutes) Refresh work hours Refresh click log Fetch ~50K profiles ... 55 min 5 min 4 min 5 min 30 min 55 - 4 - 5 - 30 = 16 min...
  • 32. PAPIs 2015 Serial processing 32 Refresh job apps Make feature vectors Export predictions (repeat every 60 minutes) Refresh work hours Refresh click log Fetch ~50K profiles ... 55 min 5 min 4 min 5 min 30 min 55 - 4 - 5 - 30 = 16 min... Throughput: 48K users/hr
  • 33. PAPIs 2015 Parallel Processing with Actors 33 Refresh job apps ... Refresh click log Refresh work hrs. Rx data Fetch pro. Export Rx data Fetch pro. Fetch pro. Fetch pro. Fetch pro.= msg. sent = msg. rx’d 1/hr. 1/hr. 1/hr. 3/sec. (as rx’ed) Store Store Vectorize Vectorize Store 1/hr. Thr. 1 Thr. 2 Thr. 3 Thr. 4 Vectorize Fetch pro. Fetch pro. (msg. processing time not to scale) Rx data Vectorize ...
  • 34. PAPIs 2015 Parallel Processing with Actors 34 Refresh job apps ... Refresh click log Refresh work hrs. Rx data Fetch pro. Export Rx data Fetch pro. Fetch pro. Fetch pro. Fetch pro.= msg. sent = msg. rx’d 1/hr. 1/hr. 1/hr. 3/sec. (as rx’ed) Store Store Vectorize Vectorize Store 1/hr. Thr. 1 Thr. 2 Thr. 3 Thr. 4 Vectorize Fetch pro. Fetch pro. Throughput: 180K users/hr Rx data Vectorize ...
  • 35. PAPIs 2015 [D] Scala+Akka provides an easy-to-use Actor Model context. 35
  • 36. PAPIs 2015 Message passing, scheduling, & computation behavior defined in 445 lines. 36
  • 37. PAPIs 2015 Scala+Akka Actors ● Create Scala class, mix in Actor trait ● Implement the required partial function: receive: PartialFunction[Any, Unit] ● Define family of message objects this actor’s planning to handle ● Define behavior for each message case in receive 37
  • 38. PAPIs 2015 Scala+Akka Actors 38 Mixin same code used for export in non-Actor version Private, mutable state: stored scores Private, mutable state: time of last export If receiving new scores: store them! If storing lots of scores, or if it’s been awhile: upload what’s stored, then erase them If told to shut down, stop accepting new scores
  • 39. PAPIs 2015 Scala+Akka Pros ● Easy to get productive in the Scala language ● SBT dependency management makes it easy to move to any box with a JRE ● No global interpreter lock! 39
  • 40. PAPIs 2015 Scala+Akka Cons ● Moderate Scala learning curve ● Object representation on the JVM has pretty lousy memory efficiency ● Not a lot of great options for building models in Scala (compared to R, Python, Julia) 40
  • 41. PAPIs 2015 [A] Sometimes, data scientists need to worry about throughput. 41
  • 42. PAPIs 2015 [B] One way to increase throughput is with concurrency. 42
  • 43. PAPIs 2015 [C] The Actor Model is an easy way to build a concurrent system. 43
  • 44. PAPIs 2015 [D] Scala+Akka provides an easy-to-use Actor Model context. 44
  • 45. PAPIs 2015 [A + B + C + D ⇒ Z] Data scientists should check out Scala+Akka 45