SlideShare a Scribd company logo
Mantis: Netflix's Event 
Stream Processing System 
Justin Becker Danny Yuan 
10/26/2014
Motivation
Traditional TV just works
Netflix wants Internet TV to work just as well
Especially when things aren’t working
Tracking “big signals” is not 
enough
Need to track all kinds of permutations
Detect quickly, to resolve ASAP
Cheaper than product services
Here comes the problem
12 ~ 20 million metrics updates per 
second
750 billion metrics updates per day
4 ~ 18 Million App Events Per Second 
> 300 Billion Events Per Day
They Solve Specific Problems
They Require Lots of Domain Knowledge
A System to Rule Them All
Mantis: A Stream Processing System
3
1. Versatile User Demands
Asynchronous Computation Everywhere
3. Same Source, Multiple Consumers
We want Internet TV to just work
One problem we need to solve, 
detect movies that are failing?
Do it fast → limit impact, fix early 
Do it at scale → for all permutations 
Do it cheap → cost detect <<< serve
Work through the details for how 
to solve this problem in Mantis
Goal is to highlight unique and 
interesting design features
… begin with batch approach, 
the non-Mantis approach 
Batch algorithm, runs every N minutes 
for(play in playAttempts()){ 
Stats movieStats = getStats(play.movieId); 
updateStats(movieStats, play); 
if (movieStats.failRatio > THRESHOLD){ 
alert(movieId, failRatio, timestamp); 
} 
}
First problem, each run requires 
reads + writes to data store per run 
Batch algorithm, runs every N minutes 
for(play in playAttempts()){ 
Stats movieStats = getStats(play.movieId); 
updateStats(movieStats, play); 
if (movieStats.failRatio > THRESHOLD){ 
alert(movieId, failRatio, timestamp); 
} 
}
For Mantis don’t want to pay that 
cost: for latency or storage 
Batch algorithm, runs every N minutes 
for(play in playAttempts()){ 
Stats movieStats = getStats(play.movieId); 
updateStats(movieStats, play); 
if (movieStats.failRatio > THRESHOLD){ 
alert(movieId, failRatio, timestamp); 
} 
}
Next problem, “pull” model great for 
batch processing, bit awkward for 
stream processing 
Batch algorithm, runs every N minutes 
for(play in playAttempts()){ 
Stats movieStats = getStats(play.movieId); 
updateStats(movieStats, play); 
if (movieStats.failRatio > THRESHOLD){ 
alert(movieId, failRatio, timestamp); 
} 
}
By definition, batch processing 
requires batches. How do I chunk 
my data? Or, how often do I run? 
Batch algorithm, runs every N minutes 
for(play in playAttempts()){ 
Stats movieStats = getStats(play.movieId); 
updateStats(movieStats, play); 
if (movieStats.failRatio > THRESHOLD){ 
alert(movieId, failRatio, timestamp); 
} 
}
For Mantis, prefer “push” model, 
natural approach to data-in-motion 
processing 
Batch algorithm, runs every N minutes 
for(play in playAttempts()){ 
Stats movieStats = getStats(play.movieId); 
updateStats(movieStats, play); 
if (movieStats.failRatio > THRESHOLD){ 
alert(movieId, failRatio, timestamp); 
} 
}
For our “push” API we decided 
to use Reactive Extensions (Rx)
Two reasons for choosing Rx: 
theoretical, practical 
1. Observable is a natural abstraction for 
stream processing, Observable = stream 
2. Rx already leveraged throughout the 
company
So, what is an Observable? 
A sequence of events, aka a stream 
Observable<String> o = 
Observable.just(“hello”, 
“qcon”, “SF”); 
o.subscribe(x->{println x;})
What can I do with an Observable? 
Apply operators → New observable 
Subscribe → Observer of data 
Operators, familiar lambda functions 
map(), flatMap(), scan(), ...
What is the connection with Mantis? 
In Mantis, a job (code-to-run) is the 
collection of operators applied to a 
sourced observable where the 
output is sinked to observers
Think of a “source” observable as 
the input to a job. 
Think of a “sink” observer as the 
output of the job.
Let’s refactor previous problem 
using Mantis API terminology 
Source: Play attempts 
Operators: Detection logic 
Sink: Alerting service
Sounds OK, but how will this scale? 
For pull model luxury of requesting 
data at specified rates/time 
Analogous to drinking 
from a straw
In contrast, push is the firehose 
No explicit control to limit the 
flow of data
In Mantis, we solve this problem 
by scaling horizontally
Horizontal scale is accomplished 
by arranging operators into 
logical “stages”, explicitly by job 
writer or implicitly with fancy 
tooling (future work)
A stage is a boundary between 
computations. The boundary 
may be a network boundary, 
process boundary, etc.
So, to scale, Mantis job is really, 
Source → input observable 
Stage(s) → operators 
Sink → output observer
Let’s refactor previous problem to 
follow the Mantis job API structure 
MantisJob 
.source(Netflix.PlayAttempts()) 
.stage({ // detection logic }) 
.sink(Alerting.email())
We need to provide a computation 
boundary to scale horizontally 
MantisJob 
.source(Netflix.PlayAttempts()) 
.stage({ // detection logic }) 
.sink(Alerting.email())
For our problem, scale is a function 
of the number of movies tracking 
MantisJob 
.source(Netflix.PlayAttempts()) 
.stage({ // detection logic }) 
.sink(Alerting.email())
Lets create two stages, one producing 
groups of movies, other to run detection 
MantisJob 
.source(Netflix.PlayAttempts()) 
.stage({ // groupBy movieId }) 
.stage({ // detection logic }) 
.sink(Alerting.email())
OK, computation logic is split, how is the 
code scheduled to resources for execution? 
MantisJob 
.source(Netflix.PlayAttempts()) 
.stage({ // groupBy movieId }) 
.stage({ // detection logic }) 
.sink(Alerting.email())
One, you tell the Mantis Scheduler explicitly 
at submit time: number of instances, CPU 
cores, memory, disk, network, per instance
Two, Mantis Scheduler learns how to 
schedule job (work in progress) 
Looks at previous run history 
Looks at history for source input 
Over/under provision, auto adjust
A scheduled job creates a topology 
Stage 1 Stage 2 Stage 3 
Worker 
Worker Worker 
Worker 
Worker 
Worker 
Worker 
Application 1 
Mantis cluster 
Application 
clusters 
Application 2 
Application 3 
cluster boundary
Computation is split, code is scheduled, 
how is data transmitted over stage 
boundary? 
Worker Worker ?
Depends on the service level agreement 
(SLA) for the Job, transport is pluggable 
Worker Worker ?
Decision is usually a trade-off 
between latency and fault tolerance 
Worker Worker ?
A “weak” SLA job might trade-off fault 
tolerance for speed, using TCP as the 
transport 
Worker Worker ?
A “strong” SLA job might trade-off speed 
for fault tolerance, using a queue/broker as 
a transport 
Worker Worker ?
Forks and joins require data partitioning, 
collecting over boundaries 
Worker 
Worker 
Worker 
Fork Worker 
Worker 
Worker 
Join 
Need to partition data Need to collect data
Mantis has native support for partitioning, 
collecting over scalars (T) and groups (K,V) 
Worker 
Worker 
Worker 
Fork Worker 
Worker 
Worker 
Join 
Need to partition data Need to collect data
Let’s refactor job to include SLA, for the 
detection use case we prefer low latency 
MantisJob 
.source(Netflix.PlayAttempts()) 
.stage({ // groupBy movieId }) 
.stage({ // detection logic }) 
.sink(Alerting.email()) 
.config(SLA.weak())
The job is scheduled and running what 
happens when the input-data’s volume 
changes? 
Previous scheduling decision may not hold 
Prefer not to over provision, goal is for cost 
insights <<< product
Good news, Mantis Scheduler has the 
ability to grow and shrink (autoscale) the 
cluster and jobs
The cluster can scale up/down for two 
reasons: more/less job (demand) or jobs 
themselves are growing/shrinking 
For cluster we can use submit pending 
queue depth as a proxy for demand 
For jobs we use backpressure as a proxy to 
grow shrink the job
Backpressure is “build up” in a system 
Imagine we have a two stage Mantis job, 
Second stage is performing a complex 
calculation, causing data to queue up in the 
previous stage 
We can use this signal to increase nodes at 
second stage
Having touched on key points in Mantis 
architecture, want to show a complete job 
definition
Source 
Stage 1 
Stage 2 
Sink
Play Attempts 
Grouping by 
movie Id 
Detection 
algorithm 
Email alert
Sourcing play attempts 
Static set of 
sources
Grouping by movie Id 
GroupBy operator 
returns key 
selector function
Simple detection algorithms 
Windows for 10 minutes 
or 1000 play events 
Reduce the window to 
an experiment, update 
counts 
Filter out if less than 
threshold
Sink alerts

More Related Content

What's hot

S4: Distributed Stream Computing Platform
S4: Distributed Stream Computing PlatformS4: Distributed Stream Computing Platform
S4: Distributed Stream Computing Platform
Aleksandar Bradic
 
Storm Anatomy
Storm AnatomyStorm Anatomy
Storm Anatomy
Eiichiro Uchiumi
 
Slide #1:Introduction to Apache Storm
Slide #1:Introduction to Apache StormSlide #1:Introduction to Apache Storm
Slide #1:Introduction to Apache Storm
Md. Shamsur Rahim
 
Apache Storm Concepts
Apache Storm ConceptsApache Storm Concepts
Apache Storm ConceptsAndré Dias
 
Yahoo compares Storm and Spark
Yahoo compares Storm and SparkYahoo compares Storm and Spark
Yahoo compares Storm and Spark
Chicago Hadoop Users Group
 
Multi-Tenant Storm Service on Hadoop Grid
Multi-Tenant Storm Service on Hadoop GridMulti-Tenant Storm Service on Hadoop Grid
Multi-Tenant Storm Service on Hadoop GridDataWorks Summit
 
Improved Reliable Streaming Processing: Apache Storm as example
Improved Reliable Streaming Processing: Apache Storm as exampleImproved Reliable Streaming Processing: Apache Storm as example
Improved Reliable Streaming Processing: Apache Storm as example
DataWorks Summit/Hadoop Summit
 
Storm: The Real-Time Layer - GlueCon 2012
Storm: The Real-Time Layer  - GlueCon 2012Storm: The Real-Time Layer  - GlueCon 2012
Storm: The Real-Time Layer - GlueCon 2012
Dan Lynn
 
Streams processing with Storm
Streams processing with StormStreams processing with Storm
Streams processing with StormMariusz Gil
 
Real Time Data Streaming using Kafka & Storm
Real Time Data Streaming using Kafka & StormReal Time Data Streaming using Kafka & Storm
Real Time Data Streaming using Kafka & Storm
Ran Silberman
 
PHP Backends for Real-Time User Interaction using Apache Storm.
PHP Backends for Real-Time User Interaction using Apache Storm.PHP Backends for Real-Time User Interaction using Apache Storm.
PHP Backends for Real-Time User Interaction using Apache Storm.
DECK36
 
Real-Time Streaming with Apache Spark Streaming and Apache Storm
Real-Time Streaming with Apache Spark Streaming and Apache StormReal-Time Streaming with Apache Spark Streaming and Apache Storm
Real-Time Streaming with Apache Spark Streaming and Apache Storm
Davorin Vukelic
 
Storm and Cassandra
Storm and Cassandra Storm and Cassandra
Storm and Cassandra
T Jake Luciani
 
Introduction to Apache Beam & No Shard Left Behind: APIs for Massive Parallel...
Introduction to Apache Beam & No Shard Left Behind: APIs for Massive Parallel...Introduction to Apache Beam & No Shard Left Behind: APIs for Massive Parallel...
Introduction to Apache Beam & No Shard Left Behind: APIs for Massive Parallel...
Dan Halperin
 
Apache Storm Internals
Apache Storm InternalsApache Storm Internals
Apache Storm Internals
Humoyun Ahmedov
 
Apache Beam: A unified model for batch and stream processing data
Apache Beam: A unified model for batch and stream processing dataApache Beam: A unified model for batch and stream processing data
Apache Beam: A unified model for batch and stream processing data
DataWorks Summit/Hadoop Summit
 
Realtime Risk Management Using Kafka, Python, and Spark Streaming by Nick Evans
Realtime Risk Management Using Kafka, Python, and Spark Streaming by Nick EvansRealtime Risk Management Using Kafka, Python, and Spark Streaming by Nick Evans
Realtime Risk Management Using Kafka, Python, and Spark Streaming by Nick Evans
Spark Summit
 
Introduction to Apache Storm
Introduction to Apache StormIntroduction to Apache Storm
Introduction to Apache Storm
Tiziano De Matteis
 
A Deeper Dive into EXPLAIN
A Deeper Dive into EXPLAINA Deeper Dive into EXPLAIN
A Deeper Dive into EXPLAIN
EDB
 
Distributed and Fault Tolerant Realtime Computation with Apache Storm, Apache...
Distributed and Fault Tolerant Realtime Computation with Apache Storm, Apache...Distributed and Fault Tolerant Realtime Computation with Apache Storm, Apache...
Distributed and Fault Tolerant Realtime Computation with Apache Storm, Apache...
Folio3 Software
 

What's hot (20)

S4: Distributed Stream Computing Platform
S4: Distributed Stream Computing PlatformS4: Distributed Stream Computing Platform
S4: Distributed Stream Computing Platform
 
Storm Anatomy
Storm AnatomyStorm Anatomy
Storm Anatomy
 
Slide #1:Introduction to Apache Storm
Slide #1:Introduction to Apache StormSlide #1:Introduction to Apache Storm
Slide #1:Introduction to Apache Storm
 
Apache Storm Concepts
Apache Storm ConceptsApache Storm Concepts
Apache Storm Concepts
 
Yahoo compares Storm and Spark
Yahoo compares Storm and SparkYahoo compares Storm and Spark
Yahoo compares Storm and Spark
 
Multi-Tenant Storm Service on Hadoop Grid
Multi-Tenant Storm Service on Hadoop GridMulti-Tenant Storm Service on Hadoop Grid
Multi-Tenant Storm Service on Hadoop Grid
 
Improved Reliable Streaming Processing: Apache Storm as example
Improved Reliable Streaming Processing: Apache Storm as exampleImproved Reliable Streaming Processing: Apache Storm as example
Improved Reliable Streaming Processing: Apache Storm as example
 
Storm: The Real-Time Layer - GlueCon 2012
Storm: The Real-Time Layer  - GlueCon 2012Storm: The Real-Time Layer  - GlueCon 2012
Storm: The Real-Time Layer - GlueCon 2012
 
Streams processing with Storm
Streams processing with StormStreams processing with Storm
Streams processing with Storm
 
Real Time Data Streaming using Kafka & Storm
Real Time Data Streaming using Kafka & StormReal Time Data Streaming using Kafka & Storm
Real Time Data Streaming using Kafka & Storm
 
PHP Backends for Real-Time User Interaction using Apache Storm.
PHP Backends for Real-Time User Interaction using Apache Storm.PHP Backends for Real-Time User Interaction using Apache Storm.
PHP Backends for Real-Time User Interaction using Apache Storm.
 
Real-Time Streaming with Apache Spark Streaming and Apache Storm
Real-Time Streaming with Apache Spark Streaming and Apache StormReal-Time Streaming with Apache Spark Streaming and Apache Storm
Real-Time Streaming with Apache Spark Streaming and Apache Storm
 
Storm and Cassandra
Storm and Cassandra Storm and Cassandra
Storm and Cassandra
 
Introduction to Apache Beam & No Shard Left Behind: APIs for Massive Parallel...
Introduction to Apache Beam & No Shard Left Behind: APIs for Massive Parallel...Introduction to Apache Beam & No Shard Left Behind: APIs for Massive Parallel...
Introduction to Apache Beam & No Shard Left Behind: APIs for Massive Parallel...
 
Apache Storm Internals
Apache Storm InternalsApache Storm Internals
Apache Storm Internals
 
Apache Beam: A unified model for batch and stream processing data
Apache Beam: A unified model for batch and stream processing dataApache Beam: A unified model for batch and stream processing data
Apache Beam: A unified model for batch and stream processing data
 
Realtime Risk Management Using Kafka, Python, and Spark Streaming by Nick Evans
Realtime Risk Management Using Kafka, Python, and Spark Streaming by Nick EvansRealtime Risk Management Using Kafka, Python, and Spark Streaming by Nick Evans
Realtime Risk Management Using Kafka, Python, and Spark Streaming by Nick Evans
 
Introduction to Apache Storm
Introduction to Apache StormIntroduction to Apache Storm
Introduction to Apache Storm
 
A Deeper Dive into EXPLAIN
A Deeper Dive into EXPLAINA Deeper Dive into EXPLAIN
A Deeper Dive into EXPLAIN
 
Distributed and Fault Tolerant Realtime Computation with Apache Storm, Apache...
Distributed and Fault Tolerant Realtime Computation with Apache Storm, Apache...Distributed and Fault Tolerant Realtime Computation with Apache Storm, Apache...
Distributed and Fault Tolerant Realtime Computation with Apache Storm, Apache...
 

Similar to QConSF 2014 talk on Netflix Mantis, a stream processing system

Bdu -stream_processing_with_smack_final
Bdu  -stream_processing_with_smack_finalBdu  -stream_processing_with_smack_final
Bdu -stream_processing_with_smack_final
manishduttpurohit
 
IMC Summit 2016 Breakout - Matt Coventon - Test Driving Streaming and CEP on ...
IMC Summit 2016 Breakout - Matt Coventon - Test Driving Streaming and CEP on ...IMC Summit 2016 Breakout - Matt Coventon - Test Driving Streaming and CEP on ...
IMC Summit 2016 Breakout - Matt Coventon - Test Driving Streaming and CEP on ...
In-Memory Computing Summit
 
Intelligent Monitoring
Intelligent MonitoringIntelligent Monitoring
Intelligent Monitoring
Intelie
 
Big Data Day LA 2016/ Big Data Track - Portable Stream and Batch Processing w...
Big Data Day LA 2016/ Big Data Track - Portable Stream and Batch Processing w...Big Data Day LA 2016/ Big Data Track - Portable Stream and Batch Processing w...
Big Data Day LA 2016/ Big Data Track - Portable Stream and Batch Processing w...
Data Con LA
 
Moving Towards a Streaming Architecture
Moving Towards a Streaming ArchitectureMoving Towards a Streaming Architecture
Moving Towards a Streaming Architecture
Gabriele Modena
 
Monitoring as Software Validation
Monitoring as Software ValidationMonitoring as Software Validation
Monitoring as Software Validation
BioDec
 
Building Continuous Application with Structured Streaming and Real-Time Data ...
Building Continuous Application with Structured Streaming and Real-Time Data ...Building Continuous Application with Structured Streaming and Real-Time Data ...
Building Continuous Application with Structured Streaming and Real-Time Data ...
Databricks
 
Deep learning with kafka
Deep learning with kafkaDeep learning with kafka
Deep learning with kafka
Nitin Kumar
 
Counting with Prometheus (CloudNativeCon+Kubecon Europe 2017)
Counting with Prometheus (CloudNativeCon+Kubecon Europe 2017)Counting with Prometheus (CloudNativeCon+Kubecon Europe 2017)
Counting with Prometheus (CloudNativeCon+Kubecon Europe 2017)
Brian Brazil
 
PART-3 : Mastering RTOS FreeRTOS and STM32Fx with Debugging
PART-3 : Mastering RTOS FreeRTOS and STM32Fx with DebuggingPART-3 : Mastering RTOS FreeRTOS and STM32Fx with Debugging
PART-3 : Mastering RTOS FreeRTOS and STM32Fx with Debugging
FastBit Embedded Brain Academy
 
Embedded Intro India05
Embedded Intro India05Embedded Intro India05
Embedded Intro India05
Rajesh Gupta
 
Natural Laws of Software Performance
Natural Laws of Software PerformanceNatural Laws of Software Performance
Natural Laws of Software Performance
Gibraltar Software
 
Alex Smola, Professor in the Machine Learning Department, Carnegie Mellon Uni...
Alex Smola, Professor in the Machine Learning Department, Carnegie Mellon Uni...Alex Smola, Professor in the Machine Learning Department, Carnegie Mellon Uni...
Alex Smola, Professor in the Machine Learning Department, Carnegie Mellon Uni...
MLconf
 
Flink 0.10 @ Bay Area Meetup (October 2015)
Flink 0.10 @ Bay Area Meetup (October 2015)Flink 0.10 @ Bay Area Meetup (October 2015)
Flink 0.10 @ Bay Area Meetup (October 2015)
Stephan Ewen
 
Big data 2.0, deep learning and financial Usecases
Big data 2.0, deep learning and financial UsecasesBig data 2.0, deep learning and financial Usecases
Big data 2.0, deep learning and financial Usecases
Arvind Rapaka
 
Spark streaming
Spark streamingSpark streaming
Spark streaming
Venkateswaran Kandasamy
 
Real time stream processing presentation at General Assemb.ly
Real time stream processing presentation at General Assemb.lyReal time stream processing presentation at General Assemb.ly
Real time stream processing presentation at General Assemb.ly
Varun Vijayaraghavan
 
Improved implementation of a Deadline Monotonic algorithm for aperiodic traff...
Improved implementation of a Deadline Monotonic algorithm for aperiodic traff...Improved implementation of a Deadline Monotonic algorithm for aperiodic traff...
Improved implementation of a Deadline Monotonic algorithm for aperiodic traff...
Andrea Tino
 
strata_spark_streaming.ppt
strata_spark_streaming.pptstrata_spark_streaming.ppt
strata_spark_streaming.ppt
rveiga100
 
C* Summit 2013: Time is Money Jake Luciani and Carl Yeksigian
C* Summit 2013: Time is Money Jake Luciani and Carl YeksigianC* Summit 2013: Time is Money Jake Luciani and Carl Yeksigian
C* Summit 2013: Time is Money Jake Luciani and Carl Yeksigian
DataStax Academy
 

Similar to QConSF 2014 talk on Netflix Mantis, a stream processing system (20)

Bdu -stream_processing_with_smack_final
Bdu  -stream_processing_with_smack_finalBdu  -stream_processing_with_smack_final
Bdu -stream_processing_with_smack_final
 
IMC Summit 2016 Breakout - Matt Coventon - Test Driving Streaming and CEP on ...
IMC Summit 2016 Breakout - Matt Coventon - Test Driving Streaming and CEP on ...IMC Summit 2016 Breakout - Matt Coventon - Test Driving Streaming and CEP on ...
IMC Summit 2016 Breakout - Matt Coventon - Test Driving Streaming and CEP on ...
 
Intelligent Monitoring
Intelligent MonitoringIntelligent Monitoring
Intelligent Monitoring
 
Big Data Day LA 2016/ Big Data Track - Portable Stream and Batch Processing w...
Big Data Day LA 2016/ Big Data Track - Portable Stream and Batch Processing w...Big Data Day LA 2016/ Big Data Track - Portable Stream and Batch Processing w...
Big Data Day LA 2016/ Big Data Track - Portable Stream and Batch Processing w...
 
Moving Towards a Streaming Architecture
Moving Towards a Streaming ArchitectureMoving Towards a Streaming Architecture
Moving Towards a Streaming Architecture
 
Monitoring as Software Validation
Monitoring as Software ValidationMonitoring as Software Validation
Monitoring as Software Validation
 
Building Continuous Application with Structured Streaming and Real-Time Data ...
Building Continuous Application with Structured Streaming and Real-Time Data ...Building Continuous Application with Structured Streaming and Real-Time Data ...
Building Continuous Application with Structured Streaming and Real-Time Data ...
 
Deep learning with kafka
Deep learning with kafkaDeep learning with kafka
Deep learning with kafka
 
Counting with Prometheus (CloudNativeCon+Kubecon Europe 2017)
Counting with Prometheus (CloudNativeCon+Kubecon Europe 2017)Counting with Prometheus (CloudNativeCon+Kubecon Europe 2017)
Counting with Prometheus (CloudNativeCon+Kubecon Europe 2017)
 
PART-3 : Mastering RTOS FreeRTOS and STM32Fx with Debugging
PART-3 : Mastering RTOS FreeRTOS and STM32Fx with DebuggingPART-3 : Mastering RTOS FreeRTOS and STM32Fx with Debugging
PART-3 : Mastering RTOS FreeRTOS and STM32Fx with Debugging
 
Embedded Intro India05
Embedded Intro India05Embedded Intro India05
Embedded Intro India05
 
Natural Laws of Software Performance
Natural Laws of Software PerformanceNatural Laws of Software Performance
Natural Laws of Software Performance
 
Alex Smola, Professor in the Machine Learning Department, Carnegie Mellon Uni...
Alex Smola, Professor in the Machine Learning Department, Carnegie Mellon Uni...Alex Smola, Professor in the Machine Learning Department, Carnegie Mellon Uni...
Alex Smola, Professor in the Machine Learning Department, Carnegie Mellon Uni...
 
Flink 0.10 @ Bay Area Meetup (October 2015)
Flink 0.10 @ Bay Area Meetup (October 2015)Flink 0.10 @ Bay Area Meetup (October 2015)
Flink 0.10 @ Bay Area Meetup (October 2015)
 
Big data 2.0, deep learning and financial Usecases
Big data 2.0, deep learning and financial UsecasesBig data 2.0, deep learning and financial Usecases
Big data 2.0, deep learning and financial Usecases
 
Spark streaming
Spark streamingSpark streaming
Spark streaming
 
Real time stream processing presentation at General Assemb.ly
Real time stream processing presentation at General Assemb.lyReal time stream processing presentation at General Assemb.ly
Real time stream processing presentation at General Assemb.ly
 
Improved implementation of a Deadline Monotonic algorithm for aperiodic traff...
Improved implementation of a Deadline Monotonic algorithm for aperiodic traff...Improved implementation of a Deadline Monotonic algorithm for aperiodic traff...
Improved implementation of a Deadline Monotonic algorithm for aperiodic traff...
 
strata_spark_streaming.ppt
strata_spark_streaming.pptstrata_spark_streaming.ppt
strata_spark_streaming.ppt
 
C* Summit 2013: Time is Money Jake Luciani and Carl Yeksigian
C* Summit 2013: Time is Money Jake Luciani and Carl YeksigianC* Summit 2013: Time is Money Jake Luciani and Carl Yeksigian
C* Summit 2013: Time is Money Jake Luciani and Carl Yeksigian
 

More from Danny Yuan

Streaming Analytics in Uber
Streaming Analytics in Uber Streaming Analytics in Uber
Streaming Analytics in Uber
Danny Yuan
 
Streaming Processing in Uber Marketplace for Kafka Summit 2016
Streaming Processing in Uber Marketplace for Kafka Summit 2016Streaming Processing in Uber Marketplace for Kafka Summit 2016
Streaming Processing in Uber Marketplace for Kafka Summit 2016
Danny Yuan
 
QCon SF-2015 Stream Processing in uber
QCon SF-2015 Stream Processing in uberQCon SF-2015 Stream Processing in uber
QCon SF-2015 Stream Processing in uber
Danny Yuan
 
Elasticsearch in Netflix
Elasticsearch in NetflixElasticsearch in Netflix
Elasticsearch in Netflix
Danny Yuan
 
netflix-real-time-data-strata-talk
netflix-real-time-data-strata-talknetflix-real-time-data-strata-talk
netflix-real-time-data-strata-talk
Danny Yuan
 
Strata lightening-talk
Strata lightening-talkStrata lightening-talk
Strata lightening-talk
Danny Yuan
 

More from Danny Yuan (6)

Streaming Analytics in Uber
Streaming Analytics in Uber Streaming Analytics in Uber
Streaming Analytics in Uber
 
Streaming Processing in Uber Marketplace for Kafka Summit 2016
Streaming Processing in Uber Marketplace for Kafka Summit 2016Streaming Processing in Uber Marketplace for Kafka Summit 2016
Streaming Processing in Uber Marketplace for Kafka Summit 2016
 
QCon SF-2015 Stream Processing in uber
QCon SF-2015 Stream Processing in uberQCon SF-2015 Stream Processing in uber
QCon SF-2015 Stream Processing in uber
 
Elasticsearch in Netflix
Elasticsearch in NetflixElasticsearch in Netflix
Elasticsearch in Netflix
 
netflix-real-time-data-strata-talk
netflix-real-time-data-strata-talknetflix-real-time-data-strata-talk
netflix-real-time-data-strata-talk
 
Strata lightening-talk
Strata lightening-talkStrata lightening-talk
Strata lightening-talk
 

Recently uploaded

Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)
TravisMalana
 
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
axoqas
 
一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单
ewymefz
 
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
ukgaet
 
Empowering Data Analytics Ecosystem.pptx
Empowering Data Analytics Ecosystem.pptxEmpowering Data Analytics Ecosystem.pptx
Empowering Data Analytics Ecosystem.pptx
benishzehra469
 
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
ewymefz
 
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
vcaxypu
 
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
NABLAS株式会社
 
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
ewymefz
 
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
ewymefz
 
Business update Q1 2024 Lar España Real Estate SOCIMI
Business update Q1 2024 Lar España Real Estate SOCIMIBusiness update Q1 2024 Lar España Real Estate SOCIMI
Business update Q1 2024 Lar España Real Estate SOCIMI
AlejandraGmez176757
 
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
nscud
 
SOCRadar Germany 2024 Threat Landscape Report
SOCRadar Germany 2024 Threat Landscape ReportSOCRadar Germany 2024 Threat Landscape Report
SOCRadar Germany 2024 Threat Landscape Report
SOCRadar
 
Adjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTESAdjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTES
Subhajit Sahu
 
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdfSample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Linda486226
 
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
vcaxypu
 
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Subhajit Sahu
 
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
yhkoc
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP
 
Tabula.io Cheatsheet: automate your data workflows
Tabula.io Cheatsheet: automate your data workflowsTabula.io Cheatsheet: automate your data workflows
Tabula.io Cheatsheet: automate your data workflows
alex933524
 

Recently uploaded (20)

Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)
 
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
 
一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单
 
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
 
Empowering Data Analytics Ecosystem.pptx
Empowering Data Analytics Ecosystem.pptxEmpowering Data Analytics Ecosystem.pptx
Empowering Data Analytics Ecosystem.pptx
 
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
 
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
 
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
 
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
 
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
 
Business update Q1 2024 Lar España Real Estate SOCIMI
Business update Q1 2024 Lar España Real Estate SOCIMIBusiness update Q1 2024 Lar España Real Estate SOCIMI
Business update Q1 2024 Lar España Real Estate SOCIMI
 
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
 
SOCRadar Germany 2024 Threat Landscape Report
SOCRadar Germany 2024 Threat Landscape ReportSOCRadar Germany 2024 Threat Landscape Report
SOCRadar Germany 2024 Threat Landscape Report
 
Adjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTESAdjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTES
 
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdfSample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
 
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
 
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
 
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
 
Tabula.io Cheatsheet: automate your data workflows
Tabula.io Cheatsheet: automate your data workflowsTabula.io Cheatsheet: automate your data workflows
Tabula.io Cheatsheet: automate your data workflows
 

QConSF 2014 talk on Netflix Mantis, a stream processing system

  • 1. Mantis: Netflix's Event Stream Processing System Justin Becker Danny Yuan 10/26/2014
  • 4. Netflix wants Internet TV to work just as well
  • 5.
  • 6. Especially when things aren’t working
  • 7.
  • 9. Need to track all kinds of permutations
  • 10. Detect quickly, to resolve ASAP
  • 12. Here comes the problem
  • 13. 12 ~ 20 million metrics updates per second
  • 14. 750 billion metrics updates per day
  • 15.
  • 16.
  • 17. 4 ~ 18 Million App Events Per Second > 300 Billion Events Per Day
  • 18.
  • 19.
  • 20.
  • 21.
  • 22.
  • 23.
  • 24.
  • 25.
  • 27. They Require Lots of Domain Knowledge
  • 28. A System to Rule Them All
  • 29.
  • 30.
  • 31. Mantis: A Stream Processing System
  • 32.
  • 33. 3
  • 34. 1. Versatile User Demands
  • 35.
  • 36.
  • 37.
  • 38.
  • 39.
  • 41.
  • 42.
  • 43.
  • 44.
  • 45.
  • 46. 3. Same Source, Multiple Consumers
  • 47.
  • 48.
  • 49.
  • 50. We want Internet TV to just work
  • 51. One problem we need to solve, detect movies that are failing?
  • 52. Do it fast → limit impact, fix early Do it at scale → for all permutations Do it cheap → cost detect <<< serve
  • 53. Work through the details for how to solve this problem in Mantis
  • 54. Goal is to highlight unique and interesting design features
  • 55. … begin with batch approach, the non-Mantis approach Batch algorithm, runs every N minutes for(play in playAttempts()){ Stats movieStats = getStats(play.movieId); updateStats(movieStats, play); if (movieStats.failRatio > THRESHOLD){ alert(movieId, failRatio, timestamp); } }
  • 56. First problem, each run requires reads + writes to data store per run Batch algorithm, runs every N minutes for(play in playAttempts()){ Stats movieStats = getStats(play.movieId); updateStats(movieStats, play); if (movieStats.failRatio > THRESHOLD){ alert(movieId, failRatio, timestamp); } }
  • 57. For Mantis don’t want to pay that cost: for latency or storage Batch algorithm, runs every N minutes for(play in playAttempts()){ Stats movieStats = getStats(play.movieId); updateStats(movieStats, play); if (movieStats.failRatio > THRESHOLD){ alert(movieId, failRatio, timestamp); } }
  • 58. Next problem, “pull” model great for batch processing, bit awkward for stream processing Batch algorithm, runs every N minutes for(play in playAttempts()){ Stats movieStats = getStats(play.movieId); updateStats(movieStats, play); if (movieStats.failRatio > THRESHOLD){ alert(movieId, failRatio, timestamp); } }
  • 59. By definition, batch processing requires batches. How do I chunk my data? Or, how often do I run? Batch algorithm, runs every N minutes for(play in playAttempts()){ Stats movieStats = getStats(play.movieId); updateStats(movieStats, play); if (movieStats.failRatio > THRESHOLD){ alert(movieId, failRatio, timestamp); } }
  • 60. For Mantis, prefer “push” model, natural approach to data-in-motion processing Batch algorithm, runs every N minutes for(play in playAttempts()){ Stats movieStats = getStats(play.movieId); updateStats(movieStats, play); if (movieStats.failRatio > THRESHOLD){ alert(movieId, failRatio, timestamp); } }
  • 61. For our “push” API we decided to use Reactive Extensions (Rx)
  • 62. Two reasons for choosing Rx: theoretical, practical 1. Observable is a natural abstraction for stream processing, Observable = stream 2. Rx already leveraged throughout the company
  • 63. So, what is an Observable? A sequence of events, aka a stream Observable<String> o = Observable.just(“hello”, “qcon”, “SF”); o.subscribe(x->{println x;})
  • 64. What can I do with an Observable? Apply operators → New observable Subscribe → Observer of data Operators, familiar lambda functions map(), flatMap(), scan(), ...
  • 65. What is the connection with Mantis? In Mantis, a job (code-to-run) is the collection of operators applied to a sourced observable where the output is sinked to observers
  • 66. Think of a “source” observable as the input to a job. Think of a “sink” observer as the output of the job.
  • 67. Let’s refactor previous problem using Mantis API terminology Source: Play attempts Operators: Detection logic Sink: Alerting service
  • 68. Sounds OK, but how will this scale? For pull model luxury of requesting data at specified rates/time Analogous to drinking from a straw
  • 69. In contrast, push is the firehose No explicit control to limit the flow of data
  • 70. In Mantis, we solve this problem by scaling horizontally
  • 71. Horizontal scale is accomplished by arranging operators into logical “stages”, explicitly by job writer or implicitly with fancy tooling (future work)
  • 72. A stage is a boundary between computations. The boundary may be a network boundary, process boundary, etc.
  • 73. So, to scale, Mantis job is really, Source → input observable Stage(s) → operators Sink → output observer
  • 74. Let’s refactor previous problem to follow the Mantis job API structure MantisJob .source(Netflix.PlayAttempts()) .stage({ // detection logic }) .sink(Alerting.email())
  • 75. We need to provide a computation boundary to scale horizontally MantisJob .source(Netflix.PlayAttempts()) .stage({ // detection logic }) .sink(Alerting.email())
  • 76. For our problem, scale is a function of the number of movies tracking MantisJob .source(Netflix.PlayAttempts()) .stage({ // detection logic }) .sink(Alerting.email())
  • 77. Lets create two stages, one producing groups of movies, other to run detection MantisJob .source(Netflix.PlayAttempts()) .stage({ // groupBy movieId }) .stage({ // detection logic }) .sink(Alerting.email())
  • 78. OK, computation logic is split, how is the code scheduled to resources for execution? MantisJob .source(Netflix.PlayAttempts()) .stage({ // groupBy movieId }) .stage({ // detection logic }) .sink(Alerting.email())
  • 79. One, you tell the Mantis Scheduler explicitly at submit time: number of instances, CPU cores, memory, disk, network, per instance
  • 80. Two, Mantis Scheduler learns how to schedule job (work in progress) Looks at previous run history Looks at history for source input Over/under provision, auto adjust
  • 81. A scheduled job creates a topology Stage 1 Stage 2 Stage 3 Worker Worker Worker Worker Worker Worker Worker Application 1 Mantis cluster Application clusters Application 2 Application 3 cluster boundary
  • 82. Computation is split, code is scheduled, how is data transmitted over stage boundary? Worker Worker ?
  • 83. Depends on the service level agreement (SLA) for the Job, transport is pluggable Worker Worker ?
  • 84. Decision is usually a trade-off between latency and fault tolerance Worker Worker ?
  • 85. A “weak” SLA job might trade-off fault tolerance for speed, using TCP as the transport Worker Worker ?
  • 86. A “strong” SLA job might trade-off speed for fault tolerance, using a queue/broker as a transport Worker Worker ?
  • 87. Forks and joins require data partitioning, collecting over boundaries Worker Worker Worker Fork Worker Worker Worker Join Need to partition data Need to collect data
  • 88. Mantis has native support for partitioning, collecting over scalars (T) and groups (K,V) Worker Worker Worker Fork Worker Worker Worker Join Need to partition data Need to collect data
  • 89. Let’s refactor job to include SLA, for the detection use case we prefer low latency MantisJob .source(Netflix.PlayAttempts()) .stage({ // groupBy movieId }) .stage({ // detection logic }) .sink(Alerting.email()) .config(SLA.weak())
  • 90. The job is scheduled and running what happens when the input-data’s volume changes? Previous scheduling decision may not hold Prefer not to over provision, goal is for cost insights <<< product
  • 91. Good news, Mantis Scheduler has the ability to grow and shrink (autoscale) the cluster and jobs
  • 92. The cluster can scale up/down for two reasons: more/less job (demand) or jobs themselves are growing/shrinking For cluster we can use submit pending queue depth as a proxy for demand For jobs we use backpressure as a proxy to grow shrink the job
  • 93. Backpressure is “build up” in a system Imagine we have a two stage Mantis job, Second stage is performing a complex calculation, causing data to queue up in the previous stage We can use this signal to increase nodes at second stage
  • 94. Having touched on key points in Mantis architecture, want to show a complete job definition
  • 95. Source Stage 1 Stage 2 Sink
  • 96. Play Attempts Grouping by movie Id Detection algorithm Email alert
  • 97. Sourcing play attempts Static set of sources
  • 98. Grouping by movie Id GroupBy operator returns key selector function
  • 99. Simple detection algorithms Windows for 10 minutes or 1000 play events Reduce the window to an experiment, update counts Filter out if less than threshold