Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Mantis: Netflix's Event 
Stream Processing System 
Justin Becker Danny Yuan 
10/26/2014
Motivation
Traditional TV just works
Netflix wants Internet TV to work just as well
Especially when things aren’t working
Tracking “big signals” is not 
enough
Need to track all kinds of permutations
Detect quickly, to resolve ASAP
Cheaper than product services
Here comes the problem
12 ~ 20 million metrics updates per 
second
750 billion metrics updates per day
4 ~ 18 Million App Events Per Second 
> 300 Billion Events Per Day
They Solve Specific Problems
They Require Lots of Domain Knowledge
A System to Rule Them All
Mantis: A Stream Processing System
3
1. Versatile User Demands
Asynchronous Computation Everywhere
3. Same Source, Multiple Consumers
We want Internet TV to just work
One problem we need to solve, 
detect movies that are failing?
Do it fast → limit impact, fix early 
Do it at scale → for all permutations 
Do it cheap → cost detect <<< serve
Work through the details for how 
to solve this problem in Mantis
Goal is to highlight unique and 
interesting design features
… begin with batch approach, 
the non-Mantis approach 
Batch algorithm, runs every N minutes 
for(play in playAttempts()){...
First problem, each run requires 
reads + writes to data store per run 
Batch algorithm, runs every N minutes 
for(play in...
For Mantis don’t want to pay that 
cost: for latency or storage 
Batch algorithm, runs every N minutes 
for(play in playAt...
Next problem, “pull” model great for 
batch processing, bit awkward for 
stream processing 
Batch algorithm, runs every N ...
By definition, batch processing 
requires batches. How do I chunk 
my data? Or, how often do I run? 
Batch algorithm, runs...
For Mantis, prefer “push” model, 
natural approach to data-in-motion 
processing 
Batch algorithm, runs every N minutes 
f...
For our “push” API we decided 
to use Reactive Extensions (Rx)
Two reasons for choosing Rx: 
theoretical, practical 
1. Observable is a natural abstraction for 
stream processing, Obser...
So, what is an Observable? 
A sequence of events, aka a stream 
Observable<String> o = 
Observable.just(“hello”, 
“qcon”, ...
What can I do with an Observable? 
Apply operators → New observable 
Subscribe → Observer of data 
Operators, familiar lam...
What is the connection with Mantis? 
In Mantis, a job (code-to-run) is the 
collection of operators applied to a 
sourced ...
Think of a “source” observable as 
the input to a job. 
Think of a “sink” observer as the 
output of the job.
Let’s refactor previous problem 
using Mantis API terminology 
Source: Play attempts 
Operators: Detection logic 
Sink: Al...
Sounds OK, but how will this scale? 
For pull model luxury of requesting 
data at specified rates/time 
Analogous to drink...
In contrast, push is the firehose 
No explicit control to limit the 
flow of data
In Mantis, we solve this problem 
by scaling horizontally
Horizontal scale is accomplished 
by arranging operators into 
logical “stages”, explicitly by job 
writer or implicitly w...
A stage is a boundary between 
computations. The boundary 
may be a network boundary, 
process boundary, etc.
So, to scale, Mantis job is really, 
Source → input observable 
Stage(s) → operators 
Sink → output observer
Let’s refactor previous problem to 
follow the Mantis job API structure 
MantisJob 
.source(Netflix.PlayAttempts()) 
.stag...
We need to provide a computation 
boundary to scale horizontally 
MantisJob 
.source(Netflix.PlayAttempts()) 
.stage({ // ...
For our problem, scale is a function 
of the number of movies tracking 
MantisJob 
.source(Netflix.PlayAttempts()) 
.stage...
Lets create two stages, one producing 
groups of movies, other to run detection 
MantisJob 
.source(Netflix.PlayAttempts()...
OK, computation logic is split, how is the 
code scheduled to resources for execution? 
MantisJob 
.source(Netflix.PlayAtt...
One, you tell the Mantis Scheduler explicitly 
at submit time: number of instances, CPU 
cores, memory, disk, network, per...
Two, Mantis Scheduler learns how to 
schedule job (work in progress) 
Looks at previous run history 
Looks at history for ...
A scheduled job creates a topology 
Stage 1 Stage 2 Stage 3 
Worker 
Worker Worker 
Worker 
Worker 
Worker 
Worker 
Applic...
Computation is split, code is scheduled, 
how is data transmitted over stage 
boundary? 
Worker Worker ?
Depends on the service level agreement 
(SLA) for the Job, transport is pluggable 
Worker Worker ?
Decision is usually a trade-off 
between latency and fault tolerance 
Worker Worker ?
A “weak” SLA job might trade-off fault 
tolerance for speed, using TCP as the 
transport 
Worker Worker ?
A “strong” SLA job might trade-off speed 
for fault tolerance, using a queue/broker as 
a transport 
Worker Worker ?
Forks and joins require data partitioning, 
collecting over boundaries 
Worker 
Worker 
Worker 
Fork Worker 
Worker 
Worke...
Mantis has native support for partitioning, 
collecting over scalars (T) and groups (K,V) 
Worker 
Worker 
Worker 
Fork Wo...
Let’s refactor job to include SLA, for the 
detection use case we prefer low latency 
MantisJob 
.source(Netflix.PlayAttem...
The job is scheduled and running what 
happens when the input-data’s volume 
changes? 
Previous scheduling decision may no...
Good news, Mantis Scheduler has the 
ability to grow and shrink (autoscale) the 
cluster and jobs
The cluster can scale up/down for two 
reasons: more/less job (demand) or jobs 
themselves are growing/shrinking 
For clus...
Backpressure is “build up” in a system 
Imagine we have a two stage Mantis job, 
Second stage is performing a complex 
cal...
Having touched on key points in Mantis 
architecture, want to show a complete job 
definition
Source 
Stage 1 
Stage 2 
Sink
Play Attempts 
Grouping by 
movie Id 
Detection 
algorithm 
Email alert
Sourcing play attempts 
Static set of 
sources
Grouping by movie Id 
GroupBy operator 
returns key 
selector function
Simple detection algorithms 
Windows for 10 minutes 
or 1000 play events 
Reduce the window to 
an experiment, update 
cou...
Sink alerts
QConSF 2014 talk on Netflix Mantis, a stream processing system
QConSF 2014 talk on Netflix Mantis, a stream processing system
QConSF 2014 talk on Netflix Mantis, a stream processing system
QConSF 2014 talk on Netflix Mantis, a stream processing system
QConSF 2014 talk on Netflix Mantis, a stream processing system
QConSF 2014 talk on Netflix Mantis, a stream processing system
QConSF 2014 talk on Netflix Mantis, a stream processing system
QConSF 2014 talk on Netflix Mantis, a stream processing system
QConSF 2014 talk on Netflix Mantis, a stream processing system
QConSF 2014 talk on Netflix Mantis, a stream processing system
QConSF 2014 talk on Netflix Mantis, a stream processing system
QConSF 2014 talk on Netflix Mantis, a stream processing system
QConSF 2014 talk on Netflix Mantis, a stream processing system
QConSF 2014 talk on Netflix Mantis, a stream processing system
QConSF 2014 talk on Netflix Mantis, a stream processing system
QConSF 2014 talk on Netflix Mantis, a stream processing system
QConSF 2014 talk on Netflix Mantis, a stream processing system
QConSF 2014 talk on Netflix Mantis, a stream processing system
QConSF 2014 talk on Netflix Mantis, a stream processing system
QConSF 2014 talk on Netflix Mantis, a stream processing system
QConSF 2014 talk on Netflix Mantis, a stream processing system
QConSF 2014 talk on Netflix Mantis, a stream processing system
QConSF 2014 talk on Netflix Mantis, a stream processing system
QConSF 2014 talk on Netflix Mantis, a stream processing system
QConSF 2014 talk on Netflix Mantis, a stream processing system
QConSF 2014 talk on Netflix Mantis, a stream processing system
QConSF 2014 talk on Netflix Mantis, a stream processing system
QConSF 2014 talk on Netflix Mantis, a stream processing system
Upcoming SlideShare
Loading in …5
×

QConSF 2014 talk on Netflix Mantis, a stream processing system

9,820 views

Published on

Justin and I gave this talk in QCon SF 2014 about the Mantis, a stream processing system that features a reactive programming API, auto scaling, and stream locality

Published in: Data & Analytics

QConSF 2014 talk on Netflix Mantis, a stream processing system

  1. 1. Mantis: Netflix's Event Stream Processing System Justin Becker Danny Yuan 10/26/2014
  2. 2. Motivation
  3. 3. Traditional TV just works
  4. 4. Netflix wants Internet TV to work just as well
  5. 5. Especially when things aren’t working
  6. 6. Tracking “big signals” is not enough
  7. 7. Need to track all kinds of permutations
  8. 8. Detect quickly, to resolve ASAP
  9. 9. Cheaper than product services
  10. 10. Here comes the problem
  11. 11. 12 ~ 20 million metrics updates per second
  12. 12. 750 billion metrics updates per day
  13. 13. 4 ~ 18 Million App Events Per Second > 300 Billion Events Per Day
  14. 14. They Solve Specific Problems
  15. 15. They Require Lots of Domain Knowledge
  16. 16. A System to Rule Them All
  17. 17. Mantis: A Stream Processing System
  18. 18. 3
  19. 19. 1. Versatile User Demands
  20. 20. Asynchronous Computation Everywhere
  21. 21. 3. Same Source, Multiple Consumers
  22. 22. We want Internet TV to just work
  23. 23. One problem we need to solve, detect movies that are failing?
  24. 24. Do it fast → limit impact, fix early Do it at scale → for all permutations Do it cheap → cost detect <<< serve
  25. 25. Work through the details for how to solve this problem in Mantis
  26. 26. Goal is to highlight unique and interesting design features
  27. 27. … begin with batch approach, the non-Mantis approach Batch algorithm, runs every N minutes for(play in playAttempts()){ Stats movieStats = getStats(play.movieId); updateStats(movieStats, play); if (movieStats.failRatio > THRESHOLD){ alert(movieId, failRatio, timestamp); } }
  28. 28. First problem, each run requires reads + writes to data store per run Batch algorithm, runs every N minutes for(play in playAttempts()){ Stats movieStats = getStats(play.movieId); updateStats(movieStats, play); if (movieStats.failRatio > THRESHOLD){ alert(movieId, failRatio, timestamp); } }
  29. 29. For Mantis don’t want to pay that cost: for latency or storage Batch algorithm, runs every N minutes for(play in playAttempts()){ Stats movieStats = getStats(play.movieId); updateStats(movieStats, play); if (movieStats.failRatio > THRESHOLD){ alert(movieId, failRatio, timestamp); } }
  30. 30. Next problem, “pull” model great for batch processing, bit awkward for stream processing Batch algorithm, runs every N minutes for(play in playAttempts()){ Stats movieStats = getStats(play.movieId); updateStats(movieStats, play); if (movieStats.failRatio > THRESHOLD){ alert(movieId, failRatio, timestamp); } }
  31. 31. By definition, batch processing requires batches. How do I chunk my data? Or, how often do I run? Batch algorithm, runs every N minutes for(play in playAttempts()){ Stats movieStats = getStats(play.movieId); updateStats(movieStats, play); if (movieStats.failRatio > THRESHOLD){ alert(movieId, failRatio, timestamp); } }
  32. 32. For Mantis, prefer “push” model, natural approach to data-in-motion processing Batch algorithm, runs every N minutes for(play in playAttempts()){ Stats movieStats = getStats(play.movieId); updateStats(movieStats, play); if (movieStats.failRatio > THRESHOLD){ alert(movieId, failRatio, timestamp); } }
  33. 33. For our “push” API we decided to use Reactive Extensions (Rx)
  34. 34. Two reasons for choosing Rx: theoretical, practical 1. Observable is a natural abstraction for stream processing, Observable = stream 2. Rx already leveraged throughout the company
  35. 35. So, what is an Observable? A sequence of events, aka a stream Observable<String> o = Observable.just(“hello”, “qcon”, “SF”); o.subscribe(x->{println x;})
  36. 36. What can I do with an Observable? Apply operators → New observable Subscribe → Observer of data Operators, familiar lambda functions map(), flatMap(), scan(), ...
  37. 37. What is the connection with Mantis? In Mantis, a job (code-to-run) is the collection of operators applied to a sourced observable where the output is sinked to observers
  38. 38. Think of a “source” observable as the input to a job. Think of a “sink” observer as the output of the job.
  39. 39. Let’s refactor previous problem using Mantis API terminology Source: Play attempts Operators: Detection logic Sink: Alerting service
  40. 40. Sounds OK, but how will this scale? For pull model luxury of requesting data at specified rates/time Analogous to drinking from a straw
  41. 41. In contrast, push is the firehose No explicit control to limit the flow of data
  42. 42. In Mantis, we solve this problem by scaling horizontally
  43. 43. Horizontal scale is accomplished by arranging operators into logical “stages”, explicitly by job writer or implicitly with fancy tooling (future work)
  44. 44. A stage is a boundary between computations. The boundary may be a network boundary, process boundary, etc.
  45. 45. So, to scale, Mantis job is really, Source → input observable Stage(s) → operators Sink → output observer
  46. 46. Let’s refactor previous problem to follow the Mantis job API structure MantisJob .source(Netflix.PlayAttempts()) .stage({ // detection logic }) .sink(Alerting.email())
  47. 47. We need to provide a computation boundary to scale horizontally MantisJob .source(Netflix.PlayAttempts()) .stage({ // detection logic }) .sink(Alerting.email())
  48. 48. For our problem, scale is a function of the number of movies tracking MantisJob .source(Netflix.PlayAttempts()) .stage({ // detection logic }) .sink(Alerting.email())
  49. 49. Lets create two stages, one producing groups of movies, other to run detection MantisJob .source(Netflix.PlayAttempts()) .stage({ // groupBy movieId }) .stage({ // detection logic }) .sink(Alerting.email())
  50. 50. OK, computation logic is split, how is the code scheduled to resources for execution? MantisJob .source(Netflix.PlayAttempts()) .stage({ // groupBy movieId }) .stage({ // detection logic }) .sink(Alerting.email())
  51. 51. One, you tell the Mantis Scheduler explicitly at submit time: number of instances, CPU cores, memory, disk, network, per instance
  52. 52. Two, Mantis Scheduler learns how to schedule job (work in progress) Looks at previous run history Looks at history for source input Over/under provision, auto adjust
  53. 53. A scheduled job creates a topology Stage 1 Stage 2 Stage 3 Worker Worker Worker Worker Worker Worker Worker Application 1 Mantis cluster Application clusters Application 2 Application 3 cluster boundary
  54. 54. Computation is split, code is scheduled, how is data transmitted over stage boundary? Worker Worker ?
  55. 55. Depends on the service level agreement (SLA) for the Job, transport is pluggable Worker Worker ?
  56. 56. Decision is usually a trade-off between latency and fault tolerance Worker Worker ?
  57. 57. A “weak” SLA job might trade-off fault tolerance for speed, using TCP as the transport Worker Worker ?
  58. 58. A “strong” SLA job might trade-off speed for fault tolerance, using a queue/broker as a transport Worker Worker ?
  59. 59. Forks and joins require data partitioning, collecting over boundaries Worker Worker Worker Fork Worker Worker Worker Join Need to partition data Need to collect data
  60. 60. Mantis has native support for partitioning, collecting over scalars (T) and groups (K,V) Worker Worker Worker Fork Worker Worker Worker Join Need to partition data Need to collect data
  61. 61. Let’s refactor job to include SLA, for the detection use case we prefer low latency MantisJob .source(Netflix.PlayAttempts()) .stage({ // groupBy movieId }) .stage({ // detection logic }) .sink(Alerting.email()) .config(SLA.weak())
  62. 62. The job is scheduled and running what happens when the input-data’s volume changes? Previous scheduling decision may not hold Prefer not to over provision, goal is for cost insights <<< product
  63. 63. Good news, Mantis Scheduler has the ability to grow and shrink (autoscale) the cluster and jobs
  64. 64. The cluster can scale up/down for two reasons: more/less job (demand) or jobs themselves are growing/shrinking For cluster we can use submit pending queue depth as a proxy for demand For jobs we use backpressure as a proxy to grow shrink the job
  65. 65. Backpressure is “build up” in a system Imagine we have a two stage Mantis job, Second stage is performing a complex calculation, causing data to queue up in the previous stage We can use this signal to increase nodes at second stage
  66. 66. Having touched on key points in Mantis architecture, want to show a complete job definition
  67. 67. Source Stage 1 Stage 2 Sink
  68. 68. Play Attempts Grouping by movie Id Detection algorithm Email alert
  69. 69. Sourcing play attempts Static set of sources
  70. 70. Grouping by movie Id GroupBy operator returns key selector function
  71. 71. Simple detection algorithms Windows for 10 minutes or 1000 play events Reduce the window to an experiment, update counts Filter out if less than threshold
  72. 72. Sink alerts

×