Sangeeta Narayanan
Edge Developer Experience @Netflix
@sangeetan
Operational Visibility at
Global Scale
Velocity Conf Santa Clara 2016
A Determined Subscriber and a
Passion-filled Telenovela
> 81M Subscribers
Subscribers/year in millionsValueAxis
20
40
60
80
2007 2009 2011 2013 2015
Subscribers/year in millionsValueAxis
20
40
60
80
2007 2009 2011 2013 2015 2016
??
RetentionAcquisition
> 81M DELIGHTED Subscribers
and counting…
How do you measure delight?
Measurement
Visibility Insights
Business
Insights
Operational
Insights
Business
Insights
Operational
Insights
+
The Rebelde Problem
Granularity
http://bit.ly/1YBdMrT
Latency
Perspective
http://bit.ly/28LPb37
Context
http://bit.ly/28KOUZ5
Fine Grained
Realtime
Varying Perspectives
Contextual
SCALABLE
What we have built
http://bit.ly/1ZYO5Qx
Fine-grained
event stream
querying
Realtime Insights Applications
Stream Processing Platform
Aggregate
Health
Dashboards
Anomaly
detection
Alerting
Correlations
Fine-grained
event stream
querying
Realtime Insights Applications
Aggregate
Health
Dashboards
Anomaly
detection
Alerting
Correlations
Stream Processing Platform
Mantis
High Throughput
Low latency
Cost Efficient
Stream Processing Platform
http://techblog.netflix.com/2016/03/stream-processing-with-mantis.html
• Cloud Native - Elastic worker jobs & Elastic clusters
• Mixed workload on same cluster
• Data filtering at source
• Built-in backpressure
• Deep integration with Netflix ecosystem
Realtime Insights Applications
Stream Processing Platform
Aggregate
Health
Dashboards
Anomaly
detection
Alerting
Correlations
Fine-grained
event stream
querying
Device Type
Filtered down to 5 events
On-Demand Metric Definition
Rapid iterations
Recipes
Persistence for later use
Fine-grained
event stream
querying
Realtime Insights Applications
Stream Processing Platform
Anomaly
detection
Alerting
Correlations
Aggregate
Health
Dashboards
/someDevice/search
Highly responsive, interactive UI
Dynamic data visualizations
Dimensions/Facets
Integration with multiple data sources
Fine-grained
event stream
querying
Realtime Insights Applications
Stream Processing Platform
Aggregate
Health
Dashboards
Anomaly
detection
Alerting
Correlations
Anomaly Detection
Correlation Analysis
http://bit.ly/1sMbUAe
Event
Streams
Time series
Data
Device
Telemetry
Dist sys &
per node
traces
Infra
Changes
Aggregate
Health
Dashboards
Fine grained
event stream
querying
Anomaly
detection &
Alerting
App Session
Profiling
Request
Profiling
Instance level
profiling
Cassandra,
ES, …
Back to Rebelde
Anomaly Detection + Alerting —> Rapid detection
Event Stream Querying —> Precise debugging
Dynamic Metrics —> Short feedback loop
Profiling —> ‘Shift left’ Insights
> 81M DELIGHTED Customers
and counting…
Thank You
Stream Processing with Mantis
Atlas: Netflix Telemetry Platform
http://netflix.github.io/

Operational Visibility at Global Scale