Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Unbounded bounded-data-strangeloop-2016-monal-daxini

1,018 views

Published on

The need for gleaning answers from data in real-time is moving from nicety to a necessity. There are few options to analyze the never-ending stream of unbounded data at scale. Let’s compare and contrast the core principles and technologies the different open source solutions available to help with this endeavor, and where in the future processing engines need to evolve to solve processing needs at scale. These findings are based on the experience of continuing to build a scalable solution in the cloud to process over 700 billion events at Netflix, and how we are embarking on the next journey to evolve unbounded data processing engines.

Published in: Technology
  • Be the first to comment

Unbounded bounded-data-strangeloop-2016-monal-daxini

  1. 1. Monal Daxini Engineering Manager, Stream Processing Real Time Data Infrastructure @monaldax @Netflix #keystone DEEP DIVE INTO UNBOUNDED DATA PROCESSING SYSTEMS Sep 17 2016
  2. 2. The Unbounded Data Domain Photo: Monal Daxini
  3. 3. ● Streaming is used to mean ○ infinite set (streaming data) or ○ type of data processing engine (stream processor) Overloaded Term
  4. 4. ● Batch is used to mean ○ finite set (batched) or ○ the type of execution engine (bath processor) Overloaded Term
  5. 5. ● Unbounded → Infinite data elements (order not implied) ● Bounded → Finite data elements (order not implied) ● Streaming / Batch → use to exclusively describe execution engine Let’s be precise
  6. 6. ● Streaming execution engine could process out-of-order unbounded or bounded data ● Batch execution engine could process out -of-order unbounded or bounded data Hence..
  7. 7. Based on tradeoffs between ● Latency ● Accuracy (correctness) ● Cost How do I choose an engine? Choose an engine that lets you make these tradeoffs for each use case.
  8. 8. ● At-most once processing ● At-least once processing ● Exactly-once processing* Processing semantics
  9. 9. Easy, just ● Reprocess the finite set again on failure ○ More efficient with checkpointing Accurate bounded data processing
  10. 10. Needs 1. Consistent state (correctness) - across failure via checkpointing 2. Tools / techniques to reason about time Accurate unbounded data processing
  11. 11. ● Event Time - time event was created ● Ingest Time - time event was ingested into the engine ● Processing Time - time the event is processed Events & Time
  12. 12. Events & Time Image: Flink 1.2 documentation
  13. 13. ● Not needed for operating on each element ● Needed for some operations on unbounded data ○ grouping: aggregations, outer joins Windowing
  14. 14. ● Aligned - count or time based ○ Sliding ○ Fixed - Tumbling, Hopping ● Unaligned ○ Session / Dynamic Windowing Types
  15. 15. Event-Time Based Tumbling Windows Event Time Processing Time 11:0010:00 15:0014:0013:0012:00 11:0010:00 15:0014:0013:0012:00 Input Output Adapted from: The Apache Beam Model, Tyler Akidau, Frances Perry
  16. 16. Process-Time Based Sliding Windows Processing Time 11:0010:00 15:0014:0013:0012:00 Input Output Adapted from: The Apache Beam Model, Tyler Akidau, Frances Perry Processing Time 11:0010:00 15:0014:0013:0012:00
  17. 17. Session Windows (Unaligned) Event Time Processing Time 11:0010:00 15:0014:0013:0012:00 11:0010:00 15:0014:0013:0012:00 Input Output Gap Duration Adapted from: The Apache Beam Model, Tyler Akidau, Frances Perry
  18. 18. ● When to compute and materialize window results ○ Before a window (early firing) ○ At window completion ○ After window completion (late firing) Triggers
  19. 19. Click impressions for a movie within a row ● What - click count per listed movie, & enrich with movie metadata ● Where - at every 4 mins past current event-time ● When - trigger every 2 mins (wall clock) & 4 mins (event-time) ● How - update the count for late event (mobile reconnects) Click count Example Reference: Dataflow Model
  20. 20. Google Cloud Platform 21 Watermark - Reasoning about completenessProcessingTime Event Time Ideal Watermark Watermarks describe event time progress. "No timestamp earlier than the watermark will be seen" Adapted from: The Apache Beam Model, Tyler Akidau, Frances Perry
  21. 21. Watermark example
  22. 22. Google Cloud Platform 23 Watermark in PracticeProcessingTime Event Time Skew Watermark ● Often heuristic-based. ○ timestamp < watermark can show up ● Too Slow? Results are delayed. ● Too Fast? Some data is late. Adapted from: The Apache Beam Model, Tyler Akidau, Frances Perry
  23. 23. Watermark is heuristic based. For late data ● Emit late click count, let sink or downstream app accumulate ● Emit correct value - Fetch earlier count, add late click count, emit to sink Late Data Handling (challenging) Reference: Dataflow Model
  24. 24. Apache Beam (incubating)
  25. 25. Stream Processing Requirements
  26. 26. ● Time support ○ Event, Processing, and Ingestion time ● Windowing ○ Fixed, Sliding, Session / Dynamic ● Watermark (completeness) Data flow Functionality (review) ● Deal with late data ● Event Processing Semantics ● Checkpoints / Savepoints ○ Metadata & Data
  27. 27. ● Map, Filter, projection, grouping, etc. ● Joining - streams with other streams or static data ● Chain functionality - DAG of transformations ● Support for different event sources and sinks ● Streaming SQL Functional Features
  28. 28. ● Job level monitoring & alerts ● Job lifecycle management ● Backpressure ● Auto scaling Operational Features ● Dynamic rebalancing ● Event traceability ● Multi-tenant ● Quick prototyping (REPL / Notebook)
  29. 29. Runtime & Data flow Execution Engines
  30. 30. Sinks Runtime Execution Engine Local State Event Producers Temp Event Store What architecture would one end up with?
  31. 31. 4 Partitions Kafka Image: Kafka 0.9 documentation
  32. 32. Samza 0.9 Architecture ● Single threaded loop ○ Process ○ Window ○ Commit Image: Samza documentation
  33. 33. Samza State Management Local state Image: Samza documentation
  34. 34. Samza Job Digraph Kafka Topic Image: Samza documentation
  35. 35. Spark Streaming - Microbatch Image Adapted: Tathagata Das, Deep Dive into Spark, 2016 RDD ImmutableRDD Immutable
  36. 36. Image: Source - link
  37. 37. Pipelining RDD Partition Image: Lisa Hua, 2014
  38. 38. Spark Structure Streaming (2.0 experimental) (Abstraction atop microbatch) Image: Tathagata Das, Deep Dive into Spark, 2016
  39. 39. Spark Structure Streaming (2.0 experimental) (Abstraction atop microbatch) Image: Tathagata Das, Deep Dive into Spark, 2016
  40. 40. Image Credit: Spark 2.0 documentation Spark Execution
  41. 41. Image: Flink 1.2 documentation Flink
  42. 42. Image: Flink 1.2 documentation Flink
  43. 43. Flink Image: Flink 1.2 documentation
  44. 44. Flink Image: Flink 1.2 documentation
  45. 45. Flink Image: Flink 1.2 documentation
  46. 46. Ponder Unbounded Data Processing Paradigm & runtime as a platform to build Applications?
  47. 47. Unbounded Data Processing in Practice
  48. 48. Keystone Keystone Stream Processing (SPaaS) Keystone Management Keystone Messaging Schema Support 100% in AWS
  49. 49. Create DuploⓇ Blocks: Let reusability drive new value Our Philosophy
  50. 50. Netflix Service Scale
  51. 51. World’s Leading Internet Streaming Service (Global launch Jan 6, 2016)
  52. 52. ● 83+ Million Members, 190+ Countries ● 1000+ device types ● 35% of downstream Internet traffic Netflix Service Scale
  53. 53. Netflix Service Scale - Daily viewing hours 125,000,000,000+ Whoa!
  54. 54. Events Processed / day 1,000,000,000,000+ 1.4 PB That’s a huge number!
  55. 55. Event Scale Peak ● 1T unique events ingested/day ● 16M / sec ● 43GB / sec ● 10MB / message Daily Averages ● 1T+ events processed ● 600B unique events ingested ● 1.4 PB / day ● 4K / event
  56. 56. 99.99% + Availability / Four 9s Keystone Scale
  57. 57. Keystone Events Trend 1/2014 80B / day 1/2015 300B / day 1/2016 1T+ / day
  58. 58. Evolution SPaaS
  59. 59. Where are we? Season 0
  60. 60. Keystone Management
  61. 61. Per Stream Auto Dashboard
  62. 62. Keystone Stream Consumers Router EMR Fronting Kafka Consumer Kafka Event Producer KSProxy Control Plane Self Service UI 100% in AWS 24 x 7 Region failover
  63. 63. Keystone Messaging Kafka Clusters
  64. 64. Keystone Stream Consumers Samza Router EMR Fronting Kafka Event Producer Consumer Kafka Control Plane Checkpoint Cluster
  65. 65. Kafka Kong At least once a weekKafka
  66. 66. Samza Router Fronting Kafka Event Producer X Failover Fronting Stand-In Kafka
  67. 67. ● Time is the essence - failover as fast as 5 minutes Fully Automated Failover
  68. 68. Event Flow
  69. 69. Keystone Stream Consumers Samza Router EMR Fronting Kafka Event Producer Consumer Kafka KSProxy KSLib Control Plane Self Service UI
  70. 70. Keystone Stream Consumers Samza Router EMR Fronting Kafka Event Producer Consumer Kafka Control Plane Self Service UI
  71. 71. Keystone Stream Consumers Samza Router EMR Fronting Kafka Event Producer Consumer Kafka Control Plane Self Service UI
  72. 72. Keystone Stream Consumers Samza Router EMR Fronting Kafka Event Producer Consumer Kafka Control Plane Self Service UI
  73. 73. Keystone Stream Consumers Samza Router EMR Fronting Kafka Event Producer Consumer Kafka Control Plane Self Service UI Checkpoint Cluster
  74. 74. Details.. ○ Massively parallel use-case ■ Per element processing - declarative filtering & projection ○ Stateless except Kafka offset checkpointing state
  75. 75. Routing Infrastructure + Checkpointing Cluster + 0.9.1Go C language
  76. 76. EC2 Instances Zookeeper (Instance Id assignment) Job Job Job KSNodeAgent Checkpointing Cluster Immutable Job Config Self-Service UI / Control Plane Reconcile Loop - 1 min KSNode Container Runtime
  77. 77. Keystone Scale ● 4000+ Kafka brokers ● 16,000+ Samza jobs in Docker containers ○ On 2000+ nodes
  78. 78. What’s Next? Season 1 Pilot & Plot SPaaS
  79. 79. Stream Processing As a Service (SPaaS)
  80. 80. SPaaS Vision (plot) ● Self Service ● Multi-tenant support for stateful stream processing apps ● Autoscaling managed infrastructure ● Support for schemas
  81. 81. SPaaS Architecture (plot) SPaaS Manager Titus Container Runtime Framework Specific API or Common API (Beam) [ Dockerized Job ] 1. Create 2. Submit 3. Launch Runner Running Job 1. Submit Job DSL (SQL) Tooling/ Dashboard
  82. 82. Why Apache Beam? ○ Portable API layer to build sophisticated data processing apps ■ Support multiple execution engines ○ Unified model API over bounded and unbounded data sources ○ Millwheel, FlumeJava, Dataflow model lineage SPaaS - “Beam Me Up, Scotty ! "
  83. 83. Iterative build out: then ● First - Flink on Titus in VPC, AWS ○ Titus is a cloud runtime platform for container based jobs ● Next - Apache Beam and Flink runner SPaaS - Pilot
  84. 84. 2. 1. Keystone SPaaS-Flink Pilot Use Cases Stream Consumers Flink Router EMR Fronting Kafka Event Producer Consumer Kafka 3. Demux MergeControl Plane Self Service UI
  85. 85. Titus Job Task Manager IP Titus Host 4 Titus Host 5 Flink Program Deployment (prod shadow) Zookeeper Job Manager (standby) Job Manager (master) Task Manager Titus Host 1 IP Titus Host 2 …. Task Manager Titus Host 3 IP Titus Job IPIP AWS VPC ENI
  86. 86. Titus High Level Architecture Titus UITitus UI Docker Registry Docker Registry Rhea container container container docker Titus Agent metrics agent container container SPaaS-Flink Titus executor logging agent zfsmesos agent docker RheaTitus API Cassandra Titus Master Job Management & Scheduler S3 Zookeeper Docker Registry EC2 Autoscaling API Mesos Master Titus UI (CI/CD) Fenzo
  87. 87. CD
  88. 88. Flink Router perf test (YMMV) ○ Note ■ The tests were performed on a specific use case, ■ running in a specific environment, and with ■ with one specific event stream, and setup.
  89. 89. 1. Keystone Stream Consumers Samza Router EMR Fronting Kafka Event Producer Consumer Kafka Control Plane Self Service UI
  90. 90. Details.. ○ Different runtimes for Flink & Samza routers ○ Massively parallel use-case ■ Per element processing ○ Focused on net outcomes
  91. 91. Titus Job Task Manager IP Titus Host 4 Titus Host 5 Flink (1.2) Router Zookeeper Job Manager (standby) Job Manager (master) Task Manager Titus Host 1 IP Titus Host 2 …. Task Manager Titus Host 3 IP Titus Job IPIP AWS VPC ENI Backed state
  92. 92. Flink Router Perf Test (YMMV) ○ Cost ≅17% savings ○ Memory utilization ≅16% better ○ Cpu utilization ≅ 40% better ○ Network utilization ≅ 10% better ○ Msg. throughput ≅ 1% (avg) - 4% (peak) better
  93. 93. The story has just begun… We have lot’s of challenges ahead. Photo: Monal Daxini
  94. 94. More brain food... ● Netflix Keystone Pipeline Evolution ● Netflix Kafka in Keystone Pipeline ● Samza Meetup Presentation ● Titus talk ● Netflix OSS

×