Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Apache Flink @ Tel Aviv / Herzliya Meetup

275 views

Published on

Slides of my Apache Flink talk "Big Things" meetup at Proofpoint and "Apache Flink" meetup at Clicktale in January 2018

Published in: Technology
  • Be the first to comment

Apache Flink @ Tel Aviv / Herzliya Meetup

  1. 1. Apache Flink Building a Stream Processor for fast analytics, event-driven applications, event time, and tons of state Robert Metzger @rmetzger_ robert@data-artisans.com 1
  2. 2. 2 Original creators of Apache Flink® dA Platform 2 Open Source Apache Flink + dA Application Manager
  3. 3. Agenda for today § Streaming history / definition § Apache Flink intro § Production users and use cases § Building blocks § APIs § (Implementation) § Q&A 3
  4. 4. What is Streaming and Streaming Processing? § First wave for streaming was lambda architecture • Aid batch systems to be more real-time § Second wave was analytics (real time and lag-time) • Based on distributed collections, functions, and windows § The next wave is much broader: A new architecture to bring together analytics and event- driven applications 4
  5. 5. Apache Flink 101 5
  6. 6. Apache Flink § Apache Flink is an open source stream processing framework • Low latency • High throughput • Stateful • Distributed § Developed at the Apache Software Foundation, § Flink 1.4.0 is the latest release, used in production 6
  7. 7. What is Apache Flink? 7 Batch Processing process static and historic data Data Stream Processing realtime results from data streams Event-driven Applications data-driven actions and services Stateful Computations Over Data Streams
  8. 8. What is Apache Flink? 8 Queries Applications Devices etc. Database Stream File / Object Storage Stateful computations over streams real-time and historic fast, scalable, fault tolerant, in-memory, event time, large state, exactly-once Historic Data Streams Application
  9. 9. Hardened at scale 9 AthenaX Streaming SQL Platform Service trillion messages per day Streaming Platform as a Service Fraud detection Streaming Analytics Platform 100s jobs, 1000s nodes, TBs state metrics, analytics, real time ML Streaming SQL as a platform
  10. 10. AthenaX @ Uber 10
  11. 11. Blink on Flink @ 11
  12. 12. 12 @ Social network implemented using event sourcing and CQRS (Command Query Responsibility Segregation) on Kafka/Flink/Elasticsearch/Redis More: https://data-artisans.com/blog/drivetribe-cqrs-apache-flink
  13. 13. Popular Apache Flink use cases § Streaming Analytics and Pipelines (e.g., Netflix) § Streaming ML and Search (e.g., Alibaba) § Streaming SQL Metrics, Analytics, Alerting Platform (e.g., Uber) § Streaming fraud Detection (e.g., ING) § CQRS Social Network (e.g., DriveTribe) § Streaming Trade Processing (reference upon request) • Reporting (compliance), position keeping, balance sheets, risk 13
  14. 14. Building blocks 14
  15. 15. 15 Event Streams State (Event) Time Snapshots The Core Building Blocks real-time and hindsight complex business logic consistency with out-of-order data and late data forking / versioning / time-travel
  16. 16. Stateful Event & Stream Processing 16 Source Transformation Transformation Sink val lines: DataStream[String] = env.addSource(new FlinkKafkaConsumer09(…)) val events: DataStream[Event] = lines.map((line) => parse(line)) val stats: DataStream[Statistic] = stream .keyBy("sensor") .timeWindow(Time.seconds(5)) .sum(new MyAggregationFunction()) stats.addSink(new RollingSink(path)) Streaming Dataflow Source Transform Window (state read/write) Sink
  17. 17. Stateful Event & Stream Processing 17 Source Filter / Transform State read/write Sink
  18. 18. Stateful Event & Stream Processing 18 Scalable embedded state Access at memory speed & scales with parallel operators
  19. 19. Stateful Event & Stream Processing 19 Re-load state Reset positions in input streams Rolling back computation Re-processing
  20. 20. Time: Different Notions of Time 20 Event Producer Message Queue Flink Data Source Flink Window Operator partition 1 partition 2 Event Time Ingestion Time Window Processing Time Broker Time
  21. 21. Time: Event Time Example 21 1977 1980 1983 1999 2002 2005 2015 Processing Time Episode IV Episode V Episode VI Episode I Episode II Episode III Episode VII Event Time
  22. 22. 22 Event Streams State (Event) Time Snapshots Recap: The Core Building Blocks real-time and hindsight complex business logic consistency with out-of-order data and late data forking / versioning / time-travel
  23. 23. APIs 23
  24. 24. The APIs 24 Process Function (events, state, time) DataStream API (streams, windows) Table API (dynamic tables) Stream SQL Stream- & Batch Processing Analytics Stateful Event-Driven Applications
  25. 25. Process Function 25 class MyFunction extends ProcessFunction[MyEvent, Result] { // declare state to use in the program lazy val state: ValueState[CountWithTimestamp] = getRuntimeContext().getState(…) def processElement(event: MyEvent, ctx: Context, out: Collector[Result]): Unit = { // work with event and state (event, state.value) match { … } out.collect(…) // emit events state.update(…) // modify state // schedule a timer callback ctx.timerService.registerEventTimeTimer(event.timestamp + 500) } def onTimer(timestamp: Long, ctx: OnTimerContext, out: Collector[Result]): Unit = { // handle callback when event-/processing- time instant is reached } }
  26. 26. Data Stream API 26 val lines: DataStream[String] = env.addSource( new FlinkKafkaConsumer<>(…)) val events: DataStream[Event] = lines.map((line) => parse(line)) val stats: DataStream[Statistic] = stream .keyBy("sensor") .timeWindow(Time.seconds(5)) .sum(new MyAggregationFunction()) stats.addSink(new RollingSink(path))
  27. 27. Table API & Stream SQL 27
  28. 28. Implementation 28
  29. 29. Implementation: Scheduling and Deployment 29
  30. 30. Processes, Scheduling, Deployment 30
  31. 31. Implementation: State Checkpointing 31
  32. 32. State, Snapshots, Recovery 32 Coordination via markers, injected into the streams
  33. 33. 33
  34. 34. 34
  35. 35. 35
  36. 36. 36
  37. 37. Implementation: Rescaling 37
  38. 38. Rescaling State / Elasticity ▪ Similar to consistent hashing ▪ Split key space into key groups ▪ Assign key groups to tasks 38 Key space Key group #1 Key group #2 Key group #3Key group #4
  39. 39. Rescaling State / Elasticity ▪ Rescaling changes key group assignment ▪ Maximum parallelism defined by #key groups ▪ Rescaling happens through restoring a savepoint using the new parallism 39
  40. 40. Implementation: Time-handling 40
  41. 41. Time: Different Notions of Time 41 Event Producer Message Queue Flink Data Source Flink Window Operator partition 1 partition 2 Event Time Ingestion Time Window Processing Time Broker Time
  42. 42. Time: Watermarks 42 7 W(11)W(17) 11159121417122220 171921 Watermark Event Event timestamp Stream (in order) 7 W(11)W(20) Watermark 991011141517 Event Event timestamp 1820 192123 Stream (out of order)
  43. 43. Time: Watermarks in Parallel 43 Source (1) Source (2) map (1) map (2) window (1) window (2) 29 29 17 14 14 29 14 14 W(33) W(17) W(17) A|30B|31 C|30 D|15 E|30 F|15G|18H|20 K|35 Watermark Event Time at the operator Event [id|timestamp] Event Time at input streams Watermark Generation M|39N|39Q|44 L|22O|23R|37
  44. 44. Implementation: Stream SQL 44
  45. 45. Stream SQL: Intuition 45 Stream / Table Duality Table with Primary KeyTable without Primary Key
  46. 46. Stream SQL: Implementation 46 Query Compilation Differential Computation (add/mod/del)
  47. 47. Concluding… 47
  48. 48. 48
  49. 49. We are hiring! data-artisans.com/careers Positions: Software Engineers, Solution Architects, SREs, and more
  50. 50. Q & A 50 Get in touch via Twitter: @rmetzger_ @ApacheFlink @dataArtisans Get in touch via eMail: robert@data-artisans.com info@data-artisans.com
  51. 51. Bonus Material 51
  52. 52. Implementation: Queryable State 52
  53. 53. Queryable State 53 Traditional Queryable State
  54. 54. Queryable State: Implementation 54 Query Client State Registry window( )/ sum() Job Manager Task Manager ExecutionGraph State Location Server deploy status Query: /job/operation/state-name/key State Registry window( )/ sum() Task Manager (1) Get location of "key-partition" for "operator" of" job" (2) Look up location (3) Respond location (4) Query state-name and key local state register

×