Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

January 2016 Flink Community Update & Roadmap 2016

3,033 views

Published on

This presentation from the 13th Flink Meetup in Berlin contains the regular community update for January and a walkthrough of the most important upcoming features in 2016

Published in: Technology
  • Be the first to comment

January 2016 Flink Community Update & Roadmap 2016

  1. 1. Community Update & Roadmap 2016 Robert Metzger @rmetzger_ rmetzger@apache.org Berlin Apache Flink Meetup, January 26, 2016
  2. 2. January Community Update What happened in the last month 2
  3. 3. What happened? 3  Google proposed Dataflow API to Apache Incubator  Proposal discussions at the mailing list: • SQL / Stream SQL support • CEP (Complex Event Processing) library  Flink Kinesis Connector  Chengxiang Li added as committer  Discussions for releasing 1.0.0
  4. 4. Now merged to master (1.0-SNAPSOT) 4  Savepoints: Manual checkpoints for restarting jobs with state  Kafka 0.9.0.0 integration  Job submission through JobManager web interface  Checkpoint statistics in JobManager web interface  Streaming examples are now in the binary dist
  5. 5. Reading List  Benchmarking Streaming Computation Engines at Yahoo!  Receiving metrics from Apache Flink applications  Running Apache Flink on Amazon Elastic Mapreduce 5 1. http://yahooeng.tumblr.com/post/135321837876/benchmarking-streaming- computation-engines-at 2. http://mnxfst.tumblr.com/post/136539620407/receiving-metrics-from-apache-flink- applications 3. http://themodernlife.github.io/scala/hadoop/hdfs/sclading/flink/streaming/realtime/e mr/aws/2016/01/06/running-apache-flink-on-amazon-elastic-mapreduce/
  6. 6. Upcoming talks  FOSDEM Brussels (4 talks) (Jan 30-31)  Big Data Technology Summit Warsaw (Feb. 25-26)  Qcon London (March 7-9)  Hadoop Summit Dublin (2 talks) (April 13- 14)  Strata San Jose  Strata London 6
  7. 7. Global Meetup Community  Brazil-Sao Paulo Apache Flink Meetup  Apache Flink Taiwan User Group  Also new groups in Delhi, Phoenix and Dallas 7
  8. 8. Github stats 8  900 Stars
  9. 9. Roadmap 2016 Whats next? 9
  10. 10. Overview 10  SQL / StreamSQL  CEP Library  Managed Operator State  Dynamic Scaling  Miscellaneous
  11. 11. SQL and StreamSQL 11
  12. 12. SQL / StreamSQL 12  Structured queries over data sets and streams  Add support for SQL • Standard SQL queries over (batch) data sets • Continuous StreamSQL queries over data streams  Keep and extend Table API as structured query API on data sets and streams
  13. 13. Proposed Architecture 13 Table API (Batch) SQL Query StreamSQL Query ApacheCalcite Standard SQL parser Customized StreamSQL parser Optimizer Logical Plan DataSet Program DataStream Program APIs Internals
  14. 14. SQL integration into APIs 14 val stream : DataStream[(String, Double, Int)] = env.addSource(new FlinkKafkaConsumer(...)) val tabEnv = new TableEnvironment(env) tabEnv.registerStream(stream, “myStream”, (“ID”, “MEASURE”, “COUNT”)) val sqlQuery = tabEnv.sql( “SELECT ID, MEASURE FROM myStream WHERE COUNT > 17”)  Define Kafka input stream  Define table environment  SQL Query
  15. 15. Complex Event Processing 15
  16. 16. CEP Library  Complex Event Processing: the analysis of complex patterns such as correlations and sequence detection from multiple sources  Most current systems are not distributed (beyond multi-threading)  Goal: provide an easy to use API for CEP, running on a distributed high-throughput, low latency engine. 16
  17. 17. CEP Example 17 Realtime stock prices 15.1 15.3 15.2 15.5 State Machine Alerts Start Price drop by at least $.5 Ignore Alert
  18. 18. Programming API for CEP CEPStream<Event> cepStream = CEP.from(inputDataStream) // grouping GroupedCEPStream<Event> grouped = cepStream.groupBy(“id”) // windows WindowedCEPStream windowed = grouped.timeWindow(Time.minutes(10), Time.minutes(1)) WindowedCEPStream windowed = grouped.countWindow(10L, 1L) // pattern matching CEPStream<Result> resultStream = CEP.from(input).groupBy(0).pattern( Pattern.<Event>next("e1").where( (evt) -> evt.id == 42 ) .followedBy("e2").where( (evt) -> evt.id == 1337 ) .within(Time.minutes(10)) ).select( (Map<String, Event> patternElements) -> new Result(patternElements.get("e2").timestamp - patternElements.get("e1").timestamp) ) 18  convert stream into CEPStream of Events  Window events  Define a pattern to match
  19. 19. DSL for CEP select e1.id, e1.price from every e1 = Event(price > 10) → e2 = Event(date == 42) → e3 = Event(price == 10) within 10 seconds where e1.id == e2.id 19  No programming required  Potentially integrated with SQL
  20. 20. Managed Operator State 20
  21. 21. State in Flink 21 Operator “count tweet impressions” User Function state impression counts Retrieve/set count for tweet it
  22. 22. State in Flink 22 Operator “count tweet impressions” User Function state impression counts Retrieve/set count for tweet it What happens if the job crashes? Loss of data
  23. 23. Solution: Checkpoints 23 Operator “count tweet impressions” User Function impression counts Retrieve/set count for tweet it Periodic checkpoints of state to HDFS Restore from HDFS in case of failure state
  24. 24. Solution: Checkpoints 24 Operator “count tweet impressions” User Function impression counts Retrieve/set count for tweet it Periodic checkpoints of state to HDFS Restore from HDFS in case of failure state This is the current state in Flink!
  25. 25. State on Steroids 25 Operator “count tweet impressions” User Function impression counts Retrieve/set count for tweet it state
  26. 26. State on Steroids 26 Operator “count tweet impressions” User Function impression counts Retrieve/set count for tweet it state Spill to disk async/incremental snapshots Restore from HDFS in case of failure What if state grows too big?
  27. 27. State on Steroids 27 Operator “count tweet impressions” User Function impression counts Retrieve/set count for tweet it state Spill to disk
  28. 28. State on Steroids 28 Operator “count tweet impressions” User Function impression counts Retrieve/set count for tweet it state Spill to disk async/incremental snapshots Restore from HDFS in case of failure What if state grows too big? Checkpointing stalls processing!
  29. 29. State on Steroids 29 Operator “count tweet impressions” User Function impression counts Retrieve/set count for tweet it state Spill to disk async/incremental snapshots Restore from HDFS in case of failure
  30. 30. Dealing with Dynamic Resources 30
  31. 31. Streams with varying data rate 31 time events/second With static resources: Provision for max. rate Idle capacity
  32. 32. (1) Adjust Parallelism 32 Initial configuration Scale Out (for load) Scale In (save resources)
  33. 33. (1) Adjust Parallelism  Adjusting parallelism without (significantly) interrupting the program  Initial version: • Checkpoint -> stop -> restart-with-different-parallelism  Stateless operators: Trivial  Stateful operators: Repartition state • Transparent for key/value state and windows • Consistent hashing simplifies state reorganization 33
  34. 34. (2) Dynamic Worker Pool 34 JobManager Resource Manager Pool of Cluster ResourcesYARN/Mesos/… TaskManager TaskManager
  35. 35. Miscellaneous  Support for Apache Mesos  Security • Over-the-wire encryption of RPC (akka) and data transfers (netty)  More connectors • Apache Cassandra • Amazon Kinesis  Enhance metrics • Throughput / Latencies • Backpressure monitoring • Spilling / Out of Core 35

×