Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Bay Area Apache Flink Meetup #2
Distributed Stream and Graph Processing
Community Update
August 2015
Henry Saputra
Committ...
Apache Flink is an open source platform for
scalable batch and stream data processing.
Apache Flink is …
2
• The core of A...
One engine for many use cases
3
Real time streaming
topologies
Machine Learning at scale
Graph Analysis
Long batch

pipeli...
What happened? - 1
• New PMC: Maximilian Michels
• New Committer: Chesnay Schepler
• Discussions for a 0.9.1 release had s...
What happened? - 2
• Apache Flink on Wikipedia: https://
en.wikipedia.org/wiki/Apache_Flink
• New JobManager Dashboard
• A...
New Job Manager Dashboard
6
New Website Redesign and
New Features page
7
New Architecture diagram in 0.10
documentation
8
More contents in the Wiki for
Internal Information
9
In master (0.10-SNAPSHOT) - 1
10
• Gelly Scala API
• More improvements and fixes for YARN
• Flink dropped Java 6 support
•...
In master (0.10-SNAPSHOT) - 2
• Low watermarks / Event time
• New JM Dashboard
• Akka messages are now aware of leader
IDs...
Articles and Mentions
• High-throughput, low-latency, and exactly-once stream
processing with Apache Flink [1]
• Introduci...
New Meetups and Events
13
• Chicago: Flink Training @ Capital One
• Bay Area: Stream & Graph Processing @
MapR
13
GitHub stats
14
Upcoming
• Sept 15: Washington DC Area Apache
Flink Meetup
• Sept 17: StreamProcessing.be meetup
• Sept 28-30: Flink Talks...
Flink Forward schedule published
16
• http://flink-forward.org/?post_type=day
• Talks by Google, Data Artisans, Huawei,
Ca...
Upcoming SlideShare
Loading in …5
×

Bay Area Apache Flink Meetup Community Update August 2015

6,705 views

Published on

Bay Area Apache Flink Meetup Community Update August 2015 at MapR

Published in: Software
  • Be the first to comment

Bay Area Apache Flink Meetup Community Update August 2015

  1. 1. Bay Area Apache Flink Meetup #2 Distributed Stream and Graph Processing Community Update August 2015 Henry Saputra Committer and PMC Member hsaputra@apache.org @Kingwulf
  2. 2. Apache Flink is an open source platform for scalable batch and stream data processing. Apache Flink is … 2 • The core of Apache Flink is a distributed streaming dataflow engine. • Executing dataflows in parallel on clusters • Providing a reliable foundation for various workloads • DataSet and DataStream programming abstractions are the foundation for user programs and higher layers
  3. 3. One engine for many use cases 3 Real time streaming topologies Machine Learning at scale Graph Analysis Long batch
 pipelines
  4. 4. What happened? - 1 • New PMC: Maximilian Michels • New Committer: Chesnay Schepler • Discussions for a 0.9.1 release had started • Apache Flink is becoming more popular: – 1000+ Twitter followers – 500+ GitHub stars – Named as “open source Big Data project” to watch by ZDNet. – Flink Forward schedule with great speakers announced 4
  5. 5. What happened? - 2 • Apache Flink on Wikipedia: https:// en.wikipedia.org/wiki/Apache_Flink • New JobManager Dashboard • Apache SAMOA 0.3.0-incubating with Flink integration • New “Features” page • Contributors list (can you spot your name?) https://cwiki.apache.org/confluence/display/ FLINK/List+of+contributors 5
  6. 6. New Job Manager Dashboard 6
  7. 7. New Website Redesign and New Features page 7
  8. 8. New Architecture diagram in 0.10 documentation 8
  9. 9. More contents in the Wiki for Internal Information 9
  10. 10. In master (0.10-SNAPSHOT) - 1 10 • Gelly Scala API • More improvements and fixes for YARN • Flink dropped Java 6 support • Streaming connector for Elastic Search • Sampling operation on DataSet API • A lot of bug fixes: – Streaming: APIs, general stability, kafka connector
  11. 11. In master (0.10-SNAPSHOT) - 2 • Low watermarks / Event time • New JM Dashboard • Akka messages are now aware of leader IDs (for HA) • Zookeeper integration (for HA) • Live accumulators (runtime only) • Stability improvements 11
  12. 12. Articles and Mentions • High-throughput, low-latency, and exactly-once stream processing with Apache Flink [1] • Introducing Gelly: Graph Processing with Apache Flink [2] • Apache Flink and the case for stream processing [3] • Crunching Parquet Files with Apache Flink [4] • The morning paper: Asynchronous Distributed Snapshots for Distributed Dataflows [5] • Five open source Big Data projects to watch [6] • Big Data Performance Engineering: Examples from Hadoop, Pig, HBase, Flink and Spark [7] 12 [1] http://data-artisans.com/high-throughput-low-latency-and-exactly-once-stream-processing-with-apache-flink/ [2] http://flink.apache.org/news/2015/08/24/introducing-flink-gelly.html [3] http://www.kdnuggets.com/2015/08/apache-flink-stream-processing.html [4] https://medium.com/@istanbul_techie/crunching-parquet-files-with-apache-flink-200bec90d8a7 [5] http://blog.acolyer.org/2015/08/19/asynchronous-distributed-snapshots-for-distributed-dataflows/ [6] http://www.zdnet.com/article/five-open-source-big-data-projects-to-watch/ [7] http://www.bigsynapse.com/addressing-big-data-performance
  13. 13. New Meetups and Events 13 • Chicago: Flink Training @ Capital One • Bay Area: Stream & Graph Processing @ MapR 13
  14. 14. GitHub stats 14
  15. 15. Upcoming • Sept 15: Washington DC Area Apache Flink Meetup • Sept 17: StreamProcessing.be meetup • Sept 28-30: Flink Talks at ApacheCon Big Data Budapest New Meetup groups: • New York • Boston 15
  16. 16. Flink Forward schedule published 16 • http://flink-forward.org/?post_type=day • Talks by Google, Data Artisans, Huawei, CapitalOne, Bouyges, Ericsson, Amadeus, ResearchGate, RedHat, and many more. 50% off for this meetup‘s guests FlinkMeetupBayArea50

×