Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Robust
Stream Processing
with
Apache Flink
Jamie Grier
@jamiegrier
jamie@data-artisans.com
Who am I?
• Director of Applications Engineering at data
Artisans
• Previously working on streaming computation at
Twitter...
Overview
• What is Apache Flink?
• What is Stateful Stream Processing?
• Windowed computation over streams
• Robust Time H...
What is
Apache Flink?
Apache Flink is an open source platform for distributed
stream and batch data processing.
What is
Apache Flink?
Stream Processing
Your
Code
Data Stream Data Stream
Stateful
Stream Processing
Your
Code
Data Stream Data Stream
State
More Complex
Example
Kafka
Files
Rabb
itMQ
Filter
Map
Join /
Sum
Influx
DB
C*
Distributed and Parallel
Deployment
Kafka
Files
Rabb
it
MQ
Filter
Parse
Join /
Sum
Influx
DB
C*
Robust Stream Processing
with Apache Flink
Code Example!
Windowing
Processing Time
vs
Event Time
Windowing in Processing
Time
0 1 2 34 56 7 8 9 0 1 2 3 4 5 6 7 8 9
Processing Time
Event Time
Windowing in Event
Time
0 1 2 34 56 7 8 9 0 1 2 3 4 5 6 7 8 9
Event Time
Processing Time = Errors!
Event Time = Accuracy
Failure Handling
Downtime Handling
Data Reprocessing
We’re Hiring!
http://data-artisans.com/careers
Flink Forward 2016, Berlin
Submission deadline: June 30, 2016
Early bird deadline: July 15, 2016
www.flink-forward.org
Questions?
Thanks!
Upcoming SlideShare
Loading in …5
×

Jamie Grier - Robust Stream Processing with Apache Flink

1,165 views

Published on

http://flink-forward.org/kb_sessions/robust-stream-processing-with-apache-flink/

In this hands on talk and demonstration I’ll give a very short introduction to stream processing and then dive into writing code and demonstrating the features in Apache Flink that make truly robust stream processing possible. We’ll focus on correctness and robustness in stream processing. During this live demo we’ll be developing a realtime analytics application and modifying it on the fly based on the topics we’re working though. We’ll exercise Flink’s unique features, demonstrate fault-recovery, clearly explain and demonstrate why Event Time is such an important concept in robust stateful stream processing and talk about and demonstrate the features you need in a stream processor in production. Some of the topics covered will be: – Stateful Stream Processing – Event Time vs. Processing Time – Fault tolerance – State management in the face of faults – Savepoints – Data re-processing – Planned downtime and upgrades

Published in: Data & Analytics
  • Be the first to comment

Jamie Grier - Robust Stream Processing with Apache Flink

  1. 1. Robust Stream Processing with Apache Flink Jamie Grier @jamiegrier jamie@data-artisans.com
  2. 2. Who am I? • Director of Applications Engineering at data Artisans • Previously working on streaming computation at Twitter, Gnip and Boulder Imaging • Involved in various kinds of stream processing for about a decade • High-speed video, social media streaming, general frameworks for stream processing
  3. 3. Overview • What is Apache Flink? • What is Stateful Stream Processing? • Windowed computation over streams • Robust Time Handling (Event Time vs Processing Time) • Robust Failure Handling • Robust Planned Downtime Handling • Robust Reprocessing
  4. 4. What is Apache Flink? Apache Flink is an open source platform for distributed stream and batch data processing.
  5. 5. What is Apache Flink?
  6. 6. Stream Processing Your Code Data Stream Data Stream
  7. 7. Stateful Stream Processing Your Code Data Stream Data Stream State
  8. 8. More Complex Example Kafka Files Rabb itMQ Filter Map Join / Sum Influx DB C*
  9. 9. Distributed and Parallel Deployment Kafka Files Rabb it MQ Filter Parse Join / Sum Influx DB C*
  10. 10. Robust Stream Processing with Apache Flink
  11. 11. Code Example!
  12. 12. Windowing
  13. 13. Processing Time vs Event Time
  14. 14. Windowing in Processing Time 0 1 2 34 56 7 8 9 0 1 2 3 4 5 6 7 8 9 Processing Time Event Time
  15. 15. Windowing in Event Time 0 1 2 34 56 7 8 9 0 1 2 3 4 5 6 7 8 9 Event Time
  16. 16. Processing Time = Errors!
  17. 17. Event Time = Accuracy
  18. 18. Failure Handling
  19. 19. Downtime Handling
  20. 20. Data Reprocessing
  21. 21. We’re Hiring! http://data-artisans.com/careers
  22. 22. Flink Forward 2016, Berlin Submission deadline: June 30, 2016 Early bird deadline: July 15, 2016 www.flink-forward.org
  23. 23. Questions?
  24. 24. Thanks!

×