Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Some Practical Information
Network name: Flink Forward Berlin
Password: #FlinkForward
Twitter handle: @flinkforward
Hashta...
A big Thank You! to our Sponsors
Xiaowei Jiang
Alibaba
Fabian Hueske
data Artisans
Chinmay Soman
Uber
Flavio Junqueira
Pravega by DellEMC
Kostas Kloudas
da...
A big Thank You! to our Speakers
A big Thank You! to our Speakers
A warm Welcome to all of You!
We hope you enjoy the conference
8
An example of conference
and community @ Flink Forward 2016
Last minute
talk cancellation
Impromptu discussion
session o...
A Flink Review
of 2017
@StephanEwen
Flink Forward, Berlin, 2017
9
10
2,100,000,000,000
That the number of events processed by
users attending today that filled out the survey
(~50% of you ...
11
Use case highlights
from this year so far…
12
@
 Apache Flink deployed as a streaming platform service
 Billions messages / petabytes of data per day
 Incremental...
@
13
14
 Various use cases
• Example: Stream ingestion, routing
• Example: Model user interaction sessions
 Mix of stateless ...
15
Detecting fraud in real time
As fraudsters get better, need
to update models without
downtime
Live 24/7 service
Credit ...
16
@
Social network implemented using event sourcing
and CQRS (Command Query Responsibility
Segregation) on Kafka/Flink/El...
Streaming and Streaming Processing
 First wave for streaming was lambda architecture
• Aid batch systems to be more real-...
Flink's APIs over the last year
18
Process Function (events, state, time)
DataStream API (streams, windows)
Stream SQL / T...
Some Flink features in progress
19
Dynamic Resources
Kafka exactly-once
Hadoop-free Flink
Massive dependency
reduction
Fas...
Building a
Stream Processing
Infrastructure
Flink Forward, Berlin, 2017
2
21
Let's take a step back…
Good old centralized architecture
22
The big mean
central database
$$$
The grumpy
DBA
Application Application Application ...
Enter Microservices…
23
Application Application Application
Application Application
decentralized infrastructure
DevOps
de...
Thinking about State and Databases
24
Kudos to Kiki Carter
for the Broccoli
Metaphor
… and Stateful Stream Processing
25
Application
Sensor
APIs
Application
Application
Application
very simple: state is just...
Compute, State, and Storage
26
Classic tiered architecture Streaming architecture
database
layer
compute
layer
application...
Performance
27
synchronous reads/writes
across tier boundary
asynchronous writes
of large blobs
all modifications
are loca...
Consistency
28
distributed transactions
at scale typically
at-most / at-least once
exactly once
per state =1 =1
Classic ti...
Scaling a Service
29
separately provision additional
database capacity
provision compute
and state together
Classic tiered...
Rolling out a new Service
30
provision a new database
(or add capacity to an existing one)
simply occupies some
additional...
Stateful Stream Processing
31
Application
Sensor
APIs
Application
Application
Application
very simple: state is just part
...
32
Does that solve
everything?
(of course not)
How does one easily do…?
 consistent stateful upgrades
• application evolution and bug fixes
 migration of application s...
34
?
Consistent Distributed
Snapshots
versioning, upgrading,
rollback, duplicating,
migrating, …
Continuous Applications
(...
35
Consistent Distributed
Snapshots
versioning, upgrading,
rollback, duplicating,
migrating, …
Continuous Applications
(a....
36
The dA Platform Architecture
37
dA Platform 2
Apache Flink
Stateful stream processing
Kubernetes
Container platform
Loggin...
Versioned Applications, not Jobs/Jars
38
Stream Processing
Application
Version 3
Version 2
Version 1
Code and Application
...
Deployments, not Flink Clusters
39
Testing / QA Kubernetes Cluster Production Kubernetes Cluster
Threat Metrics
App. Testi...
Hooks for CI/CD pipelines
40
Kubernetes Cluster
Application
Version 1
CI Service
dA Application
Manager
push
update
trigge...
41
Payments Dashboard
Demo Time!
42
Sign up for the
early access program!
https://data-artisans.com/da-platform-2
43
Learn more at
At the boothTalk at 11am today
@ Kesselhaus
Enjoy the Conference!
Flink Forward Berlin 2017: Stephan Ewen - The State of Flink and how to adopt Stream Processing
Upcoming SlideShare
Loading in …5
×

Flink Forward Berlin 2017: Stephan Ewen - The State of Flink and how to adopt Stream Processing

1,040 views

Published on

Data stream processing has redefined how many of us build data pipelines. Apache Flink is one of the systems at the forefront of that development: With its versatile APIs (event-time streaming, Stream SQL, events/state) and powerful execution model, Flink has been part of re-defining what stream processing can do. By now, Apache Flink powers some of the largest data stream processing pipelines in open source data stream processing. In this keynote, we will look at the evolution of Stream Processing and Apache Flink during the last year, and what we believe will be the next wave of stream processing applications. We show how the Flink community and users evolved, what use cases are coming up, and how new and upcoming features in Flink are making new types of applications possible. We will also discuss common challenges that companies are facing when adopting stream processing, and how we can help companies to rapidly adopt and roll out stream processing company-wide.

Published in: Data & Analytics
  • Hello! Get Your Professional Job-Winning Resume Here - Check our website! https://vk.cc/818RFv
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here

Flink Forward Berlin 2017: Stephan Ewen - The State of Flink and how to adopt Stream Processing

  1. 1. Some Practical Information Network name: Flink Forward Berlin Password: #FlinkForward Twitter handle: @flinkforward Hashtag: #FlinkForward All talks will be recorded and can be found on our YouTube channel “Flink Forward” after the conference Flink Fest today starting at 6.30 pm at Willner Brauerei
  2. 2. A big Thank You! to our Sponsors
  3. 3. Xiaowei Jiang Alibaba Fabian Hueske data Artisans Chinmay Soman Uber Flavio Junqueira Pravega by DellEMC Kostas Kloudas data Artisans Dean Wampler Lightbend Tyler Akidau Google A big Thank You! to our Program Committee
  4. 4. A big Thank You! to our Speakers
  5. 5. A big Thank You! to our Speakers
  6. 6. A warm Welcome to all of You! We hope you enjoy the conference
  7. 7. 8 An example of conference and community @ Flink Forward 2016 Last minute talk cancellation Impromptu discussion session on Streaming and SQL Stream SQL in Flink (and possibly others)
  8. 8. A Flink Review of 2017 @StephanEwen Flink Forward, Berlin, 2017 9
  9. 9. 10 2,100,000,000,000 That the number of events processed by users attending today that filled out the survey (~50% of you here)
  10. 10. 11 Use case highlights from this year so far…
  11. 11. 12 @  Apache Flink deployed as a streaming platform service  Billions messages / petabytes of data per day  Incrementally realizing more and more services • Growth: “how much did we earn in SF in the last 5 mins” • Intelligent alerting: Ban driver/rider if suspicious activity • Intelligent forecasting: Increase accuracy of ETA models
  12. 12. @ 13
  13. 13. 14  Various use cases • Example: Stream ingestion, routing • Example: Model user interaction sessions  Mix of stateless / moderate state / large state  Stream Processing as a Service • Launching, monitoring, scaling, updating @
  14. 14. 15 Detecting fraud in real time As fraudsters get better, need to update models without downtime Live 24/7 service Credit card transactions Notifications and alerts Evolving fraud models built by data scientists @
  15. 15. 16 @ Social network implemented using event sourcing and CQRS (Command Query Responsibility Segregation) on Kafka/Flink/Elasticsearch/Redis More: https://data-artisans.com/blog/drivetribe-cqrs-apache-flink
  16. 16. Streaming and Streaming Processing  First wave for streaming was lambda architecture • Aid batch systems to be more real-time  Second wave was analytics (real time and lag-time) • Based on distributed collections, functions, and windows  The next wave is much broader: A new architecture for a unified approach to data analytics and event-driven applications 17
  17. 17. Flink's APIs over the last year 18 Process Function (events, state, time) DataStream API (streams, windows) Stream SQL / Tables (dynamic tables) Stream- & Batch Data Processing High-level Analytics API Stateful Event- Driven Applications val stats = stream .keyBy("sensor") .timeWindow(Time.seconds(5)) .sum((a, b) -> a.add(b)) def processElement(event: MyEvent, ctx: Context, out: Collector[Result]) = { // work with event and state (event, state.value) match { … } out.collect(…) // emit events state.update(…) // modify state // schedule a timer callback ctx.timerService.registerEventTimeTimer(event.timestamp + 500) }
  18. 18. Some Flink features in progress 19 Dynamic Resources Kafka exactly-once Hadoop-free Flink Massive dependency reduction Faster checkpoints and restore Auto-tuning network latency SQL performance SQL / CEP integration Flink/Beam/Python RESTified ops APIs Deployment Versatility KinesisPravega Integration Eventually Consistent FileSystems SQL connectors Connectors & Ecosystem Engine APIs & Languages Development and Deployment Mesos Kubernetes (more)
  19. 19. Building a Stream Processing Infrastructure Flink Forward, Berlin, 2017 2
  20. 20. 21 Let's take a step back…
  21. 21. Good old centralized architecture 22 The big mean central database $$$ The grumpy DBA Application Application Application Application
  22. 22. Enter Microservices… 23 Application Application Application Application Application decentralized infrastructure DevOps decentralized responsibilities still involves managing databases everyone is a mini-DBA
  23. 23. Thinking about State and Databases 24 Kudos to Kiki Carter for the Broccoli Metaphor
  24. 24. … and Stateful Stream Processing 25 Application Sensor APIs Application Application Application very simple: state is just part of the application micro services on steroids! encourages to build even more lightweight and specialized apps
  25. 25. Compute, State, and Storage 26 Classic tiered architecture Streaming architecture database layer compute layer application state + backup compute + stream storage and snapshot storage (backup) application state
  26. 26. Performance 27 synchronous reads/writes across tier boundary asynchronous writes of large blobs all modifications are local Classic tiered architecture Streaming architecture
  27. 27. Consistency 28 distributed transactions at scale typically at-most / at-least once exactly once per state =1 =1 Classic tiered architecture Streaming architecture
  28. 28. Scaling a Service 29 separately provision additional database capacity provision compute and state together Classic tiered architecture Streaming architecture provision compute
  29. 29. Rolling out a new Service 30 provision a new database (or add capacity to an existing one) simply occupies some additional backup space Classic tiered architecture Streaming architecture provision compute and state together
  30. 30. Stateful Stream Processing 31 Application Sensor APIs Application Application Application very simple: state is just part of the application micro services on steroids! encourages to build even more lightweight and specialized apps
  31. 31. 32 Does that solve everything? (of course not)
  32. 32. How does one easily do…?  consistent stateful upgrades • application evolution and bug fixes  migration of application state • cluster migration, A/B testing  re-processing and reinstatement • fix corrupt results, bootstrap new applications  state evolution (schema evolution) 33
  33. 33. 34 ? Consistent Distributed Snapshots versioning, upgrading, rollback, duplicating, migrating, … Continuous Applications (a.k.a. savepoints)
  34. 34. 35 Consistent Distributed Snapshots versioning, upgrading, rollback, duplicating, migrating, … Continuous Applications (a.k.a. savepoints)
  35. 35. 36
  36. 36. The dA Platform Architecture 37 dA Platform 2 Apache Flink Stateful stream processing Kubernetes Container platform Logging Streams from Kafka, Kinesis, S3, HDFS, Databases, ... dA Application Manager Application lifecycle management Metrics CI/CD Real-time Analytics Anomaly- & Fraud Detection Real-time Data Integration Reactive Microservices (and more)
  37. 37. Versioned Applications, not Jobs/Jars 38 Stream Processing Application Version 3 Version 2 Version 1 Code and Application Snapshot upgrade upgrade New Application Version 3a Version 2a fork / duplicate
  38. 38. Deployments, not Flink Clusters 39 Testing / QA Kubernetes Cluster Production Kubernetes Cluster Threat Metrics App. Testing Activity Monitor Application Fraud Detection Application Fraud Detection App. Testing
  39. 39. Hooks for CI/CD pipelines 40 Kubernetes Cluster Application Version 1 CI Service dA Application Manager push update trigger CI upgrade API stateful upgrade Application Version 2
  40. 40. 41 Payments Dashboard Demo Time!
  41. 41. 42 Sign up for the early access program! https://data-artisans.com/da-platform-2
  42. 42. 43 Learn more at At the boothTalk at 11am today @ Kesselhaus
  43. 43. Enjoy the Conference!

×