SlideShare a Scribd company logo
1 of 47
Download to read offline
Welcome to
The first conference on Apache Flink
Sponsored by
Some practical info
§  Registration, cloakroom, and meals are in
Palais
§  Information point always staffed
§  WiFi is FlinkForward
§  Twitter hashtag is #ff15
§  Follow @FlinkForward
Some practical info
§  Need help? Look for a volunteer (pink badges)
§  All sessions are recorded and will be made
available online
§  This includes the training sessions
3
Getting around
4
Please go around while
talks are in progress
Our speaker organizations
5
Kostas Tzoumas and Stephan Ewen
@kostas_tzoumas | @StephanEwen
Apache FlinkTM: From
Incubation to Flink 1.0
7
1.  A bit of history
2.  The streaming era and Flink
3.  Inside Flink 0.10
4.  Towards Flink 1.0 and beyond
A bit of history
From incubation until now
8
9
DataSet API (Java/Scala)
Flink core
Local Remote Yarn
Apr 2014 Jun 2015Dec 2014
0.70.60.5 0.90.9-m1 0.10
Oct 2015
Top level
0.8
Gelly
Table
FlinkML
SAMOA
DataSet (Java/Scala/Python) DataStream (Java/Scala)
HadoopM/R
Flink dataflow engine
Local Remote Yarn Tez Embedded
Dataflow
Dataflow
Cascading
Table
Storm
Community growth
Flink is one of the largest and most active Apache
big data projects with well over 120 contributors
10
Flink meetups around the globe
11
Featured in
12
The streaming era
Welcome to
13
14
batch
event
based
need new systems
well served
15
Streaming is the biggest change in
data infrastructure since Hadoop
16
1.  Radically simplified infrastructure
2.  Internet of Things, on-demand services
3.  Can completely subsume batch
17
In a world of events and isolated apps, the stream processor is the
backbone of the data infrastructure
App App
App
local view
local viewlocal view
Consistent
movement,
analytics
App App App
Global view
Consistent store
18
§  Until now, stream processors were less
mature than batch processors
§  This led to
•  in-house solutions
•  abuse of batch processors
•  Lambda architectures
§  This is no longer the case
19
Flink 0.10
With the upcoming 0.10 release, Flink
significantly surpasses the state of the art in
open source stream processing systems.
And, we are heading to Flink 1.0 after that.
20
§  Streaming technology has matured
•  e.g., Flink, Kafka, Dataflow
§  Flink and Dataflow duality
•  a Google technology
•  an open source Apache project
+
21
§  Streaming is happening
§  Better adapt now
§  Flink 0.10: a ready to use open
source stream processor
Flink 0.10
Flink for the streaming era
22
Improved DataStream API
§  Stream data analysis differs from batch data
analysis by introducing time
§  Streams are unbounded and produce data
over time
§  Simple as batch API if handling time in a
simple way
§  Powerful if you want to handle time in an
advanced way (out-of-order records,
preliminary results, etc)
23
Improved DataStream API
24
case	
  class	
  Event(location:	
  Location,	
  numVehicles:	
  Long)	
  	
  
val	
  stream:	
  DataStream[Event]	
  =	
  …;	
  
	
  
stream	
  
	
  	
  	
  .filter	
  {	
  evt	
  =>	
  isIntersection(evt.location)	
  }	
  
Improved DataStream API
25
case	
  class	
  Event(location:	
  Location,	
  numVehicles:	
  Long)	
  	
  
val	
  stream:	
  DataStream[Event]	
  =	
  …;	
  
	
  
stream	
  
	
  	
  	
  .filter	
  {	
  evt	
  =>	
  isIntersection(evt.location)	
  }	
  	
  
	
  	
  	
  .keyBy("location")	
  
	
  	
  	
  .timeWindow(Time.of(15,	
  MINUTES),	
  Time.of(5,	
  MINUTES))	
  
	
  	
  	
  .sum("numVehicles")	
  
Improved DataStream API
26
case	
  class	
  Event(location:	
  Location,	
  numVehicles:	
  Long)	
  	
  
val	
  stream:	
  DataStream[Event]	
  =	
  …;	
  
	
  
stream	
  
	
  	
  	
  .filter	
  {	
  evt	
  =>	
  isIntersection(evt.location)	
  }	
  	
  
	
  	
  	
  .keyBy("location")	
  
	
  	
  	
  .timeWindow(Time.of(15,	
  MINUTES),	
  Time.of(5,	
  MINUTES))	
  
	
  	
  	
  .trigger(new	
  Threshold(200))	
  
	
  	
  	
  .sum("numVehicles")	
  
Improved DataStream API
27
case	
  class	
  Event(location:	
  Location,	
  numVehicles:	
  Long)	
  	
  
val	
  stream:	
  DataStream[Event]	
  =	
  …;	
  
	
  
stream	
  
	
  	
  	
  .filter	
  {	
  evt	
  =>	
  isIntersection(evt.location)	
  }	
  	
  
	
  	
  	
  .keyBy("location")	
  
	
  	
  	
  .timeWindow(Time.of(15,	
  MINUTES),	
  Time.of(5,	
  MINUTES))	
  
	
  	
  	
  .trigger(new	
  Threshold(200))	
  
	
  	
  	
  .sum("numVehicles")	
  
	
  
	
  	
  	
  .keyBy(	
  evt	
  =>	
  evt.location.grid	
  )	
  
	
  	
  	
  .mapWithState	
  {	
  (evt,	
  state:	
  Option[Model])	
  =>	
  {	
  
	
  	
  	
  	
  	
  	
  val	
  model	
  =	
  state.orElse(new	
  Model())	
  
	
  	
  	
  	
  	
  	
  (model.classify(evt),	
  Some(model.update(evt)))	
  
	
  	
  	
  	
  }}	
  
IoT / Mobile Applications
28
Events occur on devices
Queue / Log
Events analyzed in a
data streaming
system
Stream Analysis
Events stored in a log
IoT / Mobile Applications
29
IoT / Mobile Applications
30
IoT / Mobile Applications
31
IoT / Mobile Applications
32
Out of order !!!
First burst of events
Second burst of events
IoT / Mobile Applications
33
Event time windows
Arrival time windows
Instant event-at-a-time
Flink supports out of order time (event time) windows,
arrival time windows (and mixtures) plus low latency processing.
First burst of events
Second burst of events
High Availability and Consistency
34
No Single-Point-Of-Failure
any more
Exactly-once processing semantics
across pipeline
Checkpoints/Fault Tolerance is decoupled from windows
è Allows for highly flexible window implementations
ZooKeeper
ensemble
Multiple
Masters
failover
Performance
35
Continuous
streaming
Latency-bound
buffering
Distributed
Snapshots
High Throughput &
Low Latency
With configurable throughput/latency tradeoff
Batch and Streaming
36
case	
  class	
  WordCount(word:	
  String,	
  count:	
  Int)	
  	
  
val	
  text:	
  DataStream[String]	
  =	
  …;	
  
	
  
text	
  
	
  	
  .flatMap	
  {	
  line	
  =>	
  line.split("	
  ")	
  }	
  
	
  	
  .map	
  {	
  word	
  =>	
  new	
  WordCount(word,	
  1)	
  }	
  
	
  	
  .keyBy("word")	
  
	
  	
  .window(GlobalWindows.create())	
  
	
  	
  .trigger(new	
  EOFTrigger())	
  
	
  	
  .sum("count")	
  
Batch Word Count in the DataStream API
Batch and Streaming
37
Batch Word Count in the DataSet API
case	
  class	
  WordCount(word:	
  String,	
  count:	
  Int)	
  	
  
val	
  text:	
  DataStream[String]	
  =	
  …;	
  
	
  
text	
  
	
  	
  .flatMap	
  {	
  line	
  =>	
  line.split("	
  ")	
  }	
  
	
  	
  .map	
  {	
  word	
  =>	
  new	
  WordCount(word,	
  1)	
  }	
  
	
  	
  .keyBy("word")	
  
	
  	
  .window(GlobalWindows.create())	
  
	
  	
  .trigger(new	
  EOFTrigger())	
  
	
  	
  .sum("count")	
  
val	
  text:	
  DataSet[String]	
  =	
  …;	
  
	
  
text	
  
	
  	
  .flatMap	
  {	
  line	
  =>	
  line.split("	
  ")	
  }	
  
	
  	
  .map	
  {	
  word	
  =>	
  new	
  WordCount(word,	
  1)	
  }	
  
	
  	
  .groupBy("word")	
  
	
  	
  .sum("count")	
  
	
  
	
  
Batch and Streaming
38
Pipelined and
blocking operators Streaming Dataflow Runtime
Batch Parameters
DataSet DataStream
Relational
Optimizer
Window
Optimization
Pipelined and
windowed operators
Schedule lazily
Schedule eagerly
Recompute whole
operators Periodic checkpoints
Streaming data movement
Stateful operations
DAG recovery
Fully buffered streams DAG resource management
Streaming Parameters
Batch and Streaming
39
A full-fledged batch processor as well
Gelly
Table
FlinkML
SAMOA
DataSet (Java/Scala/Python) DataStream (Java/Scala)
HadoopM/R
Flink dataflow engine
Local Remote Yarn Tez Embedded
Dataflow
Dataflow
Cascading
Table
Storm
Batch and Streaming
40
A full-fledged batch processor as well
Gelly
Table
FlinkML
SAMOA
DataSet (Java/Scala/Python) DataStream (Java/Scala)
HadoopM/R
Flink dataflow engine
Local Remote Yarn Tez Embedded
Dataflow
Dataflow
Cascading
Table
Storm
More details at Dongwon Kim's Talk
"A comparative performance evaluation of Flink"
Integration (picture not complete)
41
POSIX	
   Java/Scala
Collections
POSIX	
  
Monitoring
42
Life system metrics and
user-defined accumulators/statistics
Get	
  http://flink-­‐m:8081/jobs/7684be6004e4e955c2a558a9bc463f65/accumulators	
  
Monitoring REST API for
custom monitoring tools
{	
  "id":	
  "dceafe2df1f57a1206fcb907cb38ad97",	
  "user-­‐accumulators":	
  [	
  
	
  	
  {	
  "name":"avglen",	
  "type":"DoubleCounter",	
  "value":"123.03259440000001"	
  },	
  
	
  	
  {	
  "name":"genwords",	
  "type":"LongCounter",	
  "value":"75000000"	
  }	
  ]	
  }	
  
Flink 0.10 Summary
§  Focus on operational readiness
•  high availability
•  monitoring
•  integration with other systems
§  First-class support for event time
§  Refined DataStream API: easy and
powerful
43
Towards Flink 1.0 and beyond
Where we see the project going
44
Towards Flink 1.0
§  Flink 1.0 is around the corner
§  Focus on defining public APIs and
automatic API compatibility checks
§  Guarantee backwards compatibility in all
Flink 1.X versions
45
Beyond Flink 1.0
§  Flink engine has most features in place
§  Focus on usability features on top of
DataStream API
•  e.g., SQL, ML, more connectors
§  Continue work on elasticity and memory
management
46
47	
  
Enjoy the rest of
The first conference on Apache Flink

More Related Content

What's hot

S. Bartoli & F. Pompermaier – A Semantic Big Data Companion
S. Bartoli & F. Pompermaier – A Semantic Big Data CompanionS. Bartoli & F. Pompermaier – A Semantic Big Data Companion
S. Bartoli & F. Pompermaier – A Semantic Big Data CompanionFlink Forward
 
Christian Kreuzfeld – Static vs Dynamic Stream Processing
Christian Kreuzfeld – Static vs Dynamic Stream ProcessingChristian Kreuzfeld – Static vs Dynamic Stream Processing
Christian Kreuzfeld – Static vs Dynamic Stream ProcessingFlink Forward
 
Apache Flink Training: System Overview
Apache Flink Training: System OverviewApache Flink Training: System Overview
Apache Flink Training: System OverviewFlink Forward
 
Flink Forward SF 2017: Eron Wright - Introducing Flink Tensorflow
Flink Forward SF 2017: Eron Wright - Introducing Flink TensorflowFlink Forward SF 2017: Eron Wright - Introducing Flink Tensorflow
Flink Forward SF 2017: Eron Wright - Introducing Flink TensorflowFlink Forward
 
Real-time Stream Processing with Apache Flink
Real-time Stream Processing with Apache FlinkReal-time Stream Processing with Apache Flink
Real-time Stream Processing with Apache FlinkDataWorks Summit
 
Advanced Streaming Analytics with Apache Flink and Apache Kafka, Stephan Ewen
Advanced Streaming Analytics with Apache Flink and Apache Kafka, Stephan EwenAdvanced Streaming Analytics with Apache Flink and Apache Kafka, Stephan Ewen
Advanced Streaming Analytics with Apache Flink and Apache Kafka, Stephan Ewenconfluent
 
Extending the Yahoo Streaming Benchmark
Extending the Yahoo Streaming BenchmarkExtending the Yahoo Streaming Benchmark
Extending the Yahoo Streaming BenchmarkJamie Grier
 
Architecture of Flink's Streaming Runtime @ ApacheCon EU 2015
Architecture of Flink's Streaming Runtime @ ApacheCon EU 2015Architecture of Flink's Streaming Runtime @ ApacheCon EU 2015
Architecture of Flink's Streaming Runtime @ ApacheCon EU 2015Robert Metzger
 
Stream Processing with Apache Flink (Flink.tw Meetup 2016/07/19)
Stream Processing with Apache Flink (Flink.tw Meetup 2016/07/19)Stream Processing with Apache Flink (Flink.tw Meetup 2016/07/19)
Stream Processing with Apache Flink (Flink.tw Meetup 2016/07/19)Apache Flink Taiwan User Group
 
Introduction to Apache Flink - Fast and reliable big data processing
Introduction to Apache Flink - Fast and reliable big data processingIntroduction to Apache Flink - Fast and reliable big data processing
Introduction to Apache Flink - Fast and reliable big data processingTill Rohrmann
 
Aljoscha Krettek – Notions of Time
Aljoscha Krettek – Notions of TimeAljoscha Krettek – Notions of Time
Aljoscha Krettek – Notions of TimeFlink Forward
 
Functional Comparison and Performance Evaluation of Streaming Frameworks
Functional Comparison and Performance Evaluation of Streaming FrameworksFunctional Comparison and Performance Evaluation of Streaming Frameworks
Functional Comparison and Performance Evaluation of Streaming FrameworksHuafeng Wang
 
Apache Flink - Overview and Use cases of a Distributed Dataflow System (at pr...
Apache Flink - Overview and Use cases of a Distributed Dataflow System (at pr...Apache Flink - Overview and Use cases of a Distributed Dataflow System (at pr...
Apache Flink - Overview and Use cases of a Distributed Dataflow System (at pr...Stephan Ewen
 
Data Stream Processing with Apache Flink
Data Stream Processing with Apache FlinkData Stream Processing with Apache Flink
Data Stream Processing with Apache FlinkFabian Hueske
 
A Data Streaming Architecture with Apache Flink (berlin Buzzwords 2016)
A Data Streaming Architecture with Apache Flink (berlin Buzzwords 2016)A Data Streaming Architecture with Apache Flink (berlin Buzzwords 2016)
A Data Streaming Architecture with Apache Flink (berlin Buzzwords 2016)Robert Metzger
 
Stateful Distributed Stream Processing
Stateful Distributed Stream ProcessingStateful Distributed Stream Processing
Stateful Distributed Stream ProcessingGyula Fóra
 
Dongwon Kim – A Comparative Performance Evaluation of Flink
Dongwon Kim – A Comparative Performance Evaluation of FlinkDongwon Kim – A Comparative Performance Evaluation of Flink
Dongwon Kim – A Comparative Performance Evaluation of FlinkFlink Forward
 
Apache Flink internals
Apache Flink internalsApache Flink internals
Apache Flink internalsKostas Tzoumas
 
Apache Flink Overview at SF Spark and Friends
Apache Flink Overview at SF Spark and FriendsApache Flink Overview at SF Spark and Friends
Apache Flink Overview at SF Spark and FriendsStephan Ewen
 
Tran Nam-Luc – Stale Synchronous Parallel Iterations on Flink
Tran Nam-Luc – Stale Synchronous Parallel Iterations on FlinkTran Nam-Luc – Stale Synchronous Parallel Iterations on Flink
Tran Nam-Luc – Stale Synchronous Parallel Iterations on FlinkFlink Forward
 

What's hot (20)

S. Bartoli & F. Pompermaier – A Semantic Big Data Companion
S. Bartoli & F. Pompermaier – A Semantic Big Data CompanionS. Bartoli & F. Pompermaier – A Semantic Big Data Companion
S. Bartoli & F. Pompermaier – A Semantic Big Data Companion
 
Christian Kreuzfeld – Static vs Dynamic Stream Processing
Christian Kreuzfeld – Static vs Dynamic Stream ProcessingChristian Kreuzfeld – Static vs Dynamic Stream Processing
Christian Kreuzfeld – Static vs Dynamic Stream Processing
 
Apache Flink Training: System Overview
Apache Flink Training: System OverviewApache Flink Training: System Overview
Apache Flink Training: System Overview
 
Flink Forward SF 2017: Eron Wright - Introducing Flink Tensorflow
Flink Forward SF 2017: Eron Wright - Introducing Flink TensorflowFlink Forward SF 2017: Eron Wright - Introducing Flink Tensorflow
Flink Forward SF 2017: Eron Wright - Introducing Flink Tensorflow
 
Real-time Stream Processing with Apache Flink
Real-time Stream Processing with Apache FlinkReal-time Stream Processing with Apache Flink
Real-time Stream Processing with Apache Flink
 
Advanced Streaming Analytics with Apache Flink and Apache Kafka, Stephan Ewen
Advanced Streaming Analytics with Apache Flink and Apache Kafka, Stephan EwenAdvanced Streaming Analytics with Apache Flink and Apache Kafka, Stephan Ewen
Advanced Streaming Analytics with Apache Flink and Apache Kafka, Stephan Ewen
 
Extending the Yahoo Streaming Benchmark
Extending the Yahoo Streaming BenchmarkExtending the Yahoo Streaming Benchmark
Extending the Yahoo Streaming Benchmark
 
Architecture of Flink's Streaming Runtime @ ApacheCon EU 2015
Architecture of Flink's Streaming Runtime @ ApacheCon EU 2015Architecture of Flink's Streaming Runtime @ ApacheCon EU 2015
Architecture of Flink's Streaming Runtime @ ApacheCon EU 2015
 
Stream Processing with Apache Flink (Flink.tw Meetup 2016/07/19)
Stream Processing with Apache Flink (Flink.tw Meetup 2016/07/19)Stream Processing with Apache Flink (Flink.tw Meetup 2016/07/19)
Stream Processing with Apache Flink (Flink.tw Meetup 2016/07/19)
 
Introduction to Apache Flink - Fast and reliable big data processing
Introduction to Apache Flink - Fast and reliable big data processingIntroduction to Apache Flink - Fast and reliable big data processing
Introduction to Apache Flink - Fast and reliable big data processing
 
Aljoscha Krettek – Notions of Time
Aljoscha Krettek – Notions of TimeAljoscha Krettek – Notions of Time
Aljoscha Krettek – Notions of Time
 
Functional Comparison and Performance Evaluation of Streaming Frameworks
Functional Comparison and Performance Evaluation of Streaming FrameworksFunctional Comparison and Performance Evaluation of Streaming Frameworks
Functional Comparison and Performance Evaluation of Streaming Frameworks
 
Apache Flink - Overview and Use cases of a Distributed Dataflow System (at pr...
Apache Flink - Overview and Use cases of a Distributed Dataflow System (at pr...Apache Flink - Overview and Use cases of a Distributed Dataflow System (at pr...
Apache Flink - Overview and Use cases of a Distributed Dataflow System (at pr...
 
Data Stream Processing with Apache Flink
Data Stream Processing with Apache FlinkData Stream Processing with Apache Flink
Data Stream Processing with Apache Flink
 
A Data Streaming Architecture with Apache Flink (berlin Buzzwords 2016)
A Data Streaming Architecture with Apache Flink (berlin Buzzwords 2016)A Data Streaming Architecture with Apache Flink (berlin Buzzwords 2016)
A Data Streaming Architecture with Apache Flink (berlin Buzzwords 2016)
 
Stateful Distributed Stream Processing
Stateful Distributed Stream ProcessingStateful Distributed Stream Processing
Stateful Distributed Stream Processing
 
Dongwon Kim – A Comparative Performance Evaluation of Flink
Dongwon Kim – A Comparative Performance Evaluation of FlinkDongwon Kim – A Comparative Performance Evaluation of Flink
Dongwon Kim – A Comparative Performance Evaluation of Flink
 
Apache Flink internals
Apache Flink internalsApache Flink internals
Apache Flink internals
 
Apache Flink Overview at SF Spark and Friends
Apache Flink Overview at SF Spark and FriendsApache Flink Overview at SF Spark and Friends
Apache Flink Overview at SF Spark and Friends
 
Tran Nam-Luc – Stale Synchronous Parallel Iterations on Flink
Tran Nam-Luc – Stale Synchronous Parallel Iterations on FlinkTran Nam-Luc – Stale Synchronous Parallel Iterations on Flink
Tran Nam-Luc – Stale Synchronous Parallel Iterations on Flink
 

Viewers also liked

Chris Hillman – Beyond Mapreduce Scientific Data Processing in Real-time
Chris Hillman – Beyond Mapreduce Scientific Data Processing in Real-timeChris Hillman – Beyond Mapreduce Scientific Data Processing in Real-time
Chris Hillman – Beyond Mapreduce Scientific Data Processing in Real-timeFlink Forward
 
Fabian Hueske – Cascading on Flink
Fabian Hueske – Cascading on FlinkFabian Hueske – Cascading on Flink
Fabian Hueske – Cascading on FlinkFlink Forward
 
Apache Flink - Hadoop MapReduce Compatibility
Apache Flink - Hadoop MapReduce CompatibilityApache Flink - Hadoop MapReduce Compatibility
Apache Flink - Hadoop MapReduce CompatibilityFabian Hueske
 
Marc Schwering – Using Flink with MongoDB to enhance relevancy in personaliza...
Marc Schwering – Using Flink with MongoDB to enhance relevancy in personaliza...Marc Schwering – Using Flink with MongoDB to enhance relevancy in personaliza...
Marc Schwering – Using Flink with MongoDB to enhance relevancy in personaliza...Flink Forward
 
Anwar Rizal – Streaming & Parallel Decision Tree in Flink
Anwar Rizal – Streaming & Parallel Decision Tree in FlinkAnwar Rizal – Streaming & Parallel Decision Tree in Flink
Anwar Rizal – Streaming & Parallel Decision Tree in FlinkFlink Forward
 
Flink Apachecon Presentation
Flink Apachecon PresentationFlink Apachecon Presentation
Flink Apachecon PresentationGyula Fóra
 
Vasia Kalavri – Training: Gelly School
Vasia Kalavri – Training: Gelly School Vasia Kalavri – Training: Gelly School
Vasia Kalavri – Training: Gelly School Flink Forward
 
Martin Junghans – Gradoop: Scalable Graph Analytics with Apache Flink
Martin Junghans – Gradoop: Scalable Graph Analytics with Apache FlinkMartin Junghans – Gradoop: Scalable Graph Analytics with Apache Flink
Martin Junghans – Gradoop: Scalable Graph Analytics with Apache FlinkFlink Forward
 
Ufuc Celebi – Stream & Batch Processing in one System
Ufuc Celebi – Stream & Batch Processing in one SystemUfuc Celebi – Stream & Batch Processing in one System
Ufuc Celebi – Stream & Batch Processing in one SystemFlink Forward
 
Slim Baltagi – Flink vs. Spark
Slim Baltagi – Flink vs. SparkSlim Baltagi – Flink vs. Spark
Slim Baltagi – Flink vs. SparkFlink Forward
 
Flink Case Study: Bouygues Telecom
Flink Case Study: Bouygues TelecomFlink Case Study: Bouygues Telecom
Flink Case Study: Bouygues TelecomFlink Forward
 
Sebastian Schelter – Distributed Machine Learing with the Samsara DSL
Sebastian Schelter – Distributed Machine Learing with the Samsara DSLSebastian Schelter – Distributed Machine Learing with the Samsara DSL
Sebastian Schelter – Distributed Machine Learing with the Samsara DSLFlink Forward
 
William Vambenepe – Google Cloud Dataflow and Flink , Stream Processing by De...
William Vambenepe – Google Cloud Dataflow and Flink , Stream Processing by De...William Vambenepe – Google Cloud Dataflow and Flink , Stream Processing by De...
William Vambenepe – Google Cloud Dataflow and Flink , Stream Processing by De...Flink Forward
 
Apache Flink Training: DataSet API Basics
Apache Flink Training: DataSet API BasicsApache Flink Training: DataSet API Basics
Apache Flink Training: DataSet API BasicsFlink Forward
 
Michael Häusler – Everyday flink
Michael Häusler – Everyday flinkMichael Häusler – Everyday flink
Michael Häusler – Everyday flinkFlink Forward
 
Mikio Braun – Data flow vs. procedural programming
Mikio Braun – Data flow vs. procedural programming Mikio Braun – Data flow vs. procedural programming
Mikio Braun – Data flow vs. procedural programming Flink Forward
 
Mohamed Amine Abdessemed – Real-time Data Integration with Apache Flink & Kafka
Mohamed Amine Abdessemed – Real-time Data Integration with Apache Flink & KafkaMohamed Amine Abdessemed – Real-time Data Integration with Apache Flink & Kafka
Mohamed Amine Abdessemed – Real-time Data Integration with Apache Flink & KafkaFlink Forward
 
Apache Flink Training: DataStream API Part 1 Basic
 Apache Flink Training: DataStream API Part 1 Basic Apache Flink Training: DataStream API Part 1 Basic
Apache Flink Training: DataStream API Part 1 BasicFlink Forward
 
Streaming Data Flow with Apache Flink @ Paris Flink Meetup 2015
Streaming Data Flow with Apache Flink @ Paris Flink Meetup 2015Streaming Data Flow with Apache Flink @ Paris Flink Meetup 2015
Streaming Data Flow with Apache Flink @ Paris Flink Meetup 2015Till Rohrmann
 
Suneel Marthi – BigPetStore Flink: A Comprehensive Blueprint for Apache Flink
Suneel Marthi – BigPetStore Flink: A Comprehensive Blueprint for Apache FlinkSuneel Marthi – BigPetStore Flink: A Comprehensive Blueprint for Apache Flink
Suneel Marthi – BigPetStore Flink: A Comprehensive Blueprint for Apache FlinkFlink Forward
 

Viewers also liked (20)

Chris Hillman – Beyond Mapreduce Scientific Data Processing in Real-time
Chris Hillman – Beyond Mapreduce Scientific Data Processing in Real-timeChris Hillman – Beyond Mapreduce Scientific Data Processing in Real-time
Chris Hillman – Beyond Mapreduce Scientific Data Processing in Real-time
 
Fabian Hueske – Cascading on Flink
Fabian Hueske – Cascading on FlinkFabian Hueske – Cascading on Flink
Fabian Hueske – Cascading on Flink
 
Apache Flink - Hadoop MapReduce Compatibility
Apache Flink - Hadoop MapReduce CompatibilityApache Flink - Hadoop MapReduce Compatibility
Apache Flink - Hadoop MapReduce Compatibility
 
Marc Schwering – Using Flink with MongoDB to enhance relevancy in personaliza...
Marc Schwering – Using Flink with MongoDB to enhance relevancy in personaliza...Marc Schwering – Using Flink with MongoDB to enhance relevancy in personaliza...
Marc Schwering – Using Flink with MongoDB to enhance relevancy in personaliza...
 
Anwar Rizal – Streaming & Parallel Decision Tree in Flink
Anwar Rizal – Streaming & Parallel Decision Tree in FlinkAnwar Rizal – Streaming & Parallel Decision Tree in Flink
Anwar Rizal – Streaming & Parallel Decision Tree in Flink
 
Flink Apachecon Presentation
Flink Apachecon PresentationFlink Apachecon Presentation
Flink Apachecon Presentation
 
Vasia Kalavri – Training: Gelly School
Vasia Kalavri – Training: Gelly School Vasia Kalavri – Training: Gelly School
Vasia Kalavri – Training: Gelly School
 
Martin Junghans – Gradoop: Scalable Graph Analytics with Apache Flink
Martin Junghans – Gradoop: Scalable Graph Analytics with Apache FlinkMartin Junghans – Gradoop: Scalable Graph Analytics with Apache Flink
Martin Junghans – Gradoop: Scalable Graph Analytics with Apache Flink
 
Ufuc Celebi – Stream & Batch Processing in one System
Ufuc Celebi – Stream & Batch Processing in one SystemUfuc Celebi – Stream & Batch Processing in one System
Ufuc Celebi – Stream & Batch Processing in one System
 
Slim Baltagi – Flink vs. Spark
Slim Baltagi – Flink vs. SparkSlim Baltagi – Flink vs. Spark
Slim Baltagi – Flink vs. Spark
 
Flink Case Study: Bouygues Telecom
Flink Case Study: Bouygues TelecomFlink Case Study: Bouygues Telecom
Flink Case Study: Bouygues Telecom
 
Sebastian Schelter – Distributed Machine Learing with the Samsara DSL
Sebastian Schelter – Distributed Machine Learing with the Samsara DSLSebastian Schelter – Distributed Machine Learing with the Samsara DSL
Sebastian Schelter – Distributed Machine Learing with the Samsara DSL
 
William Vambenepe – Google Cloud Dataflow and Flink , Stream Processing by De...
William Vambenepe – Google Cloud Dataflow and Flink , Stream Processing by De...William Vambenepe – Google Cloud Dataflow and Flink , Stream Processing by De...
William Vambenepe – Google Cloud Dataflow and Flink , Stream Processing by De...
 
Apache Flink Training: DataSet API Basics
Apache Flink Training: DataSet API BasicsApache Flink Training: DataSet API Basics
Apache Flink Training: DataSet API Basics
 
Michael Häusler – Everyday flink
Michael Häusler – Everyday flinkMichael Häusler – Everyday flink
Michael Häusler – Everyday flink
 
Mikio Braun – Data flow vs. procedural programming
Mikio Braun – Data flow vs. procedural programming Mikio Braun – Data flow vs. procedural programming
Mikio Braun – Data flow vs. procedural programming
 
Mohamed Amine Abdessemed – Real-time Data Integration with Apache Flink & Kafka
Mohamed Amine Abdessemed – Real-time Data Integration with Apache Flink & KafkaMohamed Amine Abdessemed – Real-time Data Integration with Apache Flink & Kafka
Mohamed Amine Abdessemed – Real-time Data Integration with Apache Flink & Kafka
 
Apache Flink Training: DataStream API Part 1 Basic
 Apache Flink Training: DataStream API Part 1 Basic Apache Flink Training: DataStream API Part 1 Basic
Apache Flink Training: DataStream API Part 1 Basic
 
Streaming Data Flow with Apache Flink @ Paris Flink Meetup 2015
Streaming Data Flow with Apache Flink @ Paris Flink Meetup 2015Streaming Data Flow with Apache Flink @ Paris Flink Meetup 2015
Streaming Data Flow with Apache Flink @ Paris Flink Meetup 2015
 
Suneel Marthi – BigPetStore Flink: A Comprehensive Blueprint for Apache Flink
Suneel Marthi – BigPetStore Flink: A Comprehensive Blueprint for Apache FlinkSuneel Marthi – BigPetStore Flink: A Comprehensive Blueprint for Apache Flink
Suneel Marthi – BigPetStore Flink: A Comprehensive Blueprint for Apache Flink
 

Similar to K. Tzoumas & S. Ewen – Flink Forward Keynote

Apache Flink Meetup Munich (November 2015): Flink Overview, Architecture, Int...
Apache Flink Meetup Munich (November 2015): Flink Overview, Architecture, Int...Apache Flink Meetup Munich (November 2015): Flink Overview, Architecture, Int...
Apache Flink Meetup Munich (November 2015): Flink Overview, Architecture, Int...Robert Metzger
 
Flink 0.10 @ Bay Area Meetup (October 2015)
Flink 0.10 @ Bay Area Meetup (October 2015)Flink 0.10 @ Bay Area Meetup (October 2015)
Flink 0.10 @ Bay Area Meetup (October 2015)Stephan Ewen
 
Chicago Flink Meetup: Flink's streaming architecture
Chicago Flink Meetup: Flink's streaming architectureChicago Flink Meetup: Flink's streaming architecture
Chicago Flink Meetup: Flink's streaming architectureRobert Metzger
 
Apache Flink(tm) - A Next-Generation Stream Processor
Apache Flink(tm) - A Next-Generation Stream ProcessorApache Flink(tm) - A Next-Generation Stream Processor
Apache Flink(tm) - A Next-Generation Stream ProcessorAljoscha Krettek
 
Apache Flink @ Tel Aviv / Herzliya Meetup
Apache Flink @ Tel Aviv / Herzliya MeetupApache Flink @ Tel Aviv / Herzliya Meetup
Apache Flink @ Tel Aviv / Herzliya MeetupRobert Metzger
 
Spark (Structured) Streaming vs. Kafka Streams - two stream processing platfo...
Spark (Structured) Streaming vs. Kafka Streams - two stream processing platfo...Spark (Structured) Streaming vs. Kafka Streams - two stream processing platfo...
Spark (Structured) Streaming vs. Kafka Streams - two stream processing platfo...Guido Schmutz
 
Serverless London 2019 FaaS composition using Kafka and CloudEvents
Serverless London 2019   FaaS composition using Kafka and CloudEventsServerless London 2019   FaaS composition using Kafka and CloudEvents
Serverless London 2019 FaaS composition using Kafka and CloudEventsNeil Avery
 
Introduction to Streaming with Apache Flink
Introduction to Streaming with Apache FlinkIntroduction to Streaming with Apache Flink
Introduction to Streaming with Apache FlinkTugdual Grall
 
Near real-time anomaly detection at Lyft
Near real-time anomaly detection at LyftNear real-time anomaly detection at Lyft
Near real-time anomaly detection at Lyftmarkgrover
 
Flink Streaming Hadoop Summit San Jose
Flink Streaming Hadoop Summit San JoseFlink Streaming Hadoop Summit San Jose
Flink Streaming Hadoop Summit San JoseKostas Tzoumas
 
Cloud lunch and learn real-time streaming in azure
Cloud lunch and learn real-time streaming in azureCloud lunch and learn real-time streaming in azure
Cloud lunch and learn real-time streaming in azureTimothy Spann
 
Why apache Flink is the 4G of Big Data Analytics Frameworks
Why apache Flink is the 4G of Big Data Analytics FrameworksWhy apache Flink is the 4G of Big Data Analytics Frameworks
Why apache Flink is the 4G of Big Data Analytics FrameworksSlim Baltagi
 
Apache Flink@ Strata & Hadoop World London
Apache Flink@ Strata & Hadoop World LondonApache Flink@ Strata & Hadoop World London
Apache Flink@ Strata & Hadoop World LondonStephan Ewen
 
Advanced Stream Processing with Flink and Pulsar - Pulsar Summit NA 2021 Keynote
Advanced Stream Processing with Flink and Pulsar - Pulsar Summit NA 2021 KeynoteAdvanced Stream Processing with Flink and Pulsar - Pulsar Summit NA 2021 Keynote
Advanced Stream Processing with Flink and Pulsar - Pulsar Summit NA 2021 KeynoteStreamNative
 
Flexible and Real-Time Stream Processing with Apache Flink
Flexible and Real-Time Stream Processing with Apache FlinkFlexible and Real-Time Stream Processing with Apache Flink
Flexible and Real-Time Stream Processing with Apache FlinkDataWorks Summit
 
Unifying Stream, SWL and CEP for Declarative Stream Processing with Apache Flink
Unifying Stream, SWL and CEP for Declarative Stream Processing with Apache FlinkUnifying Stream, SWL and CEP for Declarative Stream Processing with Apache Flink
Unifying Stream, SWL and CEP for Declarative Stream Processing with Apache FlinkDataWorks Summit/Hadoop Summit
 
Analitica de datos en tiempo real con Apache Flink y Apache BEAM
Analitica de datos en tiempo real con Apache Flink y Apache BEAMAnalitica de datos en tiempo real con Apache Flink y Apache BEAM
Analitica de datos en tiempo real con Apache Flink y Apache BEAMjavier ramirez
 
Designing Event-Driven Applications with Apache NiFi, Apache Flink, Apache Sp...
Designing Event-Driven Applications with Apache NiFi, Apache Flink, Apache Sp...Designing Event-Driven Applications with Apache NiFi, Apache Flink, Apache Sp...
Designing Event-Driven Applications with Apache NiFi, Apache Flink, Apache Sp...Timothy Spann
 
WRENCH: Workflow Management System Simulation Workbench
WRENCH: Workflow Management System Simulation WorkbenchWRENCH: Workflow Management System Simulation Workbench
WRENCH: Workflow Management System Simulation WorkbenchRafael Ferreira da Silva
 

Similar to K. Tzoumas & S. Ewen – Flink Forward Keynote (20)

Apache Flink Meetup Munich (November 2015): Flink Overview, Architecture, Int...
Apache Flink Meetup Munich (November 2015): Flink Overview, Architecture, Int...Apache Flink Meetup Munich (November 2015): Flink Overview, Architecture, Int...
Apache Flink Meetup Munich (November 2015): Flink Overview, Architecture, Int...
 
Flink 0.10 @ Bay Area Meetup (October 2015)
Flink 0.10 @ Bay Area Meetup (October 2015)Flink 0.10 @ Bay Area Meetup (October 2015)
Flink 0.10 @ Bay Area Meetup (October 2015)
 
Chicago Flink Meetup: Flink's streaming architecture
Chicago Flink Meetup: Flink's streaming architectureChicago Flink Meetup: Flink's streaming architecture
Chicago Flink Meetup: Flink's streaming architecture
 
Apache Flink(tm) - A Next-Generation Stream Processor
Apache Flink(tm) - A Next-Generation Stream ProcessorApache Flink(tm) - A Next-Generation Stream Processor
Apache Flink(tm) - A Next-Generation Stream Processor
 
Apache Flink @ Tel Aviv / Herzliya Meetup
Apache Flink @ Tel Aviv / Herzliya MeetupApache Flink @ Tel Aviv / Herzliya Meetup
Apache Flink @ Tel Aviv / Herzliya Meetup
 
Spark (Structured) Streaming vs. Kafka Streams - two stream processing platfo...
Spark (Structured) Streaming vs. Kafka Streams - two stream processing platfo...Spark (Structured) Streaming vs. Kafka Streams - two stream processing platfo...
Spark (Structured) Streaming vs. Kafka Streams - two stream processing platfo...
 
Serverless London 2019 FaaS composition using Kafka and CloudEvents
Serverless London 2019   FaaS composition using Kafka and CloudEventsServerless London 2019   FaaS composition using Kafka and CloudEvents
Serverless London 2019 FaaS composition using Kafka and CloudEvents
 
Introduction to Streaming with Apache Flink
Introduction to Streaming with Apache FlinkIntroduction to Streaming with Apache Flink
Introduction to Streaming with Apache Flink
 
Near real-time anomaly detection at Lyft
Near real-time anomaly detection at LyftNear real-time anomaly detection at Lyft
Near real-time anomaly detection at Lyft
 
Flink Streaming Hadoop Summit San Jose
Flink Streaming Hadoop Summit San JoseFlink Streaming Hadoop Summit San Jose
Flink Streaming Hadoop Summit San Jose
 
Cloud lunch and learn real-time streaming in azure
Cloud lunch and learn real-time streaming in azureCloud lunch and learn real-time streaming in azure
Cloud lunch and learn real-time streaming in azure
 
Why apache Flink is the 4G of Big Data Analytics Frameworks
Why apache Flink is the 4G of Big Data Analytics FrameworksWhy apache Flink is the 4G of Big Data Analytics Frameworks
Why apache Flink is the 4G of Big Data Analytics Frameworks
 
Apache Flink@ Strata & Hadoop World London
Apache Flink@ Strata & Hadoop World LondonApache Flink@ Strata & Hadoop World London
Apache Flink@ Strata & Hadoop World London
 
Advanced Stream Processing with Flink and Pulsar - Pulsar Summit NA 2021 Keynote
Advanced Stream Processing with Flink and Pulsar - Pulsar Summit NA 2021 KeynoteAdvanced Stream Processing with Flink and Pulsar - Pulsar Summit NA 2021 Keynote
Advanced Stream Processing with Flink and Pulsar - Pulsar Summit NA 2021 Keynote
 
Flexible and Real-Time Stream Processing with Apache Flink
Flexible and Real-Time Stream Processing with Apache FlinkFlexible and Real-Time Stream Processing with Apache Flink
Flexible and Real-Time Stream Processing with Apache Flink
 
Unifying Stream, SWL and CEP for Declarative Stream Processing with Apache Flink
Unifying Stream, SWL and CEP for Declarative Stream Processing with Apache FlinkUnifying Stream, SWL and CEP for Declarative Stream Processing with Apache Flink
Unifying Stream, SWL and CEP for Declarative Stream Processing with Apache Flink
 
Analitica de datos en tiempo real con Apache Flink y Apache BEAM
Analitica de datos en tiempo real con Apache Flink y Apache BEAMAnalitica de datos en tiempo real con Apache Flink y Apache BEAM
Analitica de datos en tiempo real con Apache Flink y Apache BEAM
 
Unified Stream and Batch Processing with Apache Flink
Unified Stream and Batch Processing with Apache FlinkUnified Stream and Batch Processing with Apache Flink
Unified Stream and Batch Processing with Apache Flink
 
Designing Event-Driven Applications with Apache NiFi, Apache Flink, Apache Sp...
Designing Event-Driven Applications with Apache NiFi, Apache Flink, Apache Sp...Designing Event-Driven Applications with Apache NiFi, Apache Flink, Apache Sp...
Designing Event-Driven Applications with Apache NiFi, Apache Flink, Apache Sp...
 
WRENCH: Workflow Management System Simulation Workbench
WRENCH: Workflow Management System Simulation WorkbenchWRENCH: Workflow Management System Simulation Workbench
WRENCH: Workflow Management System Simulation Workbench
 

More from Flink Forward

Building a fully managed stream processing platform on Flink at scale for Lin...
Building a fully managed stream processing platform on Flink at scale for Lin...Building a fully managed stream processing platform on Flink at scale for Lin...
Building a fully managed stream processing platform on Flink at scale for Lin...Flink Forward
 
Evening out the uneven: dealing with skew in Flink
Evening out the uneven: dealing with skew in FlinkEvening out the uneven: dealing with skew in Flink
Evening out the uneven: dealing with skew in FlinkFlink Forward
 
“Alexa, be quiet!”: End-to-end near-real time model building and evaluation i...
“Alexa, be quiet!”: End-to-end near-real time model building and evaluation i...“Alexa, be quiet!”: End-to-end near-real time model building and evaluation i...
“Alexa, be quiet!”: End-to-end near-real time model building and evaluation i...Flink Forward
 
Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...
Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...
Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...Flink Forward
 
Introducing the Apache Flink Kubernetes Operator
Introducing the Apache Flink Kubernetes OperatorIntroducing the Apache Flink Kubernetes Operator
Introducing the Apache Flink Kubernetes OperatorFlink Forward
 
Autoscaling Flink with Reactive Mode
Autoscaling Flink with Reactive ModeAutoscaling Flink with Reactive Mode
Autoscaling Flink with Reactive ModeFlink Forward
 
Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...
Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...
Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...Flink Forward
 
One sink to rule them all: Introducing the new Async Sink
One sink to rule them all: Introducing the new Async SinkOne sink to rule them all: Introducing the new Async Sink
One sink to rule them all: Introducing the new Async SinkFlink Forward
 
Tuning Apache Kafka Connectors for Flink.pptx
Tuning Apache Kafka Connectors for Flink.pptxTuning Apache Kafka Connectors for Flink.pptx
Tuning Apache Kafka Connectors for Flink.pptxFlink Forward
 
Flink powered stream processing platform at Pinterest
Flink powered stream processing platform at PinterestFlink powered stream processing platform at Pinterest
Flink powered stream processing platform at PinterestFlink Forward
 
Apache Flink in the Cloud-Native Era
Apache Flink in the Cloud-Native EraApache Flink in the Cloud-Native Era
Apache Flink in the Cloud-Native EraFlink Forward
 
Where is my bottleneck? Performance troubleshooting in Flink
Where is my bottleneck? Performance troubleshooting in FlinkWhere is my bottleneck? Performance troubleshooting in Flink
Where is my bottleneck? Performance troubleshooting in FlinkFlink Forward
 
Using the New Apache Flink Kubernetes Operator in a Production Deployment
Using the New Apache Flink Kubernetes Operator in a Production DeploymentUsing the New Apache Flink Kubernetes Operator in a Production Deployment
Using the New Apache Flink Kubernetes Operator in a Production DeploymentFlink Forward
 
The Current State of Table API in 2022
The Current State of Table API in 2022The Current State of Table API in 2022
The Current State of Table API in 2022Flink Forward
 
Flink SQL on Pulsar made easy
Flink SQL on Pulsar made easyFlink SQL on Pulsar made easy
Flink SQL on Pulsar made easyFlink Forward
 
Dynamic Rule-based Real-time Market Data Alerts
Dynamic Rule-based Real-time Market Data AlertsDynamic Rule-based Real-time Market Data Alerts
Dynamic Rule-based Real-time Market Data AlertsFlink Forward
 
Exactly-Once Financial Data Processing at Scale with Flink and Pinot
Exactly-Once Financial Data Processing at Scale with Flink and PinotExactly-Once Financial Data Processing at Scale with Flink and Pinot
Exactly-Once Financial Data Processing at Scale with Flink and PinotFlink Forward
 
Processing Semantically-Ordered Streams in Financial Services
Processing Semantically-Ordered Streams in Financial ServicesProcessing Semantically-Ordered Streams in Financial Services
Processing Semantically-Ordered Streams in Financial ServicesFlink Forward
 
Tame the small files problem and optimize data layout for streaming ingestion...
Tame the small files problem and optimize data layout for streaming ingestion...Tame the small files problem and optimize data layout for streaming ingestion...
Tame the small files problem and optimize data layout for streaming ingestion...Flink Forward
 
Batch Processing at Scale with Flink & Iceberg
Batch Processing at Scale with Flink & IcebergBatch Processing at Scale with Flink & Iceberg
Batch Processing at Scale with Flink & IcebergFlink Forward
 

More from Flink Forward (20)

Building a fully managed stream processing platform on Flink at scale for Lin...
Building a fully managed stream processing platform on Flink at scale for Lin...Building a fully managed stream processing platform on Flink at scale for Lin...
Building a fully managed stream processing platform on Flink at scale for Lin...
 
Evening out the uneven: dealing with skew in Flink
Evening out the uneven: dealing with skew in FlinkEvening out the uneven: dealing with skew in Flink
Evening out the uneven: dealing with skew in Flink
 
“Alexa, be quiet!”: End-to-end near-real time model building and evaluation i...
“Alexa, be quiet!”: End-to-end near-real time model building and evaluation i...“Alexa, be quiet!”: End-to-end near-real time model building and evaluation i...
“Alexa, be quiet!”: End-to-end near-real time model building and evaluation i...
 
Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...
Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...
Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...
 
Introducing the Apache Flink Kubernetes Operator
Introducing the Apache Flink Kubernetes OperatorIntroducing the Apache Flink Kubernetes Operator
Introducing the Apache Flink Kubernetes Operator
 
Autoscaling Flink with Reactive Mode
Autoscaling Flink with Reactive ModeAutoscaling Flink with Reactive Mode
Autoscaling Flink with Reactive Mode
 
Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...
Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...
Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...
 
One sink to rule them all: Introducing the new Async Sink
One sink to rule them all: Introducing the new Async SinkOne sink to rule them all: Introducing the new Async Sink
One sink to rule them all: Introducing the new Async Sink
 
Tuning Apache Kafka Connectors for Flink.pptx
Tuning Apache Kafka Connectors for Flink.pptxTuning Apache Kafka Connectors for Flink.pptx
Tuning Apache Kafka Connectors for Flink.pptx
 
Flink powered stream processing platform at Pinterest
Flink powered stream processing platform at PinterestFlink powered stream processing platform at Pinterest
Flink powered stream processing platform at Pinterest
 
Apache Flink in the Cloud-Native Era
Apache Flink in the Cloud-Native EraApache Flink in the Cloud-Native Era
Apache Flink in the Cloud-Native Era
 
Where is my bottleneck? Performance troubleshooting in Flink
Where is my bottleneck? Performance troubleshooting in FlinkWhere is my bottleneck? Performance troubleshooting in Flink
Where is my bottleneck? Performance troubleshooting in Flink
 
Using the New Apache Flink Kubernetes Operator in a Production Deployment
Using the New Apache Flink Kubernetes Operator in a Production DeploymentUsing the New Apache Flink Kubernetes Operator in a Production Deployment
Using the New Apache Flink Kubernetes Operator in a Production Deployment
 
The Current State of Table API in 2022
The Current State of Table API in 2022The Current State of Table API in 2022
The Current State of Table API in 2022
 
Flink SQL on Pulsar made easy
Flink SQL on Pulsar made easyFlink SQL on Pulsar made easy
Flink SQL on Pulsar made easy
 
Dynamic Rule-based Real-time Market Data Alerts
Dynamic Rule-based Real-time Market Data AlertsDynamic Rule-based Real-time Market Data Alerts
Dynamic Rule-based Real-time Market Data Alerts
 
Exactly-Once Financial Data Processing at Scale with Flink and Pinot
Exactly-Once Financial Data Processing at Scale with Flink and PinotExactly-Once Financial Data Processing at Scale with Flink and Pinot
Exactly-Once Financial Data Processing at Scale with Flink and Pinot
 
Processing Semantically-Ordered Streams in Financial Services
Processing Semantically-Ordered Streams in Financial ServicesProcessing Semantically-Ordered Streams in Financial Services
Processing Semantically-Ordered Streams in Financial Services
 
Tame the small files problem and optimize data layout for streaming ingestion...
Tame the small files problem and optimize data layout for streaming ingestion...Tame the small files problem and optimize data layout for streaming ingestion...
Tame the small files problem and optimize data layout for streaming ingestion...
 
Batch Processing at Scale with Flink & Iceberg
Batch Processing at Scale with Flink & IcebergBatch Processing at Scale with Flink & Iceberg
Batch Processing at Scale with Flink & Iceberg
 

Recently uploaded

Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksSoftradix Technologies
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Neo4j
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsSnow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsHyundai Motor Group
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsAndrey Dotsenko
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraDeakin University
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 

Recently uploaded (20)

Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other Frameworks
 
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptxVulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsSnow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning era
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 

K. Tzoumas & S. Ewen – Flink Forward Keynote

  • 1. Welcome to The first conference on Apache Flink Sponsored by
  • 2. Some practical info §  Registration, cloakroom, and meals are in Palais §  Information point always staffed §  WiFi is FlinkForward §  Twitter hashtag is #ff15 §  Follow @FlinkForward
  • 3. Some practical info §  Need help? Look for a volunteer (pink badges) §  All sessions are recorded and will be made available online §  This includes the training sessions 3
  • 4. Getting around 4 Please go around while talks are in progress
  • 6. Kostas Tzoumas and Stephan Ewen @kostas_tzoumas | @StephanEwen Apache FlinkTM: From Incubation to Flink 1.0
  • 7. 7 1.  A bit of history 2.  The streaming era and Flink 3.  Inside Flink 0.10 4.  Towards Flink 1.0 and beyond
  • 8. A bit of history From incubation until now 8
  • 9. 9 DataSet API (Java/Scala) Flink core Local Remote Yarn Apr 2014 Jun 2015Dec 2014 0.70.60.5 0.90.9-m1 0.10 Oct 2015 Top level 0.8 Gelly Table FlinkML SAMOA DataSet (Java/Scala/Python) DataStream (Java/Scala) HadoopM/R Flink dataflow engine Local Remote Yarn Tez Embedded Dataflow Dataflow Cascading Table Storm
  • 10. Community growth Flink is one of the largest and most active Apache big data projects with well over 120 contributors 10
  • 11. Flink meetups around the globe 11
  • 15. 15 Streaming is the biggest change in data infrastructure since Hadoop
  • 16. 16 1.  Radically simplified infrastructure 2.  Internet of Things, on-demand services 3.  Can completely subsume batch
  • 17. 17 In a world of events and isolated apps, the stream processor is the backbone of the data infrastructure App App App local view local viewlocal view Consistent movement, analytics App App App Global view Consistent store
  • 18. 18 §  Until now, stream processors were less mature than batch processors §  This led to •  in-house solutions •  abuse of batch processors •  Lambda architectures §  This is no longer the case
  • 19. 19 Flink 0.10 With the upcoming 0.10 release, Flink significantly surpasses the state of the art in open source stream processing systems. And, we are heading to Flink 1.0 after that.
  • 20. 20 §  Streaming technology has matured •  e.g., Flink, Kafka, Dataflow §  Flink and Dataflow duality •  a Google technology •  an open source Apache project +
  • 21. 21 §  Streaming is happening §  Better adapt now §  Flink 0.10: a ready to use open source stream processor
  • 22. Flink 0.10 Flink for the streaming era 22
  • 23. Improved DataStream API §  Stream data analysis differs from batch data analysis by introducing time §  Streams are unbounded and produce data over time §  Simple as batch API if handling time in a simple way §  Powerful if you want to handle time in an advanced way (out-of-order records, preliminary results, etc) 23
  • 24. Improved DataStream API 24 case  class  Event(location:  Location,  numVehicles:  Long)     val  stream:  DataStream[Event]  =  …;     stream        .filter  {  evt  =>  isIntersection(evt.location)  }  
  • 25. Improved DataStream API 25 case  class  Event(location:  Location,  numVehicles:  Long)     val  stream:  DataStream[Event]  =  …;     stream        .filter  {  evt  =>  isIntersection(evt.location)  }          .keyBy("location")        .timeWindow(Time.of(15,  MINUTES),  Time.of(5,  MINUTES))        .sum("numVehicles")  
  • 26. Improved DataStream API 26 case  class  Event(location:  Location,  numVehicles:  Long)     val  stream:  DataStream[Event]  =  …;     stream        .filter  {  evt  =>  isIntersection(evt.location)  }          .keyBy("location")        .timeWindow(Time.of(15,  MINUTES),  Time.of(5,  MINUTES))        .trigger(new  Threshold(200))        .sum("numVehicles")  
  • 27. Improved DataStream API 27 case  class  Event(location:  Location,  numVehicles:  Long)     val  stream:  DataStream[Event]  =  …;     stream        .filter  {  evt  =>  isIntersection(evt.location)  }          .keyBy("location")        .timeWindow(Time.of(15,  MINUTES),  Time.of(5,  MINUTES))        .trigger(new  Threshold(200))        .sum("numVehicles")          .keyBy(  evt  =>  evt.location.grid  )        .mapWithState  {  (evt,  state:  Option[Model])  =>  {              val  model  =  state.orElse(new  Model())              (model.classify(evt),  Some(model.update(evt)))          }}  
  • 28. IoT / Mobile Applications 28 Events occur on devices Queue / Log Events analyzed in a data streaming system Stream Analysis Events stored in a log
  • 29. IoT / Mobile Applications 29
  • 30. IoT / Mobile Applications 30
  • 31. IoT / Mobile Applications 31
  • 32. IoT / Mobile Applications 32 Out of order !!! First burst of events Second burst of events
  • 33. IoT / Mobile Applications 33 Event time windows Arrival time windows Instant event-at-a-time Flink supports out of order time (event time) windows, arrival time windows (and mixtures) plus low latency processing. First burst of events Second burst of events
  • 34. High Availability and Consistency 34 No Single-Point-Of-Failure any more Exactly-once processing semantics across pipeline Checkpoints/Fault Tolerance is decoupled from windows è Allows for highly flexible window implementations ZooKeeper ensemble Multiple Masters failover
  • 36. Batch and Streaming 36 case  class  WordCount(word:  String,  count:  Int)     val  text:  DataStream[String]  =  …;     text      .flatMap  {  line  =>  line.split("  ")  }      .map  {  word  =>  new  WordCount(word,  1)  }      .keyBy("word")      .window(GlobalWindows.create())      .trigger(new  EOFTrigger())      .sum("count")   Batch Word Count in the DataStream API
  • 37. Batch and Streaming 37 Batch Word Count in the DataSet API case  class  WordCount(word:  String,  count:  Int)     val  text:  DataStream[String]  =  …;     text      .flatMap  {  line  =>  line.split("  ")  }      .map  {  word  =>  new  WordCount(word,  1)  }      .keyBy("word")      .window(GlobalWindows.create())      .trigger(new  EOFTrigger())      .sum("count")   val  text:  DataSet[String]  =  …;     text      .flatMap  {  line  =>  line.split("  ")  }      .map  {  word  =>  new  WordCount(word,  1)  }      .groupBy("word")      .sum("count")      
  • 38. Batch and Streaming 38 Pipelined and blocking operators Streaming Dataflow Runtime Batch Parameters DataSet DataStream Relational Optimizer Window Optimization Pipelined and windowed operators Schedule lazily Schedule eagerly Recompute whole operators Periodic checkpoints Streaming data movement Stateful operations DAG recovery Fully buffered streams DAG resource management Streaming Parameters
  • 39. Batch and Streaming 39 A full-fledged batch processor as well Gelly Table FlinkML SAMOA DataSet (Java/Scala/Python) DataStream (Java/Scala) HadoopM/R Flink dataflow engine Local Remote Yarn Tez Embedded Dataflow Dataflow Cascading Table Storm
  • 40. Batch and Streaming 40 A full-fledged batch processor as well Gelly Table FlinkML SAMOA DataSet (Java/Scala/Python) DataStream (Java/Scala) HadoopM/R Flink dataflow engine Local Remote Yarn Tez Embedded Dataflow Dataflow Cascading Table Storm More details at Dongwon Kim's Talk "A comparative performance evaluation of Flink"
  • 41. Integration (picture not complete) 41 POSIX   Java/Scala Collections POSIX  
  • 42. Monitoring 42 Life system metrics and user-defined accumulators/statistics Get  http://flink-­‐m:8081/jobs/7684be6004e4e955c2a558a9bc463f65/accumulators   Monitoring REST API for custom monitoring tools {  "id":  "dceafe2df1f57a1206fcb907cb38ad97",  "user-­‐accumulators":  [      {  "name":"avglen",  "type":"DoubleCounter",  "value":"123.03259440000001"  },      {  "name":"genwords",  "type":"LongCounter",  "value":"75000000"  }  ]  }  
  • 43. Flink 0.10 Summary §  Focus on operational readiness •  high availability •  monitoring •  integration with other systems §  First-class support for event time §  Refined DataStream API: easy and powerful 43
  • 44. Towards Flink 1.0 and beyond Where we see the project going 44
  • 45. Towards Flink 1.0 §  Flink 1.0 is around the corner §  Focus on defining public APIs and automatic API compatibility checks §  Guarantee backwards compatibility in all Flink 1.X versions 45
  • 46. Beyond Flink 1.0 §  Flink engine has most features in place §  Focus on usability features on top of DataStream API •  e.g., SQL, ML, more connectors §  Continue work on elasticity and memory management 46
  • 47. 47   Enjoy the rest of The first conference on Apache Flink