SlideShare a Scribd company logo
1 of 18
Download to read offline
Log Event Stream
Processing In Flink Way
George T. C. Lai
Who Am I?
● BlueWell Technology
○ Big Data Architect
○ Focuses
■ Open DC/OS
■ CoreOS
■ Kubernetes
■ Apache Flink
■ Data Science
What We Have Done?
A Glance of Log Events
Event Timestamp Compound Key Event Body
Brief Introduction To Apache Flink
Basic Structure
● For each Apache Flink DataStream Program
○ Obtain an execution environment.
■ StreamExecutionEnvironment.getExecutionEnvironment()
■ determine the method to handle timestamp (Time characteristic)
○ Load/create data sources.
■ read from file
■ read from socket
■ read from built-in sources (Kafka, RabbitMQ, etc.)
○ Execute transformations on them.
■ filter, map, reduce, etc. (Task chaining)
○ Specify where to save results of the computations.
■ stdout (print)
■ write to files
■ write to built-in sinks (elasticsearch, Kafka, etc.)
○ Trigger the program execution.
Time Handling & Time Characteristics
Logs have its own timestamp.
Windows
● The concept of Windows
○ cut an infinite stream into slices with finite elements.
○ based on timestamp or some criteria.
● Construction of Windows
○ Keyed Windows
■ an infinite DataStream is divided based on both window and key
■ elements with different keys can be processed concurrently
○ Non-keyed Windows
● We focus on the keyed (hosts) windowing.
Windows
● Basic Structure
○ Key
○ Window assigner
○ Window function
■ reduce()
■ fold()
■ apply()
val input: DataStream[T] = ...
input
.keyBy(<key selector>)
.window(<window assigner>)
.<windowed transformation>(<window function>)
Window Assigner - Global Windows
● Single per-key global window.
● Only useful if a custom trigger is
specified.
Window Assigner - Tumbling Windows
● Defined by window size.
● Windows are disjoint.
The window assigner we adopted!!
Window Assigner - Sliding Windows
● Defined by both window size and
sliding size.
● Windows may have overlap.
Window Assigner - Session Windows
● Defined by gap of time.
● Window time
○ starts at individual time points.
○ ends once there has been a certain period of
inactivity.
Window Functions
● WindowFunction
○ Cache elements internally
○ Provides Window meta information (e.g., start time, end time, etc.)
● ReduceFunction
○ Incrementally aggregation
○ No access to Window meta information
● FoldFunction
○ Incrementally aggregation
○ No access to Window meta information
● WindowFunction with ReduceFunction / FoldFunction
○ Incrementally aggregation
○ Has access to Window meta information
Dealing with Data Lateness
● Set allowed lateness to Windows
○ new in 1.1.0
○ watermark passes end timestamp of window + allowedLateness.
○ defaults to 0, drop event once it is late.
Flink VS Spark Based On My Limited Experience
● Data streams manipulation by means of built-in API
○ Flink DataStream API (fine-grained)
○ Spark Streaming (coarse-grained)
● Intrinsic
○ Flink is a stream-intrinsic engine, time window
○ Spark is a batch-intrinsic engine, mini-batch
We’re all set. Thank you!!!
Just Flink
It!

More Related Content

What's hot

Computer network (6)
Computer network (6)Computer network (6)
Computer network (6)
NYversity
 

What's hot (20)

Rust in TiKV
Rust in TiKVRust in TiKV
Rust in TiKV
 
Flink forward SF 2017: Ufuk Celebi - The Stream Processor as a Database: Buil...
Flink forward SF 2017: Ufuk Celebi - The Stream Processor as a Database: Buil...Flink forward SF 2017: Ufuk Celebi - The Stream Processor as a Database: Buil...
Flink forward SF 2017: Ufuk Celebi - The Stream Processor as a Database: Buil...
 
Diagnosing HotSpot JVM Memory Leaks with JFR and JMC
Diagnosing HotSpot JVM Memory Leaks with JFR and JMCDiagnosing HotSpot JVM Memory Leaks with JFR and JMC
Diagnosing HotSpot JVM Memory Leaks with JFR and JMC
 
How to build TiDB
How to build TiDBHow to build TiDB
How to build TiDB
 
FlowSim_presentation
FlowSim_presentationFlowSim_presentation
FlowSim_presentation
 
Dynomite - PerconaLive 2017
Dynomite  - PerconaLive 2017Dynomite  - PerconaLive 2017
Dynomite - PerconaLive 2017
 
Baker: Scaling OVN with Kubernetes API Server
Baker: Scaling OVN with Kubernetes API ServerBaker: Scaling OVN with Kubernetes API Server
Baker: Scaling OVN with Kubernetes API Server
 
A day in the life of a log message
A day in the life of a log messageA day in the life of a log message
A day in the life of a log message
 
Apache Cassandra Lunch #67: Moving Data from Cassandra to Datastax Astra
Apache Cassandra Lunch #67: Moving Data from Cassandra to Datastax AstraApache Cassandra Lunch #67: Moving Data from Cassandra to Datastax Astra
Apache Cassandra Lunch #67: Moving Data from Cassandra to Datastax Astra
 
Computer network (6)
Computer network (6)Computer network (6)
Computer network (6)
 
Skydive 31 janv. 2016
Skydive 31 janv. 2016Skydive 31 janv. 2016
Skydive 31 janv. 2016
 
Scale Relational Database with NewSQL
Scale Relational Database with NewSQLScale Relational Database with NewSQL
Scale Relational Database with NewSQL
 
BKK16-306 ART ii
BKK16-306 ART iiBKK16-306 ART ii
BKK16-306 ART ii
 
Tools in action jdk mission control and flight recorder
Tools in action  jdk mission control and flight recorderTools in action  jdk mission control and flight recorder
Tools in action jdk mission control and flight recorder
 
Aljoscha Krettek - Portable stateful big data processing in Apache Beam
Aljoscha Krettek - Portable stateful big data processing in Apache BeamAljoscha Krettek - Portable stateful big data processing in Apache Beam
Aljoscha Krettek - Portable stateful big data processing in Apache Beam
 
KDB+ Lite
KDB+ LiteKDB+ Lite
KDB+ Lite
 
.NET Memory Primer (Martin Kulov)
.NET Memory Primer (Martin Kulov).NET Memory Primer (Martin Kulov)
.NET Memory Primer (Martin Kulov)
 
Skydive 5/07/2016
Skydive 5/07/2016Skydive 5/07/2016
Skydive 5/07/2016
 
Skydive, real-time network analyzer, container integration
Skydive, real-time network analyzer, container integrationSkydive, real-time network analyzer, container integration
Skydive, real-time network analyzer, container integration
 
Mongodb meetup
Mongodb meetupMongodb meetup
Mongodb meetup
 

Viewers also liked

Viewers also liked (20)

HadoopCon'16, Taipei @myui
HadoopCon'16, Taipei @myuiHadoopCon'16, Taipei @myui
HadoopCon'16, Taipei @myui
 
Stream Processing with Apache Flink
Stream Processing with Apache FlinkStream Processing with Apache Flink
Stream Processing with Apache Flink
 
Hadoop con 2016_9_10_王經篤(Jing-Doo Wang)
Hadoop con 2016_9_10_王經篤(Jing-Doo Wang)Hadoop con 2016_9_10_王經篤(Jing-Doo Wang)
Hadoop con 2016_9_10_王經篤(Jing-Doo Wang)
 
Yarn Resource Management Using Machine Learning
Yarn Resource Management Using Machine LearningYarn Resource Management Using Machine Learning
Yarn Resource Management Using Machine Learning
 
How to plan a hadoop cluster for testing and production environment
How to plan a hadoop cluster for testing and production environmentHow to plan a hadoop cluster for testing and production environment
How to plan a hadoop cluster for testing and production environment
 
Apache Flink & Graph Processing
Apache Flink & Graph ProcessingApache Flink & Graph Processing
Apache Flink & Graph Processing
 
2016-07-12 Introduction to Big Data Platform Security
2016-07-12 Introduction to Big Data Platform Security2016-07-12 Introduction to Big Data Platform Security
2016-07-12 Introduction to Big Data Platform Security
 
Batch and Stream Graph Processing with Apache Flink
Batch and Stream Graph Processing with Apache FlinkBatch and Stream Graph Processing with Apache Flink
Batch and Stream Graph Processing with Apache Flink
 
2016 Hadoop Conf TW - 如何建置數據精靈
2016 Hadoop Conf TW - 如何建置數據精靈2016 Hadoop Conf TW - 如何建置數據精靈
2016 Hadoop Conf TW - 如何建置數據精靈
 
Apache Software Foundation: How To Contribute, with Apache Flink as Example (...
Apache Software Foundation: How To Contribute, with Apache Flink as Example (...Apache Software Foundation: How To Contribute, with Apache Flink as Example (...
Apache Software Foundation: How To Contribute, with Apache Flink as Example (...
 
HadoopCon 2016 - 用 Jupyter Notebook Hold 住一個上線 Spark Machine Learning 專案實戰
HadoopCon 2016  - 用 Jupyter Notebook Hold 住一個上線 Spark  Machine Learning 專案實戰HadoopCon 2016  - 用 Jupyter Notebook Hold 住一個上線 Spark  Machine Learning 專案實戰
HadoopCon 2016 - 用 Jupyter Notebook Hold 住一個上線 Spark Machine Learning 專案實戰
 
Streaming in the Wild with Apache Flink
Streaming in the Wild with Apache FlinkStreaming in the Wild with Apache Flink
Streaming in the Wild with Apache Flink
 
BI in Xuenn
BI in XuennBI in Xuenn
BI in Xuenn
 
Achieve big data analytic platform with lambda architecture on cloud
Achieve big data analytic platform with lambda architecture on cloudAchieve big data analytic platform with lambda architecture on cloud
Achieve big data analytic platform with lambda architecture on cloud
 
Apache Flink Training Workshop @ HadoopCon2016 - #1 System Overview
Apache Flink Training Workshop @ HadoopCon2016 - #1 System OverviewApache Flink Training Workshop @ HadoopCon2016 - #1 System Overview
Apache Flink Training Workshop @ HadoopCon2016 - #1 System Overview
 
Introduction to Apache Flink
Introduction to Apache FlinkIntroduction to Apache Flink
Introduction to Apache Flink
 
Data Stream Processing with Apache Flink
Data Stream Processing with Apache FlinkData Stream Processing with Apache Flink
Data Stream Processing with Apache Flink
 
Apache Flink at Strata San Jose 2016
Apache Flink at Strata San Jose 2016Apache Flink at Strata San Jose 2016
Apache Flink at Strata San Jose 2016
 
Continuous Processing with Apache Flink - Strata London 2016
Continuous Processing with Apache Flink - Strata London 2016Continuous Processing with Apache Flink - Strata London 2016
Continuous Processing with Apache Flink - Strata London 2016
 
SparkR - Play Spark Using R (20160909 HadoopCon)
SparkR - Play Spark Using R (20160909 HadoopCon)SparkR - Play Spark Using R (20160909 HadoopCon)
SparkR - Play Spark Using R (20160909 HadoopCon)
 

Similar to Log Event Stream Processing In Flink Way

Cassandra NYC 2011 Data Modeling
Cassandra NYC 2011 Data ModelingCassandra NYC 2011 Data Modeling
Cassandra NYC 2011 Data Modeling
Matthew Dennis
 
Initial presentation of swift (for montreal user group)
Initial presentation of swift (for montreal user group)Initial presentation of swift (for montreal user group)
Initial presentation of swift (for montreal user group)
Marcos García
 

Similar to Log Event Stream Processing In Flink Way (20)

Streaming Data Pipelines With Apache Beam
Streaming Data Pipelines With Apache BeamStreaming Data Pipelines With Apache Beam
Streaming Data Pipelines With Apache Beam
 
Building a Unified Logging Layer with Fluentd, Elasticsearch and Kibana
Building a Unified Logging Layer with Fluentd, Elasticsearch and KibanaBuilding a Unified Logging Layer with Fluentd, Elasticsearch and Kibana
Building a Unified Logging Layer with Fluentd, Elasticsearch and Kibana
 
Apache flink
Apache flinkApache flink
Apache flink
 
Performance & dev tools
Performance & dev toolsPerformance & dev tools
Performance & dev tools
 
Parallel programing in web applications - public.pptx
Parallel programing in web applications - public.pptxParallel programing in web applications - public.pptx
Parallel programing in web applications - public.pptx
 
Scaling xtext
Scaling xtextScaling xtext
Scaling xtext
 
Understanding time in structured streaming
Understanding time in structured streamingUnderstanding time in structured streaming
Understanding time in structured streaming
 
Flink Forward SF 2017: Kenneth Knowles - Back to Sessions overview
Flink Forward SF 2017: Kenneth Knowles - Back to Sessions overviewFlink Forward SF 2017: Kenneth Knowles - Back to Sessions overview
Flink Forward SF 2017: Kenneth Knowles - Back to Sessions overview
 
Eko10 - Security Monitoring for Big Infrastructures without a Million Dollar ...
Eko10 - Security Monitoring for Big Infrastructures without a Million Dollar ...Eko10 - Security Monitoring for Big Infrastructures without a Million Dollar ...
Eko10 - Security Monitoring for Big Infrastructures without a Million Dollar ...
 
Introduction to InfluxDB, an Open Source Distributed Time Series Database by ...
Introduction to InfluxDB, an Open Source Distributed Time Series Database by ...Introduction to InfluxDB, an Open Source Distributed Time Series Database by ...
Introduction to InfluxDB, an Open Source Distributed Time Series Database by ...
 
PGConf APAC 2018 - High performance json postgre-sql vs. mongodb
PGConf APAC 2018 - High performance json  postgre-sql vs. mongodbPGConf APAC 2018 - High performance json  postgre-sql vs. mongodb
PGConf APAC 2018 - High performance json postgre-sql vs. mongodb
 
Tips and tricks for developing streaming and table connectors - Eron Wright,...
Tips and tricks for developing streaming and table connectors  - Eron Wright,...Tips and tricks for developing streaming and table connectors  - Eron Wright,...
Tips and tricks for developing streaming and table connectors - Eron Wright,...
 
Tips and Tricks for Developing Streaming and table Connectors - Wron Wright, ...
Tips and Tricks for Developing Streaming and table Connectors - Wron Wright, ...Tips and Tricks for Developing Streaming and table Connectors - Wron Wright, ...
Tips and Tricks for Developing Streaming and table Connectors - Wron Wright, ...
 
Flink Connector Development Tips & Tricks
Flink Connector Development Tips & TricksFlink Connector Development Tips & Tricks
Flink Connector Development Tips & Tricks
 
Elasticsearch as a time series database
Elasticsearch as a time series databaseElasticsearch as a time series database
Elasticsearch as a time series database
 
Security Monitoring for big Infrastructures without a Million Dollar budget
Security Monitoring for big Infrastructures without a Million Dollar budgetSecurity Monitoring for big Infrastructures without a Million Dollar budget
Security Monitoring for big Infrastructures without a Million Dollar budget
 
Node.js Web Apps @ ebay scale
Node.js Web Apps @ ebay scaleNode.js Web Apps @ ebay scale
Node.js Web Apps @ ebay scale
 
Cassandra NYC 2011 Data Modeling
Cassandra NYC 2011 Data ModelingCassandra NYC 2011 Data Modeling
Cassandra NYC 2011 Data Modeling
 
Initial presentation of swift (for montreal user group)
Initial presentation of swift (for montreal user group)Initial presentation of swift (for montreal user group)
Initial presentation of swift (for montreal user group)
 
The Parquet Format and Performance Optimization Opportunities
The Parquet Format and Performance Optimization OpportunitiesThe Parquet Format and Performance Optimization Opportunities
The Parquet Format and Performance Optimization Opportunities
 

Recently uploaded

一比一原版西悉尼大学毕业证成绩单如何办理
一比一原版西悉尼大学毕业证成绩单如何办理一比一原版西悉尼大学毕业证成绩单如何办理
一比一原版西悉尼大学毕业证成绩单如何办理
pyhepag
 
一比一原版阿德莱德大学毕业证成绩单如何办理
一比一原版阿德莱德大学毕业证成绩单如何办理一比一原版阿德莱德大学毕业证成绩单如何办理
一比一原版阿德莱德大学毕业证成绩单如何办理
pyhepag
 
Exploratory Data Analysis - Dilip S.pptx
Exploratory Data Analysis - Dilip S.pptxExploratory Data Analysis - Dilip S.pptx
Exploratory Data Analysis - Dilip S.pptx
DilipVasan
 
一比一原版麦考瑞大学毕业证成绩单如何办理
一比一原版麦考瑞大学毕业证成绩单如何办理一比一原版麦考瑞大学毕业证成绩单如何办理
一比一原版麦考瑞大学毕业证成绩单如何办理
cyebo
 
Abortion pills in Dammam Saudi Arabia// +966572737505 // buy cytotec
Abortion pills in Dammam Saudi Arabia// +966572737505 // buy cytotecAbortion pills in Dammam Saudi Arabia// +966572737505 // buy cytotec
Abortion pills in Dammam Saudi Arabia// +966572737505 // buy cytotec
Abortion pills in Riyadh +966572737505 get cytotec
 
一比一原版加利福尼亚大学尔湾分校毕业证成绩单如何办理
一比一原版加利福尼亚大学尔湾分校毕业证成绩单如何办理一比一原版加利福尼亚大学尔湾分校毕业证成绩单如何办理
一比一原版加利福尼亚大学尔湾分校毕业证成绩单如何办理
pyhepag
 

Recently uploaded (20)

Artificial_General_Intelligence__storm_gen_article.pdf
Artificial_General_Intelligence__storm_gen_article.pdfArtificial_General_Intelligence__storm_gen_article.pdf
Artificial_General_Intelligence__storm_gen_article.pdf
 
How I opened a fake bank account and didn't go to prison
How I opened a fake bank account and didn't go to prisonHow I opened a fake bank account and didn't go to prison
How I opened a fake bank account and didn't go to prison
 
一比一原版西悉尼大学毕业证成绩单如何办理
一比一原版西悉尼大学毕业证成绩单如何办理一比一原版西悉尼大学毕业证成绩单如何办理
一比一原版西悉尼大学毕业证成绩单如何办理
 
Webinar One View, Multiple Systems No-Code Integration of Salesforce and ERPs
Webinar One View, Multiple Systems No-Code Integration of Salesforce and ERPsWebinar One View, Multiple Systems No-Code Integration of Salesforce and ERPs
Webinar One View, Multiple Systems No-Code Integration of Salesforce and ERPs
 
How can I successfully sell my pi coins in Philippines?
How can I successfully sell my pi coins in Philippines?How can I successfully sell my pi coins in Philippines?
How can I successfully sell my pi coins in Philippines?
 
Easy and simple project file on mp online
Easy and simple project file on mp onlineEasy and simple project file on mp online
Easy and simple project file on mp online
 
Slip-and-fall Injuries: Top Workers' Comp Claims
Slip-and-fall Injuries: Top Workers' Comp ClaimsSlip-and-fall Injuries: Top Workers' Comp Claims
Slip-and-fall Injuries: Top Workers' Comp Claims
 
一比一原版阿德莱德大学毕业证成绩单如何办理
一比一原版阿德莱德大学毕业证成绩单如何办理一比一原版阿德莱德大学毕业证成绩单如何办理
一比一原版阿德莱德大学毕业证成绩单如何办理
 
Exploratory Data Analysis - Dilip S.pptx
Exploratory Data Analysis - Dilip S.pptxExploratory Data Analysis - Dilip S.pptx
Exploratory Data Analysis - Dilip S.pptx
 
how can i exchange pi coins for others currency like Bitcoin
how can i exchange pi coins for others currency like Bitcoinhow can i exchange pi coins for others currency like Bitcoin
how can i exchange pi coins for others currency like Bitcoin
 
Pre-ProductionImproveddsfjgndflghtgg.pptx
Pre-ProductionImproveddsfjgndflghtgg.pptxPre-ProductionImproveddsfjgndflghtgg.pptx
Pre-ProductionImproveddsfjgndflghtgg.pptx
 
一比一原版麦考瑞大学毕业证成绩单如何办理
一比一原版麦考瑞大学毕业证成绩单如何办理一比一原版麦考瑞大学毕业证成绩单如何办理
一比一原版麦考瑞大学毕业证成绩单如何办理
 
Atlantic Grupa Case Study (Mintec Data AI)
Atlantic Grupa Case Study (Mintec Data AI)Atlantic Grupa Case Study (Mintec Data AI)
Atlantic Grupa Case Study (Mintec Data AI)
 
Abortion pills in Dammam Saudi Arabia// +966572737505 // buy cytotec
Abortion pills in Dammam Saudi Arabia// +966572737505 // buy cytotecAbortion pills in Dammam Saudi Arabia// +966572737505 // buy cytotec
Abortion pills in Dammam Saudi Arabia// +966572737505 // buy cytotec
 
Machine Learning for Accident Severity Prediction
Machine Learning for Accident Severity PredictionMachine Learning for Accident Severity Prediction
Machine Learning for Accident Severity Prediction
 
Supply chain analytics to combat the effects of Ukraine-Russia-conflict
Supply chain analytics to combat the effects of Ukraine-Russia-conflictSupply chain analytics to combat the effects of Ukraine-Russia-conflict
Supply chain analytics to combat the effects of Ukraine-Russia-conflict
 
2024 Q1 Tableau User Group Leader Quarterly Call
2024 Q1 Tableau User Group Leader Quarterly Call2024 Q1 Tableau User Group Leader Quarterly Call
2024 Q1 Tableau User Group Leader Quarterly Call
 
basics of data science with application areas.pdf
basics of data science with application areas.pdfbasics of data science with application areas.pdf
basics of data science with application areas.pdf
 
2024 Q2 Orange County (CA) Tableau User Group Meeting
2024 Q2 Orange County (CA) Tableau User Group Meeting2024 Q2 Orange County (CA) Tableau User Group Meeting
2024 Q2 Orange County (CA) Tableau User Group Meeting
 
一比一原版加利福尼亚大学尔湾分校毕业证成绩单如何办理
一比一原版加利福尼亚大学尔湾分校毕业证成绩单如何办理一比一原版加利福尼亚大学尔湾分校毕业证成绩单如何办理
一比一原版加利福尼亚大学尔湾分校毕业证成绩单如何办理
 

Log Event Stream Processing In Flink Way

  • 1. Log Event Stream Processing In Flink Way George T. C. Lai
  • 2. Who Am I? ● BlueWell Technology ○ Big Data Architect ○ Focuses ■ Open DC/OS ■ CoreOS ■ Kubernetes ■ Apache Flink ■ Data Science
  • 3. What We Have Done?
  • 4.
  • 5. A Glance of Log Events Event Timestamp Compound Key Event Body
  • 6. Brief Introduction To Apache Flink
  • 7. Basic Structure ● For each Apache Flink DataStream Program ○ Obtain an execution environment. ■ StreamExecutionEnvironment.getExecutionEnvironment() ■ determine the method to handle timestamp (Time characteristic) ○ Load/create data sources. ■ read from file ■ read from socket ■ read from built-in sources (Kafka, RabbitMQ, etc.) ○ Execute transformations on them. ■ filter, map, reduce, etc. (Task chaining) ○ Specify where to save results of the computations. ■ stdout (print) ■ write to files ■ write to built-in sinks (elasticsearch, Kafka, etc.) ○ Trigger the program execution.
  • 8. Time Handling & Time Characteristics Logs have its own timestamp.
  • 9. Windows ● The concept of Windows ○ cut an infinite stream into slices with finite elements. ○ based on timestamp or some criteria. ● Construction of Windows ○ Keyed Windows ■ an infinite DataStream is divided based on both window and key ■ elements with different keys can be processed concurrently ○ Non-keyed Windows ● We focus on the keyed (hosts) windowing.
  • 10. Windows ● Basic Structure ○ Key ○ Window assigner ○ Window function ■ reduce() ■ fold() ■ apply() val input: DataStream[T] = ... input .keyBy(<key selector>) .window(<window assigner>) .<windowed transformation>(<window function>)
  • 11. Window Assigner - Global Windows ● Single per-key global window. ● Only useful if a custom trigger is specified.
  • 12. Window Assigner - Tumbling Windows ● Defined by window size. ● Windows are disjoint. The window assigner we adopted!!
  • 13. Window Assigner - Sliding Windows ● Defined by both window size and sliding size. ● Windows may have overlap.
  • 14. Window Assigner - Session Windows ● Defined by gap of time. ● Window time ○ starts at individual time points. ○ ends once there has been a certain period of inactivity.
  • 15. Window Functions ● WindowFunction ○ Cache elements internally ○ Provides Window meta information (e.g., start time, end time, etc.) ● ReduceFunction ○ Incrementally aggregation ○ No access to Window meta information ● FoldFunction ○ Incrementally aggregation ○ No access to Window meta information ● WindowFunction with ReduceFunction / FoldFunction ○ Incrementally aggregation ○ Has access to Window meta information
  • 16. Dealing with Data Lateness ● Set allowed lateness to Windows ○ new in 1.1.0 ○ watermark passes end timestamp of window + allowedLateness. ○ defaults to 0, drop event once it is late.
  • 17. Flink VS Spark Based On My Limited Experience ● Data streams manipulation by means of built-in API ○ Flink DataStream API (fine-grained) ○ Spark Streaming (coarse-grained) ● Intrinsic ○ Flink is a stream-intrinsic engine, time window ○ Spark is a batch-intrinsic engine, mini-batch
  • 18. We’re all set. Thank you!!! Just Flink It!