SlideShare a Scribd company logo
1 of 55
© 2019 Ververica
Presenter Name & Title
PRESENTATION TITLE
The Apache Flink® Conference
Stream Processing | Event Driven | Real Time
San Francisco 1-2, 2019
© 2019 Ververica2
A big thanks to our Sponsors
Platinum
Gold
Silver
Community
Flink Fest
Media
© 2019 Ververica3
A big thanks to our Program Committee
Fabian Hueske Dean Wampler
Sonali SharmaEric SammerStefan RichterTyler Akidau
Jamie Grier
© 2019 Ververica4
A big thanks to our Speakers
© 2019 Ververica5
Flink Forward App
Rate speakers and sessions with
a chance to win a Flybrix drone!
Get involved!
Apache Flink User Survey
Win a trip to one of the next Flink Forward
conferences of your choice!
Flink Forward Survey
Help us improve the quality of Flink
Forward. We appreciate your feedback!
Community Contribution
Sign up as a content contribut or for blog
posts or speaking oppor tunities.
Flink Forward Social Feed - View & Engage!
Get involved
© 2019 Ververica
Kostas Tzoumas
INTRODUCING VERVERICA
© 2019 Ververica7
Founded in 2014 by the original creators of Apache Flink to
commercialize the open source project and support the
community
© 2019 Ververica8
+
=
© 2019 Ververica9
Why?
• Alibaba has been the largest user of Flink and second largest
contributor for years
• Deeply committed to open source and creating technological impact
• Joining forces made a lot of sense for the two teams in order to collaborate
even closer and accelerate their contributions to Flink
© 2019 Ververica10
Flink at Alibaba (few examples)
Taobao is the largest e-commerce platform globally with more than 600 million
monthly active users. Every time a user logs into the Taobao app they see a
different landing page personalized for the user and depending on the latest real-
time activity in the platform. Using Flink for real-time machine learning at Taobao
has resulted in over 20% increase in purchase conversion rate. At peak during
Singles Day last year, the system processed over 1.7 billion events/sec.
In the Hangzhou City Brain Project, Flink is used to process in real-time data from
a variety of sensors (traffic cameras, map applications, etc), and manage traffic
signals in 128 intersections. The City Brain project has halved traveling times for
ambulances and commuters. Traffic accidents can be detected immediately, and
help can reach the accident site within 5 minutes.
© 2019 Ververica11
What’s in a name?
verum (“real” in Latin)
Understanding the truth about the world by
getting the real-time view
© 2019 Ververica12
What is Ververica?
1. Double down on the open source community and improve its health
and diversity
2. Contribute a number of innovations to the open source project starting
with Alibaba’s Blink for batch processing
3. Create an ecosystem and foundation for the commercial success of
Flink projects and products across the world
Our #1 goal is to position Apache Flink for the next 10 years of its life
© 2019 Ververica13
Ververica Commercial Products
Full continuation of our commercial products and
services
• Ververica Platform including Apache Flink,
Application Manager, and Streaming Ledger
• Apache Flink Training and Consulting Services
• Enterprise Support
A lot of innovation coming here as well leveraging
existing work in Alibaba Cloud
© 2019 Ververica14
Announcing: Ververica Partner Program
We are looking for partners to help us develop the broader Flink
ecosystem
• Ververica Platform Partner
Preferred partners of our commercial products around the globe
• Ververica Services Partner
Service provider on Apache Flink certified by Ververica
Sign up here! ververica.com/partner-program
© 2019 Ververica
Stephan Ewen
Xiaowei Jiang
Robert Metzger
From Stream Processor to
Unified Data Processing System
© 2019 Ververica16
Use Cases Presented Today
© 2019 Ververica17
Apache Flink and Public Clouds
cloud
on-prem
© 2019 Ververica
Data Processing Applications
© 2019 Ververica
more lag time
Batch
Processing
Continuous
Processing
Event-driven
Applications
Transactional
Processing
more real time
Stream Processing
Data
Pipelines
Streaming
Analytics
© 2019 Ververica
more lag time
Batch
Processing
Continuous
Processing
Event-driven
Applications
Transactional
Processing
more real time
Stream Processing
Data
Pipelines
Streaming
Analytics
Batch Processing & Continuous Streaming
Analytics & Applications
© 2019 Ververica
more lag time
Batch
Processing
Continuous
Processing
Event-driven
Applications
Transactional
Processing
more real time
Stream Processing
Data
Pipelines
Streaming
Analytics
Flink community's focus over the last releases
© 2019 Ververica
more lag time
Batch
Processing
Continuous
Processing
Event-driven
Applications
Transactional
Processing
more real time
Recent Features
Data
Pipelines
Streaming
Analytics
Time-versioned Joins
MATCH_RECOGNIZE
Schema Upgrades
SELECT *
FROM TaxiRides
MATCH_RECOGNIZE (
PARTITION BY driverId
ORDER BY rideTime
MEASURES
S.rideId as sRideId
AFTER MATCH SKIP PAST LAST ROW
PATTERN (S M{2,} E)
DEFINE
S AS S.isStart = true,
M AS M.rideId <> S.rideId,
E AS E.isStart = false
AND E.rideId = S.rideId)
A B
A X Y
© 2019 Ververica
more lag time
Batch
Processing
Continuous
Processing
Event-driven
Applications
Transactional
Processing
more real time
Stream Processing
Data
Pipelines
Streaming
Analytics
"Steam Processing takes on ACID"
by Seth Wiesman
11am, Nikko I
© 2019 Ververica
more lag time
Batch
Processing
Continuous
Processing
Event-driven
Applications
Transactional
Processing
more real time
Stream Processing
Data
Pipelines
Streaming
Analytics
SQL and SQL Ecosystem / tools
Machine Learning
Graphs
Batch Performance
Batch Fault Tolerance
Interactive Queries
Dashboards
© 2019 Ververica
The Relationship between
Batch and Streaming
© 2019 Ververica26
Everything Streams
That is about 60% of the truth…
© 2019 Ververica27
The remaining 40% of the truth
Continuous
Streaming
Batch
Processing
Data is incomplete
Latency SLAs
Completeness and
Latency is a tradeoff
Data is as complete
as it gets within
the job
No Low Latency SLAs
© 2019 Ververica28
The remaining 40% of the truth
Continuous
Streaming
Batch
Processing
Data is incomplete
Latency SLAs
Completeness and
Latency is a tradeoff
Data is as complete
as it gets within
the job
No Low Latency SLAs
© 2019 Ververica29
Streaming versus Batch Join
© 2019 Ververica30
Streaming versus Batch Join
2x RocksDB
LSM-Trees 1x Hybrid Hash Join
DataStream API DataSet API
push-based
operators
low-latency
minimize
in-flight data
pull-based
operators
flexible data
flow control
high latency
no checkpoints
© 2019 Ververica31
Exploiting the Batch Special Case
Planner/Optimizer
See also: "Towards Flink 2.0: Rethinking the stack and APIs to unify Batch & Stream"
by Aljoscha Krettek, 2pm, Nikko II/III
Continuous Operators
Streaming
Scheduler Rules
Additional Bounded
Operators
Additional
Scheduling Strategies
if (bounded && non-incremental)
activates additional
optimizer choices
Core operators,
cover all cases
© 2019 Ververica
Stream Processing,
Analytics, and Applications
© 2019 Ververica33
Process Function (events, state, time)
DataStream API (streams, windows)
Stream SQL / Tables (dynamic tables)
Stream- & Batch
Data Processing
High-level
Analytics API
Stateful Event-
Driven Applications
val stats = stream
.keyBy("sensor")
.timeWindow(Time.seconds(5))
.sum((a, b) -> a.add(b))
def processElement(event: MyEvent, ctx: Context, out: Collector[Result]) = {
// work with event and state
(event, state.value) match { … }
out.collect(…) // emit events
state.update(…) // modify state
// schedule a timer callback
ctx.timerService.registerEventTimeTimer(event.timestamp + 500)
}
How we showed the API stack in the past…
© 2019 Ververica
Applications
(physical)
Analytics
(declarative)
DataStream API Table API
Types are Java / Scala classes Logical Schema for Tables
Transformation Functions Declarative Language (SQL, Table DSL)
State implicit in operationsExplicit control over State
Explicit control over Time SLAs define when to trigger
Executes as described Automatic Optimization
© 2019 Ververica35
Rethinking the Flink Stack
Stream Operator & DAG API
Runtime
DataSet
(deprecated)
DataStream Table / SQL
Still possible to mix and
match within a program
© 2019 Ververica
SQL, Notebooks,
and Machine Learning
© 2019 Ververica
JobGraphPhysical PlanTable API & SQL Logical Plan
Pluggable
Catalog
Query
Optimizer
SubQuery
Decorrelation
Filter/Project
Push-Down
Join
Reorder
…
Adding a new Table API / SQL Query Processor (Blink)
© 2019 Ververica
Sub-QueryANSI Syntax Data Type Join
Advanced
Aggregates
Window
Operator
Over Window TPC-H/TPC-DS
Functional Improvements
© 2019 Ververica
Performant
Operators
Operator codegen
HashAgg/Local-global
Agg
Improved HashJoin
Semi/Anti join
Vectorization
Resource
Optimizations
Stats based estimation
Dynamic memory
allocation
Expression
Optimizations
Record Format
Operate binary data
JVM intrinsics
Hot method codegen
Rich Stats
NDV
NULL count
Avg length
Max length
Min
Max
Cost Based
Join order
Join type
Agg strategy
……
Advanced Rules
Subplan reuse
Join condition
expansion
Shuffle removal
Distinct Agg rewrite
….
Query Execution Query Optimizer
Performance Improvements
© 2019 Ververica
Blink
Spark
1T TPC-DS Queries
Blink
Spark
10T TPC-DS Queries
Blink
Spark
30T TPC-DS Queries
2000S
20000S
50000S ?
Batch SQL Benchmark
© 2019 Ververica
1.7B10K 10K
Sub-
Second 100TB
Flink SQL in Production at Alibaba
© 2019 Ververica
Common
Implementation
Modular/
Composable
Applications
Dynamic
Query Logic
Interactive
Programming
Ease of
Use
Table API
© 2019 Ververica
June, 2019
TableAPI Refactor
FLIP-32
July, 2019
Initial Blink Runner Merge
Flink 1.9 Release
Oct, 2019
Full Merge
Table API Layer
Flink
Runner
Blink
Runner
Blink SQL Merge Plan
© 2019 Ververica
Flink Hive
+
For More?
Integrate Flink with Hive Ecosystem
Xuefu Zhang & Bowen Li, Alibaba 12:20pm - 1:00pm Carmel
HMS Data
FlinkHive
Hive Integration
© 2019 Ververica
• Zeppelin Integration
Zeppelin support for Flink
© 2019 Ververica
Unified Interface ML algorithms Common Utilities
For More?
When Table meets AI: Build Flink AI Ecosystem on Table API
Shaoxuan Wang, Alibaba
4:30pm - 5:10pm Nikko II & II
High performance ML library based on Flink
Xu Yang, Alibaba
2:50pm - 3:10pm Carmel
Clustering
K-means
Latent Dirichlet allocation (LDA)
Bisecting k-means
Gaussian Mixture Model (GMM)
Regression
Linear regression
Lasso regression
Ridge regression
Generalized linear regression
Survival regression
Isotonic regression
Classifier
Binomial logistic regression
Multinomial logistic regression
Multilayer perceptron classifier
Linear Support Vector Machine
Naive Bayes
Random Forest
GBDT
Decision Tree
Others
Collaborative filtering
FP-Growth
PrefixSpan
Proposal for Machine Learning
© 2019 Ververica
The Apache Flink Community
© 2019 Ververica48
A growing Apache Flink Community
… not only Flink’s codebase that is growing massively …
230 Emails / week
© 2019 Ververica49
© 2019 Ververica50
Launch of a new Chinese language user support mailing list
© 2019 Ververica51
Growing the Contributors Community
• Cleanup & reorganization of the Jira
components
• Flinkbot: Improve pull request reviews and
labeling
• Discussions about improving the contribution
workflow
• PMC mentoring new committer candidates
• Flink Community Packages website
© 2019 Ververica52
Flink Community Packages
© 2019 Ververica
Closing
© 2019 Ververica
The Apache Flink community is
more active than ever
Apache Flink continues to evolve with the
Stream Processing space.
Seamlessly integrate analytics, machine learning, applications
and very fast batch processing on top of stream processing
© 2019 Ververica
Thank you!

More Related Content

What's hot

Scaling stream data pipelines with Pravega and Apache Flink
Scaling stream data pipelines with Pravega and Apache FlinkScaling stream data pipelines with Pravega and Apache Flink
Scaling stream data pipelines with Pravega and Apache FlinkTill Rohrmann
 
Flink Forward San Francisco 2018: Gregory Fee - "Bootstrapping State In Apach...
Flink Forward San Francisco 2018: Gregory Fee - "Bootstrapping State In Apach...Flink Forward San Francisco 2018: Gregory Fee - "Bootstrapping State In Apach...
Flink Forward San Francisco 2018: Gregory Fee - "Bootstrapping State In Apach...Flink Forward
 
Flink Forward San Francisco 2018 keynote: Srikanth Satya - "Stream Processin...
Flink Forward San Francisco 2018 keynote:  Srikanth Satya - "Stream Processin...Flink Forward San Francisco 2018 keynote:  Srikanth Satya - "Stream Processin...
Flink Forward San Francisco 2018 keynote: Srikanth Satya - "Stream Processin...Flink Forward
 
Flink Forward San Francisco 2018: Andrew Gao & Jeff Sharpe - "Finding Bad Ac...
Flink Forward San Francisco 2018: Andrew Gao &  Jeff Sharpe - "Finding Bad Ac...Flink Forward San Francisco 2018: Andrew Gao &  Jeff Sharpe - "Finding Bad Ac...
Flink Forward San Francisco 2018: Andrew Gao & Jeff Sharpe - "Finding Bad Ac...Flink Forward
 
Matching the Scale at Tinder with Kafka
Matching the Scale at Tinder with Kafka Matching the Scale at Tinder with Kafka
Matching the Scale at Tinder with Kafka confluent
 
Flink Forward San Francisco 2018: Andrew Torson - "Extending Flink metrics: R...
Flink Forward San Francisco 2018: Andrew Torson - "Extending Flink metrics: R...Flink Forward San Francisco 2018: Andrew Torson - "Extending Flink metrics: R...
Flink Forward San Francisco 2018: Andrew Torson - "Extending Flink metrics: R...Flink Forward
 
Maximilian Michels - Flink and Beam
Maximilian Michels - Flink and BeamMaximilian Michels - Flink and Beam
Maximilian Michels - Flink and BeamFlink Forward
 
HOP! Airlines Jets to Real Time
HOP! Airlines Jets to Real TimeHOP! Airlines Jets to Real Time
HOP! Airlines Jets to Real Timeconfluent
 
Do Flink on Web with FLOW
Do Flink on Web with FLOWDo Flink on Web with FLOW
Do Flink on Web with FLOWDongwon Kim
 
Flink Forward San Francisco 2018: Xu Yang - "Alibaba’s common algorithm platf...
Flink Forward San Francisco 2018: Xu Yang - "Alibaba’s common algorithm platf...Flink Forward San Francisco 2018: Xu Yang - "Alibaba’s common algorithm platf...
Flink Forward San Francisco 2018: Xu Yang - "Alibaba’s common algorithm platf...Flink Forward
 
Flink Forward San Francisco 2018: - Jinkui Shi and Radu Tudoran "Flink real-t...
Flink Forward San Francisco 2018: - Jinkui Shi and Radu Tudoran "Flink real-t...Flink Forward San Francisco 2018: - Jinkui Shi and Radu Tudoran "Flink real-t...
Flink Forward San Francisco 2018: - Jinkui Shi and Radu Tudoran "Flink real-t...Flink Forward
 
On Track with Apache Kafka®: Building a Streaming ETL Solution with Rail Data
On Track with Apache Kafka®: Building a Streaming ETL Solution with Rail DataOn Track with Apache Kafka®: Building a Streaming ETL Solution with Rail Data
On Track with Apache Kafka®: Building a Streaming ETL Solution with Rail Dataconfluent
 
Kafka as your Data Lake - is it Feasible? (Guido Schmutz, Trivadis) Kafka Sum...
Kafka as your Data Lake - is it Feasible? (Guido Schmutz, Trivadis) Kafka Sum...Kafka as your Data Lake - is it Feasible? (Guido Schmutz, Trivadis) Kafka Sum...
Kafka as your Data Lake - is it Feasible? (Guido Schmutz, Trivadis) Kafka Sum...HostedbyConfluent
 
Eventing Things - A Netflix Original! (Nitin Sharma, Netflix) Kafka Summit SF...
Eventing Things - A Netflix Original! (Nitin Sharma, Netflix) Kafka Summit SF...Eventing Things - A Netflix Original! (Nitin Sharma, Netflix) Kafka Summit SF...
Eventing Things - A Netflix Original! (Nitin Sharma, Netflix) Kafka Summit SF...confluent
 
Flink Forward San Francisco 2018: Fabian Hueske & Timo Walther - "Why and how...
Flink Forward San Francisco 2018: Fabian Hueske & Timo Walther - "Why and how...Flink Forward San Francisco 2018: Fabian Hueske & Timo Walther - "Why and how...
Flink Forward San Francisco 2018: Fabian Hueske & Timo Walther - "Why and how...Flink Forward
 
Bringing Streaming Data To The Masses: Lowering The “Cost Of Admission” For Y...
Bringing Streaming Data To The Masses: Lowering The “Cost Of Admission” For Y...Bringing Streaming Data To The Masses: Lowering The “Cost Of Admission” For Y...
Bringing Streaming Data To The Masses: Lowering The “Cost Of Admission” For Y...confluent
 
Failing to Cross the Streams – Lessons Learned the Hard Way | Philip Schmitt,...
Failing to Cross the Streams – Lessons Learned the Hard Way | Philip Schmitt,...Failing to Cross the Streams – Lessons Learned the Hard Way | Philip Schmitt,...
Failing to Cross the Streams – Lessons Learned the Hard Way | Philip Schmitt,...HostedbyConfluent
 
Flink Forward Berlin 2018: Shriya Arora - "Taming large-state to join dataset...
Flink Forward Berlin 2018: Shriya Arora - "Taming large-state to join dataset...Flink Forward Berlin 2018: Shriya Arora - "Taming large-state to join dataset...
Flink Forward Berlin 2018: Shriya Arora - "Taming large-state to join dataset...Flink Forward
 
Flink Forward San Francisco 2018: Ken Krugler - "Building a scalable focused ...
Flink Forward San Francisco 2018: Ken Krugler - "Building a scalable focused ...Flink Forward San Francisco 2018: Ken Krugler - "Building a scalable focused ...
Flink Forward San Francisco 2018: Ken Krugler - "Building a scalable focused ...Flink Forward
 
apidays LIVE Singapore 2021 - REST the Events - REST APIs for Event-Driven Ar...
apidays LIVE Singapore 2021 - REST the Events - REST APIs for Event-Driven Ar...apidays LIVE Singapore 2021 - REST the Events - REST APIs for Event-Driven Ar...
apidays LIVE Singapore 2021 - REST the Events - REST APIs for Event-Driven Ar...apidays
 

What's hot (20)

Scaling stream data pipelines with Pravega and Apache Flink
Scaling stream data pipelines with Pravega and Apache FlinkScaling stream data pipelines with Pravega and Apache Flink
Scaling stream data pipelines with Pravega and Apache Flink
 
Flink Forward San Francisco 2018: Gregory Fee - "Bootstrapping State In Apach...
Flink Forward San Francisco 2018: Gregory Fee - "Bootstrapping State In Apach...Flink Forward San Francisco 2018: Gregory Fee - "Bootstrapping State In Apach...
Flink Forward San Francisco 2018: Gregory Fee - "Bootstrapping State In Apach...
 
Flink Forward San Francisco 2018 keynote: Srikanth Satya - "Stream Processin...
Flink Forward San Francisco 2018 keynote:  Srikanth Satya - "Stream Processin...Flink Forward San Francisco 2018 keynote:  Srikanth Satya - "Stream Processin...
Flink Forward San Francisco 2018 keynote: Srikanth Satya - "Stream Processin...
 
Flink Forward San Francisco 2018: Andrew Gao & Jeff Sharpe - "Finding Bad Ac...
Flink Forward San Francisco 2018: Andrew Gao &  Jeff Sharpe - "Finding Bad Ac...Flink Forward San Francisco 2018: Andrew Gao &  Jeff Sharpe - "Finding Bad Ac...
Flink Forward San Francisco 2018: Andrew Gao & Jeff Sharpe - "Finding Bad Ac...
 
Matching the Scale at Tinder with Kafka
Matching the Scale at Tinder with Kafka Matching the Scale at Tinder with Kafka
Matching the Scale at Tinder with Kafka
 
Flink Forward San Francisco 2018: Andrew Torson - "Extending Flink metrics: R...
Flink Forward San Francisco 2018: Andrew Torson - "Extending Flink metrics: R...Flink Forward San Francisco 2018: Andrew Torson - "Extending Flink metrics: R...
Flink Forward San Francisco 2018: Andrew Torson - "Extending Flink metrics: R...
 
Maximilian Michels - Flink and Beam
Maximilian Michels - Flink and BeamMaximilian Michels - Flink and Beam
Maximilian Michels - Flink and Beam
 
HOP! Airlines Jets to Real Time
HOP! Airlines Jets to Real TimeHOP! Airlines Jets to Real Time
HOP! Airlines Jets to Real Time
 
Do Flink on Web with FLOW
Do Flink on Web with FLOWDo Flink on Web with FLOW
Do Flink on Web with FLOW
 
Flink Forward San Francisco 2018: Xu Yang - "Alibaba’s common algorithm platf...
Flink Forward San Francisco 2018: Xu Yang - "Alibaba’s common algorithm platf...Flink Forward San Francisco 2018: Xu Yang - "Alibaba’s common algorithm platf...
Flink Forward San Francisco 2018: Xu Yang - "Alibaba’s common algorithm platf...
 
Flink Forward San Francisco 2018: - Jinkui Shi and Radu Tudoran "Flink real-t...
Flink Forward San Francisco 2018: - Jinkui Shi and Radu Tudoran "Flink real-t...Flink Forward San Francisco 2018: - Jinkui Shi and Radu Tudoran "Flink real-t...
Flink Forward San Francisco 2018: - Jinkui Shi and Radu Tudoran "Flink real-t...
 
On Track with Apache Kafka®: Building a Streaming ETL Solution with Rail Data
On Track with Apache Kafka®: Building a Streaming ETL Solution with Rail DataOn Track with Apache Kafka®: Building a Streaming ETL Solution with Rail Data
On Track with Apache Kafka®: Building a Streaming ETL Solution with Rail Data
 
Kafka as your Data Lake - is it Feasible? (Guido Schmutz, Trivadis) Kafka Sum...
Kafka as your Data Lake - is it Feasible? (Guido Schmutz, Trivadis) Kafka Sum...Kafka as your Data Lake - is it Feasible? (Guido Schmutz, Trivadis) Kafka Sum...
Kafka as your Data Lake - is it Feasible? (Guido Schmutz, Trivadis) Kafka Sum...
 
Eventing Things - A Netflix Original! (Nitin Sharma, Netflix) Kafka Summit SF...
Eventing Things - A Netflix Original! (Nitin Sharma, Netflix) Kafka Summit SF...Eventing Things - A Netflix Original! (Nitin Sharma, Netflix) Kafka Summit SF...
Eventing Things - A Netflix Original! (Nitin Sharma, Netflix) Kafka Summit SF...
 
Flink Forward San Francisco 2018: Fabian Hueske & Timo Walther - "Why and how...
Flink Forward San Francisco 2018: Fabian Hueske & Timo Walther - "Why and how...Flink Forward San Francisco 2018: Fabian Hueske & Timo Walther - "Why and how...
Flink Forward San Francisco 2018: Fabian Hueske & Timo Walther - "Why and how...
 
Bringing Streaming Data To The Masses: Lowering The “Cost Of Admission” For Y...
Bringing Streaming Data To The Masses: Lowering The “Cost Of Admission” For Y...Bringing Streaming Data To The Masses: Lowering The “Cost Of Admission” For Y...
Bringing Streaming Data To The Masses: Lowering The “Cost Of Admission” For Y...
 
Failing to Cross the Streams – Lessons Learned the Hard Way | Philip Schmitt,...
Failing to Cross the Streams – Lessons Learned the Hard Way | Philip Schmitt,...Failing to Cross the Streams – Lessons Learned the Hard Way | Philip Schmitt,...
Failing to Cross the Streams – Lessons Learned the Hard Way | Philip Schmitt,...
 
Flink Forward Berlin 2018: Shriya Arora - "Taming large-state to join dataset...
Flink Forward Berlin 2018: Shriya Arora - "Taming large-state to join dataset...Flink Forward Berlin 2018: Shriya Arora - "Taming large-state to join dataset...
Flink Forward Berlin 2018: Shriya Arora - "Taming large-state to join dataset...
 
Flink Forward San Francisco 2018: Ken Krugler - "Building a scalable focused ...
Flink Forward San Francisco 2018: Ken Krugler - "Building a scalable focused ...Flink Forward San Francisco 2018: Ken Krugler - "Building a scalable focused ...
Flink Forward San Francisco 2018: Ken Krugler - "Building a scalable focused ...
 
apidays LIVE Singapore 2021 - REST the Events - REST APIs for Event-Driven Ar...
apidays LIVE Singapore 2021 - REST the Events - REST APIs for Event-Driven Ar...apidays LIVE Singapore 2021 - REST the Events - REST APIs for Event-Driven Ar...
apidays LIVE Singapore 2021 - REST the Events - REST APIs for Event-Driven Ar...
 

Similar to KEYNOTE Flink Forward San Francisco 2019: From Stream Processor to a Unified Data Processing System - Stephan Ewen & Xiaowei Jiang & Robert Metzger

Unified Data Processing with Apache Flink and Apache Pulsar_Seth Wiesman
Unified Data Processing with Apache Flink and Apache Pulsar_Seth WiesmanUnified Data Processing with Apache Flink and Apache Pulsar_Seth Wiesman
Unified Data Processing with Apache Flink and Apache Pulsar_Seth WiesmanStreamNative
 
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming FeaturesHDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming FeaturesHortonworks
 
Apache Kafka as Event Streaming Platform for Microservice Architectures
Apache Kafka as Event Streaming Platform for Microservice ArchitecturesApache Kafka as Event Streaming Platform for Microservice Architectures
Apache Kafka as Event Streaming Platform for Microservice ArchitecturesKai Wähner
 
Spring on PAS - Fabio Marinelli
Spring on PAS - Fabio MarinelliSpring on PAS - Fabio Marinelli
Spring on PAS - Fabio MarinelliVMware Tanzu
 
apidays LIVE JAKARTA - Event Driven APIs by Phil Scanlon
apidays LIVE JAKARTA - Event Driven APIs by Phil Scanlonapidays LIVE JAKARTA - Event Driven APIs by Phil Scanlon
apidays LIVE JAKARTA - Event Driven APIs by Phil Scanlonapidays
 
Facilitez votre transition DevOps grâce à l'automatisation de votre infras...
 Facilitez votre transition DevOps grâce à l'automatisation de votre infras... Facilitez votre transition DevOps grâce à l'automatisation de votre infras...
Facilitez votre transition DevOps grâce à l'automatisation de votre infras...VMware Tanzu
 
Fast Data – Fast Cars: Wie Apache Kafka die Datenwelt revolutioniert
Fast Data – Fast Cars: Wie Apache Kafka die Datenwelt revolutioniertFast Data – Fast Cars: Wie Apache Kafka die Datenwelt revolutioniert
Fast Data – Fast Cars: Wie Apache Kafka die Datenwelt revolutioniertconfluent
 
Santander Stream Processing with Apache Flink
Santander Stream Processing with Apache FlinkSantander Stream Processing with Apache Flink
Santander Stream Processing with Apache Flinkconfluent
 
IoT and Event Streaming at Scale with Apache Kafka
IoT and Event Streaming at Scale with Apache KafkaIoT and Event Streaming at Scale with Apache Kafka
IoT and Event Streaming at Scale with Apache Kafkaconfluent
 
IoT Architectures for Apache Kafka and Event Streaming - Industry 4.0, Digita...
IoT Architectures for Apache Kafka and Event Streaming - Industry 4.0, Digita...IoT Architectures for Apache Kafka and Event Streaming - Industry 4.0, Digita...
IoT Architectures for Apache Kafka and Event Streaming - Industry 4.0, Digita...Kai Wähner
 
Spring and Pivotal Application Service - SpringOne Tour Dallas
Spring and Pivotal Application Service - SpringOne Tour DallasSpring and Pivotal Application Service - SpringOne Tour Dallas
Spring and Pivotal Application Service - SpringOne Tour DallasVMware Tanzu
 
Data Driven Enterprise with Apache Kafka
Data Driven Enterprise with Apache KafkaData Driven Enterprise with Apache Kafka
Data Driven Enterprise with Apache Kafkaconfluent
 
Progetta, crea e gestisci Modern Application per web e mobile su AWS
Progetta, crea e gestisci Modern Application per web e mobile su AWSProgetta, crea e gestisci Modern Application per web e mobile su AWS
Progetta, crea e gestisci Modern Application per web e mobile su AWSAmazon Web Services
 
Sviluppare un backend serverless in real time attraverso GraphQL
Sviluppare un backend serverless in real time attraverso GraphQLSviluppare un backend serverless in real time attraverso GraphQL
Sviluppare un backend serverless in real time attraverso GraphQLAmazon Web Services
 
Event Driven Streaming Analytics - Demostration on Architecture of IoT
Event Driven Streaming Analytics - Demostration on Architecture of IoTEvent Driven Streaming Analytics - Demostration on Architecture of IoT
Event Driven Streaming Analytics - Demostration on Architecture of IoTLei Xu
 
Building Event Driven (Micro)services with Apache Kafka
Building Event Driven (Micro)services with Apache KafkaBuilding Event Driven (Micro)services with Apache Kafka
Building Event Driven (Micro)services with Apache KafkaGuido Schmutz
 
Accelerating Mobile App Data Synchronization and Real-Time Data Development w...
Accelerating Mobile App Data Synchronization and Real-Time Data Development w...Accelerating Mobile App Data Synchronization and Real-Time Data Development w...
Accelerating Mobile App Data Synchronization and Real-Time Data Development w...Amazon Web Services
 
Hong Kong User Group 2019
Hong Kong User Group 2019Hong Kong User Group 2019
Hong Kong User Group 2019Solace
 
A New Approach to Continuous Monitoring in the Cloud: Migrate to AWS with NET...
A New Approach to Continuous Monitoring in the Cloud: Migrate to AWS with NET...A New Approach to Continuous Monitoring in the Cloud: Migrate to AWS with NET...
A New Approach to Continuous Monitoring in the Cloud: Migrate to AWS with NET...Amazon Web Services
 

Similar to KEYNOTE Flink Forward San Francisco 2019: From Stream Processor to a Unified Data Processing System - Stephan Ewen & Xiaowei Jiang & Robert Metzger (20)

Unified Data Processing with Apache Flink and Apache Pulsar_Seth Wiesman
Unified Data Processing with Apache Flink and Apache Pulsar_Seth WiesmanUnified Data Processing with Apache Flink and Apache Pulsar_Seth Wiesman
Unified Data Processing with Apache Flink and Apache Pulsar_Seth Wiesman
 
Flink SQL in Action
Flink SQL in ActionFlink SQL in Action
Flink SQL in Action
 
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming FeaturesHDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
 
Apache Kafka as Event Streaming Platform for Microservice Architectures
Apache Kafka as Event Streaming Platform for Microservice ArchitecturesApache Kafka as Event Streaming Platform for Microservice Architectures
Apache Kafka as Event Streaming Platform for Microservice Architectures
 
Spring on PAS - Fabio Marinelli
Spring on PAS - Fabio MarinelliSpring on PAS - Fabio Marinelli
Spring on PAS - Fabio Marinelli
 
apidays LIVE JAKARTA - Event Driven APIs by Phil Scanlon
apidays LIVE JAKARTA - Event Driven APIs by Phil Scanlonapidays LIVE JAKARTA - Event Driven APIs by Phil Scanlon
apidays LIVE JAKARTA - Event Driven APIs by Phil Scanlon
 
Facilitez votre transition DevOps grâce à l'automatisation de votre infras...
 Facilitez votre transition DevOps grâce à l'automatisation de votre infras... Facilitez votre transition DevOps grâce à l'automatisation de votre infras...
Facilitez votre transition DevOps grâce à l'automatisation de votre infras...
 
Fast Data – Fast Cars: Wie Apache Kafka die Datenwelt revolutioniert
Fast Data – Fast Cars: Wie Apache Kafka die Datenwelt revolutioniertFast Data – Fast Cars: Wie Apache Kafka die Datenwelt revolutioniert
Fast Data – Fast Cars: Wie Apache Kafka die Datenwelt revolutioniert
 
Santander Stream Processing with Apache Flink
Santander Stream Processing with Apache FlinkSantander Stream Processing with Apache Flink
Santander Stream Processing with Apache Flink
 
IoT and Event Streaming at Scale with Apache Kafka
IoT and Event Streaming at Scale with Apache KafkaIoT and Event Streaming at Scale with Apache Kafka
IoT and Event Streaming at Scale with Apache Kafka
 
IoT Architectures for Apache Kafka and Event Streaming - Industry 4.0, Digita...
IoT Architectures for Apache Kafka and Event Streaming - Industry 4.0, Digita...IoT Architectures for Apache Kafka and Event Streaming - Industry 4.0, Digita...
IoT Architectures for Apache Kafka and Event Streaming - Industry 4.0, Digita...
 
Spring and Pivotal Application Service - SpringOne Tour Dallas
Spring and Pivotal Application Service - SpringOne Tour DallasSpring and Pivotal Application Service - SpringOne Tour Dallas
Spring and Pivotal Application Service - SpringOne Tour Dallas
 
Data Driven Enterprise with Apache Kafka
Data Driven Enterprise with Apache KafkaData Driven Enterprise with Apache Kafka
Data Driven Enterprise with Apache Kafka
 
Progetta, crea e gestisci Modern Application per web e mobile su AWS
Progetta, crea e gestisci Modern Application per web e mobile su AWSProgetta, crea e gestisci Modern Application per web e mobile su AWS
Progetta, crea e gestisci Modern Application per web e mobile su AWS
 
Sviluppare un backend serverless in real time attraverso GraphQL
Sviluppare un backend serverless in real time attraverso GraphQLSviluppare un backend serverless in real time attraverso GraphQL
Sviluppare un backend serverless in real time attraverso GraphQL
 
Event Driven Streaming Analytics - Demostration on Architecture of IoT
Event Driven Streaming Analytics - Demostration on Architecture of IoTEvent Driven Streaming Analytics - Demostration on Architecture of IoT
Event Driven Streaming Analytics - Demostration on Architecture of IoT
 
Building Event Driven (Micro)services with Apache Kafka
Building Event Driven (Micro)services with Apache KafkaBuilding Event Driven (Micro)services with Apache Kafka
Building Event Driven (Micro)services with Apache Kafka
 
Accelerating Mobile App Data Synchronization and Real-Time Data Development w...
Accelerating Mobile App Data Synchronization and Real-Time Data Development w...Accelerating Mobile App Data Synchronization and Real-Time Data Development w...
Accelerating Mobile App Data Synchronization and Real-Time Data Development w...
 
Hong Kong User Group 2019
Hong Kong User Group 2019Hong Kong User Group 2019
Hong Kong User Group 2019
 
A New Approach to Continuous Monitoring in the Cloud: Migrate to AWS with NET...
A New Approach to Continuous Monitoring in the Cloud: Migrate to AWS with NET...A New Approach to Continuous Monitoring in the Cloud: Migrate to AWS with NET...
A New Approach to Continuous Monitoring in the Cloud: Migrate to AWS with NET...
 

More from Flink Forward

Building a fully managed stream processing platform on Flink at scale for Lin...
Building a fully managed stream processing platform on Flink at scale for Lin...Building a fully managed stream processing platform on Flink at scale for Lin...
Building a fully managed stream processing platform on Flink at scale for Lin...Flink Forward
 
Evening out the uneven: dealing with skew in Flink
Evening out the uneven: dealing with skew in FlinkEvening out the uneven: dealing with skew in Flink
Evening out the uneven: dealing with skew in FlinkFlink Forward
 
“Alexa, be quiet!”: End-to-end near-real time model building and evaluation i...
“Alexa, be quiet!”: End-to-end near-real time model building and evaluation i...“Alexa, be quiet!”: End-to-end near-real time model building and evaluation i...
“Alexa, be quiet!”: End-to-end near-real time model building and evaluation i...Flink Forward
 
Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...
Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...
Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...Flink Forward
 
Introducing the Apache Flink Kubernetes Operator
Introducing the Apache Flink Kubernetes OperatorIntroducing the Apache Flink Kubernetes Operator
Introducing the Apache Flink Kubernetes OperatorFlink Forward
 
Autoscaling Flink with Reactive Mode
Autoscaling Flink with Reactive ModeAutoscaling Flink with Reactive Mode
Autoscaling Flink with Reactive ModeFlink Forward
 
Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...
Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...
Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...Flink Forward
 
One sink to rule them all: Introducing the new Async Sink
One sink to rule them all: Introducing the new Async SinkOne sink to rule them all: Introducing the new Async Sink
One sink to rule them all: Introducing the new Async SinkFlink Forward
 
Tuning Apache Kafka Connectors for Flink.pptx
Tuning Apache Kafka Connectors for Flink.pptxTuning Apache Kafka Connectors for Flink.pptx
Tuning Apache Kafka Connectors for Flink.pptxFlink Forward
 
Flink powered stream processing platform at Pinterest
Flink powered stream processing platform at PinterestFlink powered stream processing platform at Pinterest
Flink powered stream processing platform at PinterestFlink Forward
 
Apache Flink in the Cloud-Native Era
Apache Flink in the Cloud-Native EraApache Flink in the Cloud-Native Era
Apache Flink in the Cloud-Native EraFlink Forward
 
Where is my bottleneck? Performance troubleshooting in Flink
Where is my bottleneck? Performance troubleshooting in FlinkWhere is my bottleneck? Performance troubleshooting in Flink
Where is my bottleneck? Performance troubleshooting in FlinkFlink Forward
 
Using the New Apache Flink Kubernetes Operator in a Production Deployment
Using the New Apache Flink Kubernetes Operator in a Production DeploymentUsing the New Apache Flink Kubernetes Operator in a Production Deployment
Using the New Apache Flink Kubernetes Operator in a Production DeploymentFlink Forward
 
The Current State of Table API in 2022
The Current State of Table API in 2022The Current State of Table API in 2022
The Current State of Table API in 2022Flink Forward
 
Flink SQL on Pulsar made easy
Flink SQL on Pulsar made easyFlink SQL on Pulsar made easy
Flink SQL on Pulsar made easyFlink Forward
 
Dynamic Rule-based Real-time Market Data Alerts
Dynamic Rule-based Real-time Market Data AlertsDynamic Rule-based Real-time Market Data Alerts
Dynamic Rule-based Real-time Market Data AlertsFlink Forward
 
Exactly-Once Financial Data Processing at Scale with Flink and Pinot
Exactly-Once Financial Data Processing at Scale with Flink and PinotExactly-Once Financial Data Processing at Scale with Flink and Pinot
Exactly-Once Financial Data Processing at Scale with Flink and PinotFlink Forward
 
Processing Semantically-Ordered Streams in Financial Services
Processing Semantically-Ordered Streams in Financial ServicesProcessing Semantically-Ordered Streams in Financial Services
Processing Semantically-Ordered Streams in Financial ServicesFlink Forward
 
Tame the small files problem and optimize data layout for streaming ingestion...
Tame the small files problem and optimize data layout for streaming ingestion...Tame the small files problem and optimize data layout for streaming ingestion...
Tame the small files problem and optimize data layout for streaming ingestion...Flink Forward
 
Batch Processing at Scale with Flink & Iceberg
Batch Processing at Scale with Flink & IcebergBatch Processing at Scale with Flink & Iceberg
Batch Processing at Scale with Flink & IcebergFlink Forward
 

More from Flink Forward (20)

Building a fully managed stream processing platform on Flink at scale for Lin...
Building a fully managed stream processing platform on Flink at scale for Lin...Building a fully managed stream processing platform on Flink at scale for Lin...
Building a fully managed stream processing platform on Flink at scale for Lin...
 
Evening out the uneven: dealing with skew in Flink
Evening out the uneven: dealing with skew in FlinkEvening out the uneven: dealing with skew in Flink
Evening out the uneven: dealing with skew in Flink
 
“Alexa, be quiet!”: End-to-end near-real time model building and evaluation i...
“Alexa, be quiet!”: End-to-end near-real time model building and evaluation i...“Alexa, be quiet!”: End-to-end near-real time model building and evaluation i...
“Alexa, be quiet!”: End-to-end near-real time model building and evaluation i...
 
Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...
Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...
Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...
 
Introducing the Apache Flink Kubernetes Operator
Introducing the Apache Flink Kubernetes OperatorIntroducing the Apache Flink Kubernetes Operator
Introducing the Apache Flink Kubernetes Operator
 
Autoscaling Flink with Reactive Mode
Autoscaling Flink with Reactive ModeAutoscaling Flink with Reactive Mode
Autoscaling Flink with Reactive Mode
 
Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...
Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...
Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...
 
One sink to rule them all: Introducing the new Async Sink
One sink to rule them all: Introducing the new Async SinkOne sink to rule them all: Introducing the new Async Sink
One sink to rule them all: Introducing the new Async Sink
 
Tuning Apache Kafka Connectors for Flink.pptx
Tuning Apache Kafka Connectors for Flink.pptxTuning Apache Kafka Connectors for Flink.pptx
Tuning Apache Kafka Connectors for Flink.pptx
 
Flink powered stream processing platform at Pinterest
Flink powered stream processing platform at PinterestFlink powered stream processing platform at Pinterest
Flink powered stream processing platform at Pinterest
 
Apache Flink in the Cloud-Native Era
Apache Flink in the Cloud-Native EraApache Flink in the Cloud-Native Era
Apache Flink in the Cloud-Native Era
 
Where is my bottleneck? Performance troubleshooting in Flink
Where is my bottleneck? Performance troubleshooting in FlinkWhere is my bottleneck? Performance troubleshooting in Flink
Where is my bottleneck? Performance troubleshooting in Flink
 
Using the New Apache Flink Kubernetes Operator in a Production Deployment
Using the New Apache Flink Kubernetes Operator in a Production DeploymentUsing the New Apache Flink Kubernetes Operator in a Production Deployment
Using the New Apache Flink Kubernetes Operator in a Production Deployment
 
The Current State of Table API in 2022
The Current State of Table API in 2022The Current State of Table API in 2022
The Current State of Table API in 2022
 
Flink SQL on Pulsar made easy
Flink SQL on Pulsar made easyFlink SQL on Pulsar made easy
Flink SQL on Pulsar made easy
 
Dynamic Rule-based Real-time Market Data Alerts
Dynamic Rule-based Real-time Market Data AlertsDynamic Rule-based Real-time Market Data Alerts
Dynamic Rule-based Real-time Market Data Alerts
 
Exactly-Once Financial Data Processing at Scale with Flink and Pinot
Exactly-Once Financial Data Processing at Scale with Flink and PinotExactly-Once Financial Data Processing at Scale with Flink and Pinot
Exactly-Once Financial Data Processing at Scale with Flink and Pinot
 
Processing Semantically-Ordered Streams in Financial Services
Processing Semantically-Ordered Streams in Financial ServicesProcessing Semantically-Ordered Streams in Financial Services
Processing Semantically-Ordered Streams in Financial Services
 
Tame the small files problem and optimize data layout for streaming ingestion...
Tame the small files problem and optimize data layout for streaming ingestion...Tame the small files problem and optimize data layout for streaming ingestion...
Tame the small files problem and optimize data layout for streaming ingestion...
 
Batch Processing at Scale with Flink & Iceberg
Batch Processing at Scale with Flink & IcebergBatch Processing at Scale with Flink & Iceberg
Batch Processing at Scale with Flink & Iceberg
 

Recently uploaded

How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
 
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsSnow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsHyundai Motor Group
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 

Recently uploaded (20)

How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
 
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsSnow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptxVulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 

KEYNOTE Flink Forward San Francisco 2019: From Stream Processor to a Unified Data Processing System - Stephan Ewen & Xiaowei Jiang & Robert Metzger

  • 1. © 2019 Ververica Presenter Name & Title PRESENTATION TITLE The Apache Flink® Conference Stream Processing | Event Driven | Real Time San Francisco 1-2, 2019
  • 2. © 2019 Ververica2 A big thanks to our Sponsors Platinum Gold Silver Community Flink Fest Media
  • 3. © 2019 Ververica3 A big thanks to our Program Committee Fabian Hueske Dean Wampler Sonali SharmaEric SammerStefan RichterTyler Akidau Jamie Grier
  • 4. © 2019 Ververica4 A big thanks to our Speakers
  • 5. © 2019 Ververica5 Flink Forward App Rate speakers and sessions with a chance to win a Flybrix drone! Get involved! Apache Flink User Survey Win a trip to one of the next Flink Forward conferences of your choice! Flink Forward Survey Help us improve the quality of Flink Forward. We appreciate your feedback! Community Contribution Sign up as a content contribut or for blog posts or speaking oppor tunities. Flink Forward Social Feed - View & Engage! Get involved
  • 6. © 2019 Ververica Kostas Tzoumas INTRODUCING VERVERICA
  • 7. © 2019 Ververica7 Founded in 2014 by the original creators of Apache Flink to commercialize the open source project and support the community
  • 9. © 2019 Ververica9 Why? • Alibaba has been the largest user of Flink and second largest contributor for years • Deeply committed to open source and creating technological impact • Joining forces made a lot of sense for the two teams in order to collaborate even closer and accelerate their contributions to Flink
  • 10. © 2019 Ververica10 Flink at Alibaba (few examples) Taobao is the largest e-commerce platform globally with more than 600 million monthly active users. Every time a user logs into the Taobao app they see a different landing page personalized for the user and depending on the latest real- time activity in the platform. Using Flink for real-time machine learning at Taobao has resulted in over 20% increase in purchase conversion rate. At peak during Singles Day last year, the system processed over 1.7 billion events/sec. In the Hangzhou City Brain Project, Flink is used to process in real-time data from a variety of sensors (traffic cameras, map applications, etc), and manage traffic signals in 128 intersections. The City Brain project has halved traveling times for ambulances and commuters. Traffic accidents can be detected immediately, and help can reach the accident site within 5 minutes.
  • 11. © 2019 Ververica11 What’s in a name? verum (“real” in Latin) Understanding the truth about the world by getting the real-time view
  • 12. © 2019 Ververica12 What is Ververica? 1. Double down on the open source community and improve its health and diversity 2. Contribute a number of innovations to the open source project starting with Alibaba’s Blink for batch processing 3. Create an ecosystem and foundation for the commercial success of Flink projects and products across the world Our #1 goal is to position Apache Flink for the next 10 years of its life
  • 13. © 2019 Ververica13 Ververica Commercial Products Full continuation of our commercial products and services • Ververica Platform including Apache Flink, Application Manager, and Streaming Ledger • Apache Flink Training and Consulting Services • Enterprise Support A lot of innovation coming here as well leveraging existing work in Alibaba Cloud
  • 14. © 2019 Ververica14 Announcing: Ververica Partner Program We are looking for partners to help us develop the broader Flink ecosystem • Ververica Platform Partner Preferred partners of our commercial products around the globe • Ververica Services Partner Service provider on Apache Flink certified by Ververica Sign up here! ververica.com/partner-program
  • 15. © 2019 Ververica Stephan Ewen Xiaowei Jiang Robert Metzger From Stream Processor to Unified Data Processing System
  • 16. © 2019 Ververica16 Use Cases Presented Today
  • 17. © 2019 Ververica17 Apache Flink and Public Clouds cloud on-prem
  • 18. © 2019 Ververica Data Processing Applications
  • 19. © 2019 Ververica more lag time Batch Processing Continuous Processing Event-driven Applications Transactional Processing more real time Stream Processing Data Pipelines Streaming Analytics
  • 20. © 2019 Ververica more lag time Batch Processing Continuous Processing Event-driven Applications Transactional Processing more real time Stream Processing Data Pipelines Streaming Analytics Batch Processing & Continuous Streaming Analytics & Applications
  • 21. © 2019 Ververica more lag time Batch Processing Continuous Processing Event-driven Applications Transactional Processing more real time Stream Processing Data Pipelines Streaming Analytics Flink community's focus over the last releases
  • 22. © 2019 Ververica more lag time Batch Processing Continuous Processing Event-driven Applications Transactional Processing more real time Recent Features Data Pipelines Streaming Analytics Time-versioned Joins MATCH_RECOGNIZE Schema Upgrades SELECT * FROM TaxiRides MATCH_RECOGNIZE ( PARTITION BY driverId ORDER BY rideTime MEASURES S.rideId as sRideId AFTER MATCH SKIP PAST LAST ROW PATTERN (S M{2,} E) DEFINE S AS S.isStart = true, M AS M.rideId <> S.rideId, E AS E.isStart = false AND E.rideId = S.rideId) A B A X Y
  • 23. © 2019 Ververica more lag time Batch Processing Continuous Processing Event-driven Applications Transactional Processing more real time Stream Processing Data Pipelines Streaming Analytics "Steam Processing takes on ACID" by Seth Wiesman 11am, Nikko I
  • 24. © 2019 Ververica more lag time Batch Processing Continuous Processing Event-driven Applications Transactional Processing more real time Stream Processing Data Pipelines Streaming Analytics SQL and SQL Ecosystem / tools Machine Learning Graphs Batch Performance Batch Fault Tolerance Interactive Queries Dashboards
  • 25. © 2019 Ververica The Relationship between Batch and Streaming
  • 26. © 2019 Ververica26 Everything Streams That is about 60% of the truth…
  • 27. © 2019 Ververica27 The remaining 40% of the truth Continuous Streaming Batch Processing Data is incomplete Latency SLAs Completeness and Latency is a tradeoff Data is as complete as it gets within the job No Low Latency SLAs
  • 28. © 2019 Ververica28 The remaining 40% of the truth Continuous Streaming Batch Processing Data is incomplete Latency SLAs Completeness and Latency is a tradeoff Data is as complete as it gets within the job No Low Latency SLAs
  • 29. © 2019 Ververica29 Streaming versus Batch Join
  • 30. © 2019 Ververica30 Streaming versus Batch Join 2x RocksDB LSM-Trees 1x Hybrid Hash Join DataStream API DataSet API push-based operators low-latency minimize in-flight data pull-based operators flexible data flow control high latency no checkpoints
  • 31. © 2019 Ververica31 Exploiting the Batch Special Case Planner/Optimizer See also: "Towards Flink 2.0: Rethinking the stack and APIs to unify Batch & Stream" by Aljoscha Krettek, 2pm, Nikko II/III Continuous Operators Streaming Scheduler Rules Additional Bounded Operators Additional Scheduling Strategies if (bounded && non-incremental) activates additional optimizer choices Core operators, cover all cases
  • 32. © 2019 Ververica Stream Processing, Analytics, and Applications
  • 33. © 2019 Ververica33 Process Function (events, state, time) DataStream API (streams, windows) Stream SQL / Tables (dynamic tables) Stream- & Batch Data Processing High-level Analytics API Stateful Event- Driven Applications val stats = stream .keyBy("sensor") .timeWindow(Time.seconds(5)) .sum((a, b) -> a.add(b)) def processElement(event: MyEvent, ctx: Context, out: Collector[Result]) = { // work with event and state (event, state.value) match { … } out.collect(…) // emit events state.update(…) // modify state // schedule a timer callback ctx.timerService.registerEventTimeTimer(event.timestamp + 500) } How we showed the API stack in the past…
  • 34. © 2019 Ververica Applications (physical) Analytics (declarative) DataStream API Table API Types are Java / Scala classes Logical Schema for Tables Transformation Functions Declarative Language (SQL, Table DSL) State implicit in operationsExplicit control over State Explicit control over Time SLAs define when to trigger Executes as described Automatic Optimization
  • 35. © 2019 Ververica35 Rethinking the Flink Stack Stream Operator & DAG API Runtime DataSet (deprecated) DataStream Table / SQL Still possible to mix and match within a program
  • 36. © 2019 Ververica SQL, Notebooks, and Machine Learning
  • 37. © 2019 Ververica JobGraphPhysical PlanTable API & SQL Logical Plan Pluggable Catalog Query Optimizer SubQuery Decorrelation Filter/Project Push-Down Join Reorder … Adding a new Table API / SQL Query Processor (Blink)
  • 38. © 2019 Ververica Sub-QueryANSI Syntax Data Type Join Advanced Aggregates Window Operator Over Window TPC-H/TPC-DS Functional Improvements
  • 39. © 2019 Ververica Performant Operators Operator codegen HashAgg/Local-global Agg Improved HashJoin Semi/Anti join Vectorization Resource Optimizations Stats based estimation Dynamic memory allocation Expression Optimizations Record Format Operate binary data JVM intrinsics Hot method codegen Rich Stats NDV NULL count Avg length Max length Min Max Cost Based Join order Join type Agg strategy …… Advanced Rules Subplan reuse Join condition expansion Shuffle removal Distinct Agg rewrite …. Query Execution Query Optimizer Performance Improvements
  • 40. © 2019 Ververica Blink Spark 1T TPC-DS Queries Blink Spark 10T TPC-DS Queries Blink Spark 30T TPC-DS Queries 2000S 20000S 50000S ? Batch SQL Benchmark
  • 41. © 2019 Ververica 1.7B10K 10K Sub- Second 100TB Flink SQL in Production at Alibaba
  • 43. © 2019 Ververica June, 2019 TableAPI Refactor FLIP-32 July, 2019 Initial Blink Runner Merge Flink 1.9 Release Oct, 2019 Full Merge Table API Layer Flink Runner Blink Runner Blink SQL Merge Plan
  • 44. © 2019 Ververica Flink Hive + For More? Integrate Flink with Hive Ecosystem Xuefu Zhang & Bowen Li, Alibaba 12:20pm - 1:00pm Carmel HMS Data FlinkHive Hive Integration
  • 45. © 2019 Ververica • Zeppelin Integration Zeppelin support for Flink
  • 46. © 2019 Ververica Unified Interface ML algorithms Common Utilities For More? When Table meets AI: Build Flink AI Ecosystem on Table API Shaoxuan Wang, Alibaba 4:30pm - 5:10pm Nikko II & II High performance ML library based on Flink Xu Yang, Alibaba 2:50pm - 3:10pm Carmel Clustering K-means Latent Dirichlet allocation (LDA) Bisecting k-means Gaussian Mixture Model (GMM) Regression Linear regression Lasso regression Ridge regression Generalized linear regression Survival regression Isotonic regression Classifier Binomial logistic regression Multinomial logistic regression Multilayer perceptron classifier Linear Support Vector Machine Naive Bayes Random Forest GBDT Decision Tree Others Collaborative filtering FP-Growth PrefixSpan Proposal for Machine Learning
  • 47. © 2019 Ververica The Apache Flink Community
  • 48. © 2019 Ververica48 A growing Apache Flink Community … not only Flink’s codebase that is growing massively … 230 Emails / week
  • 50. © 2019 Ververica50 Launch of a new Chinese language user support mailing list
  • 51. © 2019 Ververica51 Growing the Contributors Community • Cleanup & reorganization of the Jira components • Flinkbot: Improve pull request reviews and labeling • Discussions about improving the contribution workflow • PMC mentoring new committer candidates • Flink Community Packages website
  • 52. © 2019 Ververica52 Flink Community Packages
  • 54. © 2019 Ververica The Apache Flink community is more active than ever Apache Flink continues to evolve with the Stream Processing space. Seamlessly integrate analytics, machine learning, applications and very fast batch processing on top of stream processing

Editor's Notes

  1. Ansi SQL Syntax Support rich data types, including varchar, varbinary, and complex types such as structs …
  2. Sub-second latency Peak throughput: 1.7B Event/s Largest Batch Job Input size: 100TB Flink cluster size: 10K Flink jobs: 10K
  3. Table API is the choice for analytics workloads Shared implementation with Flink SQL Module/Composable code Dynamic queries Interactive Programming Ease of Use
  4. 2019.2-2019.6: FLIP-32,refacor table api, decouple API from implementation, side by side supports for multiple runners. (concurrently, one month after)2019.3-2019.6: Blink runner initial merge, supports major functionalities in Blink SQL, and production ready 2019.6-2019.7 flink 1.9.0发布 2019.7-2019.10 continue to merge and improve Blink Runner. Fully merge all Blink code 2019.10-2019.11 new Flink release (1.10.0/2.0.0?)
  5. Supported execution mode:Local, Remote和Yarn Supports TableAPI, Batch SQL and Stream SQL Visualize static and dynamic tables Supports savepoints in Flink jobs Supports advanced functionalities in ZeppelinContext
  6. Unified the interface of algorithm functions Table serves as input and output of ML algorithms Support ML Pipeline Implemented some major ML algorithms Design and refine the algorithm implementations for higher performance Support large scale data, running and validating in production environment Find and solve performance and stability problems Get the equal or better performance than Spark ML Accumulated utilities for Flink ML developer Basic local library: linear algebra, probabilistic, etc Distributed computing library: statistics, parallel sort, etc Refine data transmission, for example: memory cache training data
  7. Like the codebase, Flink’s community growing Now among the top 5 projects within Apache, in terms of community size numbers are all looking very positive 1st number: dev@ activity – developer community growth More contributions project future discussions 2nd number: pageviews – user community growth Constant growth over the past years We expect these to grow in the coming years Also because we’ve started onboarding the huge Chinese Flink community …
  8. … 1300 people attended Flink Forward China in December this year There’s been an effort in China to translate the Flink docs .. We are now collaborating in translating Flink’s official material Flink website almost done Flink docs coming now great opportunity for Flink: documentation (and contribution opportunities) for one of the most active Flink communities in the world
  9. Flink has thousands of users / WeChat (or DingTalk) group with thousands of members in China foster further growth of Flink in China bring them into the Apache world / offer infrastructure in an apache way make the chinese community more visible to other Flink users + within the ASF English obviously stays the official language for all developer discussions, and we are open for adding more languages
  10. Better Jira processes cleanup and reorganization of the components Flinkbot, helping with the review of pull requests, and labeling PRs into components discussion about a stricter Jira workflow, to make sure people work on things which have consensus to be accepted the PMC is actively working on onboarding new committers to help with the amount of incoming pull requests, and the overall growing project.
  11. Started a project for a Flink Ecosystem Portal, where people can submit their own Flink extensions for connectors, metrics reporters, file systems, API utilities etc. This is hosted at Apache Infra, run by the Flink project and based on open source reach out to me during the conference if you want to talk community, such as starting your own meetup group, ideas how to improve the community etc.