SlideShare a Scribd company logo
© 2020 Ververica
1
© 2020 Ververica
2
2
● Caito Scherr
Introduction
© 2020 Ververica
3
3
● Caito Scherr
Introduction
3
● Caito Scherr
● Developer Advocate
© 2020 Ververica
4
● Caito Scherr
● Developer Advocate
● Ververica, GmbH
Introduction
© 2020 Ververica
5
● Caito Scherr
● Developer Advocate
● Ververica, GmbH
● Portland, OR, USA
Introduction
© 2020 Ververica
6
Introduction
© 2020 Ververica
7
Demo credit: Marta Paes
Introduction
© 2020 Ververica
8
Agenda
● Pulsar + Flink
● Where SQL comes in
● Demo: Pulsar + Flink SQL
© 2020 Ververica
9
● Pulsar + Flink
● Where SQL comes in
● Demo: Pulsar + Flink SQL
Agenda
© 2020 Ververica
10
Agenda
● Pulsar + Flink
● Where SQL comes in
● Demo: Pulsar + Flink SQL
© 2020 Ververica
11
>> What is Flink?
Pulsar + Flink
● Stateful
● Stream processing engine
● Unified batch & streaming
© 2020 Ververica
12
>> What is Flink?
Pulsar + Flink
© 2020 Ververica
13
>> What is Flink?
Pulsar + Flink
© 2020 Ververica
14
>> Why Pulsar + Flink?
Pulsar + Flink
“Batch as a special case
of streaming”
“Stream as a unified
view on data”
© 2020 Ververica
15
>> Pulsar: Unified Storage
● Pub/Sub messaging layer
(Streaming)
● Durable storage layer
(Batch)
Pulsar + Flink
© 2020 Ververica
16
now
bounded query
unbounded query
past future
bounded query
start of the stream
unbounded query
>> Flink: Unified Processing
● Reuse code and logic
● Consistent semantics
● Simplify operations
● Mix historic and real-time
● Pub/Sub messaging layer (Stream)
● Durable storage layer (Batch)
Pulsar + Flink
© 2020 Ververica
17
Unified Processing Engine
(Batch / Streaming)
Unified Storage
(Segments / Pub/Sub)
>> A Unified Data Stack
Pulsar + Flink
© 2020 Ververica
18
Flink 1.6+
2018
Streaming Source/Sink Connectors
Table Sink Connector
>> Pulsar + Flink History
Pulsar + Flink
© 2020 Ververica
19
Flink 1.6+
2018
Streaming Source/Sink Connectors
Table Sink Connector
>> Pulsar + Flink History
Pulsar + Flink
Flink 1.9+
Pulsar Schema + Flink Catalog
Table API/SQL as 1st class citizens
Exactly-once Source
At-least once Sink
© 2020 Ververica
20
Flink 1.6+
2018
Streaming Source/Sink Connectors
Table Sink Connector
>> Pulsar + Flink History
Pulsar + Flink
Flink 1.9+
Pulsar Schema + Flink Catalog
Table API/SQL as 1st class citizens
Exactly-once Source
At-least once Sink
Flink 1.12
Upserts
DDL Computed Columns, Watermarks. Metadata
End-to-end Exactly-once
Key-shared Subscription Model
© 2020 Ververica
21
Flink Runtime
Stateful Computations over Data Streams
Stateful
Stream Processing
Streams, State, Time
Event-Driven
Applications
Stateful Functions
Streaming
Analytics & ML
SQL, PyFlink, Tables
>> Why Flink SQL?
Pulsar + Flink
© 2020 Ververica
22
>> Why Flink SQL?
● Focus on business logic, not implementation
● Mixed workloads (batch + streaming)
● Maximize developer speed and autonomy
ML Feature Generation
Unified Online/Offline Model
Training
E2E Streaming Analytics
Pipelines
Pulsar + Flink
© 2020 Ververica
23
user cnt
Mary 2
Bob 1
SELECT user_id,
COUNT(url) AS cnt
FROM clicks
GROUP BY user_id;
Take a snapshot when the
query starts
A final result is
produced
A row that was added after the query
was started is not considered
user cTime url
Mary 12:00:00 https://…
Bob 12:00:00 https://…
Mary 12:00:02 https://…
Liz 12:00:03 https://…
The query
terminates
Where SQL Fits In >> A Regular SQL Engine
© 2020 Ververica
24
user cTime url
user cnt
SELECT user_id,
COUNT(url) AS cnt
FROM clicks
GROUP BY user_id;
Mary 12:00:00 https://…
Bob 12:00:00 https://…
Mary 12:00:02 https://…
Liz 12:00:03 https://…
Bob 1
Liz 1
Mary 1
Mary 2
Ingest all changes as
they happen
Continuously update the
result
The result is identical to the one-time query (at this point)
Where SQL Fits In >> A Streaming SQL Engine
© 2020 Ververica
25
● Standard SQL syntax and semantics (i.e. not a “SQL-flavor”)
● Unified APIs for batch and streaming
● Support for advanced time handling and operations (e.g. CDC, pattern matching)
UDF Support
Python
Java
Scala
Execution
TPC-DS Coverage
Batch
Streaming
+
Formats
Native Connectors
Apache Kafka
Elasticsearch
FileSystems
JDBC HBase
+
Kinesis
Metastore Postgres (JDBC)
Data Catalogs
Debezium
Where SQL Fits In >> Flink SQL In A Nutshell
© 2020 Ververica
26
>> 1a. Twitter Firehose
Demo
© 2020 Ververica
27
Demo >> 1b. Data?
© 2020 Ververica
28
Demo >> 2. SQL Client + Pulsar
CREATE CATALOG pulsar WITH (
'type' = 'pulsar',
'service-url' = 'pulsar://pulsar:6650',
'admin-url' = 'http://pulsar:8080',
'format' = 'json'
);
Catalog DDL
© 2020 Ververica
29
Not cool. 👹
Demo
© 2020 Ververica
30
Demo
CREATE TABLE pulsar_tweets (
publishTime TIMESTAMP(3) METADATA,
WATERMARK FOR publishTime AS publishTime - INTERVAL '5'
SECOND
) WITH (
'connector' = 'pulsar',
'topic' = 'persistent://public/default/tweets',
'value.format' = 'json',
'service-url' = 'pulsar://pulsar:6650',
'admin-url' = 'http://pulsar:8080',
'scan.startup.mode' = 'earliest-offset'
)
LIKE tweets;
Derive schema from the original topic
Define the source connector (Pulsar)
Read and use Pulsar message metadata
>> 3. Get relevant timestamp
© 2020 Ververica
31
Demo >> 4. Windowed aggregation
CREATE TABLE pulsar_tweets_agg (
tmstmp TIMESTAMP(3),
tweet_cnt BIGINT
) WITH (
'connector'='pulsar',
'topic'='persistent://public/default/tweets_agg',
'value.format'='json',
'service-url'='pulsar://pulsar:6650',
'admin-url'='http://pulsar:8080'
);
Sink Table DDL
INSERT INTO pulsar_tweets_agg
SELECT TUMBLE_START(publishTime, INTERVAL '10'
SECOND) AS wStart,
COUNT(id) AS tweet_cnt
FROM pulsar_tweets
GROUP BY TUMBLE(publishTime, INTERVAL '10'
SECOND);
Continuous SQL Query
© 2020 Ververica
32
Demo >> 5. Tweet count in windows
© 2020 Ververica
33
What Next? >> Flink SQL Cookbook
© 2020 Ververica
Resources
● Flink Ahead: What Comes After Batch & Streaming: https://youtu.be/h5OYmy9Yx7Y
● Apache Pulsar as one Storage System for Real Time & Historical Data Analysis:
https://medium.com/streamnative/apache-pulsar-as-one-storage-455222c59017
● Flink Table API & SQL:
https://ci.apache.org/projects/flink/flink-docs-master/dev/table/sql/queries.html#operatio
ns
● Flink SQL Cookbook: https://github.com/ververica/flink-sql-cookbook
● When Flink & Pulsar Come Together: https://flink.apache.org/2019/05/03/pulsar-flink.html
● How to Query Pulsar Streams in Flink:
https://flink.apache.org/news/2019/11/25/query-pulsar-streams-using-apache-flink.html
● What’s New in the Flink/Pulsar Connector:
● https://flink.apache.org/2021/01/07/pulsar-flink-connector-270.html
● Marta’s Demo: https://github.com/morsapaes/flink-sql-pulsar
34
@Caito_200_OK
© 2020 Ververica
● Pulsar Conference staff!!
● Marta Paes
35
Thank You!
@Caito_200_OK
Scan here for links
& resources

More Related Content

What's hot

Web Application Firewall - Friend of your DevOps Chain?
Web Application Firewall - Friend of your DevOps Chain?Web Application Firewall - Friend of your DevOps Chain?
Web Application Firewall - Friend of your DevOps Chain?
Franziska Buehler
 
Reactive Applications with Apache Pulsar and Spring Boot
Reactive Applications with Apache Pulsar and Spring BootReactive Applications with Apache Pulsar and Spring Boot
Reactive Applications with Apache Pulsar and Spring Boot
VMware Tanzu
 
Spring Boot to Quarkus: A real app migration experience | DevNation Tech Talk
Spring Boot to Quarkus: A real app migration experience | DevNation Tech TalkSpring Boot to Quarkus: A real app migration experience | DevNation Tech Talk
Spring Boot to Quarkus: A real app migration experience | DevNation Tech Talk
Red Hat Developers
 
Developing a user-friendly OpenResty application
Developing a user-friendly OpenResty applicationDeveloping a user-friendly OpenResty application
Developing a user-friendly OpenResty application
Thibault Charbonnier
 
SpringBoot and Spring Cloud Service for MSA
SpringBoot and Spring Cloud Service for MSASpringBoot and Spring Cloud Service for MSA
SpringBoot and Spring Cloud Service for MSA
Oracle Korea
 
Reactive Spring Framework 5
Reactive Spring Framework 5Reactive Spring Framework 5
Reactive Spring Framework 5
Aliaksei Zhynhiarouski
 
A Series of Fortunate Events: Building an Operator in Java
A Series of Fortunate Events: Building an Operator in JavaA Series of Fortunate Events: Building an Operator in Java
A Series of Fortunate Events: Building an Operator in Java
VMware Tanzu
 
Leveraging Gradle @ Netflix (Madrid GUG Feb 2, 2021)
Leveraging Gradle @ Netflix (Madrid GUG Feb 2, 2021)Leveraging Gradle @ Netflix (Madrid GUG Feb 2, 2021)
Leveraging Gradle @ Netflix (Madrid GUG Feb 2, 2021)
Roberto Pérez Alcolea
 
Servlet vs Reactive Stacks in 5 Use Cases
Servlet vs Reactive Stacks in 5 Use CasesServlet vs Reactive Stacks in 5 Use Cases
Servlet vs Reactive Stacks in 5 Use Cases
VMware Tanzu
 
Spring framework 5: New Core and Reactive features
Spring framework 5: New Core and Reactive featuresSpring framework 5: New Core and Reactive features
Spring framework 5: New Core and Reactive features
Aliaksei Zhynhiarouski
 
Gatekeeper: API gateway
Gatekeeper: API gatewayGatekeeper: API gateway
Gatekeeper: API gateway
ChengHui Weng
 
Oracle Service Bus 12c (12.2.1) What You Always Wanted to Know
Oracle Service Bus 12c (12.2.1) What You Always Wanted to KnowOracle Service Bus 12c (12.2.1) What You Always Wanted to Know
Oracle Service Bus 12c (12.2.1) What You Always Wanted to Know
Frank Munz
 
Spring webflux
Spring webfluxSpring webflux
Spring webflux
Carlos E. Salazar
 
A Kong retrospective: from 0.10 to 0.13
A Kong retrospective: from 0.10 to 0.13A Kong retrospective: from 0.10 to 0.13
A Kong retrospective: from 0.10 to 0.13
Thibault Charbonnier
 
Sprint 17
Sprint 17Sprint 17
Sprint 17
ManageIQ
 
Oracle soa 10g to 11g migration
Oracle soa 10g to 11g migrationOracle soa 10g to 11g migration
Oracle soa 10g to 11g migration
Krishna R
 
CI/CD 기반의 Microservice 개발
 CI/CD 기반의 Microservice 개발 CI/CD 기반의 Microservice 개발
CI/CD 기반의 Microservice 개발
Oracle Korea
 
Upgrading Oracle SOA Suite 10g to 11g (whitepaper)
Upgrading Oracle SOA Suite 10g to 11g (whitepaper)Upgrading Oracle SOA Suite 10g to 11g (whitepaper)
Upgrading Oracle SOA Suite 10g to 11g (whitepaper)
Revelation Technologies
 
Challenges in a Microservices Age: Monitoring, Logging and Tracing on Red Hat...
Challenges in a Microservices Age: Monitoring, Logging and Tracing on Red Hat...Challenges in a Microservices Age: Monitoring, Logging and Tracing on Red Hat...
Challenges in a Microservices Age: Monitoring, Logging and Tracing on Red Hat...
Martin Etmajer
 
Java Microservices with Netflix OSS & Spring
Java Microservices with Netflix OSS & Spring Java Microservices with Netflix OSS & Spring
Java Microservices with Netflix OSS & Spring
Conor Svensson
 

What's hot (20)

Web Application Firewall - Friend of your DevOps Chain?
Web Application Firewall - Friend of your DevOps Chain?Web Application Firewall - Friend of your DevOps Chain?
Web Application Firewall - Friend of your DevOps Chain?
 
Reactive Applications with Apache Pulsar and Spring Boot
Reactive Applications with Apache Pulsar and Spring BootReactive Applications with Apache Pulsar and Spring Boot
Reactive Applications with Apache Pulsar and Spring Boot
 
Spring Boot to Quarkus: A real app migration experience | DevNation Tech Talk
Spring Boot to Quarkus: A real app migration experience | DevNation Tech TalkSpring Boot to Quarkus: A real app migration experience | DevNation Tech Talk
Spring Boot to Quarkus: A real app migration experience | DevNation Tech Talk
 
Developing a user-friendly OpenResty application
Developing a user-friendly OpenResty applicationDeveloping a user-friendly OpenResty application
Developing a user-friendly OpenResty application
 
SpringBoot and Spring Cloud Service for MSA
SpringBoot and Spring Cloud Service for MSASpringBoot and Spring Cloud Service for MSA
SpringBoot and Spring Cloud Service for MSA
 
Reactive Spring Framework 5
Reactive Spring Framework 5Reactive Spring Framework 5
Reactive Spring Framework 5
 
A Series of Fortunate Events: Building an Operator in Java
A Series of Fortunate Events: Building an Operator in JavaA Series of Fortunate Events: Building an Operator in Java
A Series of Fortunate Events: Building an Operator in Java
 
Leveraging Gradle @ Netflix (Madrid GUG Feb 2, 2021)
Leveraging Gradle @ Netflix (Madrid GUG Feb 2, 2021)Leveraging Gradle @ Netflix (Madrid GUG Feb 2, 2021)
Leveraging Gradle @ Netflix (Madrid GUG Feb 2, 2021)
 
Servlet vs Reactive Stacks in 5 Use Cases
Servlet vs Reactive Stacks in 5 Use CasesServlet vs Reactive Stacks in 5 Use Cases
Servlet vs Reactive Stacks in 5 Use Cases
 
Spring framework 5: New Core and Reactive features
Spring framework 5: New Core and Reactive featuresSpring framework 5: New Core and Reactive features
Spring framework 5: New Core and Reactive features
 
Gatekeeper: API gateway
Gatekeeper: API gatewayGatekeeper: API gateway
Gatekeeper: API gateway
 
Oracle Service Bus 12c (12.2.1) What You Always Wanted to Know
Oracle Service Bus 12c (12.2.1) What You Always Wanted to KnowOracle Service Bus 12c (12.2.1) What You Always Wanted to Know
Oracle Service Bus 12c (12.2.1) What You Always Wanted to Know
 
Spring webflux
Spring webfluxSpring webflux
Spring webflux
 
A Kong retrospective: from 0.10 to 0.13
A Kong retrospective: from 0.10 to 0.13A Kong retrospective: from 0.10 to 0.13
A Kong retrospective: from 0.10 to 0.13
 
Sprint 17
Sprint 17Sprint 17
Sprint 17
 
Oracle soa 10g to 11g migration
Oracle soa 10g to 11g migrationOracle soa 10g to 11g migration
Oracle soa 10g to 11g migration
 
CI/CD 기반의 Microservice 개발
 CI/CD 기반의 Microservice 개발 CI/CD 기반의 Microservice 개발
CI/CD 기반의 Microservice 개발
 
Upgrading Oracle SOA Suite 10g to 11g (whitepaper)
Upgrading Oracle SOA Suite 10g to 11g (whitepaper)Upgrading Oracle SOA Suite 10g to 11g (whitepaper)
Upgrading Oracle SOA Suite 10g to 11g (whitepaper)
 
Challenges in a Microservices Age: Monitoring, Logging and Tracing on Red Hat...
Challenges in a Microservices Age: Monitoring, Logging and Tracing on Red Hat...Challenges in a Microservices Age: Monitoring, Logging and Tracing on Red Hat...
Challenges in a Microservices Age: Monitoring, Logging and Tracing on Red Hat...
 
Java Microservices with Netflix OSS & Spring
Java Microservices with Netflix OSS & Spring Java Microservices with Netflix OSS & Spring
Java Microservices with Netflix OSS & Spring
 

Similar to Select Star: Unified Batch & Streaming with Flink SQL & Pulsar

Select Star: Flink SQL for Pulsar Folks - Pulsar Summit NA 2021
Select Star: Flink SQL for Pulsar Folks - Pulsar Summit NA 2021Select Star: Flink SQL for Pulsar Folks - Pulsar Summit NA 2021
Select Star: Flink SQL for Pulsar Folks - Pulsar Summit NA 2021
StreamNative
 
Don't Cross the Streams! (or do, we got you)
Don't Cross the Streams! (or do, we got you)Don't Cross the Streams! (or do, we got you)
Don't Cross the Streams! (or do, we got you)
Caito Scherr
 
Better, Faster, Stronger Streaming: Your First Dive into Flink SQL
Better, Faster, Stronger Streaming: Your First Dive into Flink SQLBetter, Faster, Stronger Streaming: Your First Dive into Flink SQL
Better, Faster, Stronger Streaming: Your First Dive into Flink SQL
Caito Scherr
 
Flink sql for continuous sql etl apps & Apache NiFi devops
Flink sql for continuous sql etl apps & Apache NiFi devopsFlink sql for continuous sql etl apps & Apache NiFi devops
Flink sql for continuous sql etl apps & Apache NiFi devops
Timothy Spann
 
PartnerSkillUp_Enable a Streaming CDC Solution
PartnerSkillUp_Enable a Streaming CDC SolutionPartnerSkillUp_Enable a Streaming CDC Solution
PartnerSkillUp_Enable a Streaming CDC Solution
Timothy Spann
 
Google Cloud Dataflow
Google Cloud DataflowGoogle Cloud Dataflow
Google Cloud Dataflow
Alex Van Boxel
 
ApacheCon 2020 - Flink SQL in 2020: Time to show off!
ApacheCon 2020 - Flink SQL in 2020: Time to show off!ApacheCon 2020 - Flink SQL in 2020: Time to show off!
ApacheCon 2020 - Flink SQL in 2020: Time to show off!
Timo Walther
 
Cross the Streams! Creating Streaming Data Pipelines with Apache Flink + Apac...
Cross the Streams! Creating Streaming Data Pipelines with Apache Flink + Apac...Cross the Streams! Creating Streaming Data Pipelines with Apache Flink + Apac...
Cross the Streams! Creating Streaming Data Pipelines with Apache Flink + Apac...
StreamNative
 
What's new for Apache Flink's Table & SQL APIs?
What's new for Apache Flink's Table & SQL APIs?What's new for Apache Flink's Table & SQL APIs?
What's new for Apache Flink's Table & SQL APIs?
Timo Walther
 
Real time Messages at Scale with Apache Kafka and Couchbase
Real time Messages at Scale with Apache Kafka and CouchbaseReal time Messages at Scale with Apache Kafka and Couchbase
Real time Messages at Scale with Apache Kafka and Couchbase
Will Gardella
 
Unlock cassandra data for application developers using graphQL
Unlock cassandra data for application developers using graphQLUnlock cassandra data for application developers using graphQL
Unlock cassandra data for application developers using graphQL
Cédrick Lunven
 
Why Airflow? & What's new in Airflow 2.3?
Why Airflow? & What's new in Airflow 2.3?Why Airflow? & What's new in Airflow 2.3?
Why Airflow? & What's new in Airflow 2.3?
Kaxil Naik
 
Explore Advanced CA Release Automation Configuration Topics
Explore Advanced CA Release Automation Configuration TopicsExplore Advanced CA Release Automation Configuration Topics
Explore Advanced CA Release Automation Configuration Topics
CA Technologies
 
Oracle Drivers configuration for High Availability, is it a developer's job?
Oracle Drivers configuration for High Availability, is it a developer's job?Oracle Drivers configuration for High Availability, is it a developer's job?
Oracle Drivers configuration for High Availability, is it a developer's job?
Ludovico Caldara
 
Starschema Products
Starschema ProductsStarschema Products
Starschema Products
Endre Adam
 
What's New in Oracle SQL Developer for 2018
What's New in Oracle SQL Developer for 2018What's New in Oracle SQL Developer for 2018
What's New in Oracle SQL Developer for 2018
Jeff Smith
 
An oss api layer for your cassandra
An oss api layer for your cassandraAn oss api layer for your cassandra
An oss api layer for your cassandra
Cédrick Lunven
 
HP Helion European Webinar Series ,Webinar #3
HP Helion European Webinar Series ,Webinar #3 HP Helion European Webinar Series ,Webinar #3
HP Helion European Webinar Series ,Webinar #3
BeMyApp
 
Overview SQL Server 2019
Overview SQL Server 2019Overview SQL Server 2019
Overview SQL Server 2019
Juan Fabian
 
DevCon5 (July 2014) - Acision SDK
DevCon5 (July 2014) - Acision SDKDevCon5 (July 2014) - Acision SDK
DevCon5 (July 2014) - Acision SDK
Crocodile WebRTC SDK and Cloud Signalling Network
 

Similar to Select Star: Unified Batch & Streaming with Flink SQL & Pulsar (20)

Select Star: Flink SQL for Pulsar Folks - Pulsar Summit NA 2021
Select Star: Flink SQL for Pulsar Folks - Pulsar Summit NA 2021Select Star: Flink SQL for Pulsar Folks - Pulsar Summit NA 2021
Select Star: Flink SQL for Pulsar Folks - Pulsar Summit NA 2021
 
Don't Cross the Streams! (or do, we got you)
Don't Cross the Streams! (or do, we got you)Don't Cross the Streams! (or do, we got you)
Don't Cross the Streams! (or do, we got you)
 
Better, Faster, Stronger Streaming: Your First Dive into Flink SQL
Better, Faster, Stronger Streaming: Your First Dive into Flink SQLBetter, Faster, Stronger Streaming: Your First Dive into Flink SQL
Better, Faster, Stronger Streaming: Your First Dive into Flink SQL
 
Flink sql for continuous sql etl apps & Apache NiFi devops
Flink sql for continuous sql etl apps & Apache NiFi devopsFlink sql for continuous sql etl apps & Apache NiFi devops
Flink sql for continuous sql etl apps & Apache NiFi devops
 
PartnerSkillUp_Enable a Streaming CDC Solution
PartnerSkillUp_Enable a Streaming CDC SolutionPartnerSkillUp_Enable a Streaming CDC Solution
PartnerSkillUp_Enable a Streaming CDC Solution
 
Google Cloud Dataflow
Google Cloud DataflowGoogle Cloud Dataflow
Google Cloud Dataflow
 
ApacheCon 2020 - Flink SQL in 2020: Time to show off!
ApacheCon 2020 - Flink SQL in 2020: Time to show off!ApacheCon 2020 - Flink SQL in 2020: Time to show off!
ApacheCon 2020 - Flink SQL in 2020: Time to show off!
 
Cross the Streams! Creating Streaming Data Pipelines with Apache Flink + Apac...
Cross the Streams! Creating Streaming Data Pipelines with Apache Flink + Apac...Cross the Streams! Creating Streaming Data Pipelines with Apache Flink + Apac...
Cross the Streams! Creating Streaming Data Pipelines with Apache Flink + Apac...
 
What's new for Apache Flink's Table & SQL APIs?
What's new for Apache Flink's Table & SQL APIs?What's new for Apache Flink's Table & SQL APIs?
What's new for Apache Flink's Table & SQL APIs?
 
Real time Messages at Scale with Apache Kafka and Couchbase
Real time Messages at Scale with Apache Kafka and CouchbaseReal time Messages at Scale with Apache Kafka and Couchbase
Real time Messages at Scale with Apache Kafka and Couchbase
 
Unlock cassandra data for application developers using graphQL
Unlock cassandra data for application developers using graphQLUnlock cassandra data for application developers using graphQL
Unlock cassandra data for application developers using graphQL
 
Why Airflow? & What's new in Airflow 2.3?
Why Airflow? & What's new in Airflow 2.3?Why Airflow? & What's new in Airflow 2.3?
Why Airflow? & What's new in Airflow 2.3?
 
Explore Advanced CA Release Automation Configuration Topics
Explore Advanced CA Release Automation Configuration TopicsExplore Advanced CA Release Automation Configuration Topics
Explore Advanced CA Release Automation Configuration Topics
 
Oracle Drivers configuration for High Availability, is it a developer's job?
Oracle Drivers configuration for High Availability, is it a developer's job?Oracle Drivers configuration for High Availability, is it a developer's job?
Oracle Drivers configuration for High Availability, is it a developer's job?
 
Starschema Products
Starschema ProductsStarschema Products
Starschema Products
 
What's New in Oracle SQL Developer for 2018
What's New in Oracle SQL Developer for 2018What's New in Oracle SQL Developer for 2018
What's New in Oracle SQL Developer for 2018
 
An oss api layer for your cassandra
An oss api layer for your cassandraAn oss api layer for your cassandra
An oss api layer for your cassandra
 
HP Helion European Webinar Series ,Webinar #3
HP Helion European Webinar Series ,Webinar #3 HP Helion European Webinar Series ,Webinar #3
HP Helion European Webinar Series ,Webinar #3
 
Overview SQL Server 2019
Overview SQL Server 2019Overview SQL Server 2019
Overview SQL Server 2019
 
DevCon5 (July 2014) - Acision SDK
DevCon5 (July 2014) - Acision SDKDevCon5 (July 2014) - Acision SDK
DevCon5 (July 2014) - Acision SDK
 

Recently uploaded

8 Best Automated Android App Testing Tool and Framework in 2024.pdf
8 Best Automated Android App Testing Tool and Framework in 2024.pdf8 Best Automated Android App Testing Tool and Framework in 2024.pdf
8 Best Automated Android App Testing Tool and Framework in 2024.pdf
kalichargn70th171
 
Preparing Non - Technical Founders for Engaging a Tech Agency
Preparing Non - Technical Founders for Engaging  a  Tech AgencyPreparing Non - Technical Founders for Engaging  a  Tech Agency
Preparing Non - Technical Founders for Engaging a Tech Agency
ISH Technologies
 
Fundamentals of Programming and Language Processors
Fundamentals of Programming and Language ProcessorsFundamentals of Programming and Language Processors
Fundamentals of Programming and Language Processors
Rakesh Kumar R
 
Microservice Teams - How the cloud changes the way we work
Microservice Teams - How the cloud changes the way we workMicroservice Teams - How the cloud changes the way we work
Microservice Teams - How the cloud changes the way we work
Sven Peters
 
如何办理(hull学位证书)英国赫尔大学毕业证硕士文凭原版一模一样
如何办理(hull学位证书)英国赫尔大学毕业证硕士文凭原版一模一样如何办理(hull学位证书)英国赫尔大学毕业证硕士文凭原版一模一样
如何办理(hull学位证书)英国赫尔大学毕业证硕士文凭原版一模一样
gapen1
 
KuberTENes Birthday Bash Guadalajara - Introducción a Argo CD
KuberTENes Birthday Bash Guadalajara - Introducción a Argo CDKuberTENes Birthday Bash Guadalajara - Introducción a Argo CD
KuberTENes Birthday Bash Guadalajara - Introducción a Argo CD
rodomar2
 
GreenCode-A-VSCode-Plugin--Dario-Jurisic
GreenCode-A-VSCode-Plugin--Dario-JurisicGreenCode-A-VSCode-Plugin--Dario-Jurisic
GreenCode-A-VSCode-Plugin--Dario-Jurisic
Green Software Development
 
Everything You Need to Know About X-Sign: The eSign Functionality of XfilesPr...
Everything You Need to Know About X-Sign: The eSign Functionality of XfilesPr...Everything You Need to Know About X-Sign: The eSign Functionality of XfilesPr...
Everything You Need to Know About X-Sign: The eSign Functionality of XfilesPr...
XfilesPro
 
Project Management: The Role of Project Dashboards.pdf
Project Management: The Role of Project Dashboards.pdfProject Management: The Role of Project Dashboards.pdf
Project Management: The Role of Project Dashboards.pdf
Karya Keeper
 
Using Query Store in Azure PostgreSQL to Understand Query Performance
Using Query Store in Azure PostgreSQL to Understand Query PerformanceUsing Query Store in Azure PostgreSQL to Understand Query Performance
Using Query Store in Azure PostgreSQL to Understand Query Performance
Grant Fritchey
 
E-commerce Development Services- Hornet Dynamics
E-commerce Development Services- Hornet DynamicsE-commerce Development Services- Hornet Dynamics
E-commerce Development Services- Hornet Dynamics
Hornet Dynamics
 
E-Invoicing Implementation: A Step-by-Step Guide for Saudi Arabian Companies
E-Invoicing Implementation: A Step-by-Step Guide for Saudi Arabian CompaniesE-Invoicing Implementation: A Step-by-Step Guide for Saudi Arabian Companies
E-Invoicing Implementation: A Step-by-Step Guide for Saudi Arabian Companies
Quickdice ERP
 
Mobile App Development Company In Noida | Drona Infotech
Mobile App Development Company In Noida | Drona InfotechMobile App Development Company In Noida | Drona Infotech
Mobile App Development Company In Noida | Drona Infotech
Drona Infotech
 
What next after learning python programming basics
What next after learning python programming basicsWhat next after learning python programming basics
What next after learning python programming basics
Rakesh Kumar R
 
Enums On Steroids - let's look at sealed classes !
Enums On Steroids - let's look at sealed classes !Enums On Steroids - let's look at sealed classes !
Enums On Steroids - let's look at sealed classes !
Marcin Chrost
 
J-Spring 2024 - Going serverless with Quarkus, GraalVM native images and AWS ...
J-Spring 2024 - Going serverless with Quarkus, GraalVM native images and AWS ...J-Spring 2024 - Going serverless with Quarkus, GraalVM native images and AWS ...
J-Spring 2024 - Going serverless with Quarkus, GraalVM native images and AWS ...
Bert Jan Schrijver
 
Measures in SQL (SIGMOD 2024, Santiago, Chile)
Measures in SQL (SIGMOD 2024, Santiago, Chile)Measures in SQL (SIGMOD 2024, Santiago, Chile)
Measures in SQL (SIGMOD 2024, Santiago, Chile)
Julian Hyde
 
fiscal year variant fiscal year variant.
fiscal year variant fiscal year variant.fiscal year variant fiscal year variant.
fiscal year variant fiscal year variant.
AnkitaPandya11
 
All you need to know about Spring Boot and GraalVM
All you need to know about Spring Boot and GraalVMAll you need to know about Spring Boot and GraalVM
All you need to know about Spring Boot and GraalVM
Alina Yurenko
 
14 th Edition of International conference on computer vision
14 th Edition of International conference on computer vision14 th Edition of International conference on computer vision
14 th Edition of International conference on computer vision
ShulagnaSarkar2
 

Recently uploaded (20)

8 Best Automated Android App Testing Tool and Framework in 2024.pdf
8 Best Automated Android App Testing Tool and Framework in 2024.pdf8 Best Automated Android App Testing Tool and Framework in 2024.pdf
8 Best Automated Android App Testing Tool and Framework in 2024.pdf
 
Preparing Non - Technical Founders for Engaging a Tech Agency
Preparing Non - Technical Founders for Engaging  a  Tech AgencyPreparing Non - Technical Founders for Engaging  a  Tech Agency
Preparing Non - Technical Founders for Engaging a Tech Agency
 
Fundamentals of Programming and Language Processors
Fundamentals of Programming and Language ProcessorsFundamentals of Programming and Language Processors
Fundamentals of Programming and Language Processors
 
Microservice Teams - How the cloud changes the way we work
Microservice Teams - How the cloud changes the way we workMicroservice Teams - How the cloud changes the way we work
Microservice Teams - How the cloud changes the way we work
 
如何办理(hull学位证书)英国赫尔大学毕业证硕士文凭原版一模一样
如何办理(hull学位证书)英国赫尔大学毕业证硕士文凭原版一模一样如何办理(hull学位证书)英国赫尔大学毕业证硕士文凭原版一模一样
如何办理(hull学位证书)英国赫尔大学毕业证硕士文凭原版一模一样
 
KuberTENes Birthday Bash Guadalajara - Introducción a Argo CD
KuberTENes Birthday Bash Guadalajara - Introducción a Argo CDKuberTENes Birthday Bash Guadalajara - Introducción a Argo CD
KuberTENes Birthday Bash Guadalajara - Introducción a Argo CD
 
GreenCode-A-VSCode-Plugin--Dario-Jurisic
GreenCode-A-VSCode-Plugin--Dario-JurisicGreenCode-A-VSCode-Plugin--Dario-Jurisic
GreenCode-A-VSCode-Plugin--Dario-Jurisic
 
Everything You Need to Know About X-Sign: The eSign Functionality of XfilesPr...
Everything You Need to Know About X-Sign: The eSign Functionality of XfilesPr...Everything You Need to Know About X-Sign: The eSign Functionality of XfilesPr...
Everything You Need to Know About X-Sign: The eSign Functionality of XfilesPr...
 
Project Management: The Role of Project Dashboards.pdf
Project Management: The Role of Project Dashboards.pdfProject Management: The Role of Project Dashboards.pdf
Project Management: The Role of Project Dashboards.pdf
 
Using Query Store in Azure PostgreSQL to Understand Query Performance
Using Query Store in Azure PostgreSQL to Understand Query PerformanceUsing Query Store in Azure PostgreSQL to Understand Query Performance
Using Query Store in Azure PostgreSQL to Understand Query Performance
 
E-commerce Development Services- Hornet Dynamics
E-commerce Development Services- Hornet DynamicsE-commerce Development Services- Hornet Dynamics
E-commerce Development Services- Hornet Dynamics
 
E-Invoicing Implementation: A Step-by-Step Guide for Saudi Arabian Companies
E-Invoicing Implementation: A Step-by-Step Guide for Saudi Arabian CompaniesE-Invoicing Implementation: A Step-by-Step Guide for Saudi Arabian Companies
E-Invoicing Implementation: A Step-by-Step Guide for Saudi Arabian Companies
 
Mobile App Development Company In Noida | Drona Infotech
Mobile App Development Company In Noida | Drona InfotechMobile App Development Company In Noida | Drona Infotech
Mobile App Development Company In Noida | Drona Infotech
 
What next after learning python programming basics
What next after learning python programming basicsWhat next after learning python programming basics
What next after learning python programming basics
 
Enums On Steroids - let's look at sealed classes !
Enums On Steroids - let's look at sealed classes !Enums On Steroids - let's look at sealed classes !
Enums On Steroids - let's look at sealed classes !
 
J-Spring 2024 - Going serverless with Quarkus, GraalVM native images and AWS ...
J-Spring 2024 - Going serverless with Quarkus, GraalVM native images and AWS ...J-Spring 2024 - Going serverless with Quarkus, GraalVM native images and AWS ...
J-Spring 2024 - Going serverless with Quarkus, GraalVM native images and AWS ...
 
Measures in SQL (SIGMOD 2024, Santiago, Chile)
Measures in SQL (SIGMOD 2024, Santiago, Chile)Measures in SQL (SIGMOD 2024, Santiago, Chile)
Measures in SQL (SIGMOD 2024, Santiago, Chile)
 
fiscal year variant fiscal year variant.
fiscal year variant fiscal year variant.fiscal year variant fiscal year variant.
fiscal year variant fiscal year variant.
 
All you need to know about Spring Boot and GraalVM
All you need to know about Spring Boot and GraalVMAll you need to know about Spring Boot and GraalVM
All you need to know about Spring Boot and GraalVM
 
14 th Edition of International conference on computer vision
14 th Edition of International conference on computer vision14 th Edition of International conference on computer vision
14 th Edition of International conference on computer vision
 

Select Star: Unified Batch & Streaming with Flink SQL & Pulsar

  • 2. © 2020 Ververica 2 2 ● Caito Scherr Introduction
  • 3. © 2020 Ververica 3 3 ● Caito Scherr Introduction 3 ● Caito Scherr ● Developer Advocate
  • 4. © 2020 Ververica 4 ● Caito Scherr ● Developer Advocate ● Ververica, GmbH Introduction
  • 5. © 2020 Ververica 5 ● Caito Scherr ● Developer Advocate ● Ververica, GmbH ● Portland, OR, USA Introduction
  • 7. © 2020 Ververica 7 Demo credit: Marta Paes Introduction
  • 8. © 2020 Ververica 8 Agenda ● Pulsar + Flink ● Where SQL comes in ● Demo: Pulsar + Flink SQL
  • 9. © 2020 Ververica 9 ● Pulsar + Flink ● Where SQL comes in ● Demo: Pulsar + Flink SQL Agenda
  • 10. © 2020 Ververica 10 Agenda ● Pulsar + Flink ● Where SQL comes in ● Demo: Pulsar + Flink SQL
  • 11. © 2020 Ververica 11 >> What is Flink? Pulsar + Flink ● Stateful ● Stream processing engine ● Unified batch & streaming
  • 12. © 2020 Ververica 12 >> What is Flink? Pulsar + Flink
  • 13. © 2020 Ververica 13 >> What is Flink? Pulsar + Flink
  • 14. © 2020 Ververica 14 >> Why Pulsar + Flink? Pulsar + Flink “Batch as a special case of streaming” “Stream as a unified view on data”
  • 15. © 2020 Ververica 15 >> Pulsar: Unified Storage ● Pub/Sub messaging layer (Streaming) ● Durable storage layer (Batch) Pulsar + Flink
  • 16. © 2020 Ververica 16 now bounded query unbounded query past future bounded query start of the stream unbounded query >> Flink: Unified Processing ● Reuse code and logic ● Consistent semantics ● Simplify operations ● Mix historic and real-time ● Pub/Sub messaging layer (Stream) ● Durable storage layer (Batch) Pulsar + Flink
  • 17. © 2020 Ververica 17 Unified Processing Engine (Batch / Streaming) Unified Storage (Segments / Pub/Sub) >> A Unified Data Stack Pulsar + Flink
  • 18. © 2020 Ververica 18 Flink 1.6+ 2018 Streaming Source/Sink Connectors Table Sink Connector >> Pulsar + Flink History Pulsar + Flink
  • 19. © 2020 Ververica 19 Flink 1.6+ 2018 Streaming Source/Sink Connectors Table Sink Connector >> Pulsar + Flink History Pulsar + Flink Flink 1.9+ Pulsar Schema + Flink Catalog Table API/SQL as 1st class citizens Exactly-once Source At-least once Sink
  • 20. © 2020 Ververica 20 Flink 1.6+ 2018 Streaming Source/Sink Connectors Table Sink Connector >> Pulsar + Flink History Pulsar + Flink Flink 1.9+ Pulsar Schema + Flink Catalog Table API/SQL as 1st class citizens Exactly-once Source At-least once Sink Flink 1.12 Upserts DDL Computed Columns, Watermarks. Metadata End-to-end Exactly-once Key-shared Subscription Model
  • 21. © 2020 Ververica 21 Flink Runtime Stateful Computations over Data Streams Stateful Stream Processing Streams, State, Time Event-Driven Applications Stateful Functions Streaming Analytics & ML SQL, PyFlink, Tables >> Why Flink SQL? Pulsar + Flink
  • 22. © 2020 Ververica 22 >> Why Flink SQL? ● Focus on business logic, not implementation ● Mixed workloads (batch + streaming) ● Maximize developer speed and autonomy ML Feature Generation Unified Online/Offline Model Training E2E Streaming Analytics Pipelines Pulsar + Flink
  • 23. © 2020 Ververica 23 user cnt Mary 2 Bob 1 SELECT user_id, COUNT(url) AS cnt FROM clicks GROUP BY user_id; Take a snapshot when the query starts A final result is produced A row that was added after the query was started is not considered user cTime url Mary 12:00:00 https://… Bob 12:00:00 https://… Mary 12:00:02 https://… Liz 12:00:03 https://… The query terminates Where SQL Fits In >> A Regular SQL Engine
  • 24. © 2020 Ververica 24 user cTime url user cnt SELECT user_id, COUNT(url) AS cnt FROM clicks GROUP BY user_id; Mary 12:00:00 https://… Bob 12:00:00 https://… Mary 12:00:02 https://… Liz 12:00:03 https://… Bob 1 Liz 1 Mary 1 Mary 2 Ingest all changes as they happen Continuously update the result The result is identical to the one-time query (at this point) Where SQL Fits In >> A Streaming SQL Engine
  • 25. © 2020 Ververica 25 ● Standard SQL syntax and semantics (i.e. not a “SQL-flavor”) ● Unified APIs for batch and streaming ● Support for advanced time handling and operations (e.g. CDC, pattern matching) UDF Support Python Java Scala Execution TPC-DS Coverage Batch Streaming + Formats Native Connectors Apache Kafka Elasticsearch FileSystems JDBC HBase + Kinesis Metastore Postgres (JDBC) Data Catalogs Debezium Where SQL Fits In >> Flink SQL In A Nutshell
  • 26. © 2020 Ververica 26 >> 1a. Twitter Firehose Demo
  • 28. © 2020 Ververica 28 Demo >> 2. SQL Client + Pulsar CREATE CATALOG pulsar WITH ( 'type' = 'pulsar', 'service-url' = 'pulsar://pulsar:6650', 'admin-url' = 'http://pulsar:8080', 'format' = 'json' ); Catalog DDL
  • 29. © 2020 Ververica 29 Not cool. 👹 Demo
  • 30. © 2020 Ververica 30 Demo CREATE TABLE pulsar_tweets ( publishTime TIMESTAMP(3) METADATA, WATERMARK FOR publishTime AS publishTime - INTERVAL '5' SECOND ) WITH ( 'connector' = 'pulsar', 'topic' = 'persistent://public/default/tweets', 'value.format' = 'json', 'service-url' = 'pulsar://pulsar:6650', 'admin-url' = 'http://pulsar:8080', 'scan.startup.mode' = 'earliest-offset' ) LIKE tweets; Derive schema from the original topic Define the source connector (Pulsar) Read and use Pulsar message metadata >> 3. Get relevant timestamp
  • 31. © 2020 Ververica 31 Demo >> 4. Windowed aggregation CREATE TABLE pulsar_tweets_agg ( tmstmp TIMESTAMP(3), tweet_cnt BIGINT ) WITH ( 'connector'='pulsar', 'topic'='persistent://public/default/tweets_agg', 'value.format'='json', 'service-url'='pulsar://pulsar:6650', 'admin-url'='http://pulsar:8080' ); Sink Table DDL INSERT INTO pulsar_tweets_agg SELECT TUMBLE_START(publishTime, INTERVAL '10' SECOND) AS wStart, COUNT(id) AS tweet_cnt FROM pulsar_tweets GROUP BY TUMBLE(publishTime, INTERVAL '10' SECOND); Continuous SQL Query
  • 32. © 2020 Ververica 32 Demo >> 5. Tweet count in windows
  • 33. © 2020 Ververica 33 What Next? >> Flink SQL Cookbook
  • 34. © 2020 Ververica Resources ● Flink Ahead: What Comes After Batch & Streaming: https://youtu.be/h5OYmy9Yx7Y ● Apache Pulsar as one Storage System for Real Time & Historical Data Analysis: https://medium.com/streamnative/apache-pulsar-as-one-storage-455222c59017 ● Flink Table API & SQL: https://ci.apache.org/projects/flink/flink-docs-master/dev/table/sql/queries.html#operatio ns ● Flink SQL Cookbook: https://github.com/ververica/flink-sql-cookbook ● When Flink & Pulsar Come Together: https://flink.apache.org/2019/05/03/pulsar-flink.html ● How to Query Pulsar Streams in Flink: https://flink.apache.org/news/2019/11/25/query-pulsar-streams-using-apache-flink.html ● What’s New in the Flink/Pulsar Connector: ● https://flink.apache.org/2021/01/07/pulsar-flink-connector-270.html ● Marta’s Demo: https://github.com/morsapaes/flink-sql-pulsar 34 @Caito_200_OK
  • 35. © 2020 Ververica ● Pulsar Conference staff!! ● Marta Paes 35 Thank You! @Caito_200_OK Scan here for links & resources