SlideShare a Scribd company logo
1 of 53
Download to read offline
Continuous SQL with Kafka and
Flink
Tim Spann
Principal Developer Advocate
February 20, 2024
https://www.meetup.com/dba-fundamentals-group/events/296855261/
3
Tim Spann
Twitter: @PaasDev // Blog: datainmotion.dev
Principal Developer Advocate.
Princeton Future of Data Meetup.
ex-Pivotal, ex-Hortonworks, ex-StreamNative, ex-PwC, ex-HPE
https://medium.com/@tspann
https://github.com/tspannhw
4
FLaNK Stack Weekly by Tim Spann
This week in Apache NiFi, Apache Flink,
Apache Kafka, ML, AI, Apache Spark, Apache
Iceberg, Python, Java and Open Source
friends.
https://bit.ly/32dAJft
https://www.meetup.com/futureofdata-
princeton/
© 2023 Cloudera, Inc. All rights reserved. 5
Future of Data - NYC + NJ + Philly + Virtual
@PaasDev
https://www.meetup.com/futureofdata-princeton/
From Big Data to AI to Streaming to Containers to
Cloud to Analytics to Cloud Storage to Fast Data to
Machine Learning to Microservices to ...
6
AGENDA
Introduction
Overview
Streaming Projects
Streaming Analytics
Demos
Resources
Q&A
FLANK
© 2023 Cloudera, Inc. All rights reserved. 8
BUILDING REAL-TIME REQUIRES A TEAM
© 2023 Cloudera, Inc. All rights reserved. 9
Already using Spark? Need NiFi? Need Flink?
Want unified Batch/Stream?
Want highest Throughput?
Don’t need low latency?
Large files?
Scheduled batches?
Replacing Sqoop, ETL
Simple JDBC queries?
Transform individual records?
Want easy development?
Lots of small files, events, records, rows?
Continuous stream of rows
Support many different sources
Need Microservices, Batch and Stream?
Want high Throughput?
Want Low Latency?
Want Advanced Windowing and State?
Happy with a New Solution that is
best-in-class?
Spark, NiFi, Flink? Which engine to choose?
10
Live Q&A
Travel Advisories
Weather Reports
Documents
Social Media
Databases
Transactions
Public Data Feeds
S3 / Files
Logs
ATM Data
Live Chat
…
HYBRID CLOUD
INTERACT
COLLECT STORE
ENRICH, REPORT
Distribute
Collect
Report
REPORT
Visualize
Report, Automate
AI BASED ENHANCEMENTS
Predict, Automate
VECTOR DATABASE
LLM
Machine
Learning
Data
Visualization
Data Flow
Data
Warehouse
SQL
Stream Builder
Data
Visualization
Input Sentences
Generated Text
Timestamp
Input Sentence
Timestamps
Enrichments
Messaging
Broker
Real-time alerting
Real-time alerting
Aggregations
EXAMPLES
12
Analytics-in-Stream
Data Sources Streaming Storage
Substrate
Cloudera Stream Processing
Kafka + NiFi enables
real-time ingestion into
lakes / analytics services
Data Distribution
Service
Cloudera DataFlow
Warehouses & Operational DB
Data Lakes & Lake Houses
Data-At-Rest Analytics
Data Apps Powered by
Streaming Insights and used
by other Analytics Services
Kafka + Flink
enables streaming
analytics
Cloudera Stream Processing
Streaming
Analytics
Low Latency
Data Products
Data-In-Motion Streaming Analytics
Cloudera Edge Flow
Edge Ingest
13
Flink
Connectors
Streaming Data Pipelines with Cloudera Data Platform
Edge
devices
Cloud DB’s
SaaS tools
Change Data
Capture
On prem
apps/DB’s
Streams
Any
destination
Custom
Kafka
producers
Custom
Kafka
consumers
Operational
Applications
Analytical
Applications
STREAMING DATA MOVEMENT & PROCESSING STREAMING DATA PROCESSING & ANALYTICS
16
APACHE KAFKA
© 2023 Cloudera, Inc. All rights reserved. 20
What is Can You Do With Apache Kafka?
Web site activity: track page views, searches, etc. in real time
Events & log aggregation: particularly in distributed systems where messages
come from multiple sources
Monitoring and metrics: aggregate statistics from distributed applications and
build a dashboard application
Stream processing: process raw data, clean it up, and forward it on to another
topic or messaging system
Real-time data ingestion: fast processing of a very large volume of messages
© 2019 Cloudera, Inc. All rights reserved. 21
STREAMS MESSAGING WITH KAFKA
• Highly reliable distributed messaging system.
• Decouple applications, enables many-to-many
patterns.
• Publish-Subscribe semantics.
• Horizontal scalability.
• Efficient implementation to operate at speed with big
data volumes.
• Organized by topic to support several use cases.
APACHE FLINK
23
CONTINUOUS SQL
● SSB is a Continuous SQL engine
● It’s SQL, but a slightly different mental model, but with big implications
Traditional Parse/Execute/Fetch model Continuous SQL Model
Hint: The query is boundless and never finishes, and time matters
AKA: SELECT * FROM foo WHERE 1=0 -- will run forever
24
Flink SQL
-- specify Kafka partition key on output
SELECT foo AS _eventKey FROM sensors
-- use event time timestamp from kafka
-- exactly once compatible
SELECT eventTimestamp FROM sensors
-- nested structures access
SELECT foo.’bar’ FROM table; -- must quote nested
column
-- timestamps
SELECT * FROM payments
WHERE eventTimestamp > CURRENT_TIMESTAMP-interval
'10' second;
-- unnest
SELECT b.*, u.*
FROM bgp_avro b,
UNNEST(b.path) AS u(pathitem)
-- aggregations and windows
SELECT card,
MAX(amount) as theamount,
TUMBLE_END(eventTimestamp, interval '5' minute) as
ts
FROM payments
WHERE lat IS NOT NULL
AND lon IS NOT NULL
GROUP BY card,
TUMBLE(eventTimestamp, interval '5' minute)
HAVING COUNT(*) > 4 -- >4==fraud
-- try to do this ksql!
SELECT us_west.user_score+ap_south.user_score
FROM kafka_in_zone_us_west us_west
FULL OUTER JOIN kafka_in_zone_ap_south ap_south
ON us_west.user_id = ap_south.user_id;
Key Takeaway: Rich SQL grammar with advanced time and aggregation tools
25
© 2022 Cloudera, Inc. All rights reserved.
SQL STREAM BUILDER (SSB)
SQL STREAM BUILDER allows
developers, analysts, and data
scientists to write streaming
applications with industry
standard SQL.
No Java or Scala code
development required.
Simplifies access to data in Kafka
& Flink. Connectors to batch data in
HDFS, Kudu, Hive, S3, JDBC, CDC
and more
Enrich streaming data with batch
data in a single tool
Democratize access to real-time data with just SQL
26
SSB MATERIALIZED VIEWS
Key Takeaway; MV’s allow data scientist, analyst and developers consume data from the firehose
© 2019 Cloudera, Inc. All rights reserved. 27
ICEBERG INTEGRATION
Robust Next Generation Architecture for Data Driven Business
Unified Processing Engine Massive Open table format
Iceberg Support for Flink APIs through SSB
• Maximally open
• Maximally flexible
• Ultra high performance for MASSIVE data
DEMO AND CODE
29
Continuous SQL
select max(alt_baro) as MaxAltitudeFeet, min(alt_baro) as MinAltitudeFeet, avg(alt_baro) as AvgAltitudeFeet,
max(alt_geom) as MaxGAltitudeFeet, min(alt_geom) as MinGAltitudeFeet, avg(alt_geom) as AvgGAltitudeFeet,
max(gs) as MaxGroundSpeed, min(gs) as MinGroundSpeed, avg(gs) as AvgGroundSpeed,
count(alt_baro) as RowCount,
hex as ICAO, flight as IDENT
from `sr1`.`default_database`.`adsb`
group by flight, hex;
select transcom.title, transcom.description, mta.VehicleRef,
DISTANCE_BETWEEN(CAST(transcom.latitude as STRING), CAST(transcom.latitude as STRING), mta.VehicleLocationLatitude, mta.VehicleLocationLongitude) as miles,
mta.StopPointName, mta.Bearing, mta.DestinationName, mta.ExpectedArrivalTime, mta.VehicleLocationLatitude, mta.VehicleLocationLongitude,
mta.ArrivalProximityText, mta.DistanceFromStop, mta.AimedArrivalTime, mta.`Date`, mta.ts, mta.uuid, mta.EstimatedPassengerCapacity, mta.EstimatedPassengerCount
from `schemareg1`.`default_database`.`mta` /*+ OPTIONS('scan.startup.mode' = 'earliest-offset') */ mta
FULL OUTER JOIN `schemareg1`.`default_database`.`transcom` /*+ OPTIONS('scan.startup.mode' = 'earliest-offset') */ transcom
ON (transcom.latitude >= CAST(mta.VehicleLocationLatitude as float) - 0.3)
AND (transcom.longitude >= CAST(mta.VehicleLocationLongitude as float) - 0.3)
AND (transcom.latitude <= CAST(mta.VehicleLocationLatitude as float) + 0.3)
AND (transcom.longitude <= CAST(mta.VehicleLocationLongitude as float) + 0.3)
WHERE mta.VehicleRef is not null
AND transcom.title is not null
AND DISTANCE_BETWEEN(CAST(transcom.latitude as STRING), CAST(transcom.latitude as STRING), mta.VehicleLocationLatitude, mta.VehicleLocationLongitude) <= 120
30
Real-time observability pipeline
Minfi agents
Raw Logs
Cloudera Data Flow
Cloudera Data
Lakehouse
Triaging ML Models
Threat Hunting
Response and
Investigation
UEBA/Fraud
Detection
Reports
Auto Action
Cloudera Streaming
Analytics
Cybersec
Toolkits
Parse, Triage, Profile
Cloudera Streams
Processing
Kafka
SQL Stream
Builder
SPLUNK / SIEM
/EXTERNAL
Cloudera Machine Learning
Collect Route/Filter/
Transform
Prepare/Analyze/
Alert
33
Data in Motion: Overview e Novidades do NiFi, Kafka e Flink
Apresentador: Tim Spann - Principal DIM Specialist and Developer Advocate
https://medium.com/cloudera-inc/transit-in-sao-paulo-brasil-flank-style-eaec6753cc63
34
35
36
CDC ENGINE SELECTION
HOW TO DO IT?
© 2023 Cloudera, Inc. All rights reserved. 38
Already using Kafka? Already using NiFi? Need for Fast Flink?
Simple setup for many tables
Want metadata augmented data
Don’t need low latency?
Visual monitoring
Easy manual scaling
Easy to combine with NiFi
Debezium
Simple JDBC queries?
Transform individual records?
Want easy development with UI?
Lots of small files, events, records, rows?
Continuous stream of rows
Support many different sources
Debezium coming
Strong control of table and joins
Want high Throughput?
Want Low Latency?
Want Advanced Windowing and State?
Automatic records immediately
Pure SQL
Debezium
Kafka Connect, NiFi, Flink? Which engine to choose? Or All 3?
CDC ARCHITECTURE - Using FLaNK to pull the data out of anything in near-real time
INGEST PREPARE PUBLISH
DATA SOURCES
Internal Users
(After Sales)
External
Systems
ENTERPRISE
LAKEHOUSE
CAPABILITY VIEW
INGESTION
ENTERPRISE DATA
MESSAGE HUB
STORAGE
BATCH
MANAGEMENT
STREAM
CONSUMPTION
Closed Loop
Systems
SQL Stream Builder
Machine Learning
Data Visualization
Workload Manager
watsonx.data
40
Data Distribution as a Universal, Hybrid, Multi-Cloud Data Service
Universal Data Distribution Service
(Ingest, Transform, Deliver)
Ingest
Processors
Ingest Gateway
Router, Filter &
Transform Processors
Destination
Processors
Cloud Business Process Services*
Log Data Sources
Laptops /
Servers Security
Agents
IOT Devices App Logs
Mobile Apps
Cloud Data Analytics/ Service *
On-Prem Data Sources Cloud Warehouse
(Cloudera DW)
Big Data Cloud Services
Multi-Cloud Data Distribution Service that Solves the First & Last Mile Problem for the Modern Data Stack
CDC with SQL Stream Builder
(Flink SQL)
© 2023 Cloudera, Inc. All rights reserved. 42
Streaming CDC with Cloudera SQL Stream Builder (Flink SQL)
https://github.com/tspannhw/FLaNK-CDC/blob/main/flinkcdc.MD
© 2023 Cloudera, Inc. All rights reserved. 43
https://docs.cloudera.com/csa/1.10.0/how-to-ssb/topics/csa-ssb-cdc-connectors.html
CDC with Debezium and Flink
SQL Stream Builder with Flink SQL
© 2023 Cloudera, Inc. All rights reserved. 44
CDC with Debezium and Flink
SQL Stream Builder with Flink SQL
© 2023 Cloudera, Inc. All rights reserved. 45
© 2023 Cloudera, Inc. All rights reserved. 46
CREATE TABLE `postgres_cdc_newjerseybus` (
`title` STRING,
`description` STRING,
`link` STRING,
`guid` STRING,
`advisoryAlert` STRING,
`pubDate` STRING,
`ts` STRING,
`companyname` STRING,
`uuid` STRING,
`servicename` STRING
) WITH (
'connector' = 'postgres-cdc',
'database-name' = 'tspann',
'hostname' = '192.168.1.153',
'password' = 'tspann',
'decoding.plugin.name' = 'pgoutput',
'schema-name' = 'public',
'table-name' = 'newjerseybus',
'username' = 'tspann',
'port' = '5432'
);
Flink SQL Tables - Debezium CDC From Database Tables
© 2023 Cloudera, Inc. All rights reserved. 47
Flink SQL Tables - Upsert to Kafka Topics
CREATE TABLE `upsert_kafka_newjerseybus` (
`title` String,
`description` String,
`link` String,
`guid` String,
`advisoryAlert` String,
`pubDate` String,
`ts` String,
`companyname` String,
`uuid` String,
`servicename` String,
`eventTimestamp` TIMESTAMP(3),
WATERMARK FOR `eventTimestamp` AS `eventTimestamp` - INTERVAL '5' SECOND,
PRIMARY KEY (uuid) NOT ENFORCED
) WITH (
'connector' = 'upsert-kafka',
'topic' = 'kafka_newjerseybus',
'properties.bootstrap.servers' = 'kafka:9092',
'key.format' = 'json',
'value.format' = 'json'
);
RESOURCES/WRAP-UP
https://github.com/tspannhw/FLaNK-Transit
SELECT n.speed, n.travel_time, n.borough, n.link_name, n.link_points,
n.latitude, n.longitude, DISTANCE_BETWEEN(CAST(t.latitude as STRING),
CAST(t.latitude as STRING),
m.VehicleLocationLatitude, m.VehicleLocationLongitude) as miles,
t.title, t.`description`, t.pubDate, t.latitude, t.longitude,
m.VehicleLocationLatitude, m.VehicleLocationLongitude,
m.StopPointRef, m.VehicleRef,
m.ProgressRate, m.ExpectedDepartureTime, m.StopPoint,
m.VisitNumber, m.DataFrameRef, m.StopPointName,
m.Bearing, m.OriginAimedDepartureTime, m.OperatorRef,
m.DestinationName, m.ExpectedArrivalTime, m.BlockRef,
m.LineRef, m.DirectionRef, m.ArrivalProximityText,
m.DistanceFromStop, m.EstimatedPassengerCapacity,
m.AimedArrivalTime, m.PublishedLineName,
m.ProgressStatus, m.DestinationRef, m.EstimatedPassengerCount,
m.OriginRef, m.NumberOfStopsAway, m.ts
FROM jsonmta /*+ OPTIONS('scan.startup.mode' = 'earliest-offset') */ m
FULL OUTER JOIN jsontranscom /*+ OPTIONS('scan.startup.mode' = 'earliest-offset') */ t
ON (t.latitude >= CAST(m.VehicleLocationLatitude as float) - 0.3)
AND (t.longitude >= CAST(m.VehicleLocationLongitude as float) - 0.3)
AND (t.latitude <= CAST(m.VehicleLocationLatitude as float) + 0.3)
AND (t.longitude <= CAST(m.VehicleLocationLongitude as float) + 0.3)
FULL OUTER JOIN nytrafficspeed /*+ OPTIONS('scan.startup.mode' = 'earliest-offset') */ n
ON (n.latitude >= CAST(m.VehicleLocationLatitude as float) - 0.3)
AND (n.longitude >= CAST(m.VehicleLocationLongitude as float) - 0.3)
AND (n.latitude <= CAST(m.VehicleLocationLatitude as float) + 0.3)
AND (n.longitude <= CAST(m.VehicleLocationLongitude as float) + 0.3)
WHERE m.VehicleRef is not null
AND t.title is not null
https://medium.com/@tspann/cdc-not-cat-data-capture-e43713879c03
FLaNK for Halifax Canada Transit —
NiFi, Kafka, Flink, SQL, GTFS-RT | by
Tim Spann | Cloudera | Dec, 2023 |
Medium
Never Get Lost in the Stream.
NiFi-Kafka-Flink for getting to work… |
by Tim Spann | Cloudera | Dec, 2023 |
Medium
Iteration 1: Building a System to
Consume All the Real-Time Transit
Data in the World At Once | by Tim
Spann | Cloudera | Medium
Watching Airport Traffic in Real-Time
| by Tim Spann | Cloudera | Medium
52
Resources
https://medium.com/@tspann/llm-pipelines-with-pinecone-and-hu
ggingface-with-python-and-apache-nifi-a96c20be93b7
53
TH N Y U

More Related Content

What's hot

The Patterns of Distributed Logging and Containers
The Patterns of Distributed Logging and ContainersThe Patterns of Distributed Logging and Containers
The Patterns of Distributed Logging and ContainersSATOSHI TAGOMORI
 
Quarkus - a next-generation Kubernetes Native Java framework
Quarkus - a next-generation Kubernetes Native Java frameworkQuarkus - a next-generation Kubernetes Native Java framework
Quarkus - a next-generation Kubernetes Native Java frameworkSVDevOps
 
Spring boot Under Da Hood
Spring boot Under Da HoodSpring boot Under Da Hood
Spring boot Under Da HoodMichel Schudel
 
Designing Event-Driven Applications with Apache NiFi, Apache Flink, Apache Sp...
Designing Event-Driven Applications with Apache NiFi, Apache Flink, Apache Sp...Designing Event-Driven Applications with Apache NiFi, Apache Flink, Apache Sp...
Designing Event-Driven Applications with Apache NiFi, Apache Flink, Apache Sp...Timothy Spann
 
SLF4J (Simple Logging Facade for Java)
SLF4J (Simple Logging Facade for Java)SLF4J (Simple Logging Facade for Java)
SLF4J (Simple Logging Facade for Java)Guo Albert
 
Real Life Clean Architecture
Real Life Clean ArchitectureReal Life Clean Architecture
Real Life Clean ArchitectureMattia Battiston
 
Building Modern APIs with GraphQL
Building Modern APIs with GraphQLBuilding Modern APIs with GraphQL
Building Modern APIs with GraphQLAmazon Web Services
 
Secrets of Performance Tuning Java on Kubernetes
Secrets of Performance Tuning Java on KubernetesSecrets of Performance Tuning Java on Kubernetes
Secrets of Performance Tuning Java on KubernetesBruno Borges
 
FIWARE: Managing Context Information at large scale
FIWARE: Managing Context Information at large scaleFIWARE: Managing Context Information at large scale
FIWARE: Managing Context Information at large scaleFermin Galan
 
Getting Started with Apache Spark on Kubernetes
Getting Started with Apache Spark on KubernetesGetting Started with Apache Spark on Kubernetes
Getting Started with Apache Spark on KubernetesDatabricks
 
Building a fully managed stream processing platform on Flink at scale for Lin...
Building a fully managed stream processing platform on Flink at scale for Lin...Building a fully managed stream processing platform on Flink at scale for Lin...
Building a fully managed stream processing platform on Flink at scale for Lin...Flink Forward
 
Apache Kafka Architecture & Fundamentals Explained
Apache Kafka Architecture & Fundamentals ExplainedApache Kafka Architecture & Fundamentals Explained
Apache Kafka Architecture & Fundamentals Explainedconfluent
 
How to test infrastructure code: automated testing for Terraform, Kubernetes,...
How to test infrastructure code: automated testing for Terraform, Kubernetes,...How to test infrastructure code: automated testing for Terraform, Kubernetes,...
How to test infrastructure code: automated testing for Terraform, Kubernetes,...Yevgeniy Brikman
 
Graphql presentation
Graphql presentationGraphql presentation
Graphql presentationVibhor Grover
 
Discover Quarkus and GraalVM
Discover Quarkus and GraalVMDiscover Quarkus and GraalVM
Discover Quarkus and GraalVMRomain Schlick
 
YugaByte DB Internals - Storage Engine and Transactions
YugaByte DB Internals - Storage Engine and Transactions YugaByte DB Internals - Storage Engine and Transactions
YugaByte DB Internals - Storage Engine and Transactions Yugabyte
 

What's hot (20)

The Patterns of Distributed Logging and Containers
The Patterns of Distributed Logging and ContainersThe Patterns of Distributed Logging and Containers
The Patterns of Distributed Logging and Containers
 
Quarkus - a next-generation Kubernetes Native Java framework
Quarkus - a next-generation Kubernetes Native Java frameworkQuarkus - a next-generation Kubernetes Native Java framework
Quarkus - a next-generation Kubernetes Native Java framework
 
Spring boot Under Da Hood
Spring boot Under Da HoodSpring boot Under Da Hood
Spring boot Under Da Hood
 
Designing Event-Driven Applications with Apache NiFi, Apache Flink, Apache Sp...
Designing Event-Driven Applications with Apache NiFi, Apache Flink, Apache Sp...Designing Event-Driven Applications with Apache NiFi, Apache Flink, Apache Sp...
Designing Event-Driven Applications with Apache NiFi, Apache Flink, Apache Sp...
 
SLF4J (Simple Logging Facade for Java)
SLF4J (Simple Logging Facade for Java)SLF4J (Simple Logging Facade for Java)
SLF4J (Simple Logging Facade for Java)
 
Real Life Clean Architecture
Real Life Clean ArchitectureReal Life Clean Architecture
Real Life Clean Architecture
 
Building Modern APIs with GraphQL
Building Modern APIs with GraphQLBuilding Modern APIs with GraphQL
Building Modern APIs with GraphQL
 
Secrets of Performance Tuning Java on Kubernetes
Secrets of Performance Tuning Java on KubernetesSecrets of Performance Tuning Java on Kubernetes
Secrets of Performance Tuning Java on Kubernetes
 
FIWARE: Managing Context Information at large scale
FIWARE: Managing Context Information at large scaleFIWARE: Managing Context Information at large scale
FIWARE: Managing Context Information at large scale
 
Getting Started with Apache Spark on Kubernetes
Getting Started with Apache Spark on KubernetesGetting Started with Apache Spark on Kubernetes
Getting Started with Apache Spark on Kubernetes
 
Introduction to GraphQL
Introduction to GraphQLIntroduction to GraphQL
Introduction to GraphQL
 
Building a fully managed stream processing platform on Flink at scale for Lin...
Building a fully managed stream processing platform on Flink at scale for Lin...Building a fully managed stream processing platform on Flink at scale for Lin...
Building a fully managed stream processing platform on Flink at scale for Lin...
 
Apache Kafka Architecture & Fundamentals Explained
Apache Kafka Architecture & Fundamentals ExplainedApache Kafka Architecture & Fundamentals Explained
Apache Kafka Architecture & Fundamentals Explained
 
How to test infrastructure code: automated testing for Terraform, Kubernetes,...
How to test infrastructure code: automated testing for Terraform, Kubernetes,...How to test infrastructure code: automated testing for Terraform, Kubernetes,...
How to test infrastructure code: automated testing for Terraform, Kubernetes,...
 
Graphql presentation
Graphql presentationGraphql presentation
Graphql presentation
 
Spring Security 5
Spring Security 5Spring Security 5
Spring Security 5
 
FLiP Into Trino
FLiP Into TrinoFLiP Into Trino
FLiP Into Trino
 
Discover Quarkus and GraalVM
Discover Quarkus and GraalVMDiscover Quarkus and GraalVM
Discover Quarkus and GraalVM
 
YugaByte DB Internals - Storage Engine and Transactions
YugaByte DB Internals - Storage Engine and Transactions YugaByte DB Internals - Storage Engine and Transactions
YugaByte DB Internals - Storage Engine and Transactions
 
Flink vs. Spark
Flink vs. SparkFlink vs. Spark
Flink vs. Spark
 

Similar to DBA Fundamentals Group: Continuous SQL with Kafka and Flink

Meetup: Streaming Data Pipeline Development
Meetup:  Streaming Data Pipeline DevelopmentMeetup:  Streaming Data Pipeline Development
Meetup: Streaming Data Pipeline DevelopmentTimothy Spann
 
2024 February 28 - NYC - Meetup Unlocking Financial Data with Real-Time Pipel...
2024 February 28 - NYC - Meetup Unlocking Financial Data with Real-Time Pipel...2024 February 28 - NYC - Meetup Unlocking Financial Data with Real-Time Pipel...
2024 February 28 - NYC - Meetup Unlocking Financial Data with Real-Time Pipel...Timothy Spann
 
big data fest building modern data streaming apps
big data fest building modern data streaming appsbig data fest building modern data streaming apps
big data fest building modern data streaming appsTimothy Spann
 
BigDataFest_ Building Modern Data Streaming Apps
BigDataFest_  Building Modern Data Streaming AppsBigDataFest_  Building Modern Data Streaming Apps
BigDataFest_ Building Modern Data Streaming Appsssuser73434e
 
JConWorld_ Continuous SQL with Kafka and Flink
JConWorld_ Continuous SQL with Kafka and FlinkJConWorld_ Continuous SQL with Kafka and Flink
JConWorld_ Continuous SQL with Kafka and FlinkTimothy Spann
 
RTAS 2023: Building a Real-Time IoT Application
RTAS 2023:  Building a Real-Time IoT ApplicationRTAS 2023:  Building a Real-Time IoT Application
RTAS 2023: Building a Real-Time IoT ApplicationTimothy Spann
 
Unconference Round Table Notes
Unconference Round Table NotesUnconference Round Table Notes
Unconference Round Table NotesTimothy Spann
 
BigDataFest Building Modern Data Streaming Apps
BigDataFest  Building Modern Data Streaming AppsBigDataFest  Building Modern Data Streaming Apps
BigDataFest Building Modern Data Streaming Appsssuser73434e
 
GSJUG: Mastering Data Streaming Pipelines 09May2023
GSJUG: Mastering Data Streaming Pipelines 09May2023GSJUG: Mastering Data Streaming Pipelines 09May2023
GSJUG: Mastering Data Streaming Pipelines 09May2023Timothy Spann
 
PartnerSkillUp_Enable a Streaming CDC Solution
PartnerSkillUp_Enable a Streaming CDC SolutionPartnerSkillUp_Enable a Streaming CDC Solution
PartnerSkillUp_Enable a Streaming CDC SolutionTimothy Spann
 
Building Real-time Pipelines with FLaNK_ A Case Study with Transit Data
Building Real-time Pipelines with FLaNK_ A Case Study with Transit DataBuilding Real-time Pipelines with FLaNK_ A Case Study with Transit Data
Building Real-time Pipelines with FLaNK_ A Case Study with Transit DataTimothy Spann
 
Budapest Data/ML - Building Modern Data Streaming Apps with NiFi, Flink and K...
Budapest Data/ML - Building Modern Data Streaming Apps with NiFi, Flink and K...Budapest Data/ML - Building Modern Data Streaming Apps with NiFi, Flink and K...
Budapest Data/ML - Building Modern Data Streaming Apps with NiFi, Flink and K...Timothy Spann
 
Introduction to apache kafka, confluent and why they matter
Introduction to apache kafka, confluent and why they matterIntroduction to apache kafka, confluent and why they matter
Introduction to apache kafka, confluent and why they matterPaolo Castagna
 
Meetup Streaming Data Pipeline Development
Meetup Streaming Data Pipeline DevelopmentMeetup Streaming Data Pipeline Development
Meetup Streaming Data Pipeline DevelopmentTimothy Spann
 
Future of Data Milwaukee Meetup Streaming Data Pipeline Development 28 June 2023
Future of Data Milwaukee Meetup Streaming Data Pipeline Development 28 June 2023Future of Data Milwaukee Meetup Streaming Data Pipeline Development 28 June 2023
Future of Data Milwaukee Meetup Streaming Data Pipeline Development 28 June 2023ssuser73434e
 
Building Real-Time Travel Alerts
Building Real-Time Travel AlertsBuilding Real-Time Travel Alerts
Building Real-Time Travel AlertsTimothy Spann
 
OSSNA Building Modern Data Streaming Apps
OSSNA Building Modern Data Streaming AppsOSSNA Building Modern Data Streaming Apps
OSSNA Building Modern Data Streaming AppsTimothy Spann
 
NoLambda: Combining Streaming, Ad-Hoc, Machine Learning and Batch Analysis
NoLambda: Combining Streaming, Ad-Hoc, Machine Learning and Batch AnalysisNoLambda: Combining Streaming, Ad-Hoc, Machine Learning and Batch Analysis
NoLambda: Combining Streaming, Ad-Hoc, Machine Learning and Batch AnalysisHelena Edelson
 
The Never Landing Stream with HTAP and Streaming
The Never Landing Stream with HTAP and StreamingThe Never Landing Stream with HTAP and Streaming
The Never Landing Stream with HTAP and StreamingTimothy Spann
 
CoC23_Utilizing Real-Time Transit Data for Travel Optimization
CoC23_Utilizing Real-Time Transit Data for Travel OptimizationCoC23_Utilizing Real-Time Transit Data for Travel Optimization
CoC23_Utilizing Real-Time Transit Data for Travel OptimizationTimothy Spann
 

Similar to DBA Fundamentals Group: Continuous SQL with Kafka and Flink (20)

Meetup: Streaming Data Pipeline Development
Meetup:  Streaming Data Pipeline DevelopmentMeetup:  Streaming Data Pipeline Development
Meetup: Streaming Data Pipeline Development
 
2024 February 28 - NYC - Meetup Unlocking Financial Data with Real-Time Pipel...
2024 February 28 - NYC - Meetup Unlocking Financial Data with Real-Time Pipel...2024 February 28 - NYC - Meetup Unlocking Financial Data with Real-Time Pipel...
2024 February 28 - NYC - Meetup Unlocking Financial Data with Real-Time Pipel...
 
big data fest building modern data streaming apps
big data fest building modern data streaming appsbig data fest building modern data streaming apps
big data fest building modern data streaming apps
 
BigDataFest_ Building Modern Data Streaming Apps
BigDataFest_  Building Modern Data Streaming AppsBigDataFest_  Building Modern Data Streaming Apps
BigDataFest_ Building Modern Data Streaming Apps
 
JConWorld_ Continuous SQL with Kafka and Flink
JConWorld_ Continuous SQL with Kafka and FlinkJConWorld_ Continuous SQL with Kafka and Flink
JConWorld_ Continuous SQL with Kafka and Flink
 
RTAS 2023: Building a Real-Time IoT Application
RTAS 2023:  Building a Real-Time IoT ApplicationRTAS 2023:  Building a Real-Time IoT Application
RTAS 2023: Building a Real-Time IoT Application
 
Unconference Round Table Notes
Unconference Round Table NotesUnconference Round Table Notes
Unconference Round Table Notes
 
BigDataFest Building Modern Data Streaming Apps
BigDataFest  Building Modern Data Streaming AppsBigDataFest  Building Modern Data Streaming Apps
BigDataFest Building Modern Data Streaming Apps
 
GSJUG: Mastering Data Streaming Pipelines 09May2023
GSJUG: Mastering Data Streaming Pipelines 09May2023GSJUG: Mastering Data Streaming Pipelines 09May2023
GSJUG: Mastering Data Streaming Pipelines 09May2023
 
PartnerSkillUp_Enable a Streaming CDC Solution
PartnerSkillUp_Enable a Streaming CDC SolutionPartnerSkillUp_Enable a Streaming CDC Solution
PartnerSkillUp_Enable a Streaming CDC Solution
 
Building Real-time Pipelines with FLaNK_ A Case Study with Transit Data
Building Real-time Pipelines with FLaNK_ A Case Study with Transit DataBuilding Real-time Pipelines with FLaNK_ A Case Study with Transit Data
Building Real-time Pipelines with FLaNK_ A Case Study with Transit Data
 
Budapest Data/ML - Building Modern Data Streaming Apps with NiFi, Flink and K...
Budapest Data/ML - Building Modern Data Streaming Apps with NiFi, Flink and K...Budapest Data/ML - Building Modern Data Streaming Apps with NiFi, Flink and K...
Budapest Data/ML - Building Modern Data Streaming Apps with NiFi, Flink and K...
 
Introduction to apache kafka, confluent and why they matter
Introduction to apache kafka, confluent and why they matterIntroduction to apache kafka, confluent and why they matter
Introduction to apache kafka, confluent and why they matter
 
Meetup Streaming Data Pipeline Development
Meetup Streaming Data Pipeline DevelopmentMeetup Streaming Data Pipeline Development
Meetup Streaming Data Pipeline Development
 
Future of Data Milwaukee Meetup Streaming Data Pipeline Development 28 June 2023
Future of Data Milwaukee Meetup Streaming Data Pipeline Development 28 June 2023Future of Data Milwaukee Meetup Streaming Data Pipeline Development 28 June 2023
Future of Data Milwaukee Meetup Streaming Data Pipeline Development 28 June 2023
 
Building Real-Time Travel Alerts
Building Real-Time Travel AlertsBuilding Real-Time Travel Alerts
Building Real-Time Travel Alerts
 
OSSNA Building Modern Data Streaming Apps
OSSNA Building Modern Data Streaming AppsOSSNA Building Modern Data Streaming Apps
OSSNA Building Modern Data Streaming Apps
 
NoLambda: Combining Streaming, Ad-Hoc, Machine Learning and Batch Analysis
NoLambda: Combining Streaming, Ad-Hoc, Machine Learning and Batch AnalysisNoLambda: Combining Streaming, Ad-Hoc, Machine Learning and Batch Analysis
NoLambda: Combining Streaming, Ad-Hoc, Machine Learning and Batch Analysis
 
The Never Landing Stream with HTAP and Streaming
The Never Landing Stream with HTAP and StreamingThe Never Landing Stream with HTAP and Streaming
The Never Landing Stream with HTAP and Streaming
 
CoC23_Utilizing Real-Time Transit Data for Travel Optimization
CoC23_Utilizing Real-Time Transit Data for Travel OptimizationCoC23_Utilizing Real-Time Transit Data for Travel Optimization
CoC23_Utilizing Real-Time Transit Data for Travel Optimization
 

More from Timothy Spann

April 2024 - NLIT Cloudera Real-Time LLM Streaming 2024
April 2024 - NLIT Cloudera Real-Time LLM Streaming 2024April 2024 - NLIT Cloudera Real-Time LLM Streaming 2024
April 2024 - NLIT Cloudera Real-Time LLM Streaming 2024Timothy Spann
 
Real-Time AI Streaming - AI Max Princeton
Real-Time AI  Streaming - AI Max PrincetonReal-Time AI  Streaming - AI Max Princeton
Real-Time AI Streaming - AI Max PrincetonTimothy Spann
 
Conf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
Conf42-LLM_Adding Generative AI to Real-Time Streaming PipelinesConf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
Conf42-LLM_Adding Generative AI to Real-Time Streaming PipelinesTimothy Spann
 
2024 XTREMEJ_ Building Real-time Pipelines with FLaNK_ A Case Study with Tra...
2024 XTREMEJ_  Building Real-time Pipelines with FLaNK_ A Case Study with Tra...2024 XTREMEJ_  Building Real-time Pipelines with FLaNK_ A Case Study with Tra...
2024 XTREMEJ_ Building Real-time Pipelines with FLaNK_ A Case Study with Tra...Timothy Spann
 
28March2024-Codeless-Generative-AI-Pipelines
28March2024-Codeless-Generative-AI-Pipelines28March2024-Codeless-Generative-AI-Pipelines
28March2024-Codeless-Generative-AI-PipelinesTimothy Spann
 
TCFPro24 Building Real-Time Generative AI Pipelines
TCFPro24 Building Real-Time Generative AI PipelinesTCFPro24 Building Real-Time Generative AI Pipelines
TCFPro24 Building Real-Time Generative AI PipelinesTimothy Spann
 
2024 Build Generative AI for Non-Profits
2024 Build Generative AI for Non-Profits2024 Build Generative AI for Non-Profits
2024 Build Generative AI for Non-ProfitsTimothy Spann
 
Conf42-Python-Building Apache NiFi 2.0 Python Processors
Conf42-Python-Building Apache NiFi 2.0 Python ProcessorsConf42-Python-Building Apache NiFi 2.0 Python Processors
Conf42-Python-Building Apache NiFi 2.0 Python ProcessorsTimothy Spann
 
Conf42Python -Using Apache NiFi, Apache Kafka, RisingWave, and Apache Iceberg...
Conf42Python -Using Apache NiFi, Apache Kafka, RisingWave, and Apache Iceberg...Conf42Python -Using Apache NiFi, Apache Kafka, RisingWave, and Apache Iceberg...
Conf42Python -Using Apache NiFi, Apache Kafka, RisingWave, and Apache Iceberg...Timothy Spann
 
2024 Feb AI Meetup NYC GenAI_LLMs_ML_Data Codeless Generative AI Pipelines
2024 Feb AI Meetup NYC GenAI_LLMs_ML_Data Codeless Generative AI Pipelines2024 Feb AI Meetup NYC GenAI_LLMs_ML_Data Codeless Generative AI Pipelines
2024 Feb AI Meetup NYC GenAI_LLMs_ML_Data Codeless Generative AI PipelinesTimothy Spann
 
NY Open Source Data Meetup Feb 8 2024 Building Real-time Pipelines with FLaNK...
NY Open Source Data Meetup Feb 8 2024 Building Real-time Pipelines with FLaNK...NY Open Source Data Meetup Feb 8 2024 Building Real-time Pipelines with FLaNK...
NY Open Source Data Meetup Feb 8 2024 Building Real-time Pipelines with FLaNK...Timothy Spann
 
OSACon 2023_ Unlocking Financial Data with Real-Time Pipelines
OSACon 2023_ Unlocking Financial Data with Real-Time PipelinesOSACon 2023_ Unlocking Financial Data with Real-Time Pipelines
OSACon 2023_ Unlocking Financial Data with Real-Time PipelinesTimothy Spann
 
[EN]DSS23_tspann_Integrating LLM with Streaming Data Pipelines
[EN]DSS23_tspann_Integrating LLM with Streaming Data Pipelines[EN]DSS23_tspann_Integrating LLM with Streaming Data Pipelines
[EN]DSS23_tspann_Integrating LLM with Streaming Data PipelinesTimothy Spann
 
Evolve 2023 NYC - Integrating AI Into Realtime Data Pipelines Demo
Evolve 2023 NYC - Integrating AI Into Realtime Data Pipelines DemoEvolve 2023 NYC - Integrating AI Into Realtime Data Pipelines Demo
Evolve 2023 NYC - Integrating AI Into Realtime Data Pipelines DemoTimothy Spann
 
AIDevWorldApacheNiFi101
AIDevWorldApacheNiFi101AIDevWorldApacheNiFi101
AIDevWorldApacheNiFi101Timothy Spann
 
26Oct2023_Adding Generative AI to Real-Time Streaming Pipelines_ NYC Meetup
26Oct2023_Adding Generative AI to Real-Time Streaming Pipelines_ NYC Meetup26Oct2023_Adding Generative AI to Real-Time Streaming Pipelines_ NYC Meetup
26Oct2023_Adding Generative AI to Real-Time Streaming Pipelines_ NYC MeetupTimothy Spann
 
CoC23_ Looking at the New Features of Apache NiFi
CoC23_ Looking at the New Features of Apache NiFiCoC23_ Looking at the New Features of Apache NiFi
CoC23_ Looking at the New Features of Apache NiFiTimothy Spann
 
CoC23_ Let’s Monitor The Conditions at the Conference
CoC23_ Let’s Monitor The Conditions at the ConferenceCoC23_ Let’s Monitor The Conditions at the Conference
CoC23_ Let’s Monitor The Conditions at the ConferenceTimothy Spann
 
OSSFinance_UnlockingFinancialDatawithReal-TimePipelines.pdf
OSSFinance_UnlockingFinancialDatawithReal-TimePipelines.pdfOSSFinance_UnlockingFinancialDatawithReal-TimePipelines.pdf
OSSFinance_UnlockingFinancialDatawithReal-TimePipelines.pdfTimothy Spann
 
Meetup - Brasil - Data In Motion - 2023 September 19
Meetup - Brasil - Data In Motion - 2023 September 19Meetup - Brasil - Data In Motion - 2023 September 19
Meetup - Brasil - Data In Motion - 2023 September 19Timothy Spann
 

More from Timothy Spann (20)

April 2024 - NLIT Cloudera Real-Time LLM Streaming 2024
April 2024 - NLIT Cloudera Real-Time LLM Streaming 2024April 2024 - NLIT Cloudera Real-Time LLM Streaming 2024
April 2024 - NLIT Cloudera Real-Time LLM Streaming 2024
 
Real-Time AI Streaming - AI Max Princeton
Real-Time AI  Streaming - AI Max PrincetonReal-Time AI  Streaming - AI Max Princeton
Real-Time AI Streaming - AI Max Princeton
 
Conf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
Conf42-LLM_Adding Generative AI to Real-Time Streaming PipelinesConf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
Conf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
 
2024 XTREMEJ_ Building Real-time Pipelines with FLaNK_ A Case Study with Tra...
2024 XTREMEJ_  Building Real-time Pipelines with FLaNK_ A Case Study with Tra...2024 XTREMEJ_  Building Real-time Pipelines with FLaNK_ A Case Study with Tra...
2024 XTREMEJ_ Building Real-time Pipelines with FLaNK_ A Case Study with Tra...
 
28March2024-Codeless-Generative-AI-Pipelines
28March2024-Codeless-Generative-AI-Pipelines28March2024-Codeless-Generative-AI-Pipelines
28March2024-Codeless-Generative-AI-Pipelines
 
TCFPro24 Building Real-Time Generative AI Pipelines
TCFPro24 Building Real-Time Generative AI PipelinesTCFPro24 Building Real-Time Generative AI Pipelines
TCFPro24 Building Real-Time Generative AI Pipelines
 
2024 Build Generative AI for Non-Profits
2024 Build Generative AI for Non-Profits2024 Build Generative AI for Non-Profits
2024 Build Generative AI for Non-Profits
 
Conf42-Python-Building Apache NiFi 2.0 Python Processors
Conf42-Python-Building Apache NiFi 2.0 Python ProcessorsConf42-Python-Building Apache NiFi 2.0 Python Processors
Conf42-Python-Building Apache NiFi 2.0 Python Processors
 
Conf42Python -Using Apache NiFi, Apache Kafka, RisingWave, and Apache Iceberg...
Conf42Python -Using Apache NiFi, Apache Kafka, RisingWave, and Apache Iceberg...Conf42Python -Using Apache NiFi, Apache Kafka, RisingWave, and Apache Iceberg...
Conf42Python -Using Apache NiFi, Apache Kafka, RisingWave, and Apache Iceberg...
 
2024 Feb AI Meetup NYC GenAI_LLMs_ML_Data Codeless Generative AI Pipelines
2024 Feb AI Meetup NYC GenAI_LLMs_ML_Data Codeless Generative AI Pipelines2024 Feb AI Meetup NYC GenAI_LLMs_ML_Data Codeless Generative AI Pipelines
2024 Feb AI Meetup NYC GenAI_LLMs_ML_Data Codeless Generative AI Pipelines
 
NY Open Source Data Meetup Feb 8 2024 Building Real-time Pipelines with FLaNK...
NY Open Source Data Meetup Feb 8 2024 Building Real-time Pipelines with FLaNK...NY Open Source Data Meetup Feb 8 2024 Building Real-time Pipelines with FLaNK...
NY Open Source Data Meetup Feb 8 2024 Building Real-time Pipelines with FLaNK...
 
OSACon 2023_ Unlocking Financial Data with Real-Time Pipelines
OSACon 2023_ Unlocking Financial Data with Real-Time PipelinesOSACon 2023_ Unlocking Financial Data with Real-Time Pipelines
OSACon 2023_ Unlocking Financial Data with Real-Time Pipelines
 
[EN]DSS23_tspann_Integrating LLM with Streaming Data Pipelines
[EN]DSS23_tspann_Integrating LLM with Streaming Data Pipelines[EN]DSS23_tspann_Integrating LLM with Streaming Data Pipelines
[EN]DSS23_tspann_Integrating LLM with Streaming Data Pipelines
 
Evolve 2023 NYC - Integrating AI Into Realtime Data Pipelines Demo
Evolve 2023 NYC - Integrating AI Into Realtime Data Pipelines DemoEvolve 2023 NYC - Integrating AI Into Realtime Data Pipelines Demo
Evolve 2023 NYC - Integrating AI Into Realtime Data Pipelines Demo
 
AIDevWorldApacheNiFi101
AIDevWorldApacheNiFi101AIDevWorldApacheNiFi101
AIDevWorldApacheNiFi101
 
26Oct2023_Adding Generative AI to Real-Time Streaming Pipelines_ NYC Meetup
26Oct2023_Adding Generative AI to Real-Time Streaming Pipelines_ NYC Meetup26Oct2023_Adding Generative AI to Real-Time Streaming Pipelines_ NYC Meetup
26Oct2023_Adding Generative AI to Real-Time Streaming Pipelines_ NYC Meetup
 
CoC23_ Looking at the New Features of Apache NiFi
CoC23_ Looking at the New Features of Apache NiFiCoC23_ Looking at the New Features of Apache NiFi
CoC23_ Looking at the New Features of Apache NiFi
 
CoC23_ Let’s Monitor The Conditions at the Conference
CoC23_ Let’s Monitor The Conditions at the ConferenceCoC23_ Let’s Monitor The Conditions at the Conference
CoC23_ Let’s Monitor The Conditions at the Conference
 
OSSFinance_UnlockingFinancialDatawithReal-TimePipelines.pdf
OSSFinance_UnlockingFinancialDatawithReal-TimePipelines.pdfOSSFinance_UnlockingFinancialDatawithReal-TimePipelines.pdf
OSSFinance_UnlockingFinancialDatawithReal-TimePipelines.pdf
 
Meetup - Brasil - Data In Motion - 2023 September 19
Meetup - Brasil - Data In Motion - 2023 September 19Meetup - Brasil - Data In Motion - 2023 September 19
Meetup - Brasil - Data In Motion - 2023 September 19
 

Recently uploaded

Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...OnePlan Solutions
 
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...confluent
 
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte GermanySuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte GermanyChristoph Pohl
 
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdfGOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdfAlina Yurenko
 
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...Cizo Technology Services
 
Lecture # 8 software design and architecture (SDA).ppt
Lecture # 8 software design and architecture (SDA).pptLecture # 8 software design and architecture (SDA).ppt
Lecture # 8 software design and architecture (SDA).pptesrabilgic2
 
Balasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
Balasore Best It Company|| Top 10 IT Company || Balasore Software company OdishaBalasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
Balasore Best It Company|| Top 10 IT Company || Balasore Software company Odishasmiwainfosol
 
Cyber security and its impact on E commerce
Cyber security and its impact on E commerceCyber security and its impact on E commerce
Cyber security and its impact on E commercemanigoyal112
 
Machine Learning Software Engineering Patterns and Their Engineering
Machine Learning Software Engineering Patterns and Their EngineeringMachine Learning Software Engineering Patterns and Their Engineering
Machine Learning Software Engineering Patterns and Their EngineeringHironori Washizaki
 
Unveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML DiagramsUnveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML DiagramsAhmed Mohamed
 
How to submit a standout Adobe Champion Application
How to submit a standout Adobe Champion ApplicationHow to submit a standout Adobe Champion Application
How to submit a standout Adobe Champion ApplicationBradBedford3
 
A healthy diet for your Java application Devoxx France.pdf
A healthy diet for your Java application Devoxx France.pdfA healthy diet for your Java application Devoxx France.pdf
A healthy diet for your Java application Devoxx France.pdfMarharyta Nedzelska
 
Post Quantum Cryptography – The Impact on Identity
Post Quantum Cryptography – The Impact on IdentityPost Quantum Cryptography – The Impact on Identity
Post Quantum Cryptography – The Impact on Identityteam-WIBU
 
英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作qr0udbr0
 
Understanding Flamingo - DeepMind's VLM Architecture
Understanding Flamingo - DeepMind's VLM ArchitectureUnderstanding Flamingo - DeepMind's VLM Architecture
Understanding Flamingo - DeepMind's VLM Architecturerahul_net
 
Precise and Complete Requirements? An Elusive Goal
Precise and Complete Requirements? An Elusive GoalPrecise and Complete Requirements? An Elusive Goal
Precise and Complete Requirements? An Elusive GoalLionel Briand
 
UI5ers live - Custom Controls wrapping 3rd-party libs.pptx
UI5ers live - Custom Controls wrapping 3rd-party libs.pptxUI5ers live - Custom Controls wrapping 3rd-party libs.pptx
UI5ers live - Custom Controls wrapping 3rd-party libs.pptxAndreas Kunz
 
Comparing Linux OS Image Update Models - EOSS 2024.pdf
Comparing Linux OS Image Update Models - EOSS 2024.pdfComparing Linux OS Image Update Models - EOSS 2024.pdf
Comparing Linux OS Image Update Models - EOSS 2024.pdfDrew Moseley
 
Cloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEECloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEEVICTOR MAESTRE RAMIREZ
 
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...Natan Silnitsky
 

Recently uploaded (20)

Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
 
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
 
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte GermanySuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
 
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdfGOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
 
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...
 
Lecture # 8 software design and architecture (SDA).ppt
Lecture # 8 software design and architecture (SDA).pptLecture # 8 software design and architecture (SDA).ppt
Lecture # 8 software design and architecture (SDA).ppt
 
Balasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
Balasore Best It Company|| Top 10 IT Company || Balasore Software company OdishaBalasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
Balasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
 
Cyber security and its impact on E commerce
Cyber security and its impact on E commerceCyber security and its impact on E commerce
Cyber security and its impact on E commerce
 
Machine Learning Software Engineering Patterns and Their Engineering
Machine Learning Software Engineering Patterns and Their EngineeringMachine Learning Software Engineering Patterns and Their Engineering
Machine Learning Software Engineering Patterns and Their Engineering
 
Unveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML DiagramsUnveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML Diagrams
 
How to submit a standout Adobe Champion Application
How to submit a standout Adobe Champion ApplicationHow to submit a standout Adobe Champion Application
How to submit a standout Adobe Champion Application
 
A healthy diet for your Java application Devoxx France.pdf
A healthy diet for your Java application Devoxx France.pdfA healthy diet for your Java application Devoxx France.pdf
A healthy diet for your Java application Devoxx France.pdf
 
Post Quantum Cryptography – The Impact on Identity
Post Quantum Cryptography – The Impact on IdentityPost Quantum Cryptography – The Impact on Identity
Post Quantum Cryptography – The Impact on Identity
 
英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作
 
Understanding Flamingo - DeepMind's VLM Architecture
Understanding Flamingo - DeepMind's VLM ArchitectureUnderstanding Flamingo - DeepMind's VLM Architecture
Understanding Flamingo - DeepMind's VLM Architecture
 
Precise and Complete Requirements? An Elusive Goal
Precise and Complete Requirements? An Elusive GoalPrecise and Complete Requirements? An Elusive Goal
Precise and Complete Requirements? An Elusive Goal
 
UI5ers live - Custom Controls wrapping 3rd-party libs.pptx
UI5ers live - Custom Controls wrapping 3rd-party libs.pptxUI5ers live - Custom Controls wrapping 3rd-party libs.pptx
UI5ers live - Custom Controls wrapping 3rd-party libs.pptx
 
Comparing Linux OS Image Update Models - EOSS 2024.pdf
Comparing Linux OS Image Update Models - EOSS 2024.pdfComparing Linux OS Image Update Models - EOSS 2024.pdf
Comparing Linux OS Image Update Models - EOSS 2024.pdf
 
Cloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEECloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEE
 
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
 

DBA Fundamentals Group: Continuous SQL with Kafka and Flink

  • 1. Continuous SQL with Kafka and Flink Tim Spann Principal Developer Advocate February 20, 2024 https://www.meetup.com/dba-fundamentals-group/events/296855261/
  • 2.
  • 3. 3 Tim Spann Twitter: @PaasDev // Blog: datainmotion.dev Principal Developer Advocate. Princeton Future of Data Meetup. ex-Pivotal, ex-Hortonworks, ex-StreamNative, ex-PwC, ex-HPE https://medium.com/@tspann https://github.com/tspannhw
  • 4. 4 FLaNK Stack Weekly by Tim Spann This week in Apache NiFi, Apache Flink, Apache Kafka, ML, AI, Apache Spark, Apache Iceberg, Python, Java and Open Source friends. https://bit.ly/32dAJft https://www.meetup.com/futureofdata- princeton/
  • 5. © 2023 Cloudera, Inc. All rights reserved. 5 Future of Data - NYC + NJ + Philly + Virtual @PaasDev https://www.meetup.com/futureofdata-princeton/ From Big Data to AI to Streaming to Containers to Cloud to Analytics to Cloud Storage to Fast Data to Machine Learning to Microservices to ...
  • 8. © 2023 Cloudera, Inc. All rights reserved. 8 BUILDING REAL-TIME REQUIRES A TEAM
  • 9. © 2023 Cloudera, Inc. All rights reserved. 9 Already using Spark? Need NiFi? Need Flink? Want unified Batch/Stream? Want highest Throughput? Don’t need low latency? Large files? Scheduled batches? Replacing Sqoop, ETL Simple JDBC queries? Transform individual records? Want easy development? Lots of small files, events, records, rows? Continuous stream of rows Support many different sources Need Microservices, Batch and Stream? Want high Throughput? Want Low Latency? Want Advanced Windowing and State? Happy with a New Solution that is best-in-class? Spark, NiFi, Flink? Which engine to choose?
  • 10. 10 Live Q&A Travel Advisories Weather Reports Documents Social Media Databases Transactions Public Data Feeds S3 / Files Logs ATM Data Live Chat … HYBRID CLOUD INTERACT COLLECT STORE ENRICH, REPORT Distribute Collect Report REPORT Visualize Report, Automate AI BASED ENHANCEMENTS Predict, Automate VECTOR DATABASE LLM Machine Learning Data Visualization Data Flow Data Warehouse SQL Stream Builder Data Visualization Input Sentences Generated Text Timestamp Input Sentence Timestamps Enrichments Messaging Broker Real-time alerting Real-time alerting Aggregations
  • 12. 12 Analytics-in-Stream Data Sources Streaming Storage Substrate Cloudera Stream Processing Kafka + NiFi enables real-time ingestion into lakes / analytics services Data Distribution Service Cloudera DataFlow Warehouses & Operational DB Data Lakes & Lake Houses Data-At-Rest Analytics Data Apps Powered by Streaming Insights and used by other Analytics Services Kafka + Flink enables streaming analytics Cloudera Stream Processing Streaming Analytics Low Latency Data Products Data-In-Motion Streaming Analytics Cloudera Edge Flow Edge Ingest
  • 13. 13 Flink Connectors Streaming Data Pipelines with Cloudera Data Platform Edge devices Cloud DB’s SaaS tools Change Data Capture On prem apps/DB’s Streams Any destination Custom Kafka producers Custom Kafka consumers Operational Applications Analytical Applications STREAMING DATA MOVEMENT & PROCESSING STREAMING DATA PROCESSING & ANALYTICS
  • 14.
  • 15.
  • 16. 16
  • 17.
  • 18.
  • 20. © 2023 Cloudera, Inc. All rights reserved. 20 What is Can You Do With Apache Kafka? Web site activity: track page views, searches, etc. in real time Events & log aggregation: particularly in distributed systems where messages come from multiple sources Monitoring and metrics: aggregate statistics from distributed applications and build a dashboard application Stream processing: process raw data, clean it up, and forward it on to another topic or messaging system Real-time data ingestion: fast processing of a very large volume of messages
  • 21. © 2019 Cloudera, Inc. All rights reserved. 21 STREAMS MESSAGING WITH KAFKA • Highly reliable distributed messaging system. • Decouple applications, enables many-to-many patterns. • Publish-Subscribe semantics. • Horizontal scalability. • Efficient implementation to operate at speed with big data volumes. • Organized by topic to support several use cases.
  • 23. 23 CONTINUOUS SQL ● SSB is a Continuous SQL engine ● It’s SQL, but a slightly different mental model, but with big implications Traditional Parse/Execute/Fetch model Continuous SQL Model Hint: The query is boundless and never finishes, and time matters AKA: SELECT * FROM foo WHERE 1=0 -- will run forever
  • 24. 24 Flink SQL -- specify Kafka partition key on output SELECT foo AS _eventKey FROM sensors -- use event time timestamp from kafka -- exactly once compatible SELECT eventTimestamp FROM sensors -- nested structures access SELECT foo.’bar’ FROM table; -- must quote nested column -- timestamps SELECT * FROM payments WHERE eventTimestamp > CURRENT_TIMESTAMP-interval '10' second; -- unnest SELECT b.*, u.* FROM bgp_avro b, UNNEST(b.path) AS u(pathitem) -- aggregations and windows SELECT card, MAX(amount) as theamount, TUMBLE_END(eventTimestamp, interval '5' minute) as ts FROM payments WHERE lat IS NOT NULL AND lon IS NOT NULL GROUP BY card, TUMBLE(eventTimestamp, interval '5' minute) HAVING COUNT(*) > 4 -- >4==fraud -- try to do this ksql! SELECT us_west.user_score+ap_south.user_score FROM kafka_in_zone_us_west us_west FULL OUTER JOIN kafka_in_zone_ap_south ap_south ON us_west.user_id = ap_south.user_id; Key Takeaway: Rich SQL grammar with advanced time and aggregation tools
  • 25. 25 © 2022 Cloudera, Inc. All rights reserved. SQL STREAM BUILDER (SSB) SQL STREAM BUILDER allows developers, analysts, and data scientists to write streaming applications with industry standard SQL. No Java or Scala code development required. Simplifies access to data in Kafka & Flink. Connectors to batch data in HDFS, Kudu, Hive, S3, JDBC, CDC and more Enrich streaming data with batch data in a single tool Democratize access to real-time data with just SQL
  • 26. 26 SSB MATERIALIZED VIEWS Key Takeaway; MV’s allow data scientist, analyst and developers consume data from the firehose
  • 27. © 2019 Cloudera, Inc. All rights reserved. 27 ICEBERG INTEGRATION Robust Next Generation Architecture for Data Driven Business Unified Processing Engine Massive Open table format Iceberg Support for Flink APIs through SSB • Maximally open • Maximally flexible • Ultra high performance for MASSIVE data
  • 29. 29 Continuous SQL select max(alt_baro) as MaxAltitudeFeet, min(alt_baro) as MinAltitudeFeet, avg(alt_baro) as AvgAltitudeFeet, max(alt_geom) as MaxGAltitudeFeet, min(alt_geom) as MinGAltitudeFeet, avg(alt_geom) as AvgGAltitudeFeet, max(gs) as MaxGroundSpeed, min(gs) as MinGroundSpeed, avg(gs) as AvgGroundSpeed, count(alt_baro) as RowCount, hex as ICAO, flight as IDENT from `sr1`.`default_database`.`adsb` group by flight, hex; select transcom.title, transcom.description, mta.VehicleRef, DISTANCE_BETWEEN(CAST(transcom.latitude as STRING), CAST(transcom.latitude as STRING), mta.VehicleLocationLatitude, mta.VehicleLocationLongitude) as miles, mta.StopPointName, mta.Bearing, mta.DestinationName, mta.ExpectedArrivalTime, mta.VehicleLocationLatitude, mta.VehicleLocationLongitude, mta.ArrivalProximityText, mta.DistanceFromStop, mta.AimedArrivalTime, mta.`Date`, mta.ts, mta.uuid, mta.EstimatedPassengerCapacity, mta.EstimatedPassengerCount from `schemareg1`.`default_database`.`mta` /*+ OPTIONS('scan.startup.mode' = 'earliest-offset') */ mta FULL OUTER JOIN `schemareg1`.`default_database`.`transcom` /*+ OPTIONS('scan.startup.mode' = 'earliest-offset') */ transcom ON (transcom.latitude >= CAST(mta.VehicleLocationLatitude as float) - 0.3) AND (transcom.longitude >= CAST(mta.VehicleLocationLongitude as float) - 0.3) AND (transcom.latitude <= CAST(mta.VehicleLocationLatitude as float) + 0.3) AND (transcom.longitude <= CAST(mta.VehicleLocationLongitude as float) + 0.3) WHERE mta.VehicleRef is not null AND transcom.title is not null AND DISTANCE_BETWEEN(CAST(transcom.latitude as STRING), CAST(transcom.latitude as STRING), mta.VehicleLocationLatitude, mta.VehicleLocationLongitude) <= 120
  • 30. 30 Real-time observability pipeline Minfi agents Raw Logs Cloudera Data Flow Cloudera Data Lakehouse Triaging ML Models Threat Hunting Response and Investigation UEBA/Fraud Detection Reports Auto Action Cloudera Streaming Analytics Cybersec Toolkits Parse, Triage, Profile Cloudera Streams Processing Kafka SQL Stream Builder SPLUNK / SIEM /EXTERNAL Cloudera Machine Learning Collect Route/Filter/ Transform Prepare/Analyze/ Alert
  • 31.
  • 32.
  • 33. 33 Data in Motion: Overview e Novidades do NiFi, Kafka e Flink Apresentador: Tim Spann - Principal DIM Specialist and Developer Advocate https://medium.com/cloudera-inc/transit-in-sao-paulo-brasil-flank-style-eaec6753cc63
  • 34. 34
  • 35. 35
  • 36. 36
  • 38. © 2023 Cloudera, Inc. All rights reserved. 38 Already using Kafka? Already using NiFi? Need for Fast Flink? Simple setup for many tables Want metadata augmented data Don’t need low latency? Visual monitoring Easy manual scaling Easy to combine with NiFi Debezium Simple JDBC queries? Transform individual records? Want easy development with UI? Lots of small files, events, records, rows? Continuous stream of rows Support many different sources Debezium coming Strong control of table and joins Want high Throughput? Want Low Latency? Want Advanced Windowing and State? Automatic records immediately Pure SQL Debezium Kafka Connect, NiFi, Flink? Which engine to choose? Or All 3?
  • 39. CDC ARCHITECTURE - Using FLaNK to pull the data out of anything in near-real time INGEST PREPARE PUBLISH DATA SOURCES Internal Users (After Sales) External Systems ENTERPRISE LAKEHOUSE CAPABILITY VIEW INGESTION ENTERPRISE DATA MESSAGE HUB STORAGE BATCH MANAGEMENT STREAM CONSUMPTION Closed Loop Systems SQL Stream Builder Machine Learning Data Visualization Workload Manager watsonx.data
  • 40. 40 Data Distribution as a Universal, Hybrid, Multi-Cloud Data Service Universal Data Distribution Service (Ingest, Transform, Deliver) Ingest Processors Ingest Gateway Router, Filter & Transform Processors Destination Processors Cloud Business Process Services* Log Data Sources Laptops / Servers Security Agents IOT Devices App Logs Mobile Apps Cloud Data Analytics/ Service * On-Prem Data Sources Cloud Warehouse (Cloudera DW) Big Data Cloud Services Multi-Cloud Data Distribution Service that Solves the First & Last Mile Problem for the Modern Data Stack
  • 41. CDC with SQL Stream Builder (Flink SQL)
  • 42. © 2023 Cloudera, Inc. All rights reserved. 42 Streaming CDC with Cloudera SQL Stream Builder (Flink SQL) https://github.com/tspannhw/FLaNK-CDC/blob/main/flinkcdc.MD
  • 43. © 2023 Cloudera, Inc. All rights reserved. 43 https://docs.cloudera.com/csa/1.10.0/how-to-ssb/topics/csa-ssb-cdc-connectors.html CDC with Debezium and Flink SQL Stream Builder with Flink SQL
  • 44. © 2023 Cloudera, Inc. All rights reserved. 44 CDC with Debezium and Flink SQL Stream Builder with Flink SQL
  • 45. © 2023 Cloudera, Inc. All rights reserved. 45
  • 46. © 2023 Cloudera, Inc. All rights reserved. 46 CREATE TABLE `postgres_cdc_newjerseybus` ( `title` STRING, `description` STRING, `link` STRING, `guid` STRING, `advisoryAlert` STRING, `pubDate` STRING, `ts` STRING, `companyname` STRING, `uuid` STRING, `servicename` STRING ) WITH ( 'connector' = 'postgres-cdc', 'database-name' = 'tspann', 'hostname' = '192.168.1.153', 'password' = 'tspann', 'decoding.plugin.name' = 'pgoutput', 'schema-name' = 'public', 'table-name' = 'newjerseybus', 'username' = 'tspann', 'port' = '5432' ); Flink SQL Tables - Debezium CDC From Database Tables
  • 47. © 2023 Cloudera, Inc. All rights reserved. 47 Flink SQL Tables - Upsert to Kafka Topics CREATE TABLE `upsert_kafka_newjerseybus` ( `title` String, `description` String, `link` String, `guid` String, `advisoryAlert` String, `pubDate` String, `ts` String, `companyname` String, `uuid` String, `servicename` String, `eventTimestamp` TIMESTAMP(3), WATERMARK FOR `eventTimestamp` AS `eventTimestamp` - INTERVAL '5' SECOND, PRIMARY KEY (uuid) NOT ENFORCED ) WITH ( 'connector' = 'upsert-kafka', 'topic' = 'kafka_newjerseybus', 'properties.bootstrap.servers' = 'kafka:9092', 'key.format' = 'json', 'value.format' = 'json' );
  • 49. https://github.com/tspannhw/FLaNK-Transit SELECT n.speed, n.travel_time, n.borough, n.link_name, n.link_points, n.latitude, n.longitude, DISTANCE_BETWEEN(CAST(t.latitude as STRING), CAST(t.latitude as STRING), m.VehicleLocationLatitude, m.VehicleLocationLongitude) as miles, t.title, t.`description`, t.pubDate, t.latitude, t.longitude, m.VehicleLocationLatitude, m.VehicleLocationLongitude, m.StopPointRef, m.VehicleRef, m.ProgressRate, m.ExpectedDepartureTime, m.StopPoint, m.VisitNumber, m.DataFrameRef, m.StopPointName, m.Bearing, m.OriginAimedDepartureTime, m.OperatorRef, m.DestinationName, m.ExpectedArrivalTime, m.BlockRef, m.LineRef, m.DirectionRef, m.ArrivalProximityText, m.DistanceFromStop, m.EstimatedPassengerCapacity, m.AimedArrivalTime, m.PublishedLineName, m.ProgressStatus, m.DestinationRef, m.EstimatedPassengerCount, m.OriginRef, m.NumberOfStopsAway, m.ts FROM jsonmta /*+ OPTIONS('scan.startup.mode' = 'earliest-offset') */ m FULL OUTER JOIN jsontranscom /*+ OPTIONS('scan.startup.mode' = 'earliest-offset') */ t ON (t.latitude >= CAST(m.VehicleLocationLatitude as float) - 0.3) AND (t.longitude >= CAST(m.VehicleLocationLongitude as float) - 0.3) AND (t.latitude <= CAST(m.VehicleLocationLatitude as float) + 0.3) AND (t.longitude <= CAST(m.VehicleLocationLongitude as float) + 0.3) FULL OUTER JOIN nytrafficspeed /*+ OPTIONS('scan.startup.mode' = 'earliest-offset') */ n ON (n.latitude >= CAST(m.VehicleLocationLatitude as float) - 0.3) AND (n.longitude >= CAST(m.VehicleLocationLongitude as float) - 0.3) AND (n.latitude <= CAST(m.VehicleLocationLatitude as float) + 0.3) AND (n.longitude <= CAST(m.VehicleLocationLongitude as float) + 0.3) WHERE m.VehicleRef is not null AND t.title is not null
  • 51. FLaNK for Halifax Canada Transit — NiFi, Kafka, Flink, SQL, GTFS-RT | by Tim Spann | Cloudera | Dec, 2023 | Medium Never Get Lost in the Stream. NiFi-Kafka-Flink for getting to work… | by Tim Spann | Cloudera | Dec, 2023 | Medium Iteration 1: Building a System to Consume All the Real-Time Transit Data in the World At Once | by Tim Spann | Cloudera | Medium Watching Airport Traffic in Real-Time | by Tim Spann | Cloudera | Medium