SlideShare a Scribd company logo
How to Build Streaming Apps
with Confluent
Mauro Vocale, Senior Solutions Engineer
Stefano Linguerri, Solutions Engineer
24/May/2023
Copyright 2020, Confluent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of Confluent, Inc.
Today’s Agenda
2
10:00 AM - 10:30 AM
Streams Processing / ksqlDB Overview
Mauro Vocale, Senior Solutions Engineer
10:30 AM - 11:15 AM
Interactive Streams Lab*
Stefano Linguerri Solutions Engineer
11:15 AM - 11:30 AM
Q&A and Next Steps
Workshop Tips & Help:
1. Check the ‘Chat’ window during
the session for instructions
[icon located at the bottom of the
Zoom toolbar]
2. For any technical issues, click
the ‘Raise Hand’ button or post in
the ‘Chat’ window
[a Confluent team member will
assist you]
*Please note the lab will be conducted on a Confluent Cloud. However, the ksqlDB concepts will still be relevant to
all Confluent Platform customers.
Copyright 2020, Confluent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of Confluent, Inc.
Objectives for today
3
➔ Show you how Apache Kafka and Confluent enable you to build
streaming apps
➔ Refresh on Kafka and Kafka Streams
➔ Learn the basics of:
◆ Event driven / microservices architecture
◆ Building streaming applications
➔ Overview of typical use cases for stream processing.
➔ Hands-on build a streaming application yourself
Part 1
Streams Processing / ksqlDB Overview
DB centric data architecture
What could possibly go wrong?
At the heart of every software application is data.
Databases
Databases are fundamentally incomplete.
Databases are not designed for real-time applications.
Databases
Consumer
Batch processing
(by time or event)
Producer
Upserts/Deletes
The foundational assumption of every database: Data-at-rest
Databases bring point-in-time queries to stored data.
This leads to a Giant Mess in Data Architecture.
LINE OF BUSINESS 01 LINE OF BUSINESS 02 PUBLIC CLOUD
Data in motion:
Ubiquitous real-time data and
continuous real-time processing
A New Paradigm is Required for Data in Motion:
Continuously processing evolving streams of data in real-time
Rich front-end
customer
experiences
Real-time
Events
Real-time
Event Streams
A Sale A shipment
A Trade
A Customer
Experience
Real-time
backend
operations
Apache Kafka
Copyright 2020, Confluent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of Confluent, Inc.
What is Apache Kafka?
12
Kafka is a distributed Event Streaming Platform (append-only / immutable commit log):
7 6 5 4 3 2 1
8
Producer
Write (append-only)
Consumer(s)
Read (seek and scan)
Kafka topic (partition)
Headers
Timestamp
Key Value
Anatomy of an event
● Publish/subscribe to a stream of events
● Persisted data after read
● Supports transactions
● Highly scalable, high throughput.
Copyright 2020, Confluent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of Confluent, Inc.
Producers and Consumers
13
App
#2
App
#3
App
#4
App
#1
Producers
Kafka
Cluster
Topic A Topic B Topic C Topic D
App
#1
App
#2
Consumers
App
#3
14
Message
Raw data produced by a service to be
consumed or stored elsewhere. The
publisher of the message has an
expectation about how the consumer
handles the message.
Event
Lightweight notification of a condition or
a state change. The publisher of the
event has no expectation about how
the event is handled. The consumer of
the event decides what to do with it.
Events can be discrete units or part of a
series.
Message
vs
Event
Public Service Announcement
Copyright 2020, Confluent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of Confluent, Inc.
Kafka Stream App
Transactions!
What are Kafka Connect and Kafka Streams?
15
Kafka Cluster
Connector
Transactions!
Customer
data
Order
Customer
data
Order
Shipment
Shipment
Enriched/
transformed
data (statefully)
Producer Consumer
Copyright 2020, Confluent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of Confluent, Inc.
Kafka Connect and Kafka Streams APIs
16
Kafka Connect API
● Reliable and scalable integration of Kafka with
your other data systems
● Confluent offers 120+ pre-built connectors for
popular sources and sinks
Kafka Streams API
● Write standard Java applications & micro-
services to process your data in real-time
● Built on top of Producer / Consumer APIs
Orders
Customers
STREAM
PROCESSING
Copyright 2020, Confluent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of Confluent, Inc.
Stream processing by analogy
17
Producer /
Connect API (Source)
Stream Processing
Consumer /
Connect API (Sink)
$ cat < in.txt | grep “ksql” | tr a-z A-Z > out.txt
Who is Confluent?
19
Apache Kafka is an open
source project developed
and maintained by the
Apache Foundation.
Confluent was founded
by the original creators
of Apache Kafka.
Who developed
Apache Kafka?
Public Service Announcement
Open
Source
Apache
Kafka
Value added by Confluent
What only Confluent can offer:
● Apache Kafka re-architected for the cloud
● Cloud-native, Complete and Everywhere
● Rich product capabilities and unparalleled
expertise
● Multiple networking/interconnectivity options
● Elastic scale (up and down)
● Tiered storage
● Enterprise-grade security/regulatory features
● Fully integrated Stream Governance
● High availability
● Expertise and vision to support mission critical
use cases
● Full DevOps automation (REST, CLI, Terraform
and AsyncAPI)
● Multi-language libraries: Java, Python, C++,
.NET, Go, JavaScript, etc.
Which leads to:
● Lowest TCO (40% savings in average)
● Fastest Time-to-Market (70% in average)
Data
Governance
Hybrid and
multi cloud
Enterprise
grade security
Schema
linking
Cluster
linking
Support/SLA
Kafka expertise
A
c
c
e
s
s
C
o
n
t
r
o
l
&
A
u
d
i
t
L
o
g
s
Terraform
Provider
Scaling &
auto data
balancing
Latest
Version
&
patches+
1
2
0
C
o
n
n
e
c
t
o
r
s
Fully managed
Confluent Cloud vs Open Source Apache Kafka
Stream processing with ksqlDB
Copyright 2020, Confluent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of Confluent, Inc.
ksqlDB
The easiest way to process event streams in real-time
22
Runs
everywhere
Clustering
done for you
Exactly-once
processing
Event-time
processing
Windowing &
aggregations
Flexible
sizing
Copyright 2020, Confluent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of Confluent, Inc.
ksqlDB in a nutshell
Streams and tables are the
two primary abstractions,
they are referred to as
collections
ksqlDB is built on
top of Kafka
Streams
Open source
event streaming
database under
the Confluent
Community
License
There are two ways of
creating collections in
ksqlDB:
● directly from Kafka
topics (source
collections)
● derived from other
streams and tables
(derived collections)
The direct
input/outputs will
always be Kafka
topics
ksqlDB does not query Kafka topics, only
collections.
To create a source collection:
CREATE STREAM users
(USER_ID INT KEY, USERNAME VARCHAR)
WITH
(KAFKA_TOPIC='users',VALUE_FORMAT='AVRO');
Merge/Joins:
● Table-Table
● Table-Stream
● Stream-Table
● Stream-Stream
23
Copyright 2020, Confluent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of Confluent, Inc.
Tables
● Tables are mutable collections of events
● Represent the latest version of each value per
key
● Serve queries to applications.
Streams
● Streams are immutable, append- only
sequences of events
● Useful for representing a historical series of
events
● “Data in motion”.
ksqlDB’s collections
24
Copyright 2020, Confluent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of Confluent, Inc.
Tradeoffs between stream processing tools
25
Kafka
Producer &
Consumer
Kafka
Streams
API
Flexibility Simplicity
ksqlDB
Copyright 2020, Confluent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of Confluent, Inc.
Tradeoffs between stream processing tools
26
Kafka Producer &
Consumer
Kafka
Streams API
ksqlDB
ConsumerRecords<String, String> records = consumer.poll(100);
Map<String,Integer> counts = new DefaultMap<String, Integer>();
for (ConsumerRecord<String, Integer> record : records) {
String key = record.key();
int c = counts.get(key);
c += record.value();
counts.put(key, c);
}
for (Map.Entry<String, Integer> entry : counts.entrySet()) {
int stateCount;
int attempts;
while (attempts++ < MAX_RETRIES) {
try {
stateCount = stateStore.getValue(entry.getKey());
stateStore.setValue(entry.getKey(), entry.getValue() +
stateCount);
break;
} catch (StateStoreException e) {
RetryUtils.backoff(attempts);
}
}
}
builder
.stream("users",
Consumed.with(
Serdes.String(),
Serdes.String()
)
)
.groupBy((key, value) -> value)
.count()
.toStream()
.to("counts", Produced.with(Serdes.String(),
Serdes.Long()));
CREATE STREAM counts
WITH (username VARCHAR KEY,
counter BIGINT) AS
SELECT username, count(*) as counter FROM users
GROUP BY username
EMIT CHANGES;
Copyright 2020, Confluent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of Confluent, Inc.
Runtime Environments
27
Kafka Cluster
ksqlDB Cluster
Kafka Connect
Cluster
Compute*
Storage*
(*) Each cluster can scale independently
Copyright 2020, Confluent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of Confluent, Inc.
Sample use case:
Streaming ETL
Manipulate events
in-flight and connect
data sources and sinks in
real-time rather than
through batch solutions
28
CREATE STREAM clicks_with_city AS
SELECT c.*, u.city
FROM clickstream c
LEFT JOIN users u ON c.user_id = u.id;
Copyright 2020, Confluent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of Confluent, Inc. 29
Sample use case:
Anomaly Detection
Identify anomalies in
streaming data, such as
potentially fraudulent
transactions, in only a
few milliseconds
CREATE TABLE possible_fraud AS
SELECT card_number, count(*)
FROM authorization_attempts
WINDOW TUMBLING (SIZE 5 SECONDS)
GROUP BY card_number
HAVING count(*) > 3;
Copyright 2020, Confluent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of Confluent, Inc. 30
Sample use case:
Real-time Monitoring
Monitor logs, sensor and
IoT data, and more and
take actions on events
instantly
CREATE TABLE error_counts AS
SELECT error_code, count(*)
FROM monitoring_stream
WINDOW TUMBLING (SIZE 1 MINUTE)
WHERE type = 'ERROR'
GROUP BY error_code;
Copyright 2020, Confluent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of Confluent, Inc.
What else is ksqlDB
a good fit for?
31
● Materialized caches
● Event-driven microservices
● Redaction
● Event Deduplication
● Event Re-ordering
● Sessionization
● Validation
Copyright 2020, Confluent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of Confluent, Inc.
● Kafka may only have events
over a limited span of time
depending on retention
settings
● No secondary indexes for
random lookups
● Aggregate events that
never happened.
What is ksqlDB
NOT a great fit for?
32
Part 2
Interactive Streams Lab
Copyright 2020, Confluent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of Confluent, Inc. 34
Workshop Tips & Help:
1. Check the ‘Chat’ window during the session for
instructions
[icon located at the bottom of the Zoom toolbar]
2. For any technical issues, click
the ‘Raise Hand’ button or post in the ‘Chat’ window
[a Confluent team member will assist you]
Find lab instructions with step by
step guide to reproduce this
workshop
https://drive.google.com/file/d/14HlF
63xPeb0LP6NUDuzTeDhAT_hI1Gw3
Copyright 2020, Confluent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of Confluent, Inc.
Register for Confluent Cloud
Copyright 2020, Confluent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of Confluent, Inc.
What will we do in this workshop: Use Case
Identify unhappy customers
● From streaming data generated in mobile app or on website
● Data enrichment with customer master data (in MySQLdb)
➢ to be able to do something about their unhappiness and not lose
them!
38
Copyright 2020, Confluent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of Confluent, Inc.
Demo diagram
Find more tutorials,
certifications and technical
information here:
developer.confluent.io

More Related Content

Similar to How to Build Streaming Apps with Confluent II

Citi Tech Talk: Messaging Modernization
Citi Tech Talk: Messaging ModernizationCiti Tech Talk: Messaging Modernization
Citi Tech Talk: Messaging Modernization
confluent
 
Confluent Messaging Modernization Forum
Confluent Messaging Modernization ForumConfluent Messaging Modernization Forum
Confluent Messaging Modernization Forum
confluent
 
SpringOne Tour Denver - Spring Boot & Spring Cloud on Pivotal Application Ser...
SpringOne Tour Denver - Spring Boot & Spring Cloud on Pivotal Application Ser...SpringOne Tour Denver - Spring Boot & Spring Cloud on Pivotal Application Ser...
SpringOne Tour Denver - Spring Boot & Spring Cloud on Pivotal Application Ser...
VMware Tanzu
 
Au delà des brokers, un tour de l’environnement Kafka | Florent Ramière
Au delà des brokers, un tour de l’environnement Kafka | Florent RamièreAu delà des brokers, un tour de l’environnement Kafka | Florent Ramière
Au delà des brokers, un tour de l’environnement Kafka | Florent Ramière
confluent
 
Industry 4.0: Building the Unified Namespace with Confluent, HiveMQ and Spark...
Industry 4.0: Building the Unified Namespace with Confluent, HiveMQ and Spark...Industry 4.0: Building the Unified Namespace with Confluent, HiveMQ and Spark...
Industry 4.0: Building the Unified Namespace with Confluent, HiveMQ and Spark...
confluent
 
Apache Kafka vs. Integration Middleware (MQ, ETL, ESB)
Apache Kafka vs. Integration Middleware (MQ, ETL, ESB)Apache Kafka vs. Integration Middleware (MQ, ETL, ESB)
Apache Kafka vs. Integration Middleware (MQ, ETL, ESB)
Kai Wähner
 
Confluent Partner Tech Talk with Synthesis
Confluent Partner Tech Talk with SynthesisConfluent Partner Tech Talk with Synthesis
Confluent Partner Tech Talk with Synthesis
confluent
 
Confluent Platform 5.5 + Apache Kafka 2.5 => New Features (JSON Schema, Proto...
Confluent Platform 5.5 + Apache Kafka 2.5 => New Features (JSON Schema, Proto...Confluent Platform 5.5 + Apache Kafka 2.5 => New Features (JSON Schema, Proto...
Confluent Platform 5.5 + Apache Kafka 2.5 => New Features (JSON Schema, Proto...
Kai Wähner
 
Fast Data – Fast Cars: Wie Apache Kafka die Datenwelt revolutioniert
Fast Data – Fast Cars: Wie Apache Kafka die Datenwelt revolutioniertFast Data – Fast Cars: Wie Apache Kafka die Datenwelt revolutioniert
Fast Data – Fast Cars: Wie Apache Kafka die Datenwelt revolutioniert
confluent
 
Kafka Vienna Meetup 020719
Kafka Vienna Meetup 020719Kafka Vienna Meetup 020719
Kafka Vienna Meetup 020719
Patrik Kleindl
 
Apache Kafka vs. Traditional Middleware (Kai Waehner, Confluent) Frankfurt 20...
Apache Kafka vs. Traditional Middleware (Kai Waehner, Confluent) Frankfurt 20...Apache Kafka vs. Traditional Middleware (Kai Waehner, Confluent) Frankfurt 20...
Apache Kafka vs. Traditional Middleware (Kai Waehner, Confluent) Frankfurt 20...
confluent
 
Apache Kafka vs. Integration Middleware (MQ, ETL, ESB) - Friends, Enemies or ...
Apache Kafka vs. Integration Middleware (MQ, ETL, ESB) - Friends, Enemies or ...Apache Kafka vs. Integration Middleware (MQ, ETL, ESB) - Friends, Enemies or ...
Apache Kafka vs. Integration Middleware (MQ, ETL, ESB) - Friends, Enemies or ...
confluent
 
Unlock value with Confluent and AWS.pptx
Unlock value with Confluent and AWS.pptxUnlock value with Confluent and AWS.pptx
Unlock value with Confluent and AWS.pptx
Ahmed791434
 
Confluent Partner Tech Talk with Reply
Confluent Partner Tech Talk with ReplyConfluent Partner Tech Talk with Reply
Confluent Partner Tech Talk with Reply
confluent
 
App modernization on AWS with Apache Kafka and Confluent Cloud
App modernization on AWS with Apache Kafka and Confluent CloudApp modernization on AWS with Apache Kafka and Confluent Cloud
App modernization on AWS with Apache Kafka and Confluent Cloud
Kai Wähner
 
Real-time processing of large amounts of data
Real-time processing of large amounts of dataReal-time processing of large amounts of data
Real-time processing of large amounts of data
confluent
 
Beyond the Brokers: A Tour of the Kafka Ecosystem
Beyond the Brokers: A Tour of the Kafka EcosystemBeyond the Brokers: A Tour of the Kafka Ecosystem
Beyond the Brokers: A Tour of the Kafka Ecosystem
confluent
 
Beyond the brokers - A tour of the Kafka ecosystem
Beyond the brokers - A tour of the Kafka ecosystemBeyond the brokers - A tour of the Kafka ecosystem
Beyond the brokers - A tour of the Kafka ecosystem
Damien Gasparina
 
Hybrid Kafka, Taking Real-time Analytics to the Business (Cody Irwin, Google ...
Hybrid Kafka, Taking Real-time Analytics to the Business (Cody Irwin, Google ...Hybrid Kafka, Taking Real-time Analytics to the Business (Cody Irwin, Google ...
Hybrid Kafka, Taking Real-time Analytics to the Business (Cody Irwin, Google ...
HostedbyConfluent
 
Beyond the brokers - Un tour de l'écosystème Kafka
Beyond the brokers - Un tour de l'écosystème KafkaBeyond the brokers - Un tour de l'écosystème Kafka
Beyond the brokers - Un tour de l'écosystème Kafka
Florent Ramiere
 

Similar to How to Build Streaming Apps with Confluent II (20)

Citi Tech Talk: Messaging Modernization
Citi Tech Talk: Messaging ModernizationCiti Tech Talk: Messaging Modernization
Citi Tech Talk: Messaging Modernization
 
Confluent Messaging Modernization Forum
Confluent Messaging Modernization ForumConfluent Messaging Modernization Forum
Confluent Messaging Modernization Forum
 
SpringOne Tour Denver - Spring Boot & Spring Cloud on Pivotal Application Ser...
SpringOne Tour Denver - Spring Boot & Spring Cloud on Pivotal Application Ser...SpringOne Tour Denver - Spring Boot & Spring Cloud on Pivotal Application Ser...
SpringOne Tour Denver - Spring Boot & Spring Cloud on Pivotal Application Ser...
 
Au delà des brokers, un tour de l’environnement Kafka | Florent Ramière
Au delà des brokers, un tour de l’environnement Kafka | Florent RamièreAu delà des brokers, un tour de l’environnement Kafka | Florent Ramière
Au delà des brokers, un tour de l’environnement Kafka | Florent Ramière
 
Industry 4.0: Building the Unified Namespace with Confluent, HiveMQ and Spark...
Industry 4.0: Building the Unified Namespace with Confluent, HiveMQ and Spark...Industry 4.0: Building the Unified Namespace with Confluent, HiveMQ and Spark...
Industry 4.0: Building the Unified Namespace with Confluent, HiveMQ and Spark...
 
Apache Kafka vs. Integration Middleware (MQ, ETL, ESB)
Apache Kafka vs. Integration Middleware (MQ, ETL, ESB)Apache Kafka vs. Integration Middleware (MQ, ETL, ESB)
Apache Kafka vs. Integration Middleware (MQ, ETL, ESB)
 
Confluent Partner Tech Talk with Synthesis
Confluent Partner Tech Talk with SynthesisConfluent Partner Tech Talk with Synthesis
Confluent Partner Tech Talk with Synthesis
 
Confluent Platform 5.5 + Apache Kafka 2.5 => New Features (JSON Schema, Proto...
Confluent Platform 5.5 + Apache Kafka 2.5 => New Features (JSON Schema, Proto...Confluent Platform 5.5 + Apache Kafka 2.5 => New Features (JSON Schema, Proto...
Confluent Platform 5.5 + Apache Kafka 2.5 => New Features (JSON Schema, Proto...
 
Fast Data – Fast Cars: Wie Apache Kafka die Datenwelt revolutioniert
Fast Data – Fast Cars: Wie Apache Kafka die Datenwelt revolutioniertFast Data – Fast Cars: Wie Apache Kafka die Datenwelt revolutioniert
Fast Data – Fast Cars: Wie Apache Kafka die Datenwelt revolutioniert
 
Kafka Vienna Meetup 020719
Kafka Vienna Meetup 020719Kafka Vienna Meetup 020719
Kafka Vienna Meetup 020719
 
Apache Kafka vs. Traditional Middleware (Kai Waehner, Confluent) Frankfurt 20...
Apache Kafka vs. Traditional Middleware (Kai Waehner, Confluent) Frankfurt 20...Apache Kafka vs. Traditional Middleware (Kai Waehner, Confluent) Frankfurt 20...
Apache Kafka vs. Traditional Middleware (Kai Waehner, Confluent) Frankfurt 20...
 
Apache Kafka vs. Integration Middleware (MQ, ETL, ESB) - Friends, Enemies or ...
Apache Kafka vs. Integration Middleware (MQ, ETL, ESB) - Friends, Enemies or ...Apache Kafka vs. Integration Middleware (MQ, ETL, ESB) - Friends, Enemies or ...
Apache Kafka vs. Integration Middleware (MQ, ETL, ESB) - Friends, Enemies or ...
 
Unlock value with Confluent and AWS.pptx
Unlock value with Confluent and AWS.pptxUnlock value with Confluent and AWS.pptx
Unlock value with Confluent and AWS.pptx
 
Confluent Partner Tech Talk with Reply
Confluent Partner Tech Talk with ReplyConfluent Partner Tech Talk with Reply
Confluent Partner Tech Talk with Reply
 
App modernization on AWS with Apache Kafka and Confluent Cloud
App modernization on AWS with Apache Kafka and Confluent CloudApp modernization on AWS with Apache Kafka and Confluent Cloud
App modernization on AWS with Apache Kafka and Confluent Cloud
 
Real-time processing of large amounts of data
Real-time processing of large amounts of dataReal-time processing of large amounts of data
Real-time processing of large amounts of data
 
Beyond the Brokers: A Tour of the Kafka Ecosystem
Beyond the Brokers: A Tour of the Kafka EcosystemBeyond the Brokers: A Tour of the Kafka Ecosystem
Beyond the Brokers: A Tour of the Kafka Ecosystem
 
Beyond the brokers - A tour of the Kafka ecosystem
Beyond the brokers - A tour of the Kafka ecosystemBeyond the brokers - A tour of the Kafka ecosystem
Beyond the brokers - A tour of the Kafka ecosystem
 
Hybrid Kafka, Taking Real-time Analytics to the Business (Cody Irwin, Google ...
Hybrid Kafka, Taking Real-time Analytics to the Business (Cody Irwin, Google ...Hybrid Kafka, Taking Real-time Analytics to the Business (Cody Irwin, Google ...
Hybrid Kafka, Taking Real-time Analytics to the Business (Cody Irwin, Google ...
 
Beyond the brokers - Un tour de l'écosystème Kafka
Beyond the brokers - Un tour de l'écosystème KafkaBeyond the brokers - Un tour de l'écosystème Kafka
Beyond the brokers - Un tour de l'écosystème Kafka
 

More from confluent

Speed Wins: From Kafka to APIs in Minutes
Speed Wins: From Kafka to APIs in MinutesSpeed Wins: From Kafka to APIs in Minutes
Speed Wins: From Kafka to APIs in Minutes
confluent
 
Evolving Data Governance for the Real-time Streaming and AI Era
Evolving Data Governance for the Real-time Streaming and AI EraEvolving Data Governance for the Real-time Streaming and AI Era
Evolving Data Governance for the Real-time Streaming and AI Era
confluent
 
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
confluent
 
Workshop híbrido: Stream Processing con Flink
Workshop híbrido: Stream Processing con FlinkWorkshop híbrido: Stream Processing con Flink
Workshop híbrido: Stream Processing con Flink
confluent
 
AWS Immersion Day Mapfre - Confluent
AWS Immersion Day Mapfre   -   ConfluentAWS Immersion Day Mapfre   -   Confluent
AWS Immersion Day Mapfre - Confluent
confluent
 
Eventos y Microservicios - Santander TechTalk
Eventos y Microservicios - Santander TechTalkEventos y Microservicios - Santander TechTalk
Eventos y Microservicios - Santander TechTalk
confluent
 
Q&A with Confluent Experts: Navigating Networking in Confluent Cloud
Q&A with Confluent Experts: Navigating Networking in Confluent CloudQ&A with Confluent Experts: Navigating Networking in Confluent Cloud
Q&A with Confluent Experts: Navigating Networking in Confluent Cloud
confluent
 
Citi TechTalk Session 2: Kafka Deep Dive
Citi TechTalk Session 2: Kafka Deep DiveCiti TechTalk Session 2: Kafka Deep Dive
Citi TechTalk Session 2: Kafka Deep Dive
confluent
 
Build real-time streaming data pipelines to AWS with Confluent
Build real-time streaming data pipelines to AWS with ConfluentBuild real-time streaming data pipelines to AWS with Confluent
Build real-time streaming data pipelines to AWS with Confluent
confluent
 
Q&A with Confluent Professional Services: Confluent Service Mesh
Q&A with Confluent Professional Services: Confluent Service MeshQ&A with Confluent Professional Services: Confluent Service Mesh
Q&A with Confluent Professional Services: Confluent Service Mesh
confluent
 
Citi Tech Talk: Event Driven Kafka Microservices
Citi Tech Talk: Event Driven Kafka MicroservicesCiti Tech Talk: Event Driven Kafka Microservices
Citi Tech Talk: Event Driven Kafka Microservices
confluent
 
Confluent & GSI Webinars series - Session 3
Confluent & GSI Webinars series - Session 3Confluent & GSI Webinars series - Session 3
Confluent & GSI Webinars series - Session 3
confluent
 
Citi Tech Talk: Data Governance for streaming and real time data
Citi Tech Talk: Data Governance for streaming and real time dataCiti Tech Talk: Data Governance for streaming and real time data
Citi Tech Talk: Data Governance for streaming and real time data
confluent
 
Confluent & GSI Webinars series: Session 2
Confluent & GSI Webinars series: Session 2Confluent & GSI Webinars series: Session 2
Confluent & GSI Webinars series: Session 2
confluent
 
Data In Motion Paris 2023
Data In Motion Paris 2023Data In Motion Paris 2023
Data In Motion Paris 2023
confluent
 
The Future of Application Development - API Days - Melbourne 2023
The Future of Application Development - API Days - Melbourne 2023The Future of Application Development - API Days - Melbourne 2023
The Future of Application Development - API Days - Melbourne 2023
confluent
 
The Playful Bond Between REST And Data Streams
The Playful Bond Between REST And Data StreamsThe Playful Bond Between REST And Data Streams
The Playful Bond Between REST And Data Streams
confluent
 
The Journey to Data Mesh with Confluent
The Journey to Data Mesh with ConfluentThe Journey to Data Mesh with Confluent
The Journey to Data Mesh with Confluent
confluent
 
Citi Tech Talk: Monitoring and Performance
Citi Tech Talk: Monitoring and PerformanceCiti Tech Talk: Monitoring and Performance
Citi Tech Talk: Monitoring and Performance
confluent
 
Citi Tech Talk Disaster Recovery Solutions Deep Dive
Citi Tech Talk  Disaster Recovery Solutions Deep DiveCiti Tech Talk  Disaster Recovery Solutions Deep Dive
Citi Tech Talk Disaster Recovery Solutions Deep Dive
confluent
 

More from confluent (20)

Speed Wins: From Kafka to APIs in Minutes
Speed Wins: From Kafka to APIs in MinutesSpeed Wins: From Kafka to APIs in Minutes
Speed Wins: From Kafka to APIs in Minutes
 
Evolving Data Governance for the Real-time Streaming and AI Era
Evolving Data Governance for the Real-time Streaming and AI EraEvolving Data Governance for the Real-time Streaming and AI Era
Evolving Data Governance for the Real-time Streaming and AI Era
 
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
 
Workshop híbrido: Stream Processing con Flink
Workshop híbrido: Stream Processing con FlinkWorkshop híbrido: Stream Processing con Flink
Workshop híbrido: Stream Processing con Flink
 
AWS Immersion Day Mapfre - Confluent
AWS Immersion Day Mapfre   -   ConfluentAWS Immersion Day Mapfre   -   Confluent
AWS Immersion Day Mapfre - Confluent
 
Eventos y Microservicios - Santander TechTalk
Eventos y Microservicios - Santander TechTalkEventos y Microservicios - Santander TechTalk
Eventos y Microservicios - Santander TechTalk
 
Q&A with Confluent Experts: Navigating Networking in Confluent Cloud
Q&A with Confluent Experts: Navigating Networking in Confluent CloudQ&A with Confluent Experts: Navigating Networking in Confluent Cloud
Q&A with Confluent Experts: Navigating Networking in Confluent Cloud
 
Citi TechTalk Session 2: Kafka Deep Dive
Citi TechTalk Session 2: Kafka Deep DiveCiti TechTalk Session 2: Kafka Deep Dive
Citi TechTalk Session 2: Kafka Deep Dive
 
Build real-time streaming data pipelines to AWS with Confluent
Build real-time streaming data pipelines to AWS with ConfluentBuild real-time streaming data pipelines to AWS with Confluent
Build real-time streaming data pipelines to AWS with Confluent
 
Q&A with Confluent Professional Services: Confluent Service Mesh
Q&A with Confluent Professional Services: Confluent Service MeshQ&A with Confluent Professional Services: Confluent Service Mesh
Q&A with Confluent Professional Services: Confluent Service Mesh
 
Citi Tech Talk: Event Driven Kafka Microservices
Citi Tech Talk: Event Driven Kafka MicroservicesCiti Tech Talk: Event Driven Kafka Microservices
Citi Tech Talk: Event Driven Kafka Microservices
 
Confluent & GSI Webinars series - Session 3
Confluent & GSI Webinars series - Session 3Confluent & GSI Webinars series - Session 3
Confluent & GSI Webinars series - Session 3
 
Citi Tech Talk: Data Governance for streaming and real time data
Citi Tech Talk: Data Governance for streaming and real time dataCiti Tech Talk: Data Governance for streaming and real time data
Citi Tech Talk: Data Governance for streaming and real time data
 
Confluent & GSI Webinars series: Session 2
Confluent & GSI Webinars series: Session 2Confluent & GSI Webinars series: Session 2
Confluent & GSI Webinars series: Session 2
 
Data In Motion Paris 2023
Data In Motion Paris 2023Data In Motion Paris 2023
Data In Motion Paris 2023
 
The Future of Application Development - API Days - Melbourne 2023
The Future of Application Development - API Days - Melbourne 2023The Future of Application Development - API Days - Melbourne 2023
The Future of Application Development - API Days - Melbourne 2023
 
The Playful Bond Between REST And Data Streams
The Playful Bond Between REST And Data StreamsThe Playful Bond Between REST And Data Streams
The Playful Bond Between REST And Data Streams
 
The Journey to Data Mesh with Confluent
The Journey to Data Mesh with ConfluentThe Journey to Data Mesh with Confluent
The Journey to Data Mesh with Confluent
 
Citi Tech Talk: Monitoring and Performance
Citi Tech Talk: Monitoring and PerformanceCiti Tech Talk: Monitoring and Performance
Citi Tech Talk: Monitoring and Performance
 
Citi Tech Talk Disaster Recovery Solutions Deep Dive
Citi Tech Talk  Disaster Recovery Solutions Deep DiveCiti Tech Talk  Disaster Recovery Solutions Deep Dive
Citi Tech Talk Disaster Recovery Solutions Deep Dive
 

Recently uploaded

Cyaniclab : Software Development Agency Portfolio.pdf
Cyaniclab : Software Development Agency Portfolio.pdfCyaniclab : Software Development Agency Portfolio.pdf
Cyaniclab : Software Development Agency Portfolio.pdf
Cyanic lab
 
top nidhi software solution freedownload
top nidhi software solution freedownloadtop nidhi software solution freedownload
top nidhi software solution freedownload
vrstrong314
 
Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...
Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...
Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...
Shahin Sheidaei
 
Globus Compute wth IRI Workflows - GlobusWorld 2024
Globus Compute wth IRI Workflows - GlobusWorld 2024Globus Compute wth IRI Workflows - GlobusWorld 2024
Globus Compute wth IRI Workflows - GlobusWorld 2024
Globus
 
Understanding Globus Data Transfers with NetSage
Understanding Globus Data Transfers with NetSageUnderstanding Globus Data Transfers with NetSage
Understanding Globus Data Transfers with NetSage
Globus
 
Cracking the code review at SpringIO 2024
Cracking the code review at SpringIO 2024Cracking the code review at SpringIO 2024
Cracking the code review at SpringIO 2024
Paco van Beckhoven
 
May Marketo Masterclass, London MUG May 22 2024.pdf
May Marketo Masterclass, London MUG May 22 2024.pdfMay Marketo Masterclass, London MUG May 22 2024.pdf
May Marketo Masterclass, London MUG May 22 2024.pdf
Adele Miller
 
Graphic Design Crash Course for beginners
Graphic Design Crash Course for beginnersGraphic Design Crash Course for beginners
Graphic Design Crash Course for beginners
e20449
 
Lecture 1 Introduction to games development
Lecture 1 Introduction to games developmentLecture 1 Introduction to games development
Lecture 1 Introduction to games development
abdulrafaychaudhry
 
Custom Healthcare Software for Managing Chronic Conditions and Remote Patient...
Custom Healthcare Software for Managing Chronic Conditions and Remote Patient...Custom Healthcare Software for Managing Chronic Conditions and Remote Patient...
Custom Healthcare Software for Managing Chronic Conditions and Remote Patient...
Mind IT Systems
 
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...
Juraj Vysvader
 
AI Pilot Review: The World’s First Virtual Assistant Marketing Suite
AI Pilot Review: The World’s First Virtual Assistant Marketing SuiteAI Pilot Review: The World’s First Virtual Assistant Marketing Suite
AI Pilot Review: The World’s First Virtual Assistant Marketing Suite
Google
 
Providing Globus Services to Users of JASMIN for Environmental Data Analysis
Providing Globus Services to Users of JASMIN for Environmental Data AnalysisProviding Globus Services to Users of JASMIN for Environmental Data Analysis
Providing Globus Services to Users of JASMIN for Environmental Data Analysis
Globus
 
Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...
Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...
Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...
Anthony Dahanne
 
Prosigns: Transforming Business with Tailored Technology Solutions
Prosigns: Transforming Business with Tailored Technology SolutionsProsigns: Transforming Business with Tailored Technology Solutions
Prosigns: Transforming Business with Tailored Technology Solutions
Prosigns
 
Vitthal Shirke Microservices Resume Montevideo
Vitthal Shirke Microservices Resume MontevideoVitthal Shirke Microservices Resume Montevideo
Vitthal Shirke Microservices Resume Montevideo
Vitthal Shirke
 
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoam
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoamOpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoam
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoam
takuyayamamoto1800
 
Using IESVE for Room Loads Analysis - Australia & New Zealand
Using IESVE for Room Loads Analysis - Australia & New ZealandUsing IESVE for Room Loads Analysis - Australia & New Zealand
Using IESVE for Room Loads Analysis - Australia & New Zealand
IES VE
 
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...
Globus
 
Orion Context Broker introduction 20240604
Orion Context Broker introduction 20240604Orion Context Broker introduction 20240604
Orion Context Broker introduction 20240604
Fermin Galan
 

Recently uploaded (20)

Cyaniclab : Software Development Agency Portfolio.pdf
Cyaniclab : Software Development Agency Portfolio.pdfCyaniclab : Software Development Agency Portfolio.pdf
Cyaniclab : Software Development Agency Portfolio.pdf
 
top nidhi software solution freedownload
top nidhi software solution freedownloadtop nidhi software solution freedownload
top nidhi software solution freedownload
 
Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...
Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...
Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...
 
Globus Compute wth IRI Workflows - GlobusWorld 2024
Globus Compute wth IRI Workflows - GlobusWorld 2024Globus Compute wth IRI Workflows - GlobusWorld 2024
Globus Compute wth IRI Workflows - GlobusWorld 2024
 
Understanding Globus Data Transfers with NetSage
Understanding Globus Data Transfers with NetSageUnderstanding Globus Data Transfers with NetSage
Understanding Globus Data Transfers with NetSage
 
Cracking the code review at SpringIO 2024
Cracking the code review at SpringIO 2024Cracking the code review at SpringIO 2024
Cracking the code review at SpringIO 2024
 
May Marketo Masterclass, London MUG May 22 2024.pdf
May Marketo Masterclass, London MUG May 22 2024.pdfMay Marketo Masterclass, London MUG May 22 2024.pdf
May Marketo Masterclass, London MUG May 22 2024.pdf
 
Graphic Design Crash Course for beginners
Graphic Design Crash Course for beginnersGraphic Design Crash Course for beginners
Graphic Design Crash Course for beginners
 
Lecture 1 Introduction to games development
Lecture 1 Introduction to games developmentLecture 1 Introduction to games development
Lecture 1 Introduction to games development
 
Custom Healthcare Software for Managing Chronic Conditions and Remote Patient...
Custom Healthcare Software for Managing Chronic Conditions and Remote Patient...Custom Healthcare Software for Managing Chronic Conditions and Remote Patient...
Custom Healthcare Software for Managing Chronic Conditions and Remote Patient...
 
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...
 
AI Pilot Review: The World’s First Virtual Assistant Marketing Suite
AI Pilot Review: The World’s First Virtual Assistant Marketing SuiteAI Pilot Review: The World’s First Virtual Assistant Marketing Suite
AI Pilot Review: The World’s First Virtual Assistant Marketing Suite
 
Providing Globus Services to Users of JASMIN for Environmental Data Analysis
Providing Globus Services to Users of JASMIN for Environmental Data AnalysisProviding Globus Services to Users of JASMIN for Environmental Data Analysis
Providing Globus Services to Users of JASMIN for Environmental Data Analysis
 
Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...
Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...
Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...
 
Prosigns: Transforming Business with Tailored Technology Solutions
Prosigns: Transforming Business with Tailored Technology SolutionsProsigns: Transforming Business with Tailored Technology Solutions
Prosigns: Transforming Business with Tailored Technology Solutions
 
Vitthal Shirke Microservices Resume Montevideo
Vitthal Shirke Microservices Resume MontevideoVitthal Shirke Microservices Resume Montevideo
Vitthal Shirke Microservices Resume Montevideo
 
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoam
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoamOpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoam
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoam
 
Using IESVE for Room Loads Analysis - Australia & New Zealand
Using IESVE for Room Loads Analysis - Australia & New ZealandUsing IESVE for Room Loads Analysis - Australia & New Zealand
Using IESVE for Room Loads Analysis - Australia & New Zealand
 
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...
 
Orion Context Broker introduction 20240604
Orion Context Broker introduction 20240604Orion Context Broker introduction 20240604
Orion Context Broker introduction 20240604
 

How to Build Streaming Apps with Confluent II

  • 1. How to Build Streaming Apps with Confluent Mauro Vocale, Senior Solutions Engineer Stefano Linguerri, Solutions Engineer 24/May/2023
  • 2. Copyright 2020, Confluent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of Confluent, Inc. Today’s Agenda 2 10:00 AM - 10:30 AM Streams Processing / ksqlDB Overview Mauro Vocale, Senior Solutions Engineer 10:30 AM - 11:15 AM Interactive Streams Lab* Stefano Linguerri Solutions Engineer 11:15 AM - 11:30 AM Q&A and Next Steps Workshop Tips & Help: 1. Check the ‘Chat’ window during the session for instructions [icon located at the bottom of the Zoom toolbar] 2. For any technical issues, click the ‘Raise Hand’ button or post in the ‘Chat’ window [a Confluent team member will assist you] *Please note the lab will be conducted on a Confluent Cloud. However, the ksqlDB concepts will still be relevant to all Confluent Platform customers.
  • 3. Copyright 2020, Confluent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of Confluent, Inc. Objectives for today 3 ➔ Show you how Apache Kafka and Confluent enable you to build streaming apps ➔ Refresh on Kafka and Kafka Streams ➔ Learn the basics of: ◆ Event driven / microservices architecture ◆ Building streaming applications ➔ Overview of typical use cases for stream processing. ➔ Hands-on build a streaming application yourself
  • 4. Part 1 Streams Processing / ksqlDB Overview
  • 5. DB centric data architecture What could possibly go wrong?
  • 6. At the heart of every software application is data. Databases
  • 7. Databases are fundamentally incomplete. Databases are not designed for real-time applications. Databases Consumer Batch processing (by time or event) Producer Upserts/Deletes The foundational assumption of every database: Data-at-rest
  • 8. Databases bring point-in-time queries to stored data. This leads to a Giant Mess in Data Architecture. LINE OF BUSINESS 01 LINE OF BUSINESS 02 PUBLIC CLOUD
  • 9. Data in motion: Ubiquitous real-time data and continuous real-time processing
  • 10. A New Paradigm is Required for Data in Motion: Continuously processing evolving streams of data in real-time Rich front-end customer experiences Real-time Events Real-time Event Streams A Sale A shipment A Trade A Customer Experience Real-time backend operations
  • 12. Copyright 2020, Confluent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of Confluent, Inc. What is Apache Kafka? 12 Kafka is a distributed Event Streaming Platform (append-only / immutable commit log): 7 6 5 4 3 2 1 8 Producer Write (append-only) Consumer(s) Read (seek and scan) Kafka topic (partition) Headers Timestamp Key Value Anatomy of an event ● Publish/subscribe to a stream of events ● Persisted data after read ● Supports transactions ● Highly scalable, high throughput.
  • 13. Copyright 2020, Confluent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of Confluent, Inc. Producers and Consumers 13 App #2 App #3 App #4 App #1 Producers Kafka Cluster Topic A Topic B Topic C Topic D App #1 App #2 Consumers App #3
  • 14. 14 Message Raw data produced by a service to be consumed or stored elsewhere. The publisher of the message has an expectation about how the consumer handles the message. Event Lightweight notification of a condition or a state change. The publisher of the event has no expectation about how the event is handled. The consumer of the event decides what to do with it. Events can be discrete units or part of a series. Message vs Event Public Service Announcement
  • 15. Copyright 2020, Confluent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of Confluent, Inc. Kafka Stream App Transactions! What are Kafka Connect and Kafka Streams? 15 Kafka Cluster Connector Transactions! Customer data Order Customer data Order Shipment Shipment Enriched/ transformed data (statefully) Producer Consumer
  • 16. Copyright 2020, Confluent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of Confluent, Inc. Kafka Connect and Kafka Streams APIs 16 Kafka Connect API ● Reliable and scalable integration of Kafka with your other data systems ● Confluent offers 120+ pre-built connectors for popular sources and sinks Kafka Streams API ● Write standard Java applications & micro- services to process your data in real-time ● Built on top of Producer / Consumer APIs Orders Customers STREAM PROCESSING
  • 17. Copyright 2020, Confluent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of Confluent, Inc. Stream processing by analogy 17 Producer / Connect API (Source) Stream Processing Consumer / Connect API (Sink) $ cat < in.txt | grep “ksql” | tr a-z A-Z > out.txt
  • 19. 19 Apache Kafka is an open source project developed and maintained by the Apache Foundation. Confluent was founded by the original creators of Apache Kafka. Who developed Apache Kafka? Public Service Announcement
  • 20. Open Source Apache Kafka Value added by Confluent What only Confluent can offer: ● Apache Kafka re-architected for the cloud ● Cloud-native, Complete and Everywhere ● Rich product capabilities and unparalleled expertise ● Multiple networking/interconnectivity options ● Elastic scale (up and down) ● Tiered storage ● Enterprise-grade security/regulatory features ● Fully integrated Stream Governance ● High availability ● Expertise and vision to support mission critical use cases ● Full DevOps automation (REST, CLI, Terraform and AsyncAPI) ● Multi-language libraries: Java, Python, C++, .NET, Go, JavaScript, etc. Which leads to: ● Lowest TCO (40% savings in average) ● Fastest Time-to-Market (70% in average) Data Governance Hybrid and multi cloud Enterprise grade security Schema linking Cluster linking Support/SLA Kafka expertise A c c e s s C o n t r o l & A u d i t L o g s Terraform Provider Scaling & auto data balancing Latest Version & patches+ 1 2 0 C o n n e c t o r s Fully managed Confluent Cloud vs Open Source Apache Kafka
  • 22. Copyright 2020, Confluent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of Confluent, Inc. ksqlDB The easiest way to process event streams in real-time 22 Runs everywhere Clustering done for you Exactly-once processing Event-time processing Windowing & aggregations Flexible sizing
  • 23. Copyright 2020, Confluent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of Confluent, Inc. ksqlDB in a nutshell Streams and tables are the two primary abstractions, they are referred to as collections ksqlDB is built on top of Kafka Streams Open source event streaming database under the Confluent Community License There are two ways of creating collections in ksqlDB: ● directly from Kafka topics (source collections) ● derived from other streams and tables (derived collections) The direct input/outputs will always be Kafka topics ksqlDB does not query Kafka topics, only collections. To create a source collection: CREATE STREAM users (USER_ID INT KEY, USERNAME VARCHAR) WITH (KAFKA_TOPIC='users',VALUE_FORMAT='AVRO'); Merge/Joins: ● Table-Table ● Table-Stream ● Stream-Table ● Stream-Stream 23
  • 24. Copyright 2020, Confluent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of Confluent, Inc. Tables ● Tables are mutable collections of events ● Represent the latest version of each value per key ● Serve queries to applications. Streams ● Streams are immutable, append- only sequences of events ● Useful for representing a historical series of events ● “Data in motion”. ksqlDB’s collections 24
  • 25. Copyright 2020, Confluent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of Confluent, Inc. Tradeoffs between stream processing tools 25 Kafka Producer & Consumer Kafka Streams API Flexibility Simplicity ksqlDB
  • 26. Copyright 2020, Confluent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of Confluent, Inc. Tradeoffs between stream processing tools 26 Kafka Producer & Consumer Kafka Streams API ksqlDB ConsumerRecords<String, String> records = consumer.poll(100); Map<String,Integer> counts = new DefaultMap<String, Integer>(); for (ConsumerRecord<String, Integer> record : records) { String key = record.key(); int c = counts.get(key); c += record.value(); counts.put(key, c); } for (Map.Entry<String, Integer> entry : counts.entrySet()) { int stateCount; int attempts; while (attempts++ < MAX_RETRIES) { try { stateCount = stateStore.getValue(entry.getKey()); stateStore.setValue(entry.getKey(), entry.getValue() + stateCount); break; } catch (StateStoreException e) { RetryUtils.backoff(attempts); } } } builder .stream("users", Consumed.with( Serdes.String(), Serdes.String() ) ) .groupBy((key, value) -> value) .count() .toStream() .to("counts", Produced.with(Serdes.String(), Serdes.Long())); CREATE STREAM counts WITH (username VARCHAR KEY, counter BIGINT) AS SELECT username, count(*) as counter FROM users GROUP BY username EMIT CHANGES;
  • 27. Copyright 2020, Confluent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of Confluent, Inc. Runtime Environments 27 Kafka Cluster ksqlDB Cluster Kafka Connect Cluster Compute* Storage* (*) Each cluster can scale independently
  • 28. Copyright 2020, Confluent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of Confluent, Inc. Sample use case: Streaming ETL Manipulate events in-flight and connect data sources and sinks in real-time rather than through batch solutions 28 CREATE STREAM clicks_with_city AS SELECT c.*, u.city FROM clickstream c LEFT JOIN users u ON c.user_id = u.id;
  • 29. Copyright 2020, Confluent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of Confluent, Inc. 29 Sample use case: Anomaly Detection Identify anomalies in streaming data, such as potentially fraudulent transactions, in only a few milliseconds CREATE TABLE possible_fraud AS SELECT card_number, count(*) FROM authorization_attempts WINDOW TUMBLING (SIZE 5 SECONDS) GROUP BY card_number HAVING count(*) > 3;
  • 30. Copyright 2020, Confluent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of Confluent, Inc. 30 Sample use case: Real-time Monitoring Monitor logs, sensor and IoT data, and more and take actions on events instantly CREATE TABLE error_counts AS SELECT error_code, count(*) FROM monitoring_stream WINDOW TUMBLING (SIZE 1 MINUTE) WHERE type = 'ERROR' GROUP BY error_code;
  • 31. Copyright 2020, Confluent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of Confluent, Inc. What else is ksqlDB a good fit for? 31 ● Materialized caches ● Event-driven microservices ● Redaction ● Event Deduplication ● Event Re-ordering ● Sessionization ● Validation
  • 32. Copyright 2020, Confluent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of Confluent, Inc. ● Kafka may only have events over a limited span of time depending on retention settings ● No secondary indexes for random lookups ● Aggregate events that never happened. What is ksqlDB NOT a great fit for? 32
  • 34. Copyright 2020, Confluent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of Confluent, Inc. 34 Workshop Tips & Help: 1. Check the ‘Chat’ window during the session for instructions [icon located at the bottom of the Zoom toolbar] 2. For any technical issues, click the ‘Raise Hand’ button or post in the ‘Chat’ window [a Confluent team member will assist you]
  • 35.
  • 36. Find lab instructions with step by step guide to reproduce this workshop https://drive.google.com/file/d/14HlF 63xPeb0LP6NUDuzTeDhAT_hI1Gw3
  • 37. Copyright 2020, Confluent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of Confluent, Inc. Register for Confluent Cloud
  • 38. Copyright 2020, Confluent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of Confluent, Inc. What will we do in this workshop: Use Case Identify unhappy customers ● From streaming data generated in mobile app or on website ● Data enrichment with customer master data (in MySQLdb) ➢ to be able to do something about their unhappiness and not lose them! 38
  • 39. Copyright 2020, Confluent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of Confluent, Inc. Demo diagram
  • 40. Find more tutorials, certifications and technical information here: developer.confluent.io