SlideShare a Scribd company logo
1 of 57
Crossing the Streams:
Rethinking Stream
Processing
with
Kafka Streams and KSQL
DEVNEXUS 2019 | @tlberglund | @gamussa
@
@gamussa | @tlberglund | #DEVnexus
https://cnfl.io/streams-movie-demo
Raffle, yeah 🚀
Follow @gamussa @tlberglund
📸 🖼 👬
Tag @gamussa @tlberglund
With #devnexus
Preface
@gamussa | @tlberglund | #DEVnexus
Streaming 

is the toolset for
dealing with
events 

as they move!
@
@gamussa | @tlberglund | #DEVnexus
Serving Layer
(Microservices,
Elastic, etc.)
Java Apps with
Kafka Streams or
KSQL
Continuous
Computation
High Throughput
Streaming platform
API based
clustering
Apache Kafka
Event Streaming
Platform 101
@@gamussa | @tlberglund | #DEVnexus
Event Streaming Platform
Architecture
Kafka Connect
Zookeeper Nodes
Schema RegistryREST Proxy
Application
Load Balancer *
Application
Native Client
library
Application
Kafka Streams
Kafka Brokers
KSQL
Kafka Streams
@gamussa | @tlberglund | #DEVnexus
The log is a simple idea
Messages are added at the end of the log
Old New
@gamussa | @tlberglund | #DEVnexus
The log is a simple idea
Messages are added at the end of the log
Old New
@gamussa | @tlberglund | #DEVnexus
Consumers have a position all of their own
Old New
Robin
is here
Scan
Viktor
is here
Scan
Ricardo
is here
Scan
@gamussa | @tlberglund | #DEVnexus
Consumers have a position all of their own
Old New
Robin
is here
Scan
Viktor
is here
Scan
Ricardo
is here
Scan
@gamussa | @tlberglund | #DEVnexus
Consumers have a position all of their own
Old New
Robin
is here
Scan
Viktor
is here
Scan
Ricardo
is here
Scan
@gamussa | @tlberglund | #DEVnexus
Only Sequential Access
Old New
Read to offset & scan
@gamussa | @tlberglund | #DEVnexus
Shard data to get scalability
Producer (1) Producer (2) Producer (3)
Cluster of
machines
Partitions live on
different machines
Messages are sent to
different partitions
CONSUMER GROUP
COORDINATORCONSUMERS
CONSUMER GROUP
@gamussa | @tlberglund | #DEVnexus
Linearly Scalable Architecture
Single topic:
- Many producers machines
- Many consumer machines
- Many Broker machines
No Bottleneck!!
Producers
Consumers
@
@gamussa | @tlberglund | #DEVnexus
// in-memory store, not persistent
Map<String, Integer> groupByCounts = new HashMap <>();
try (KafkaConsumer<String, String> consumer = new KafkaConsumer <>(consumerProperties());
KafkaProducer<String, Integer> producer = new KafkaProducer <>(producerProperties())) {
consumer.subscribe(Arrays.asList("A", "B"));
while (true) { // consumer poll loop
ConsumerRecords<String, String> records = consumer.poll(Duration.ofSeconds(5));
for (ConsumerRecord<String, String> record : records) {
String key = record.key();
Integer count = groupByCounts.get(key);
if (count == null) {
count = 0;
}
count += 1;
groupByCounts.put(key, count);
}
}
@
@gamussa | @tlberglund | #DEVnexus
// in-memory store, not persistent
Map<String, Integer> groupByCounts = new HashMap <>();
try (KafkaConsumer<String, String> consumer = new KafkaConsumer <>(consumerProperties());
KafkaProducer<String, Integer> producer = new KafkaProducer <>(producerProperties())) {
consumer.subscribe(Arrays.asList("A", "B"));
while (true) { // consumer poll loop
ConsumerRecords<String, String> records = consumer.poll(Duration.ofSeconds(5));
for (ConsumerRecord<String, String> record : records) {
String key = record.key();
Integer count = groupByCounts.get(key);
if (count == null) {
count = 0;
}
count += 1;
groupByCounts.put(key, count);
}
}
@
@gamussa | @tlberglund | #DEVnexus
// in-memory store, not persistent
Map<String, Integer> groupByCounts = new HashMap <>();
try (KafkaConsumer<String, String> consumer = new KafkaConsumer <>(consumerProperties());
KafkaProducer<String, Integer> producer = new KafkaProducer <>(producerProperties())) {
consumer.subscribe(Arrays.asList("A", "B"));
while (true) { // consumer poll loop
ConsumerRecords<String, String> records = consumer.poll(Duration.ofSeconds(5));
for (ConsumerRecord<String, String> record : records) {
String key = record.key();
Integer count = groupByCounts.get(key);
if (count == null) {
count = 0;
}
count += 1;
groupByCounts.put(key, count);
}
}
@
@gamussa | @tlberglund | #DEVnexus
// in-memory store, not persistent
Map<String, Integer> groupByCounts = new HashMap <>();
try (KafkaConsumer<String, String> consumer = new KafkaConsumer <>(consumerProperties());
KafkaProducer<String, Integer> producer = new KafkaProducer <>(producerProperties())) {
consumer.subscribe(Arrays.asList("A", "B"));
while (true) { // consumer poll loop
ConsumerRecords<String, String> records = consumer.poll(Duration.ofSeconds(5));
for (ConsumerRecord<String, String> record : records) {
String key = record.key();
Integer count = groupByCounts.get(key);
if (count == null) {
count = 0;
}
count += 1;
groupByCounts.put(key, count);
}
}
@
@gamussa | @tlberglund | #DEVnexus
while (true) { // consumer poll loop
ConsumerRecords<String, String> records = consumer.poll(Duration.ofSeconds(5));
for (ConsumerRecord<String, String> record : records) {
String key = record.key();
Integer count = groupByCounts.get(key);
if (count == null) {
count = 0;
}
count += 1;
groupByCounts.put(key, count);
}
}
@
@gamussa | @tlberglund | #DEVnexus
while (true) { // consumer poll loop
ConsumerRecords<String, String> records = consumer.poll(Duration.ofSeconds(5));
for (ConsumerRecord<String, String> record : records) {
String key = record.key();
Integer count = groupByCounts.get(key);
if (count == null) {
count = 0;
}
count += 1; // actually doing something useful
groupByCounts.put(key, count);
}
}
@
@gamussa | @tlberglund | #DEVnexus
if (counter ++ % sendInterval == 0) {
for (Map.Entry<String, Integer> groupedEntry : groupByCounts.entrySet()) {
ProducerRecord<String, Integer> producerRecord =
new ProducerRecord <>("group-by-counts", groupedEntry.getKey(), groupedEntry.getValue());
producer.send(producerRecord);
}
consumer.commitSync();
}
}
}
@
@gamussa | @tlberglund | #DEVnexus
if (counter ++ % sendInterval == 0) {
for (Map.Entry<String, Integer> groupedEntry : groupByCounts.entrySet()) {
ProducerRecord<String, Integer> producerRecord =
new ProducerRecord <>("group-by-counts", groupedEntry.getKey(), groupedEntry.getValue());
producer.send(producerRecord);
}
consumer.commitSync();
}
}
}
@
@gamussa | @tlberglund | #DEVnexus
if (counter ++ % sendInterval == 0) {
for (Map.Entry<String, Integer> groupedEntry : groupByCounts.entrySet()) {
ProducerRecord<String, Integer> producerRecord =
new ProducerRecord <>("group-by-counts", groupedEntry.getKey(), groupedEntry.getValue());
producer.send(producerRecord);
}
consumer.commitSync();
}
}
}
https://twitter.com/monitoring_king/status/1048264580743479296
@
@gamussa | @tlberglund | #DEVnexus
LET’S TALK ABOUT THIS FRAMEWORK
OF YOURS.
I THINK ITS GOOD, EXCEPT IT SUCKS
@
@gamussa | @tlberglund | #DEVnexus
SO LET ME SHOW KAFKA STREAMS
THAT WAY IT MIGHT BE REALLY GOOD
Talk is cheap!
Show me code!
@gamussa | @tlberglund | #DEVnexus
Every framework Wants to be when
it grows up
Scalable Elastic Fault-tolerant
Stateful Distributed
@
@gamussa | @tlberglund | #DEVnexus
@
@gamussa | @tlberglund | #DEVnexus
the KAFKA STREAMS API is a 

JAVA API to 

BUILD REAL-TIME APPLICATIONS
@gamussa | @tlberglund | #DEVnexus
App
Streams
API
Not running
inside brokers!
@gamussa | @tlberglund | #DEVnexus
Brokers?
Nope!
App
Streams
API
App
Streams
API
App
Streams
API
Same app, many instances
@gamussa | @tlberglund | #DEVnexus
Before
DashboardProcessing Cluster
Your Job
Shared Database
@gamussa | @tlberglund | #DEVnexus
After
Dashboard
APP
Streams
API
@gamussa | @tlberglund | #DEVnexus
this means you can 

DEPLOY your app ANYWHERE using
WHATEVER TECHNOLOGY YOU
WANT
@gamussa | @tlberglund | #DEVnexus
So many places to run you app!
...and many more...
@gamussa | @tlberglund | #DEVnexus
Things Kafka Stream Does
Open Source Elastic, Scalable,
Fault-tolerant
Supports Streams
and Tables
Runs Everywhere
Exactly-Once
Processing
Event-Time
Processing
Kafka Security
Integration
Powerful Processing incl.
Filters, Transforms, Joins,
Aggregations, Windowing
Enterprise Support
Talk is cheap!
Show me code!
Talk is cheap!
Show me code!
Do you think that’s a table
you are querying ?
@gamussa | @tlberglund | #DEVnexus
The Stream-Table Duality
Alice + €50
time
Alice €50
Bob + €18
Alice €50 Alice €50
Bob €18
Alice €50
Bob €18
Alice €75
Bob €18
Alice €75
Bob €18
Alice €15
Bob €18
Alice + €25 Alice – €60Stream

(payments)
Table

(balance)
Talk is cheap!
Show me code!
What’s next?
https://twitter.com/IDispose/status/1048602857191170054
@gamussa | @tlberglund | #DEVnexus
Lower the bar to enter the world of
streaming
User Population
CodingSophistication
Core developers who use Java/Scala
Core developers who don’t use Java/Scala
Data engineers, architects, DevOps/SRE
BI analysts
streams
@gamussa | @tlberglund | #DEVnexus
KSQL #FTW
4 Headless1 UI 2 CLI
ksql>
3 REST
POST /query
@gamussa | @tlberglund | #DEVnexus
Interaction with Kafka
Kafka

(data)
KSQL

(processing)
JVM application

with Kafka Streams (processing)
Does not run on 

Kafka brokers
Does not run on 

Kafka brokers
@gamussa | @tlberglund | #DEVnexus
Standing on the shoulders of Streaming
Giants
Producer,
Consumer APIs
Kafka Streams
KSQL
Ease of use
Flexibility
KSQL
Powered by
Powered by
One last thing…
@
@gamussa | @tlberglund | #DEVnexus
https://kafka-summit.org
Gamov30
@
@gamussa | @tlberglund | #DEVnexus
Thanks!
@gamussa viktor@confluent.io
@tlberglund
tim@confluent.io
We are hiring!
https://www.confluent.io/careers/

More Related Content

What's hot

Solutions for bi-directional Integration between Oracle RDMBS & Apache Kafka
Solutions for bi-directional Integration between Oracle RDMBS & Apache KafkaSolutions for bi-directional Integration between Oracle RDMBS & Apache Kafka
Solutions for bi-directional Integration between Oracle RDMBS & Apache Kafka
Guido Schmutz
 
ApacheCon2019 Talk: Improving the Observability of Cassandra, Kafka and Kuber...
ApacheCon2019 Talk: Improving the Observability of Cassandra, Kafka and Kuber...ApacheCon2019 Talk: Improving the Observability of Cassandra, Kafka and Kuber...
ApacheCon2019 Talk: Improving the Observability of Cassandra, Kafka and Kuber...
Paul Brebner
 
Rethinking Stream Processing with Apache Kafka: Applications vs. Clusters, St...
Rethinking Stream Processing with Apache Kafka: Applications vs. Clusters, St...Rethinking Stream Processing with Apache Kafka: Applications vs. Clusters, St...
Rethinking Stream Processing with Apache Kafka: Applications vs. Clusters, St...
Michael Noll
 

What's hot (20)

KSQL: Streaming SQL for Kafka
KSQL: Streaming SQL for KafkaKSQL: Streaming SQL for Kafka
KSQL: Streaming SQL for Kafka
 
Building Stream Processing Applications with Apache Kafka Using KSQL (Robin M...
Building Stream Processing Applications with Apache Kafka Using KSQL (Robin M...Building Stream Processing Applications with Apache Kafka Using KSQL (Robin M...
Building Stream Processing Applications with Apache Kafka Using KSQL (Robin M...
 
Temporal-Joins in Kafka Streams and ksqlDB | Matthias Sax, Confluent
Temporal-Joins in Kafka Streams and ksqlDB | Matthias Sax, ConfluentTemporal-Joins in Kafka Streams and ksqlDB | Matthias Sax, Confluent
Temporal-Joins in Kafka Streams and ksqlDB | Matthias Sax, Confluent
 
Streaming ETL with Apache Kafka and KSQL
Streaming ETL with Apache Kafka and KSQLStreaming ETL with Apache Kafka and KSQL
Streaming ETL with Apache Kafka and KSQL
 
KSQL and Kafka Streams – When to Use Which, and When to Use Both
KSQL and Kafka Streams – When to Use Which, and When to Use BothKSQL and Kafka Streams – When to Use Which, and When to Use Both
KSQL and Kafka Streams – When to Use Which, and When to Use Both
 
UDF/UDAF: the extensibility framework for KSQL (Hojjat Jafapour, Confluent) K...
UDF/UDAF: the extensibility framework for KSQL (Hojjat Jafapour, Confluent) K...UDF/UDAF: the extensibility framework for KSQL (Hojjat Jafapour, Confluent) K...
UDF/UDAF: the extensibility framework for KSQL (Hojjat Jafapour, Confluent) K...
 
Data Driven Enterprise with Apache Kafka
Data Driven Enterprise with Apache KafkaData Driven Enterprise with Apache Kafka
Data Driven Enterprise with Apache Kafka
 
Solutions for bi-directional Integration between Oracle RDMBS & Apache Kafka
Solutions for bi-directional Integration between Oracle RDMBS & Apache KafkaSolutions for bi-directional Integration between Oracle RDMBS & Apache Kafka
Solutions for bi-directional Integration between Oracle RDMBS & Apache Kafka
 
Introduction to KSQL: Streaming SQL for Apache Kafka®
Introduction to KSQL: Streaming SQL for Apache Kafka®Introduction to KSQL: Streaming SQL for Apache Kafka®
Introduction to KSQL: Streaming SQL for Apache Kafka®
 
Kafka Summit NYC 2017 Hanging Out with Your Past Self in VR
Kafka Summit NYC 2017 Hanging Out with Your Past Self in VRKafka Summit NYC 2017 Hanging Out with Your Past Self in VR
Kafka Summit NYC 2017 Hanging Out with Your Past Self in VR
 
ksqlDB - Stream Processing simplified!
ksqlDB - Stream Processing simplified!ksqlDB - Stream Processing simplified!
ksqlDB - Stream Processing simplified!
 
Kafka streams - From pub/sub to a complete stream processing platform
Kafka streams - From pub/sub to a complete stream processing platformKafka streams - From pub/sub to a complete stream processing platform
Kafka streams - From pub/sub to a complete stream processing platform
 
Steps to Building a Streaming ETL Pipeline with Apache Kafka® and KSQL
Steps to Building a Streaming ETL Pipeline with Apache Kafka® and KSQLSteps to Building a Streaming ETL Pipeline with Apache Kafka® and KSQL
Steps to Building a Streaming ETL Pipeline with Apache Kafka® and KSQL
 
ApacheCon2019 Talk: Improving the Observability of Cassandra, Kafka and Kuber...
ApacheCon2019 Talk: Improving the Observability of Cassandra, Kafka and Kuber...ApacheCon2019 Talk: Improving the Observability of Cassandra, Kafka and Kuber...
ApacheCon2019 Talk: Improving the Observability of Cassandra, Kafka and Kuber...
 
Rethinking Stream Processing with Apache Kafka: Applications vs. Clusters, St...
Rethinking Stream Processing with Apache Kafka: Applications vs. Clusters, St...Rethinking Stream Processing with Apache Kafka: Applications vs. Clusters, St...
Rethinking Stream Processing with Apache Kafka: Applications vs. Clusters, St...
 
Deploying and Operating KSQL
Deploying and Operating KSQLDeploying and Operating KSQL
Deploying and Operating KSQL
 
Productizing Structured Streaming Jobs
Productizing Structured Streaming JobsProductizing Structured Streaming Jobs
Productizing Structured Streaming Jobs
 
Kafka Summit NYC 2017 - Singe Message Transforms are not the Transformations ...
Kafka Summit NYC 2017 - Singe Message Transforms are not the Transformations ...Kafka Summit NYC 2017 - Singe Message Transforms are not the Transformations ...
Kafka Summit NYC 2017 - Singe Message Transforms are not the Transformations ...
 
Getting Started with Confluent Schema Registry
Getting Started with Confluent Schema RegistryGetting Started with Confluent Schema Registry
Getting Started with Confluent Schema Registry
 
From Zero to Hero with Kafka Connect (Robin Moffat, Confluent) Kafka Summit L...
From Zero to Hero with Kafka Connect (Robin Moffat, Confluent) Kafka Summit L...From Zero to Hero with Kafka Connect (Robin Moffat, Confluent) Kafka Summit L...
From Zero to Hero with Kafka Connect (Robin Moffat, Confluent) Kafka Summit L...
 

Similar to Crossing the Streams: Rethinking Stream Processing with Kafka Streams and KSQL

Timely Year Two: Lessons Learned Building a Scalable Metrics Analytic System
Timely Year Two: Lessons Learned Building a Scalable Metrics Analytic SystemTimely Year Two: Lessons Learned Building a Scalable Metrics Analytic System
Timely Year Two: Lessons Learned Building a Scalable Metrics Analytic System
Accumulo Summit
 

Similar to Crossing the Streams: Rethinking Stream Processing with Kafka Streams and KSQL (20)

Crossing the Streams: Event Streaming with Kafka Streams
Crossing the Streams: Event Streaming with Kafka StreamsCrossing the Streams: Event Streaming with Kafka Streams
Crossing the Streams: Event Streaming with Kafka Streams
 
Crossing the streams viktor gamov
Crossing the streams viktor gamovCrossing the streams viktor gamov
Crossing the streams viktor gamov
 
What is Apache Kafka®?
What is Apache Kafka®?What is Apache Kafka®?
What is Apache Kafka®?
 
Everything-as-code – Polyglotte Entwicklung in der Praxis
Everything-as-code – Polyglotte Entwicklung in der PraxisEverything-as-code – Polyglotte Entwicklung in der Praxis
Everything-as-code – Polyglotte Entwicklung in der Praxis
 
Semplificare l'observability per progetti Serverless
Semplificare l'observability per progetti ServerlessSemplificare l'observability per progetti Serverless
Semplificare l'observability per progetti Serverless
 
Tsar tech talk
Tsar tech talkTsar tech talk
Tsar tech talk
 
TSAR (TimeSeries AggregatoR) Tech Talk
TSAR (TimeSeries AggregatoR) Tech TalkTSAR (TimeSeries AggregatoR) Tech Talk
TSAR (TimeSeries AggregatoR) Tech Talk
 
Timely Year Two: Lessons Learned Building a Scalable Metrics Analytic System
Timely Year Two: Lessons Learned Building a Scalable Metrics Analytic SystemTimely Year Two: Lessons Learned Building a Scalable Metrics Analytic System
Timely Year Two: Lessons Learned Building a Scalable Metrics Analytic System
 
Reactive programming with RxJS - Taiwan
Reactive programming with RxJS - TaiwanReactive programming with RxJS - Taiwan
Reactive programming with RxJS - Taiwan
 
MongoDB.local Berlin: Building a GraphQL API with MongoDB, Prisma and Typescript
MongoDB.local Berlin: Building a GraphQL API with MongoDB, Prisma and TypescriptMongoDB.local Berlin: Building a GraphQL API with MongoDB, Prisma and Typescript
MongoDB.local Berlin: Building a GraphQL API with MongoDB, Prisma and Typescript
 
Timeseries - data visualization in Grafana
Timeseries - data visualization in GrafanaTimeseries - data visualization in Grafana
Timeseries - data visualization in Grafana
 
How I Learned to Stop Worrying and Love the Cloud - Wesley Beary, Engine Yard
How I Learned to Stop Worrying and Love the Cloud - Wesley Beary, Engine YardHow I Learned to Stop Worrying and Love the Cloud - Wesley Beary, Engine Yard
How I Learned to Stop Worrying and Love the Cloud - Wesley Beary, Engine Yard
 
Apache Kafka and ksqlDB in Action: Let's Build a Streaming Data Pipeline! (Ro...
Apache Kafka and ksqlDB in Action: Let's Build a Streaming Data Pipeline! (Ro...Apache Kafka and ksqlDB in Action: Let's Build a Streaming Data Pipeline! (Ro...
Apache Kafka and ksqlDB in Action: Let's Build a Streaming Data Pipeline! (Ro...
 
Shooting the Rapids
Shooting the RapidsShooting the Rapids
Shooting the Rapids
 
An Architect's guide to real time big data systems
An Architect's guide to real time big data systemsAn Architect's guide to real time big data systems
An Architect's guide to real time big data systems
 
Managing GraphQL servers with AWS Fargate & Prisma Cloud
Managing GraphQL servers  with AWS Fargate & Prisma CloudManaging GraphQL servers  with AWS Fargate & Prisma Cloud
Managing GraphQL servers with AWS Fargate & Prisma Cloud
 
Everything-as-code: DevOps und Continuous Delivery aus Sicht des Entwicklers....
Everything-as-code: DevOps und Continuous Delivery aus Sicht des Entwicklers....Everything-as-code: DevOps und Continuous Delivery aus Sicht des Entwicklers....
Everything-as-code: DevOps und Continuous Delivery aus Sicht des Entwicklers....
 
Everything-as-code: DevOps und Continuous Delivery aus Sicht des Entwicklers.
Everything-as-code: DevOps und Continuous Delivery aus Sicht des Entwicklers.Everything-as-code: DevOps und Continuous Delivery aus Sicht des Entwicklers.
Everything-as-code: DevOps und Continuous Delivery aus Sicht des Entwicklers.
 
Dataservices: Processing (Big) Data the Microservice Way
Dataservices: Processing (Big) Data the Microservice WayDataservices: Processing (Big) Data the Microservice Way
Dataservices: Processing (Big) Data the Microservice Way
 
Lambdas puzzler - Peter Lawrey
Lambdas puzzler - Peter LawreyLambdas puzzler - Peter Lawrey
Lambdas puzzler - Peter Lawrey
 

More from confluent

More from confluent (20)

Speed Wins: From Kafka to APIs in Minutes
Speed Wins: From Kafka to APIs in MinutesSpeed Wins: From Kafka to APIs in Minutes
Speed Wins: From Kafka to APIs in Minutes
 
Evolving Data Governance for the Real-time Streaming and AI Era
Evolving Data Governance for the Real-time Streaming and AI EraEvolving Data Governance for the Real-time Streaming and AI Era
Evolving Data Governance for the Real-time Streaming and AI Era
 
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
 
Santander Stream Processing with Apache Flink
Santander Stream Processing with Apache FlinkSantander Stream Processing with Apache Flink
Santander Stream Processing with Apache Flink
 
Unlocking the Power of IoT: A comprehensive approach to real-time insights
Unlocking the Power of IoT: A comprehensive approach to real-time insightsUnlocking the Power of IoT: A comprehensive approach to real-time insights
Unlocking the Power of IoT: A comprehensive approach to real-time insights
 
Workshop híbrido: Stream Processing con Flink
Workshop híbrido: Stream Processing con FlinkWorkshop híbrido: Stream Processing con Flink
Workshop híbrido: Stream Processing con Flink
 
Industry 4.0: Building the Unified Namespace with Confluent, HiveMQ and Spark...
Industry 4.0: Building the Unified Namespace with Confluent, HiveMQ and Spark...Industry 4.0: Building the Unified Namespace with Confluent, HiveMQ and Spark...
Industry 4.0: Building the Unified Namespace with Confluent, HiveMQ and Spark...
 
AWS Immersion Day Mapfre - Confluent
AWS Immersion Day Mapfre   -   ConfluentAWS Immersion Day Mapfre   -   Confluent
AWS Immersion Day Mapfre - Confluent
 
Eventos y Microservicios - Santander TechTalk
Eventos y Microservicios - Santander TechTalkEventos y Microservicios - Santander TechTalk
Eventos y Microservicios - Santander TechTalk
 
Q&A with Confluent Experts: Navigating Networking in Confluent Cloud
Q&A with Confluent Experts: Navigating Networking in Confluent CloudQ&A with Confluent Experts: Navigating Networking in Confluent Cloud
Q&A with Confluent Experts: Navigating Networking in Confluent Cloud
 
Citi TechTalk Session 2: Kafka Deep Dive
Citi TechTalk Session 2: Kafka Deep DiveCiti TechTalk Session 2: Kafka Deep Dive
Citi TechTalk Session 2: Kafka Deep Dive
 
Build real-time streaming data pipelines to AWS with Confluent
Build real-time streaming data pipelines to AWS with ConfluentBuild real-time streaming data pipelines to AWS with Confluent
Build real-time streaming data pipelines to AWS with Confluent
 
Q&A with Confluent Professional Services: Confluent Service Mesh
Q&A with Confluent Professional Services: Confluent Service MeshQ&A with Confluent Professional Services: Confluent Service Mesh
Q&A with Confluent Professional Services: Confluent Service Mesh
 
Citi Tech Talk: Event Driven Kafka Microservices
Citi Tech Talk: Event Driven Kafka MicroservicesCiti Tech Talk: Event Driven Kafka Microservices
Citi Tech Talk: Event Driven Kafka Microservices
 
Confluent & GSI Webinars series - Session 3
Confluent & GSI Webinars series - Session 3Confluent & GSI Webinars series - Session 3
Confluent & GSI Webinars series - Session 3
 
Citi Tech Talk: Messaging Modernization
Citi Tech Talk: Messaging ModernizationCiti Tech Talk: Messaging Modernization
Citi Tech Talk: Messaging Modernization
 
Citi Tech Talk: Data Governance for streaming and real time data
Citi Tech Talk: Data Governance for streaming and real time dataCiti Tech Talk: Data Governance for streaming and real time data
Citi Tech Talk: Data Governance for streaming and real time data
 
Confluent & GSI Webinars series: Session 2
Confluent & GSI Webinars series: Session 2Confluent & GSI Webinars series: Session 2
Confluent & GSI Webinars series: Session 2
 
Data In Motion Paris 2023
Data In Motion Paris 2023Data In Motion Paris 2023
Data In Motion Paris 2023
 
Confluent Partner Tech Talk with Synthesis
Confluent Partner Tech Talk with SynthesisConfluent Partner Tech Talk with Synthesis
Confluent Partner Tech Talk with Synthesis
 

Recently uploaded

Recently uploaded (20)

How Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdf
How Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdfHow Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdf
How Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdf
 
Linux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdf
Linux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdfLinux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdf
Linux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdf
 
Top 10 Symfony Development Companies 2024
Top 10 Symfony Development Companies 2024Top 10 Symfony Development Companies 2024
Top 10 Symfony Development Companies 2024
 
The Metaverse: Are We There Yet?
The  Metaverse:    Are   We  There  Yet?The  Metaverse:    Are   We  There  Yet?
The Metaverse: Are We There Yet?
 
Optimizing NoSQL Performance Through Observability
Optimizing NoSQL Performance Through ObservabilityOptimizing NoSQL Performance Through Observability
Optimizing NoSQL Performance Through Observability
 
Secure Zero Touch enabled Edge compute with Dell NativeEdge via FDO _ Brad at...
Secure Zero Touch enabled Edge compute with Dell NativeEdge via FDO _ Brad at...Secure Zero Touch enabled Edge compute with Dell NativeEdge via FDO _ Brad at...
Secure Zero Touch enabled Edge compute with Dell NativeEdge via FDO _ Brad at...
 
WebAssembly is Key to Better LLM Performance
WebAssembly is Key to Better LLM PerformanceWebAssembly is Key to Better LLM Performance
WebAssembly is Key to Better LLM Performance
 
ASRock Industrial FDO Solutions in Action for Industrial Edge AI _ Kenny at A...
ASRock Industrial FDO Solutions in Action for Industrial Edge AI _ Kenny at A...ASRock Industrial FDO Solutions in Action for Industrial Edge AI _ Kenny at A...
ASRock Industrial FDO Solutions in Action for Industrial Edge AI _ Kenny at A...
 
Custom Approval Process: A New Perspective, Pavel Hrbacek & Anindya Halder
Custom Approval Process: A New Perspective, Pavel Hrbacek & Anindya HalderCustom Approval Process: A New Perspective, Pavel Hrbacek & Anindya Halder
Custom Approval Process: A New Perspective, Pavel Hrbacek & Anindya Halder
 
Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)
Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)
Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)
 
Where to Learn More About FDO _ Richard at FIDO Alliance.pdf
Where to Learn More About FDO _ Richard at FIDO Alliance.pdfWhere to Learn More About FDO _ Richard at FIDO Alliance.pdf
Where to Learn More About FDO _ Richard at FIDO Alliance.pdf
 
How we scaled to 80K users by doing nothing!.pdf
How we scaled to 80K users by doing nothing!.pdfHow we scaled to 80K users by doing nothing!.pdf
How we scaled to 80K users by doing nothing!.pdf
 
Simplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdf
Simplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdfSimplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdf
Simplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdf
 
Extensible Python: Robustness through Addition - PyCon 2024
Extensible Python: Robustness through Addition - PyCon 2024Extensible Python: Robustness through Addition - PyCon 2024
Extensible Python: Robustness through Addition - PyCon 2024
 
Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...
Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...
Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...
 
AI revolution and Salesforce, Jiří Karpíšek
AI revolution and Salesforce, Jiří KarpíšekAI revolution and Salesforce, Jiří Karpíšek
AI revolution and Salesforce, Jiří Karpíšek
 
Syngulon - Selection technology May 2024.pdf
Syngulon - Selection technology May 2024.pdfSyngulon - Selection technology May 2024.pdf
Syngulon - Selection technology May 2024.pdf
 
What's New in Teams Calling, Meetings and Devices April 2024
What's New in Teams Calling, Meetings and Devices April 2024What's New in Teams Calling, Meetings and Devices April 2024
What's New in Teams Calling, Meetings and Devices April 2024
 
IESVE for Early Stage Design and Planning
IESVE for Early Stage Design and PlanningIESVE for Early Stage Design and Planning
IESVE for Early Stage Design and Planning
 
Designing for Hardware Accessibility at Comcast
Designing for Hardware Accessibility at ComcastDesigning for Hardware Accessibility at Comcast
Designing for Hardware Accessibility at Comcast
 

Crossing the Streams: Rethinking Stream Processing with Kafka Streams and KSQL

  • 1. Crossing the Streams: Rethinking Stream Processing with Kafka Streams and KSQL DEVNEXUS 2019 | @tlberglund | @gamussa
  • 4. Raffle, yeah 🚀 Follow @gamussa @tlberglund 📸 🖼 👬 Tag @gamussa @tlberglund With #devnexus
  • 6. @gamussa | @tlberglund | #DEVnexus Streaming 
 is the toolset for dealing with events 
 as they move!
  • 7. @ @gamussa | @tlberglund | #DEVnexus Serving Layer (Microservices, Elastic, etc.) Java Apps with Kafka Streams or KSQL Continuous Computation High Throughput Streaming platform API based clustering
  • 9. @@gamussa | @tlberglund | #DEVnexus Event Streaming Platform Architecture Kafka Connect Zookeeper Nodes Schema RegistryREST Proxy Application Load Balancer * Application Native Client library Application Kafka Streams Kafka Brokers KSQL Kafka Streams
  • 10. @gamussa | @tlberglund | #DEVnexus The log is a simple idea Messages are added at the end of the log Old New
  • 11. @gamussa | @tlberglund | #DEVnexus The log is a simple idea Messages are added at the end of the log Old New
  • 12. @gamussa | @tlberglund | #DEVnexus Consumers have a position all of their own Old New Robin is here Scan Viktor is here Scan Ricardo is here Scan
  • 13. @gamussa | @tlberglund | #DEVnexus Consumers have a position all of their own Old New Robin is here Scan Viktor is here Scan Ricardo is here Scan
  • 14. @gamussa | @tlberglund | #DEVnexus Consumers have a position all of their own Old New Robin is here Scan Viktor is here Scan Ricardo is here Scan
  • 15. @gamussa | @tlberglund | #DEVnexus Only Sequential Access Old New Read to offset & scan
  • 16. @gamussa | @tlberglund | #DEVnexus Shard data to get scalability Producer (1) Producer (2) Producer (3) Cluster of machines Partitions live on different machines Messages are sent to different partitions
  • 18. @gamussa | @tlberglund | #DEVnexus Linearly Scalable Architecture Single topic: - Many producers machines - Many consumer machines - Many Broker machines No Bottleneck!! Producers Consumers
  • 19. @ @gamussa | @tlberglund | #DEVnexus // in-memory store, not persistent Map<String, Integer> groupByCounts = new HashMap <>(); try (KafkaConsumer<String, String> consumer = new KafkaConsumer <>(consumerProperties()); KafkaProducer<String, Integer> producer = new KafkaProducer <>(producerProperties())) { consumer.subscribe(Arrays.asList("A", "B")); while (true) { // consumer poll loop ConsumerRecords<String, String> records = consumer.poll(Duration.ofSeconds(5)); for (ConsumerRecord<String, String> record : records) { String key = record.key(); Integer count = groupByCounts.get(key); if (count == null) { count = 0; } count += 1; groupByCounts.put(key, count); } }
  • 20. @ @gamussa | @tlberglund | #DEVnexus // in-memory store, not persistent Map<String, Integer> groupByCounts = new HashMap <>(); try (KafkaConsumer<String, String> consumer = new KafkaConsumer <>(consumerProperties()); KafkaProducer<String, Integer> producer = new KafkaProducer <>(producerProperties())) { consumer.subscribe(Arrays.asList("A", "B")); while (true) { // consumer poll loop ConsumerRecords<String, String> records = consumer.poll(Duration.ofSeconds(5)); for (ConsumerRecord<String, String> record : records) { String key = record.key(); Integer count = groupByCounts.get(key); if (count == null) { count = 0; } count += 1; groupByCounts.put(key, count); } }
  • 21. @ @gamussa | @tlberglund | #DEVnexus // in-memory store, not persistent Map<String, Integer> groupByCounts = new HashMap <>(); try (KafkaConsumer<String, String> consumer = new KafkaConsumer <>(consumerProperties()); KafkaProducer<String, Integer> producer = new KafkaProducer <>(producerProperties())) { consumer.subscribe(Arrays.asList("A", "B")); while (true) { // consumer poll loop ConsumerRecords<String, String> records = consumer.poll(Duration.ofSeconds(5)); for (ConsumerRecord<String, String> record : records) { String key = record.key(); Integer count = groupByCounts.get(key); if (count == null) { count = 0; } count += 1; groupByCounts.put(key, count); } }
  • 22. @ @gamussa | @tlberglund | #DEVnexus // in-memory store, not persistent Map<String, Integer> groupByCounts = new HashMap <>(); try (KafkaConsumer<String, String> consumer = new KafkaConsumer <>(consumerProperties()); KafkaProducer<String, Integer> producer = new KafkaProducer <>(producerProperties())) { consumer.subscribe(Arrays.asList("A", "B")); while (true) { // consumer poll loop ConsumerRecords<String, String> records = consumer.poll(Duration.ofSeconds(5)); for (ConsumerRecord<String, String> record : records) { String key = record.key(); Integer count = groupByCounts.get(key); if (count == null) { count = 0; } count += 1; groupByCounts.put(key, count); } }
  • 23. @ @gamussa | @tlberglund | #DEVnexus while (true) { // consumer poll loop ConsumerRecords<String, String> records = consumer.poll(Duration.ofSeconds(5)); for (ConsumerRecord<String, String> record : records) { String key = record.key(); Integer count = groupByCounts.get(key); if (count == null) { count = 0; } count += 1; groupByCounts.put(key, count); } }
  • 24. @ @gamussa | @tlberglund | #DEVnexus while (true) { // consumer poll loop ConsumerRecords<String, String> records = consumer.poll(Duration.ofSeconds(5)); for (ConsumerRecord<String, String> record : records) { String key = record.key(); Integer count = groupByCounts.get(key); if (count == null) { count = 0; } count += 1; // actually doing something useful groupByCounts.put(key, count); } }
  • 25. @ @gamussa | @tlberglund | #DEVnexus if (counter ++ % sendInterval == 0) { for (Map.Entry<String, Integer> groupedEntry : groupByCounts.entrySet()) { ProducerRecord<String, Integer> producerRecord = new ProducerRecord <>("group-by-counts", groupedEntry.getKey(), groupedEntry.getValue()); producer.send(producerRecord); } consumer.commitSync(); } } }
  • 26. @ @gamussa | @tlberglund | #DEVnexus if (counter ++ % sendInterval == 0) { for (Map.Entry<String, Integer> groupedEntry : groupByCounts.entrySet()) { ProducerRecord<String, Integer> producerRecord = new ProducerRecord <>("group-by-counts", groupedEntry.getKey(), groupedEntry.getValue()); producer.send(producerRecord); } consumer.commitSync(); } } }
  • 27. @ @gamussa | @tlberglund | #DEVnexus if (counter ++ % sendInterval == 0) { for (Map.Entry<String, Integer> groupedEntry : groupByCounts.entrySet()) { ProducerRecord<String, Integer> producerRecord = new ProducerRecord <>("group-by-counts", groupedEntry.getKey(), groupedEntry.getValue()); producer.send(producerRecord); } consumer.commitSync(); } } }
  • 29. @ @gamussa | @tlberglund | #DEVnexus LET’S TALK ABOUT THIS FRAMEWORK OF YOURS. I THINK ITS GOOD, EXCEPT IT SUCKS
  • 30. @ @gamussa | @tlberglund | #DEVnexus SO LET ME SHOW KAFKA STREAMS THAT WAY IT MIGHT BE REALLY GOOD
  • 32. @gamussa | @tlberglund | #DEVnexus Every framework Wants to be when it grows up Scalable Elastic Fault-tolerant Stateful Distributed
  • 34. @
  • 35. @gamussa | @tlberglund | #DEVnexus the KAFKA STREAMS API is a 
 JAVA API to 
 BUILD REAL-TIME APPLICATIONS
  • 36. @gamussa | @tlberglund | #DEVnexus App Streams API Not running inside brokers!
  • 37. @gamussa | @tlberglund | #DEVnexus Brokers? Nope! App Streams API App Streams API App Streams API Same app, many instances
  • 38. @gamussa | @tlberglund | #DEVnexus Before DashboardProcessing Cluster Your Job Shared Database
  • 39. @gamussa | @tlberglund | #DEVnexus After Dashboard APP Streams API
  • 40. @gamussa | @tlberglund | #DEVnexus this means you can 
 DEPLOY your app ANYWHERE using WHATEVER TECHNOLOGY YOU WANT
  • 41. @gamussa | @tlberglund | #DEVnexus So many places to run you app! ...and many more...
  • 42. @gamussa | @tlberglund | #DEVnexus Things Kafka Stream Does Open Source Elastic, Scalable, Fault-tolerant Supports Streams and Tables Runs Everywhere Exactly-Once Processing Event-Time Processing Kafka Security Integration Powerful Processing incl. Filters, Transforms, Joins, Aggregations, Windowing Enterprise Support
  • 45.
  • 46. Do you think that’s a table you are querying ?
  • 47. @gamussa | @tlberglund | #DEVnexus The Stream-Table Duality Alice + €50 time Alice €50 Bob + €18 Alice €50 Alice €50 Bob €18 Alice €50 Bob €18 Alice €75 Bob €18 Alice €75 Bob €18 Alice €15 Bob €18 Alice + €25 Alice – €60Stream
 (payments) Table
 (balance)
  • 51. @gamussa | @tlberglund | #DEVnexus Lower the bar to enter the world of streaming User Population CodingSophistication Core developers who use Java/Scala Core developers who don’t use Java/Scala Data engineers, architects, DevOps/SRE BI analysts streams
  • 52. @gamussa | @tlberglund | #DEVnexus KSQL #FTW 4 Headless1 UI 2 CLI ksql> 3 REST POST /query
  • 53. @gamussa | @tlberglund | #DEVnexus Interaction with Kafka Kafka
 (data) KSQL
 (processing) JVM application
 with Kafka Streams (processing) Does not run on 
 Kafka brokers Does not run on 
 Kafka brokers
  • 54. @gamussa | @tlberglund | #DEVnexus Standing on the shoulders of Streaming Giants Producer, Consumer APIs Kafka Streams KSQL Ease of use Flexibility KSQL Powered by Powered by
  • 56. @ @gamussa | @tlberglund | #DEVnexus https://kafka-summit.org Gamov30
  • 57. @ @gamussa | @tlberglund | #DEVnexus Thanks! @gamussa viktor@confluent.io @tlberglund tim@confluent.io We are hiring! https://www.confluent.io/careers/