SlideShare a Scribd company logo
1 of 23
PRESENTATION TITLE ON ONE LINE
AND ON TWO LINES
First and last name
Position, company
Streaming ETL in Kafka
for Everyone with KSQL
Software Engineer, Confluent Inc.
Hojjat Jafarpour
PRESENTATION TITLE ON ONE LINE
AND ON TWO LINES
First and last name
Position, company
Hojjat Jafarpour
2
Software Engineer at Confluent
○Starter KSQL project at Confluent
Previously at Tidemark, Quantcast, Informatica and
NEC Labs
PhD in Computer Science from UC Irvine
○Data management, pub/sub and streaming
hojjat@confluent.io
@hojjat
PRESENTATION TITLE ON ONE LINE
AND ON TWO LINES
First and last name
Position, company
Streaming ETL, with Apache Kafka and Confluent
Platform
3
PRESENTATION TITLE ON ONE LINE
AND ON TWO LINES
First and last name
Position, company
4
PRESENTATION TITLE ON ONE LINE
AND ON TWO LINES
First and last name
Position, company
5
PRESENTATION TITLE ON ONE LINE
AND ON TWO LINES
First and last name
Position, company
6
PRESENTATION TITLE ON ONE LINE
AND ON TWO LINES
First and last name
Position, company
7
PRESENTATION TITLE ON ONE LINE
AND ON TWO LINES
First and last name
Position, company
Kafka Connect : Stream data in and out of Kafka
8
Amazon
S3
PRESENTATION TITLE ON ONE LINE
AND ON TWO LINES
First and last name
Position, company
Single Message Transform (SMT)
9
▪ Modify events before storing in
Kafka:
o Mask/drop sensitive information
o Set partitioning key
o Store lineage
▪ Modify events going out of
Kafka:
o Route high priority events to faster
data stores
o Direct events to different
Elasticsearch indexes
o Cast data types to match destination
PRESENTATION TITLE ON ONE LINE
AND ON TWO LINES
First and last name
Position, company
10
But I need to
join…aggregate…filter
…
PRESENTATION TITLE ON ONE LINE
AND ON TWO LINES
First and last name
Position, company
KSQL from Confluent
11
A Developer Preview of
KSQL
An Open Source Streaming SQL
Engine for Apache KafkaTM
PRESENTATION TITLE ON ONE LINE
AND ON TWO LINES
First and last name
Position, company
KSQL: a Streaming SQL Engine for Apache Kafka™ from
Confluent
▪ Enables stream processing with zero coding required
▪ The simplest way to process streams of data in real-time
▪ Powered by Kafka: scalable, distributed, battle-tested
▪ All you need is Kafka–No complex deployments of bespoke
systems for stream processing
12
Ksql>
PRESENTATION TITLE ON ONE LINE
AND ON TWO LINES
First and last name
Position, company
KSQL: the Simplest Way to Do Stream Processing
CREATE STREAM possible_fraud AS
SELECT card_number, count(*)
FROM authorization_attempts
WINDOW TUMBLING (SIZE 5 SECONDS)
GROUP BY card_number
HAVING count(*) > 3;
13
PRESENTATION TITLE ON ONE LINE
AND ON TWO LINES
First and last name
Position, company
KSQL Concepts
▪ STREAM and TABLE as first-class citizens
o Interpretations of topic content
▪ STREAM - data in motion
▪ TABLE - collected state of a stream
o One record per key (per window)
o Current values (compacted topic) ← Not yet in KSQL
▪ STREAM – TABLE Joins
14
PRESENTATION TITLE ON ONE LINE
AND ON TWO LINES
First and last name
Position, company
Window Aggregations
Three types supported (same as KStreams):
● TUMBLING: Fixed-size, non-overlapping, gap-less windows
• SELECT ip, count(*) AS hits FROM clickstream
WINDOW TUMBLING (size 1 minute) GROUP BY ip;
● HOPPING: Fixed-size, overlapping windows
• SELECT ip, SUM(bytes) AS bytes_per_ip_and_bucket FROM clickstream
WINDOW HOPPING ( size 20 second, advance by 5 second) GROUP BY ip;
● SESSION: Dynamically-sized, non-overlapping, data-driven window
• SELECT ip, SUM(bytes) AS bytes_per_ip FROM clickstream
WINDOW SESSION (20 second) GROUP BY ip;
15
PRESENTATION TITLE ON ONE LINE
AND ON TWO LINES
First and last name
Position, company
Streaming ETL, powered by Apache Kafka and Confluent
Platform
16
KSQL
PRESENTATION TITLE ON ONE LINE
AND ON TWO LINES
First and last name
Position, company
Simple Web Analytics Pipeline
● Pageview stream
● User table
● Materialized views
o Region visitor count
o Region visitor demography
17
CREATE STREAM pageviews (viewtime BIGINT, userid VARCHAR, pageid VARCHAR) WITH
(kafka_topic='pageviews', value_format=JSON);
CREATE TABLE users (registertime BIGINT, gender VARCHAR, regionid VARCHAR, userid
VARCHAR) WITH (kafka_topic='users', value_format='JSON');
PRESENTATION TITLE ON ONE LINE
AND ON TWO LINES
First and last name
Position, company
Simple Web Analytics Pipeline
18
Region visitor count
CREATE STREAM joined_pageviews AS
SELECT users.userid AS userid, pageid, regionid, gender
FROM pageviews LEFT JOIN users ON pageviews.userid = users.userid;
CREATE TABLE region_visitor_count AS
SELECT regionid , COUNT(*) AS visit_count
FROM joined_pageviews
WINDOW TUMBLING (size 30 second)
GROUP BY regionid;
PRESENTATION TITLE ON ONE LINE
AND ON TWO LINES
First and last name
Position, company
Simple Web Analytics Pipeline
19
Region visitor demography
CREATE TABLE region_visitor_demo_count AS
SELECT regionid, gender, COUNT(*) AS visit_count
FROM joined_pageviews
WINDOW TUMBLING (size 30 second)
GROUP BY gender, regionid;
PRESENTATION TITLE ON ONE LINE
AND ON TWO LINES
First and last name
Position, company
Streaming ETL, powered by Apache Kafka and Confluent
Platform
20
KSQL
PRESENTATION TITLE ON ONE LINE
AND ON TWO LINES
First and last name
Position, company
Confluent Platform: Enterprise Streaming based on Apache
Kafka™
21
Database
Changes
Log Events loT Data
Web
Events
…
CRM
Data Warehouse
Database
Hadoop
Data
Integration
…
Monitoring
Analytics
Custom Apps
Transformations
Real-time
Applications
…
Apache Open Source Confluent Open Source Confluent Enterprise
Confluent Platform
Confluent Platform
Apache Kafka™
Core | Connect API | Streams API
Data Compatibility
Schema Registry
Monitoring & Administration
Confluent Control Center | Security
Operations
Replicator | Auto Data Balancing
Development and Connectivity
Clients | Connectors | REST Proxy | KSQL | CLI
PRESENTATION TITLE ON ONE LINE
AND ON TWO LINES
First and last name
Position, company
Date to remember
22
• Kafka Summit 2018
• April 23-24 in London!
• More details:
https://kafka-summit.org/
PRESENTATION TITLE ON ONE LINE
AND ON TWO LINES
First and last name
Position, company
THANK YOU
hojjat@confluent.io
@hojjat
Please stay in touch
Any questions?
https://github.com/confluentinc/ksql/
https://www.confluent.io/download/

More Related Content

What's hot

Scylla Summit 2017 Keynote: NextGen NoSQL with CEO Dor Laor
Scylla Summit 2017 Keynote: NextGen NoSQL with CEO Dor LaorScylla Summit 2017 Keynote: NextGen NoSQL with CEO Dor Laor
Scylla Summit 2017 Keynote: NextGen NoSQL with CEO Dor LaorScyllaDB
 
Scylla Summit 2017: SMF: The Fastest RPC in the West
Scylla Summit 2017: SMF: The Fastest RPC in the WestScylla Summit 2017: SMF: The Fastest RPC in the West
Scylla Summit 2017: SMF: The Fastest RPC in the WestScyllaDB
 
Scylla Summit 2017: A Toolbox for Understanding Scylla in the Field
Scylla Summit 2017: A Toolbox for Understanding Scylla in the FieldScylla Summit 2017: A Toolbox for Understanding Scylla in the Field
Scylla Summit 2017: A Toolbox for Understanding Scylla in the FieldScyllaDB
 
If You Care About Performance, Use User Defined Types
If You Care About Performance, Use User Defined TypesIf You Care About Performance, Use User Defined Types
If You Care About Performance, Use User Defined TypesScyllaDB
 
Scylla Summit 2017: Migrating to Scylla From Cassandra and Others With No Dow...
Scylla Summit 2017: Migrating to Scylla From Cassandra and Others With No Dow...Scylla Summit 2017: Migrating to Scylla From Cassandra and Others With No Dow...
Scylla Summit 2017: Migrating to Scylla From Cassandra and Others With No Dow...ScyllaDB
 
Scylla Summit 2017: How Baidu Runs Scylla on a Petabyte-Level Big Data Platform
Scylla Summit 2017: How Baidu Runs Scylla on a Petabyte-Level Big Data PlatformScylla Summit 2017: How Baidu Runs Scylla on a Petabyte-Level Big Data Platform
Scylla Summit 2017: How Baidu Runs Scylla on a Petabyte-Level Big Data PlatformScyllaDB
 
Simplifying the Creation of Machine Learning Workflow Pipelines for IoT Appli...
Simplifying the Creation of Machine Learning Workflow Pipelines for IoT Appli...Simplifying the Creation of Machine Learning Workflow Pipelines for IoT Appli...
Simplifying the Creation of Machine Learning Workflow Pipelines for IoT Appli...ScyllaDB
 
Scylla Summit 2017: How to Use Gocql to Execute Queries and What the Driver D...
Scylla Summit 2017: How to Use Gocql to Execute Queries and What the Driver D...Scylla Summit 2017: How to Use Gocql to Execute Queries and What the Driver D...
Scylla Summit 2017: How to Use Gocql to Execute Queries and What the Driver D...ScyllaDB
 
Scylla Summit 2017: Snapfish's Journey Towards Scylla
Scylla Summit 2017: Snapfish's Journey Towards ScyllaScylla Summit 2017: Snapfish's Journey Towards Scylla
Scylla Summit 2017: Snapfish's Journey Towards ScyllaScyllaDB
 
Scylla Summit 2017: Scylla on Samsung NVMe Z-SSDs
Scylla Summit 2017: Scylla on Samsung NVMe Z-SSDsScylla Summit 2017: Scylla on Samsung NVMe Z-SSDs
Scylla Summit 2017: Scylla on Samsung NVMe Z-SSDsScyllaDB
 
Scylla Summit 2017: How to Optimize and Reduce Inter-DC Network Traffic and S...
Scylla Summit 2017: How to Optimize and Reduce Inter-DC Network Traffic and S...Scylla Summit 2017: How to Optimize and Reduce Inter-DC Network Traffic and S...
Scylla Summit 2017: How to Optimize and Reduce Inter-DC Network Traffic and S...ScyllaDB
 
Scylla Summit 2017: Stretching Scylla Silly: The Datastore of a Graph Databas...
Scylla Summit 2017: Stretching Scylla Silly: The Datastore of a Graph Databas...Scylla Summit 2017: Stretching Scylla Silly: The Datastore of a Graph Databas...
Scylla Summit 2017: Stretching Scylla Silly: The Datastore of a Graph Databas...ScyllaDB
 
Optimizing Apache Spark UDFs
Optimizing Apache Spark UDFsOptimizing Apache Spark UDFs
Optimizing Apache Spark UDFsDatabricks
 
Spark + Parquet In Depth: Spark Summit East Talk by Emily Curtin and Robbie S...
Spark + Parquet In Depth: Spark Summit East Talk by Emily Curtin and Robbie S...Spark + Parquet In Depth: Spark Summit East Talk by Emily Curtin and Robbie S...
Spark + Parquet In Depth: Spark Summit East Talk by Emily Curtin and Robbie S...Spark Summit
 
Project Gemini - a fuzzing tool used by Scylla to guarantee that data, once w...
Project Gemini - a fuzzing tool used by Scylla to guarantee that data, once w...Project Gemini - a fuzzing tool used by Scylla to guarantee that data, once w...
Project Gemini - a fuzzing tool used by Scylla to guarantee that data, once w...ScyllaDB
 
The Apache Spark File Format Ecosystem
The Apache Spark File Format EcosystemThe Apache Spark File Format Ecosystem
The Apache Spark File Format EcosystemDatabricks
 
Materialized Column: An Efficient Way to Optimize Queries on Nested Columns
Materialized Column: An Efficient Way to Optimize Queries on Nested ColumnsMaterialized Column: An Efficient Way to Optimize Queries on Nested Columns
Materialized Column: An Efficient Way to Optimize Queries on Nested ColumnsDatabricks
 
Delta Lake: Optimizing Merge
Delta Lake: Optimizing MergeDelta Lake: Optimizing Merge
Delta Lake: Optimizing MergeDatabricks
 
Latest performance changes by Scylla - Project optimus / Nolimits
Latest performance changes by Scylla - Project optimus / Nolimits Latest performance changes by Scylla - Project optimus / Nolimits
Latest performance changes by Scylla - Project optimus / Nolimits ScyllaDB
 

What's hot (20)

Scylla Summit 2017 Keynote: NextGen NoSQL with CEO Dor Laor
Scylla Summit 2017 Keynote: NextGen NoSQL with CEO Dor LaorScylla Summit 2017 Keynote: NextGen NoSQL with CEO Dor Laor
Scylla Summit 2017 Keynote: NextGen NoSQL with CEO Dor Laor
 
Scylla Summit 2017: SMF: The Fastest RPC in the West
Scylla Summit 2017: SMF: The Fastest RPC in the WestScylla Summit 2017: SMF: The Fastest RPC in the West
Scylla Summit 2017: SMF: The Fastest RPC in the West
 
Scylla Summit 2017: A Toolbox for Understanding Scylla in the Field
Scylla Summit 2017: A Toolbox for Understanding Scylla in the FieldScylla Summit 2017: A Toolbox for Understanding Scylla in the Field
Scylla Summit 2017: A Toolbox for Understanding Scylla in the Field
 
If You Care About Performance, Use User Defined Types
If You Care About Performance, Use User Defined TypesIf You Care About Performance, Use User Defined Types
If You Care About Performance, Use User Defined Types
 
Scylla Summit 2017: Migrating to Scylla From Cassandra and Others With No Dow...
Scylla Summit 2017: Migrating to Scylla From Cassandra and Others With No Dow...Scylla Summit 2017: Migrating to Scylla From Cassandra and Others With No Dow...
Scylla Summit 2017: Migrating to Scylla From Cassandra and Others With No Dow...
 
Scylla Summit 2017: How Baidu Runs Scylla on a Petabyte-Level Big Data Platform
Scylla Summit 2017: How Baidu Runs Scylla on a Petabyte-Level Big Data PlatformScylla Summit 2017: How Baidu Runs Scylla on a Petabyte-Level Big Data Platform
Scylla Summit 2017: How Baidu Runs Scylla on a Petabyte-Level Big Data Platform
 
Simplifying the Creation of Machine Learning Workflow Pipelines for IoT Appli...
Simplifying the Creation of Machine Learning Workflow Pipelines for IoT Appli...Simplifying the Creation of Machine Learning Workflow Pipelines for IoT Appli...
Simplifying the Creation of Machine Learning Workflow Pipelines for IoT Appli...
 
Scylla Summit 2017: How to Use Gocql to Execute Queries and What the Driver D...
Scylla Summit 2017: How to Use Gocql to Execute Queries and What the Driver D...Scylla Summit 2017: How to Use Gocql to Execute Queries and What the Driver D...
Scylla Summit 2017: How to Use Gocql to Execute Queries and What the Driver D...
 
Scylla Summit 2017: Snapfish's Journey Towards Scylla
Scylla Summit 2017: Snapfish's Journey Towards ScyllaScylla Summit 2017: Snapfish's Journey Towards Scylla
Scylla Summit 2017: Snapfish's Journey Towards Scylla
 
Scylla Summit 2017: Scylla on Samsung NVMe Z-SSDs
Scylla Summit 2017: Scylla on Samsung NVMe Z-SSDsScylla Summit 2017: Scylla on Samsung NVMe Z-SSDs
Scylla Summit 2017: Scylla on Samsung NVMe Z-SSDs
 
Scylla Summit 2017: How to Optimize and Reduce Inter-DC Network Traffic and S...
Scylla Summit 2017: How to Optimize and Reduce Inter-DC Network Traffic and S...Scylla Summit 2017: How to Optimize and Reduce Inter-DC Network Traffic and S...
Scylla Summit 2017: How to Optimize and Reduce Inter-DC Network Traffic and S...
 
Scylla Summit 2017: Stretching Scylla Silly: The Datastore of a Graph Databas...
Scylla Summit 2017: Stretching Scylla Silly: The Datastore of a Graph Databas...Scylla Summit 2017: Stretching Scylla Silly: The Datastore of a Graph Databas...
Scylla Summit 2017: Stretching Scylla Silly: The Datastore of a Graph Databas...
 
Optimizing Apache Spark UDFs
Optimizing Apache Spark UDFsOptimizing Apache Spark UDFs
Optimizing Apache Spark UDFs
 
Spark + Parquet In Depth: Spark Summit East Talk by Emily Curtin and Robbie S...
Spark + Parquet In Depth: Spark Summit East Talk by Emily Curtin and Robbie S...Spark + Parquet In Depth: Spark Summit East Talk by Emily Curtin and Robbie S...
Spark + Parquet In Depth: Spark Summit East Talk by Emily Curtin and Robbie S...
 
Xephon K A Time series database with multiple backends
Xephon K A Time series database with multiple backendsXephon K A Time series database with multiple backends
Xephon K A Time series database with multiple backends
 
Project Gemini - a fuzzing tool used by Scylla to guarantee that data, once w...
Project Gemini - a fuzzing tool used by Scylla to guarantee that data, once w...Project Gemini - a fuzzing tool used by Scylla to guarantee that data, once w...
Project Gemini - a fuzzing tool used by Scylla to guarantee that data, once w...
 
The Apache Spark File Format Ecosystem
The Apache Spark File Format EcosystemThe Apache Spark File Format Ecosystem
The Apache Spark File Format Ecosystem
 
Materialized Column: An Efficient Way to Optimize Queries on Nested Columns
Materialized Column: An Efficient Way to Optimize Queries on Nested ColumnsMaterialized Column: An Efficient Way to Optimize Queries on Nested Columns
Materialized Column: An Efficient Way to Optimize Queries on Nested Columns
 
Delta Lake: Optimizing Merge
Delta Lake: Optimizing MergeDelta Lake: Optimizing Merge
Delta Lake: Optimizing Merge
 
Latest performance changes by Scylla - Project optimus / Nolimits
Latest performance changes by Scylla - Project optimus / Nolimits Latest performance changes by Scylla - Project optimus / Nolimits
Latest performance changes by Scylla - Project optimus / Nolimits
 

Similar to Streaming ETL with Kafka and KSQL

KSQL: Open Source Streaming for Apache Kafka
KSQL: Open Source Streaming for Apache KafkaKSQL: Open Source Streaming for Apache Kafka
KSQL: Open Source Streaming for Apache Kafkaconfluent
 
Scylla Summit 2018: Scalable Stream Processing with KSQL, Kafka and ScyllaDB
Scylla Summit 2018: Scalable Stream Processing with KSQL, Kafka and ScyllaDBScylla Summit 2018: Scalable Stream Processing with KSQL, Kafka and ScyllaDB
Scylla Summit 2018: Scalable Stream Processing with KSQL, Kafka and ScyllaDBScyllaDB
 
KSQL - Stream Processing simplified!
KSQL - Stream Processing simplified!KSQL - Stream Processing simplified!
KSQL - Stream Processing simplified!Guido Schmutz
 
Scylla Summit 2017: Scylla's Open Source Monitoring Solution
Scylla Summit 2017: Scylla's Open Source Monitoring SolutionScylla Summit 2017: Scylla's Open Source Monitoring Solution
Scylla Summit 2017: Scylla's Open Source Monitoring SolutionScyllaDB
 
Neo4j Morpheus: Interweaving Table and Graph Data with SQL and Cypher in Apac...
Neo4j Morpheus: Interweaving Table and Graph Data with SQL and Cypher in Apac...Neo4j Morpheus: Interweaving Table and Graph Data with SQL and Cypher in Apac...
Neo4j Morpheus: Interweaving Table and Graph Data with SQL and Cypher in Apac...Databricks
 
KSQL Deep Dive - The Open Source Streaming Engine for Apache Kafka
KSQL Deep Dive - The Open Source Streaming Engine for Apache KafkaKSQL Deep Dive - The Open Source Streaming Engine for Apache Kafka
KSQL Deep Dive - The Open Source Streaming Engine for Apache KafkaKai Wähner
 
MongoDB.local Berlin: Building a GraphQL API with MongoDB, Prisma and Typescript
MongoDB.local Berlin: Building a GraphQL API with MongoDB, Prisma and TypescriptMongoDB.local Berlin: Building a GraphQL API with MongoDB, Prisma and Typescript
MongoDB.local Berlin: Building a GraphQL API with MongoDB, Prisma and TypescriptMongoDB
 
GraphQL & Prisma from Scratch
GraphQL & Prisma from ScratchGraphQL & Prisma from Scratch
GraphQL & Prisma from ScratchNikolas Burk
 
Witsml data processing with kafka and spark streaming
Witsml data processing with kafka and spark streamingWitsml data processing with kafka and spark streaming
Witsml data processing with kafka and spark streamingMark Kerzner
 
Build a Bridge to Cloud with Apache Kafka® for Data Analytics Cloud Services
Build a Bridge to Cloud with Apache Kafka® for Data Analytics Cloud ServicesBuild a Bridge to Cloud with Apache Kafka® for Data Analytics Cloud Services
Build a Bridge to Cloud with Apache Kafka® for Data Analytics Cloud Servicesconfluent
 
Planning with Polyalgebra: Bringing Together Relational, Complex and Machine ...
Planning with Polyalgebra: Bringing Together Relational, Complex and Machine ...Planning with Polyalgebra: Bringing Together Relational, Complex and Machine ...
Planning with Polyalgebra: Bringing Together Relational, Complex and Machine ...Julian Hyde
 
KSQL: Streaming SQL for Kafka
KSQL: Streaming SQL for KafkaKSQL: Streaming SQL for Kafka
KSQL: Streaming SQL for Kafkaconfluent
 
Kai Waehner - KSQL – The Open Source SQL Streaming Engine for Apache Kafka - ...
Kai Waehner - KSQL – The Open Source SQL Streaming Engine for Apache Kafka - ...Kai Waehner - KSQL – The Open Source SQL Streaming Engine for Apache Kafka - ...
Kai Waehner - KSQL – The Open Source SQL Streaming Engine for Apache Kafka - ...Codemotion
 
Kai Waehner - KSQL – The Open Source SQL Streaming Engine for Apache Kafka - ...
Kai Waehner - KSQL – The Open Source SQL Streaming Engine for Apache Kafka - ...Kai Waehner - KSQL – The Open Source SQL Streaming Engine for Apache Kafka - ...
Kai Waehner - KSQL – The Open Source SQL Streaming Engine for Apache Kafka - ...Codemotion
 
KSQL – The Open Source SQL Streaming Engine for Apache Kafka (Big Data Spain ...
KSQL – The Open Source SQL Streaming Engine for Apache Kafka (Big Data Spain ...KSQL – The Open Source SQL Streaming Engine for Apache Kafka (Big Data Spain ...
KSQL – The Open Source SQL Streaming Engine for Apache Kafka (Big Data Spain ...Kai Wähner
 
London Apache Kafka Meetup (Jan 2017)
London Apache Kafka Meetup (Jan 2017)London Apache Kafka Meetup (Jan 2017)
London Apache Kafka Meetup (Jan 2017)Landoop Ltd
 
Kafka Streams vs. KSQL for Stream Processing on top of Apache Kafka
Kafka Streams vs. KSQL for Stream Processing on top of Apache KafkaKafka Streams vs. KSQL for Stream Processing on top of Apache Kafka
Kafka Streams vs. KSQL for Stream Processing on top of Apache KafkaKai Wähner
 

Similar to Streaming ETL with Kafka and KSQL (20)

KSQL: Open Source Streaming for Apache Kafka
KSQL: Open Source Streaming for Apache KafkaKSQL: Open Source Streaming for Apache Kafka
KSQL: Open Source Streaming for Apache Kafka
 
Scylla Summit 2018: Scalable Stream Processing with KSQL, Kafka and ScyllaDB
Scylla Summit 2018: Scalable Stream Processing with KSQL, Kafka and ScyllaDBScylla Summit 2018: Scalable Stream Processing with KSQL, Kafka and ScyllaDB
Scylla Summit 2018: Scalable Stream Processing with KSQL, Kafka and ScyllaDB
 
KSQL - Stream Processing simplified!
KSQL - Stream Processing simplified!KSQL - Stream Processing simplified!
KSQL - Stream Processing simplified!
 
Scylla Summit 2017: Scylla's Open Source Monitoring Solution
Scylla Summit 2017: Scylla's Open Source Monitoring SolutionScylla Summit 2017: Scylla's Open Source Monitoring Solution
Scylla Summit 2017: Scylla's Open Source Monitoring Solution
 
KSQL Intro
KSQL IntroKSQL Intro
KSQL Intro
 
Neo4j Morpheus: Interweaving Table and Graph Data with SQL and Cypher in Apac...
Neo4j Morpheus: Interweaving Table and Graph Data with SQL and Cypher in Apac...Neo4j Morpheus: Interweaving Table and Graph Data with SQL and Cypher in Apac...
Neo4j Morpheus: Interweaving Table and Graph Data with SQL and Cypher in Apac...
 
KSQL Deep Dive - The Open Source Streaming Engine for Apache Kafka
KSQL Deep Dive - The Open Source Streaming Engine for Apache KafkaKSQL Deep Dive - The Open Source Streaming Engine for Apache Kafka
KSQL Deep Dive - The Open Source Streaming Engine for Apache Kafka
 
MongoDB.local Berlin: Building a GraphQL API with MongoDB, Prisma and Typescript
MongoDB.local Berlin: Building a GraphQL API with MongoDB, Prisma and TypescriptMongoDB.local Berlin: Building a GraphQL API with MongoDB, Prisma and Typescript
MongoDB.local Berlin: Building a GraphQL API with MongoDB, Prisma and Typescript
 
GraphQL & Prisma from Scratch
GraphQL & Prisma from ScratchGraphQL & Prisma from Scratch
GraphQL & Prisma from Scratch
 
Witsml data processing with kafka and spark streaming
Witsml data processing with kafka and spark streamingWitsml data processing with kafka and spark streaming
Witsml data processing with kafka and spark streaming
 
Build a Bridge to Cloud with Apache Kafka® for Data Analytics Cloud Services
Build a Bridge to Cloud with Apache Kafka® for Data Analytics Cloud ServicesBuild a Bridge to Cloud with Apache Kafka® for Data Analytics Cloud Services
Build a Bridge to Cloud with Apache Kafka® for Data Analytics Cloud Services
 
Planning with Polyalgebra: Bringing Together Relational, Complex and Machine ...
Planning with Polyalgebra: Bringing Together Relational, Complex and Machine ...Planning with Polyalgebra: Bringing Together Relational, Complex and Machine ...
Planning with Polyalgebra: Bringing Together Relational, Complex and Machine ...
 
Polyalgebra
PolyalgebraPolyalgebra
Polyalgebra
 
KSQL: Streaming SQL for Kafka
KSQL: Streaming SQL for KafkaKSQL: Streaming SQL for Kafka
KSQL: Streaming SQL for Kafka
 
Kai Waehner - KSQL – The Open Source SQL Streaming Engine for Apache Kafka - ...
Kai Waehner - KSQL – The Open Source SQL Streaming Engine for Apache Kafka - ...Kai Waehner - KSQL – The Open Source SQL Streaming Engine for Apache Kafka - ...
Kai Waehner - KSQL – The Open Source SQL Streaming Engine for Apache Kafka - ...
 
Kai Waehner - KSQL – The Open Source SQL Streaming Engine for Apache Kafka - ...
Kai Waehner - KSQL – The Open Source SQL Streaming Engine for Apache Kafka - ...Kai Waehner - KSQL – The Open Source SQL Streaming Engine for Apache Kafka - ...
Kai Waehner - KSQL – The Open Source SQL Streaming Engine for Apache Kafka - ...
 
KSQL – The Open Source SQL Streaming Engine for Apache Kafka (Big Data Spain ...
KSQL – The Open Source SQL Streaming Engine for Apache Kafka (Big Data Spain ...KSQL – The Open Source SQL Streaming Engine for Apache Kafka (Big Data Spain ...
KSQL – The Open Source SQL Streaming Engine for Apache Kafka (Big Data Spain ...
 
London Apache Kafka Meetup (Jan 2017)
London Apache Kafka Meetup (Jan 2017)London Apache Kafka Meetup (Jan 2017)
London Apache Kafka Meetup (Jan 2017)
 
Exploring Relay land
Exploring Relay landExploring Relay land
Exploring Relay land
 
Kafka Streams vs. KSQL for Stream Processing on top of Apache Kafka
Kafka Streams vs. KSQL for Stream Processing on top of Apache KafkaKafka Streams vs. KSQL for Stream Processing on top of Apache Kafka
Kafka Streams vs. KSQL for Stream Processing on top of Apache Kafka
 

More from ScyllaDB

Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
What Developers Need to Unlearn for High Performance NoSQL
What Developers Need to Unlearn for High Performance NoSQLWhat Developers Need to Unlearn for High Performance NoSQL
What Developers Need to Unlearn for High Performance NoSQLScyllaDB
 
Low Latency at Extreme Scale: Proven Practices & Pitfalls
Low Latency at Extreme Scale: Proven Practices & PitfallsLow Latency at Extreme Scale: Proven Practices & Pitfalls
Low Latency at Extreme Scale: Proven Practices & PitfallsScyllaDB
 
Dissecting Real-World Database Performance Dilemmas
Dissecting Real-World Database Performance DilemmasDissecting Real-World Database Performance Dilemmas
Dissecting Real-World Database Performance DilemmasScyllaDB
 
Beyond Linear Scaling: A New Path for Performance with ScyllaDB
Beyond Linear Scaling: A New Path for Performance with ScyllaDBBeyond Linear Scaling: A New Path for Performance with ScyllaDB
Beyond Linear Scaling: A New Path for Performance with ScyllaDBScyllaDB
 
Dissecting Real-World Database Performance Dilemmas
Dissecting Real-World Database Performance DilemmasDissecting Real-World Database Performance Dilemmas
Dissecting Real-World Database Performance DilemmasScyllaDB
 
Database Performance at Scale Masterclass: Workload Characteristics by Felipe...
Database Performance at Scale Masterclass: Workload Characteristics by Felipe...Database Performance at Scale Masterclass: Workload Characteristics by Felipe...
Database Performance at Scale Masterclass: Workload Characteristics by Felipe...ScyllaDB
 
Database Performance at Scale Masterclass: Database Internals by Pavel Emelya...
Database Performance at Scale Masterclass: Database Internals by Pavel Emelya...Database Performance at Scale Masterclass: Database Internals by Pavel Emelya...
Database Performance at Scale Masterclass: Database Internals by Pavel Emelya...ScyllaDB
 
Database Performance at Scale Masterclass: Driver Strategies by Piotr Sarna
Database Performance at Scale Masterclass: Driver Strategies by Piotr SarnaDatabase Performance at Scale Masterclass: Driver Strategies by Piotr Sarna
Database Performance at Scale Masterclass: Driver Strategies by Piotr SarnaScyllaDB
 
Replacing Your Cache with ScyllaDB
Replacing Your Cache with ScyllaDBReplacing Your Cache with ScyllaDB
Replacing Your Cache with ScyllaDBScyllaDB
 
Powering Real-Time Apps with ScyllaDB_ Low Latency & Linear Scalability
Powering Real-Time Apps with ScyllaDB_ Low Latency & Linear ScalabilityPowering Real-Time Apps with ScyllaDB_ Low Latency & Linear Scalability
Powering Real-Time Apps with ScyllaDB_ Low Latency & Linear ScalabilityScyllaDB
 
7 Reasons Not to Put an External Cache in Front of Your Database.pptx
7 Reasons Not to Put an External Cache in Front of Your Database.pptx7 Reasons Not to Put an External Cache in Front of Your Database.pptx
7 Reasons Not to Put an External Cache in Front of Your Database.pptxScyllaDB
 
Getting the most out of ScyllaDB
Getting the most out of ScyllaDBGetting the most out of ScyllaDB
Getting the most out of ScyllaDBScyllaDB
 
NoSQL Database Migration Masterclass - Session 2: The Anatomy of a Migration
NoSQL Database Migration Masterclass - Session 2: The Anatomy of a MigrationNoSQL Database Migration Masterclass - Session 2: The Anatomy of a Migration
NoSQL Database Migration Masterclass - Session 2: The Anatomy of a MigrationScyllaDB
 
NoSQL Database Migration Masterclass - Session 3: Migration Logistics
NoSQL Database Migration Masterclass - Session 3: Migration LogisticsNoSQL Database Migration Masterclass - Session 3: Migration Logistics
NoSQL Database Migration Masterclass - Session 3: Migration LogisticsScyllaDB
 
NoSQL Data Migration Masterclass - Session 1 Migration Strategies and Challenges
NoSQL Data Migration Masterclass - Session 1 Migration Strategies and ChallengesNoSQL Data Migration Masterclass - Session 1 Migration Strategies and Challenges
NoSQL Data Migration Masterclass - Session 1 Migration Strategies and ChallengesScyllaDB
 
ScyllaDB Virtual Workshop
ScyllaDB Virtual WorkshopScyllaDB Virtual Workshop
ScyllaDB Virtual WorkshopScyllaDB
 
DBaaS in the Real World: Risks, Rewards & Tradeoffs
DBaaS in the Real World: Risks, Rewards & TradeoffsDBaaS in the Real World: Risks, Rewards & Tradeoffs
DBaaS in the Real World: Risks, Rewards & TradeoffsScyllaDB
 
Build Low-Latency Applications in Rust on ScyllaDB
Build Low-Latency Applications in Rust on ScyllaDBBuild Low-Latency Applications in Rust on ScyllaDB
Build Low-Latency Applications in Rust on ScyllaDBScyllaDB
 
NoSQL Data Modeling 101
NoSQL Data Modeling 101NoSQL Data Modeling 101
NoSQL Data Modeling 101ScyllaDB
 

More from ScyllaDB (20)

Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
What Developers Need to Unlearn for High Performance NoSQL
What Developers Need to Unlearn for High Performance NoSQLWhat Developers Need to Unlearn for High Performance NoSQL
What Developers Need to Unlearn for High Performance NoSQL
 
Low Latency at Extreme Scale: Proven Practices & Pitfalls
Low Latency at Extreme Scale: Proven Practices & PitfallsLow Latency at Extreme Scale: Proven Practices & Pitfalls
Low Latency at Extreme Scale: Proven Practices & Pitfalls
 
Dissecting Real-World Database Performance Dilemmas
Dissecting Real-World Database Performance DilemmasDissecting Real-World Database Performance Dilemmas
Dissecting Real-World Database Performance Dilemmas
 
Beyond Linear Scaling: A New Path for Performance with ScyllaDB
Beyond Linear Scaling: A New Path for Performance with ScyllaDBBeyond Linear Scaling: A New Path for Performance with ScyllaDB
Beyond Linear Scaling: A New Path for Performance with ScyllaDB
 
Dissecting Real-World Database Performance Dilemmas
Dissecting Real-World Database Performance DilemmasDissecting Real-World Database Performance Dilemmas
Dissecting Real-World Database Performance Dilemmas
 
Database Performance at Scale Masterclass: Workload Characteristics by Felipe...
Database Performance at Scale Masterclass: Workload Characteristics by Felipe...Database Performance at Scale Masterclass: Workload Characteristics by Felipe...
Database Performance at Scale Masterclass: Workload Characteristics by Felipe...
 
Database Performance at Scale Masterclass: Database Internals by Pavel Emelya...
Database Performance at Scale Masterclass: Database Internals by Pavel Emelya...Database Performance at Scale Masterclass: Database Internals by Pavel Emelya...
Database Performance at Scale Masterclass: Database Internals by Pavel Emelya...
 
Database Performance at Scale Masterclass: Driver Strategies by Piotr Sarna
Database Performance at Scale Masterclass: Driver Strategies by Piotr SarnaDatabase Performance at Scale Masterclass: Driver Strategies by Piotr Sarna
Database Performance at Scale Masterclass: Driver Strategies by Piotr Sarna
 
Replacing Your Cache with ScyllaDB
Replacing Your Cache with ScyllaDBReplacing Your Cache with ScyllaDB
Replacing Your Cache with ScyllaDB
 
Powering Real-Time Apps with ScyllaDB_ Low Latency & Linear Scalability
Powering Real-Time Apps with ScyllaDB_ Low Latency & Linear ScalabilityPowering Real-Time Apps with ScyllaDB_ Low Latency & Linear Scalability
Powering Real-Time Apps with ScyllaDB_ Low Latency & Linear Scalability
 
7 Reasons Not to Put an External Cache in Front of Your Database.pptx
7 Reasons Not to Put an External Cache in Front of Your Database.pptx7 Reasons Not to Put an External Cache in Front of Your Database.pptx
7 Reasons Not to Put an External Cache in Front of Your Database.pptx
 
Getting the most out of ScyllaDB
Getting the most out of ScyllaDBGetting the most out of ScyllaDB
Getting the most out of ScyllaDB
 
NoSQL Database Migration Masterclass - Session 2: The Anatomy of a Migration
NoSQL Database Migration Masterclass - Session 2: The Anatomy of a MigrationNoSQL Database Migration Masterclass - Session 2: The Anatomy of a Migration
NoSQL Database Migration Masterclass - Session 2: The Anatomy of a Migration
 
NoSQL Database Migration Masterclass - Session 3: Migration Logistics
NoSQL Database Migration Masterclass - Session 3: Migration LogisticsNoSQL Database Migration Masterclass - Session 3: Migration Logistics
NoSQL Database Migration Masterclass - Session 3: Migration Logistics
 
NoSQL Data Migration Masterclass - Session 1 Migration Strategies and Challenges
NoSQL Data Migration Masterclass - Session 1 Migration Strategies and ChallengesNoSQL Data Migration Masterclass - Session 1 Migration Strategies and Challenges
NoSQL Data Migration Masterclass - Session 1 Migration Strategies and Challenges
 
ScyllaDB Virtual Workshop
ScyllaDB Virtual WorkshopScyllaDB Virtual Workshop
ScyllaDB Virtual Workshop
 
DBaaS in the Real World: Risks, Rewards & Tradeoffs
DBaaS in the Real World: Risks, Rewards & TradeoffsDBaaS in the Real World: Risks, Rewards & Tradeoffs
DBaaS in the Real World: Risks, Rewards & Tradeoffs
 
Build Low-Latency Applications in Rust on ScyllaDB
Build Low-Latency Applications in Rust on ScyllaDBBuild Low-Latency Applications in Rust on ScyllaDB
Build Low-Latency Applications in Rust on ScyllaDB
 
NoSQL Data Modeling 101
NoSQL Data Modeling 101NoSQL Data Modeling 101
NoSQL Data Modeling 101
 

Recently uploaded

Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 

Recently uploaded (20)

Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 

Streaming ETL with Kafka and KSQL

  • 1. PRESENTATION TITLE ON ONE LINE AND ON TWO LINES First and last name Position, company Streaming ETL in Kafka for Everyone with KSQL Software Engineer, Confluent Inc. Hojjat Jafarpour
  • 2. PRESENTATION TITLE ON ONE LINE AND ON TWO LINES First and last name Position, company Hojjat Jafarpour 2 Software Engineer at Confluent ○Starter KSQL project at Confluent Previously at Tidemark, Quantcast, Informatica and NEC Labs PhD in Computer Science from UC Irvine ○Data management, pub/sub and streaming hojjat@confluent.io @hojjat
  • 3. PRESENTATION TITLE ON ONE LINE AND ON TWO LINES First and last name Position, company Streaming ETL, with Apache Kafka and Confluent Platform 3
  • 4. PRESENTATION TITLE ON ONE LINE AND ON TWO LINES First and last name Position, company 4
  • 5. PRESENTATION TITLE ON ONE LINE AND ON TWO LINES First and last name Position, company 5
  • 6. PRESENTATION TITLE ON ONE LINE AND ON TWO LINES First and last name Position, company 6
  • 7. PRESENTATION TITLE ON ONE LINE AND ON TWO LINES First and last name Position, company 7
  • 8. PRESENTATION TITLE ON ONE LINE AND ON TWO LINES First and last name Position, company Kafka Connect : Stream data in and out of Kafka 8 Amazon S3
  • 9. PRESENTATION TITLE ON ONE LINE AND ON TWO LINES First and last name Position, company Single Message Transform (SMT) 9 ▪ Modify events before storing in Kafka: o Mask/drop sensitive information o Set partitioning key o Store lineage ▪ Modify events going out of Kafka: o Route high priority events to faster data stores o Direct events to different Elasticsearch indexes o Cast data types to match destination
  • 10. PRESENTATION TITLE ON ONE LINE AND ON TWO LINES First and last name Position, company 10 But I need to join…aggregate…filter …
  • 11. PRESENTATION TITLE ON ONE LINE AND ON TWO LINES First and last name Position, company KSQL from Confluent 11 A Developer Preview of KSQL An Open Source Streaming SQL Engine for Apache KafkaTM
  • 12. PRESENTATION TITLE ON ONE LINE AND ON TWO LINES First and last name Position, company KSQL: a Streaming SQL Engine for Apache Kafka™ from Confluent ▪ Enables stream processing with zero coding required ▪ The simplest way to process streams of data in real-time ▪ Powered by Kafka: scalable, distributed, battle-tested ▪ All you need is Kafka–No complex deployments of bespoke systems for stream processing 12 Ksql>
  • 13. PRESENTATION TITLE ON ONE LINE AND ON TWO LINES First and last name Position, company KSQL: the Simplest Way to Do Stream Processing CREATE STREAM possible_fraud AS SELECT card_number, count(*) FROM authorization_attempts WINDOW TUMBLING (SIZE 5 SECONDS) GROUP BY card_number HAVING count(*) > 3; 13
  • 14. PRESENTATION TITLE ON ONE LINE AND ON TWO LINES First and last name Position, company KSQL Concepts ▪ STREAM and TABLE as first-class citizens o Interpretations of topic content ▪ STREAM - data in motion ▪ TABLE - collected state of a stream o One record per key (per window) o Current values (compacted topic) ← Not yet in KSQL ▪ STREAM – TABLE Joins 14
  • 15. PRESENTATION TITLE ON ONE LINE AND ON TWO LINES First and last name Position, company Window Aggregations Three types supported (same as KStreams): ● TUMBLING: Fixed-size, non-overlapping, gap-less windows • SELECT ip, count(*) AS hits FROM clickstream WINDOW TUMBLING (size 1 minute) GROUP BY ip; ● HOPPING: Fixed-size, overlapping windows • SELECT ip, SUM(bytes) AS bytes_per_ip_and_bucket FROM clickstream WINDOW HOPPING ( size 20 second, advance by 5 second) GROUP BY ip; ● SESSION: Dynamically-sized, non-overlapping, data-driven window • SELECT ip, SUM(bytes) AS bytes_per_ip FROM clickstream WINDOW SESSION (20 second) GROUP BY ip; 15
  • 16. PRESENTATION TITLE ON ONE LINE AND ON TWO LINES First and last name Position, company Streaming ETL, powered by Apache Kafka and Confluent Platform 16 KSQL
  • 17. PRESENTATION TITLE ON ONE LINE AND ON TWO LINES First and last name Position, company Simple Web Analytics Pipeline ● Pageview stream ● User table ● Materialized views o Region visitor count o Region visitor demography 17 CREATE STREAM pageviews (viewtime BIGINT, userid VARCHAR, pageid VARCHAR) WITH (kafka_topic='pageviews', value_format=JSON); CREATE TABLE users (registertime BIGINT, gender VARCHAR, regionid VARCHAR, userid VARCHAR) WITH (kafka_topic='users', value_format='JSON');
  • 18. PRESENTATION TITLE ON ONE LINE AND ON TWO LINES First and last name Position, company Simple Web Analytics Pipeline 18 Region visitor count CREATE STREAM joined_pageviews AS SELECT users.userid AS userid, pageid, regionid, gender FROM pageviews LEFT JOIN users ON pageviews.userid = users.userid; CREATE TABLE region_visitor_count AS SELECT regionid , COUNT(*) AS visit_count FROM joined_pageviews WINDOW TUMBLING (size 30 second) GROUP BY regionid;
  • 19. PRESENTATION TITLE ON ONE LINE AND ON TWO LINES First and last name Position, company Simple Web Analytics Pipeline 19 Region visitor demography CREATE TABLE region_visitor_demo_count AS SELECT regionid, gender, COUNT(*) AS visit_count FROM joined_pageviews WINDOW TUMBLING (size 30 second) GROUP BY gender, regionid;
  • 20. PRESENTATION TITLE ON ONE LINE AND ON TWO LINES First and last name Position, company Streaming ETL, powered by Apache Kafka and Confluent Platform 20 KSQL
  • 21. PRESENTATION TITLE ON ONE LINE AND ON TWO LINES First and last name Position, company Confluent Platform: Enterprise Streaming based on Apache Kafka™ 21 Database Changes Log Events loT Data Web Events … CRM Data Warehouse Database Hadoop Data Integration … Monitoring Analytics Custom Apps Transformations Real-time Applications … Apache Open Source Confluent Open Source Confluent Enterprise Confluent Platform Confluent Platform Apache Kafka™ Core | Connect API | Streams API Data Compatibility Schema Registry Monitoring & Administration Confluent Control Center | Security Operations Replicator | Auto Data Balancing Development and Connectivity Clients | Connectors | REST Proxy | KSQL | CLI
  • 22. PRESENTATION TITLE ON ONE LINE AND ON TWO LINES First and last name Position, company Date to remember 22 • Kafka Summit 2018 • April 23-24 in London! • More details: https://kafka-summit.org/
  • 23. PRESENTATION TITLE ON ONE LINE AND ON TWO LINES First and last name Position, company THANK YOU hojjat@confluent.io @hojjat Please stay in touch Any questions? https://github.com/confluentinc/ksql/ https://www.confluent.io/download/