Serverless Stream Processing
Bill Bejeck
@bbejeck
Nice to meet you!
• Member of the DevX team
• Prior to DevX ~3 years as engineer on Kafka Streams team
• Apache Kafka® Committer and PMC member
• Author of “Kafka Streams in Action” - 2nd edition underway!
2
@bbejeck
Agenda
• Meetup Details
• Serverless Vegetable Soup
• ksqlDB Introduction
• AWS Lambda
• Application walkthrough
3
Meetup Details
@bbejeck
Meetup Details
• Clone GitHub repo for the meetup
• https://github.com/confluentinc/CCloud-Serverless-Integration
• https://docs.confluent.io/confluent-cli/current/install.html
• https://aws.amazon.com/cli/
• Get a Confluent Cloud account
• https://confluent.cloud/signup
• Use code CC60COMM
• AWS account
• https://aws.amazon.com/free/
• Create a root account
5
@bbejeck
Meetup Details
• Kick off the application script
• <repo directory>/ccloud-build-app.sh
6
Serverless Vegetable Soup
@bbejeck
Stateful vs Stateless
• Stateless is simpler
• Predicate.apply(T parameter)
• Idempotent functions
• Can’t make decisions based on previous inputs
• Stateful is more complex
• Aggregations
• Keeps state of previous records
• Can make decisions in near-real time when combined with stream
processing
8
@bbejeck
Stream Processing
• Why do we need it?
• Hotel interactions
• Check-in, mini-bar purchases
• Fraud detection
• No good to look at yesterday's transactions
• Supply chain
• Route a truck in real-time
9
@bbejeck
Stream Processing
• Applications can respond immediately to events
• Context-aware decisions in real-time
• Implies the need for state – hard to make decisions in on isolated events
• State needs to be easy to manage
• Needs to be highly available
10
@bbejeck
Serverless
• Not concerned with infrastructure
• Some one else’s issue! *
• Focus is solely on your business problem
• Simplified development process *
• Only pay for what’s used
• Flexible options driven by needs/requirements
11
@bbejeck
Functions as a Service (FaaS)
• Extension of Serverless (my interpretation)
• Discrete chunk of code vs. an application
• Triggered by specific event – typically not long running
• Scales with load
• Sporadic in nature
• Executes when needed
• Only pay for what’s used
12
ksqlDB Introduction
Compute Storage
CREATE TABLE activePromotions AS
SELECT rideId,
qualifyPromotion(distanceToDst) AS promotion
FROM locations
GROUP BY rideId
EMIT CHANGES
ksqlDB Kafka
Build a complete real-time application
with just a few SQL statements
Easily Build
Real- Time Apps
ksqlDB at a Glance
ksqlDB is a database for building real-
time applications that leverage stream
processing
Joins Aggregates
Push & Pull Queries Filters
User-Defined
Functions
Connectors
@bbejeck
ksqlDB and events
• Enables event-based applications
• Using stateful operations are easily and seamless
15
@bbejeck
ksqlDB Input Data
{
”side": ”BUY",
”quantity": 2000,
”symbol": ”CFLT",
”price": 100,
”account": “Vandeley”,
"userid": 100
}
16
@bbejeck
ksqlDB Create Stream
CREATE STREAM stocktrade (side varchar,
quantity int,
symbol varchar,
price int,
account varchar,
userid int)
WITH (kafka_topic = 'stocktrade’,
partitions = 6,
value_format = ‘JSON’);
17
@bbejeck
ksqlDB Create Table
CREATE Table users (userid varchar primary key,
registertime bigint,
regionid varchar)
WITH (kafka_topic = ‘users’,
value_format = ‘JSON’);
18
@bbejeck
ksqlDB Persistent Query (Push)
CREATE TABLE SHARE_PRICE AS
SELECT symbol,
AVG(price) AS AVERAGE_SHARE_PRICE
FROM stocks
WINDOW TUMBLING (SIZE 5 MINUTES)
GROUP BY symbol
EMIT CHANGES;
19
@bbejeck
ksqlDB Pull Query
SELECT * from SHARE_PRICE
WHERE symbol = 'CFLT'
20
@bbejeck
ksqlDB Availablilty
21
@bbejeck
ksqlDB Failover
22
@bbejeck
ksqlDB Failover
23
AWS Lambda
@bbejeck
AWS Lambda
• Exemplifies FaaS
• Key component of AWS Serverless Application Model
• https://docs.aws.amazon.com/serverless-application-
model/latest/developerguide/what-is-sam.html
• Upload code as a zip file
• Configure the trigger
25
Application
@bbejeck
Application overview
27
@bbejeck
Additional Steps
• Create the AWS Lambda
• Set properties in configs.sh
• View Lambda logs
• Execute SQL for processing Lambda output
• View the Stream Lineage on Confluent
28
@bbejeck
Clean Up
• Important to run the clean up steps outlined in repo
• Remove all AWS components
• Remove all Confluent Cloud
29
@bbejeck
For More Details
• Serverless processing whitepaper
• https://www.confluent.io/resources/white-paper/stateful-serverless-
architectures-with-ksqldb-aws-lambda/
• Blog on serverless stream processing
• https://www.confluent.io/blog/serverless-event-stream-processing/
30
@bbejeck
Resources
• Tutorials and ksqlDB use cases – https://developer.confluent.io/tutorials
• ksqlDB Documentation - https://docs.ksqldb.io/en/latest/
• Confluent Developer - developer.confluent.io
• AWS SAM - https://docs.aws.amazon.com/serverless-application-
model/latest/developerguide/what-is-sam.html
31
Thank you!
@bbejeck
bill@confluent.io
cnfl.io/meetups cnfl.io/slack
cnfl.io/blog
Learn Kafka!
Confluent Developer
developer.confluent.io

Serverless Stream Processing with Bill Bejeck