6. CREATE STREAM possible_fraud AS
SELECT card_number, count(*)
FROM authorization_attempts
WINDOW TUMBLING (SIZE 5 MINUTE)
GROUP BY card_number
HAVING count(*) > 3;
authorization_attempts possible_fraud
7. CREATE STREAM possible_fraud AS
SELECT card_number, count(*)
FROM authorization_attempts
WINDOW TUMBLING (SIZE 5 MINUTE)
GROUP BY card_number
HAVING count(*) > 3;
authorization_attempts possible_fraud
8. CREATE STREAM possible_fraud AS
SELECT card_number, count(*)
FROM authorization_attempts
WINDOW TUMBLING (SIZE 5 MINUTE)
GROUP BY card_number
HAVING count(*) > 3;
authorization_attempts possible_fraud
9. CREATE STREAM possible_fraud AS
SELECT card_number, count(*)
FROM authorization_attempts
WINDOW TUMBLING (SIZE 5 MINUTE)
GROUP BY card_number
HAVING count(*) > 3;
authorization_attempts possible_fraud
10. CREATE STREAM possible_fraud AS
SELECT card_number, count(*)
FROM authorization_attempts
WINDOW TUMBLING (SIZE 5 MINUTE)
GROUP BY card_number
HAVING count(*) > 3;
authorization_attempts possible_fraud
11. CREATE STREAM possible_fraud AS
SELECT card_number, count(*)
FROM authorization_attempts
WINDOW TUMBLING (SIZE 5 MINUTE)
GROUP BY card_number
HAVING count(*) > 3;
authorization_attempts possible_fraud
22. Kafka: a Streaming Platform
The Log ConnectorsConnectors
Producer Consumer
Streaming Engine
23. SELECT card_number, count(*)
FROM authorization_attempts
WINDOW (SIZE 5 MINUTE)
GROUP BY card_number
HAVING count(*) > 3;
KSQL is SQL over Kafka Streams
24. Kafka Streams is just an API
public static void main(String[] args) {
StreamsBuilder builder = new StreamsBuilder();
builder.stream(”caterpillars")
.map((k, v) -> coolTransformation(k, v))
.to(“butterflies”);
new KafkaStreams(builder.build(), props()).start();
}
24
28. A KTable is just a stream with infinite retention
KStream orders = builder.stream(“Orders”);
KStream payments = builder.stream(“Payments”);
KTable customers = builder.table(“Customers”);
orders.join(payments, EmailTuple::new, JoinWindows.of(10*MIN))
.join(customers, (tuple, cust) -> tuple.setCust(cust))
.peek((key, tuple) -> emailer.sendMail(tuple));
KAFKA
Join
Materialize a
table in two
lines of code!
Dataset Moves
to Client
29. Streaming is about
1. Processing data incrementally
2. Moving data to where it needs to be
processed (quickly and efficiently)
30. Kafka: a Streaming Platform
The Log ConnectorsConnectors
Producer Consumer
Streaming Engine
36. Buying an iPad with REST
Submit
Order
shipOrder() getCustomer()
Orders
Service
Shipping
Service
Customer
Service
Webserver
37. Buying an iPad with Events
Message Broker (Kafka)
Notification Data is
replicated
(incrementally)
Submit
Order
Order
Created
Customer
Updated
Orders
Service
Shipping
Service
Customer
Service
Webserver
KAFKA
38. Events for Notification Only
Message Broker (Kafka)
Submit
Order
Order
Created
getCustomer()
REST
Notification
Orders
Service
Shipping
Service
Customer
Service
Webserver
KAFKA
39. Events for Data Locality
Customer
Updated
Submit
Order
Order
Created
Data is
replicated
(incrementally)Orders
Service
Shipping
Service
Customer
Service
Webserver
KAFKA
50. Add Payments Service & Window
Orders
Service
Customer
Service
KAFKA
Web App
Scrollable Grid
Orders
provide
Notification
Customers
are replicated
Payments
Service
Buffer /
Window
57. WIRED Principals
• Windowed: Use an API built for async events
• Immutable: Store events in an immutable log
• Repeatable: Compose from side-effect free functions
• Evolutionary: Be pluggable. Have data available in the log.
• Data-Enabled: Push data to services where necessary