SlideShare a Scribd company logo
1 of 65
Download to read offline
ATM Fraud Detection
@rmoff
with Apache Kafka
and KSQL
Photoby FreddieCollins on Unsplash
ATM Fraud Detection with Apache Kafka and KSQL
@rmoff
Spotting fraud in realtime
hotoby MirzaBabic on Unsplash
Photoby LasayeHommes on Unsplash
ATM Fraud Detection with Apache Kafka and KSQL
@rmoff
• Account id
• Location
• Amount
•
Inbound stream of ATM data
https://github.com/rmoff/gess
ATM Fraud Detection with Apache Kafka and KSQL
@rmoff
Demo!
ATM Fraud Detection with Apache Kafka and KSQL
@rmoff
Spot patterns within this stream
Ac. ID Transaction ID Time ATM
A42 xxx116d91d6-ef17 11:56:58 Midland
A42 116d91d6-ef17 11:58:19 Halifax
A42 09c2f660-ef17 19:31:11 Lloyds
ATM Fraud Detection with Apache Kafka and KSQL
@rmoff
Spot patterns within this stream
Ac. ID Transaction ID Time ATM
A42 xxx116d91d6-ef17 11:56:58 Midland
A42 116d91d6-ef17 11:58:19 Halifax
A42 09c2f660-ef17 19:31:11 Lloyds
Legit
Legit
ATM Fraud Detection with Apache Kafka and KSQL
@rmoff
Ac. ID Transaction ID Time ATM
A42 xxx116d91d6-ef17 11:56:58 Midland
A42 116d91d6-ef17 11:58:19 Halifax
A42 09c2f660-ef17 19:31:11 Lloyds
Spot patterns within this stream
Legit
Dodgy!
Legit
ATM Fraud Detection with Apache Kafka and KSQL
@rmoff
Ac. ID Transaction ID Time ATM
A42 xxx116d91d6-ef17 11:56:58 Midland
A42 116d91d6-ef17 11:58:19 Halifax
A42 09c2f660-ef17 19:31:11 Lloyds
Spot patterns within this stream
Legit
Dodgy!
Legit
ATM Fraud Detection with Apache Kafka and KSQL
@rmoff
• Account id
• Location
• Amount
•
Inbound stream of ATM data
https://github.com/rmoff/gess
ATM Fraud Detection with Apache Kafka and KSQL
@rmoff
KSQL : Stream Processing with SQL
TXN_ID, ATM,
CUSTOMER_NAME,
CUSTOMER_PHONE
ATM_POSSIBLE_FRAUD;
SELECT
FROM
ATM Fraud Detection with Apache Kafka and KSQL
@rmoff
ATM Fraud Detection with Apache Kafka and KSQL
@rmoff
Customer
details
ATM fraud txns
with customer
details
Elasticsearch
Notification
service
1. Spot fraud in stream of
transactions
2.Enrich transaction events
with customer data
ATM Fraud Detection with Apache Kafka and KSQL
@rmoff
KSQLis the
Streaming
SQL Engine
for
Apache Kafka
ATM Fraud Detection with Apache Kafka and KSQL
@rmoff
KSQL for Real-Time Monitoring
• Log data monitoring, tracking and alerting
• syslog data
• Sensor / IoT data
CREATE STREAM SYSLOG_INVALID_USERS AS
SELECT HOST, MESSAGE
FROM SYSLOG
WHERE MESSAGE LIKE '%Invalid user%';
http://cnfl.io/syslogs-filtering / http://cnfl.io/syslog-alerting
ATM Fraud Detection with Apache Kafka and KSQL
@rmoff
KSQL for Streaming ETL
CREATE STREAM vip_actions AS 

SELECT userid, page, action
FROM clickstream c
LEFT JOIN users u
ON c.userid = u.user_id 

WHERE u.level = 'Platinum';
Joining, filtering, and aggregating streams of event data
ATM Fraud Detection with Apache Kafka and KSQL
@rmoff
KSQL for Anomaly Detection
CREATE TABLE possible_fraud AS

SELECT card_number, count(*)

FROM authorization_attempts 

WINDOW TUMBLING (SIZE 5 SECONDS)

GROUP BY card_number

HAVING count(*) > 3;
Identifying patterns or anomalies in real-time data,
surfaced in milliseconds
ATM Fraud Detection with Apache Kafka and KSQL
@rmoff
CREATE STREAM pageviews
WITH (PARTITIONS=4,
VALUE_FORMAT='AVRO') AS 

SELECT * FROM pageviews_json;
KSQL for Data Transformation
Make simple derivations of existing topics from the command line
ATM Fraud Detection with Apache Kafka and KSQL
@rmoff
KSQL in Development and Production
Interactive KSQL

for development and testing
Headless KSQL

for Production
Desired KSQL queries
have been identified
REST
“Hmm, let me try

out this idea...”
ATM Fraud Detection with Apache Kafka and KSQL
@rmoff
Stream Stream joins
Orders
Shipments
Which orders
haven't shipped?
order.id = shipment.order_id
Leadtime
shipment_ts - order_ts
ATM Fraud Detection with Apache Kafka and KSQL
@rmoff
Stream Stream joins
ATM transactions
ATM Fraud Detection with Apache Kafka and KSQL
@rmoff
Stream Stream joins
ATM transactions
ATM Fraud Detection with Apache Kafka and KSQL
@rmoff
Demo!
ATM Fraud Detection with Apache Kafka and KSQL
@rmoff
Self-Join (Cartesian product)
Ac. ID Transaction ID Time ATM
A42 xxx116d91d6-ef17 11:56:58 Midland
A42 116d91d6-ef17 11:58:19 Halifax
A42 09c2f660-ef17 19:31:11 Lloyds
T
Ac. ID Transaction ID Time ATM
A42 xxx116d91d6-ef17 11:56:58 Midland
A42 116d91d6-ef17 11:58:19 Halifax
A42 09c2f660-ef17 19:31:11 Lloyds
Ac. ID Transaction ID Time ATM
A42 xxx116d91d6-ef17 11:56:58 Midland
A42 116d91d6-ef17 11:58:19 Halifax
A42 09c2f660-ef17 19:31:11 Lloyds
T1 T2
ATM Fraud Detection with Apache Kafka and KSQL
@rmoff
Ac. ID Transaction ID Time ATM
A42 xxx116d91d6-ef17 11:56:58 Midland
A42 116d91d6-ef17 11:58:19 Halifax
A42 09c2f660-ef17 19:31:11 Lloyds
Ac. ID Transaction ID Time ATM
A42 xxx116d91d6-ef17 11:56:58 Midland
A42 116d91d6-ef17 11:58:19 Halifax
A42 09c2f660-ef17 19:31:11 Lloyds
T1 T2
Self-Join (Cartesian product)
ATM_TXNS T1
INNER JOIN ATM_TXNS T2
ON T1.ACCOUNT_ID = T2.ACCOUNT_ID
ATM Fraud Detection with Apache Kafka and KSQL
@rmoff
Self-Join (Cartesian product)
FROM ATM_TXNS T1
INNER JOIN ATM_TXNS T2
WITHIN 10 MINUTES
ON T1.ACCOUNT_ID = T2.ACCOUNT_ID
Ac. ID Transaction ID Time ATM
A42 xxx116d91d6-ef17 11:56:58 Midland
A42 116d91d6-ef17 11:58:19 Halifax
A42 09c2f660-ef17 19:31:11 Lloyds
Ac. ID Transaction ID Time ATM
A42 xxx116d91d6-ef17 11:56:58 Midland
A42 116d91d6-ef17 11:58:19 Halifax
A42 09c2f660-ef17 19:31:11 Lloyds
T1 T2
ATM Fraud Detection with Apache Kafka and KSQL
@rmoff
Self-Join
T1 Txn ID T2 Txn ID T1 Time T2 Time T1 ATM T2 ATM
xxx116d91d6-ef17 116d91d6-ef17 11:56:58 11:58:19 Midland Halifax
116d91d6-ef17 xxx116d91d6-ef17 11:58:19 11:56:58 Halifax Midland
xxx116d91d6-ef17 xxx116d91d6-ef17 11:56:58 11:56:58 Midland Midland
116d91d6-ef17 116d91d6-ef17 11:58:19 11:58:19 Halifax Halifax
ATM Fraud Detection with Apache Kafka and KSQL
@rmoff
Self-Join
T1 Txn ID T2 Txn ID T1 Time T2 Time T1 ATM T2 ATM
xxx116d91d6-ef17 116d91d6-ef17 11:56:58 11:58:19 Midland Halifax
116d91d6-ef17 xxx116d91d6-ef17 11:58:19 11:56:58 Halifax Midland
xxx116d91d6-ef17 xxx116d91d6-ef17 11:56:58 11:56:58 Midland Midland
116d91d6-ef17 116d91d6-ef17 11:58:19 11:58:19 Halifax Halifax
Self join on same txn IDs
ATM Fraud Detection with Apache Kafka and KSQL
@rmoff
Exclude joins on the same txn
WHERE T1.TRANSACTION_ID !=
T2.TRANSACTION_ID
T1 Txn ID T2 Txn ID T1 Time T2 Time T1 ATM T2 ATM
xxx116d91d6-ef17 116d91d6-ef17 11:56:58 11:58:19 Midland Halifax
116d91d6-ef17 xxx116d91d6-ef17 11:58:19 11:56:58 Halifax Midland
ATM Fraud Detection with Apache Kafka and KSQL
@rmoff
Exclude joins on the same txn
T1 Txn ID T2 Txn ID T1 Time T2 Time T1 ATM T2 ATM
xxx116d91d6-ef17 116d91d6-ef17 11:56:58 11:58:19 Midland Halifax
116d91d6-ef17 xxx116d91d6-ef17 11:58:19 11:56:58 Halifax Midland
Duplicate results (A:B / B:A)
ATM Fraud Detection with Apache Kafka and KSQL
@rmoff
Join Windows
Ac. ID Transaction ID Time ATM
A42 xxx116d91d6-ef17 11:56:58 Midland
A42 116d91d6-ef17 11:58:19 Halifax
A42 09c2f660-ef17 19:31:11 Lloyds
Ac. ID Transaction ID Time ATM
A42 xxx116d91d6-ef17 11:56:58 Midland
A42 116d91d6-ef17 11:58:19 Halifax
A42 09c2f660-ef17 19:31:11 Lloyds
T1 T2
WITHIN 10 MINUTES
WHERE T1.TRANSACTION_ID !=
T2.TRANSACTION_ID
ATM Fraud Detection with Apache Kafka and KSQL
@rmoff
Join Windows
Ac. ID Transaction ID Time ATM
A42 xxx116d91d6-ef17 11:56:58 Midland
A42 116d91d6-ef17 11:58:19 Halifax
A42 09c2f660-ef17 19:31:11 Lloyds
Ac. ID Transaction ID Time ATM
A42 xxx116d91d6-ef17 11:56:58 Midland
A42 116d91d6-ef17 11:58:19 Halifax
A42 09c2f660-ef17 19:31:11 Lloyds
T1 T2
WITHIN 10 MINUTES
WHERE T1.TRANSACTION_ID !=
T2.TRANSACTION_ID
ATM Fraud Detection with Apache Kafka and KSQL
@rmoff
Join Windows
Ac. ID Transaction ID Time ATM
A42 xxx116d91d6-ef17 11:56:58 Midland
A42 116d91d6-ef17 11:58:19 Halifax
A42 09c2f660-ef17 19:31:11 Lloyds
Ac. ID Transaction ID Time ATM
A42 xxx116d91d6-ef17 11:56:58 Midland
A42 116d91d6-ef17 11:58:19 Halifax
A42 09c2f660-ef17 19:31:11 Lloyds
T1 T2
WITHIN 10 MINUTES
WHERE T1.TRANSACTION_ID !=
T2.TRANSACTION_ID
ATM Fraud Detection with Apache Kafka and KSQL
@rmoff
Only join forward
Ac. ID Transaction ID Time ATM
A42 xxx116d91d6-ef17 11:56:58 Midland
A42 116d91d6-ef17 11:58:19 Halifax
A42 09c2f660-ef17 19:31:11 Lloyds
Ac. ID Transaction ID Time ATM
A42 xxx116d91d6-ef17 11:56:58 Midland
A42 116d91d6-ef17 11:58:19 Halifax
A42 09c2f660-ef17 19:31:11 Lloyds
T1 T2
WITHIN (0 MINUTES, 10 MINUTES)
WHERE T1.TRANSACTION_ID !=
T2.TRANSACTION_ID
ATM Fraud Detection with Apache Kafka and KSQL
@rmoff
Only join forward
Ac. ID Transaction ID Time ATM
A42 xxx116d91d6-ef17 11:56:58 Midland
A42 116d91d6-ef17 11:58:19 Halifax
A42 09c2f660-ef17 19:31:11 Lloyds
Ac. ID Transaction ID Time ATM
A42 xxx116d91d6-ef17 11:56:58 Midland
A42 116d91d6-ef17 11:58:19 Halifax
A42 09c2f660-ef17 19:31:11 Lloyds
T1 T2
WITHIN (0 MINUTES, 10 MINUTES)
WHERE T1.TRANSACTION_ID !=
T2.TRANSACTION_ID
ATM Fraud Detection with Apache Kafka and KSQL
@rmoff
Only join forward
T1 Txn ID T2 Txn ID T1 Time T2 Time T1 ATM T2 ATM
xxx116d91d6-ef17 116d91d6-ef17 11:56:58 11:58:19 Midland Halifax
WITHIN (0 MINUTES, 10 MINUTES)
Ignore events in the right-hand
stream prior to those in the left
ATM Fraud Detection with Apache Kafka and KSQL
@rmoff
Only join forward
T1 Txn ID T2 Txn ID T1 Time T2 Time T1 ATM T2 ATM
xxx116d91d6-ef17 116d91d6-ef17 11:56:58 11:58:19 Midland Halifax
Legit Dodgy!
ATM Fraud Detection with Apache Kafka and KSQL
@rmoff
Photoby EstebanLopez on Unsplash
ATM Fraud Detection with Apache Kafka and KSQL
@rmoff
Calcuate distance between ATMs
GEO_DISTANCE(TX1.location->lat, TX1.location->lon,
TX2.location->lat, TX2.location->lon,
'KM')
TX1
TX2
ATM Fraud Detection with Apache Kafka and KSQL
@rmoff
Calculate time between transactions
TX2.ROWTIME - TX1.ROWTIME AS
MILLISECONDS_DIFFERENCE
(TX2.ROWTIME - TX1.ROWTIME)
/ 1000 / 60 / 60 AS HOURS_DIFFERENCE
ATM Fraud Detection with Apache Kafka and KSQL
@rmoff
Photoby EstebanLopez on Unsplash
GEO_DISTANCE(…) / HOURS_DIFFERENCE
AS KMH_REQUIRED
ATM Fraud Detection with Apache Kafka and KSQL
@rmoff
So speaking of time…
ksql> PRINT 'atm_txns_gess' ;
Format:JSON
{
"ROWTIME": 1544116309152,
"ROWKEY": "null",
"account_id": "a218",
"timestamp": "2018-12-06 17:09:58 +0000",
"atm": "HSBC",
…}
Kafka message
timestamp
2018-12-06 17:11:49
Event time
ksql> PRINT 'atm_txns_gess' ;
Format:JSON
{
"ROWTIME": 1544116309152,
"ROWKEY": "null",
"account_id": "a218",
"timestamp": "2018-12-06 17:09:58 +0000",
CREATE STREAM ATM_TXNS_GESS
(account_id VARCHAR, timestamp VARCHAR, …
WITH (KAFKA_TOPIC='atm_txns_gess',
TIMESTAMP='timestamp',
TIMESTAMP_FORMAT=
'yyyy-MM-dd HH:mm:ss X');
"timestamp": "2018-12-06 17:09:58 +0000",
ATM Fraud Detection with Apache Kafka and KSQL
@rmoff
But what about the account holder?
!
ATM Fraud Detection with Apache Kafka and KSQL
@rmoff
Photoby SamuelZeller on Unsplash
ATM Fraud Detection with Apache Kafka and KSQL
@rmoff
Customer
details
ATM fraud txns
with customer
details
Elasticsearch
Notification
service
1. Enrich transaction events
with customer data
ATM Fraud Detection with Apache Kafka and KSQL
@rmoff
Streaming Integration with Kafka Connect
Kafka Brokers
Kafka Connect
Tasks Workers
Sourcessyslog
flat file
CSV
JSON
MQTT
ATM Fraud Detection with Apache Kafka and KSQL
@rmoff
Streaming Integration with Kafka Connect
Kafka Brokers
Kafka Connect
Tasks Workers
Sinks
Amazon S3
MQTT
ATM Fraud Detection with Apache Kafka and KSQL
@rmoff
Streaming Integration with Kafka Connect
Kafka Brokers
Kafka Connect
Tasks Workers
Sources Sinks
Amazon S3
MQTT
syslog
flat file
CSV
JSON
MQTT
ATM Fraud Detection with Apache Kafka and KSQL
@rmoff
Confluent Hub
hub.confluent.io
• One-stop place to discover and
download :
• Connectors
• Transformations
• Converters
ATM Fraud Detection with Apache Kafka and KSQL
@rmoff
Demo Time!
Customer
details
Kafka Connect
Debezium
ATM Fraud Detection with Apache Kafka and KSQL
@rmoff
Do you think that’s a table
you are querying?
ATM Fraud Detection with Apache Kafka and KSQL
@rmoff
The Table Stream Duality
Account ID Balance
12345 €50
Account ID Amount
12345 + €50
12345 + €25
12345 -€60
Account ID Balance
12345 €75
Account ID Balance
12345 €15
Time
Stream Table
ATM Fraud Detection with Apache Kafka and KSQL
@rmoff
The truth is the log.
The database is a cache
of a subset of the log.
—Pat Helland
Immutability Changes Everything
http://cidrdb.org/cidr2015/Papers/CIDR15_Paper16.pdf
Photo by Bobby Burch on Unsplash
ATM Fraud Detection with Apache Kafka and KSQL
@rmoff
Ac. ID Transaction ID Time ATM
A42 xxx116d91d6-ef17 11:56:58 Midland
A42 116d91d6-ef17 11:58:19 Halifax
A42 09c2f660-ef17 19:31:11 Lloyds
Spot patterns within this stream
Legit
Dodgy!
Legit
ATM Fraud Detection with Apache Kafka and KSQL
@rmoff
Ac. ID T1 Time ATM T2 Time ATM
A42 11:56:58 Midland 11:58:19 Halifax
Suspect Transactions
Dodgy!
ATM Fraud Detection with Apache Kafka and KSQL
@rmoff
Name Phone Ac. ID T1 Time ATM T2 Time ATM
Robin M 1234 567 A42 11:56:58 Midland 11:58:19 Halifax
Suspect Transactions
ATM Fraud Detection with Apache Kafka and KSQL
@rmoff
Customer
details
ATM fraud txns
with customer
details
Elasticsearch
Notification
service
1. Spot fraud in stream of
transactions
2.Enrich transaction events
with customer data
ATM Fraud Detection with Apache Kafka and KSQL
@rmoff
Customer
details
ATM fraud txns
with customer
details
Elasticsearch
Notification
service
1. Spot fraud in stream of
transactions
2.Enrich transaction events
with customer data
ATM_POSSIBLE_FRAUD_ENRICHED
atm_txns_gess
accounts
ATM Fraud Detection with Apache Kafka and KSQL
@rmoff
What can we do with it?
Photoby JoshuaRodriguez on Unsplash
ATM Fraud Detection with Apache Kafka and KSQL
@rmoff
Realtime Operations View & Analysis
ATM Fraud Detection with Apache Kafka and KSQL
@rmoff
Push notification to the customer
ATM Fraud Detection with Apache Kafka and KSQL
@rmoff
Confluent Community Components
Apache Kafka with a bunch of cool stuff! For free!
Database Changes Log Events loT Data Web Events …
CRM
Data Warehouse
Database
Hadoop
Data

Integration
…
Monitoring
Analytics
Custom Apps
Transformations
Real-time Applications
…
Confluent Platform
Confluent Platform
Apache Kafka®
Core | Connect API | Streams API
Data Compatibility
Schema Registry
Monitoring & Administration
Confluent Control Center | Security
Operations
Replicator | Auto Data Balancing
Development and Connectivity
Clients | Connectors | REST Proxy | CLI
SQL Stream Processing
KSQL
Datacenter Public Cloud Confluent Cloud
CONFLUENT FULLY-MANAGEDCUSTOMER SELF-MANAGED
ATM Fraud Detection with Apache Kafka and KSQL
@rmoff
Free Books!
https://www.confluent.io/apache-kafka-stream-processing-book-bundle
@rmoff
robin@confluent.io
https://www.confluent.io/ksql
http://cnfl.io/slack https://cnfl.io/demo-scene

More Related Content

More from confluent

Citi TechTalk Session 2: Kafka Deep Dive
Citi TechTalk Session 2: Kafka Deep DiveCiti TechTalk Session 2: Kafka Deep Dive
Citi TechTalk Session 2: Kafka Deep Diveconfluent
 
Build real-time streaming data pipelines to AWS with Confluent
Build real-time streaming data pipelines to AWS with ConfluentBuild real-time streaming data pipelines to AWS with Confluent
Build real-time streaming data pipelines to AWS with Confluentconfluent
 
Q&A with Confluent Professional Services: Confluent Service Mesh
Q&A with Confluent Professional Services: Confluent Service MeshQ&A with Confluent Professional Services: Confluent Service Mesh
Q&A with Confluent Professional Services: Confluent Service Meshconfluent
 
Citi Tech Talk: Event Driven Kafka Microservices
Citi Tech Talk: Event Driven Kafka MicroservicesCiti Tech Talk: Event Driven Kafka Microservices
Citi Tech Talk: Event Driven Kafka Microservicesconfluent
 
Confluent & GSI Webinars series - Session 3
Confluent & GSI Webinars series - Session 3Confluent & GSI Webinars series - Session 3
Confluent & GSI Webinars series - Session 3confluent
 
Citi Tech Talk: Messaging Modernization
Citi Tech Talk: Messaging ModernizationCiti Tech Talk: Messaging Modernization
Citi Tech Talk: Messaging Modernizationconfluent
 
Citi Tech Talk: Data Governance for streaming and real time data
Citi Tech Talk: Data Governance for streaming and real time dataCiti Tech Talk: Data Governance for streaming and real time data
Citi Tech Talk: Data Governance for streaming and real time dataconfluent
 
Confluent & GSI Webinars series: Session 2
Confluent & GSI Webinars series: Session 2Confluent & GSI Webinars series: Session 2
Confluent & GSI Webinars series: Session 2confluent
 
Data In Motion Paris 2023
Data In Motion Paris 2023Data In Motion Paris 2023
Data In Motion Paris 2023confluent
 
Confluent Partner Tech Talk with Synthesis
Confluent Partner Tech Talk with SynthesisConfluent Partner Tech Talk with Synthesis
Confluent Partner Tech Talk with Synthesisconfluent
 
The Future of Application Development - API Days - Melbourne 2023
The Future of Application Development - API Days - Melbourne 2023The Future of Application Development - API Days - Melbourne 2023
The Future of Application Development - API Days - Melbourne 2023confluent
 
The Playful Bond Between REST And Data Streams
The Playful Bond Between REST And Data StreamsThe Playful Bond Between REST And Data Streams
The Playful Bond Between REST And Data Streamsconfluent
 
The Journey to Data Mesh with Confluent
The Journey to Data Mesh with ConfluentThe Journey to Data Mesh with Confluent
The Journey to Data Mesh with Confluentconfluent
 
Citi Tech Talk: Monitoring and Performance
Citi Tech Talk: Monitoring and PerformanceCiti Tech Talk: Monitoring and Performance
Citi Tech Talk: Monitoring and Performanceconfluent
 
Confluent Partner Tech Talk with Reply
Confluent Partner Tech Talk with ReplyConfluent Partner Tech Talk with Reply
Confluent Partner Tech Talk with Replyconfluent
 
Citi Tech Talk Disaster Recovery Solutions Deep Dive
Citi Tech Talk  Disaster Recovery Solutions Deep DiveCiti Tech Talk  Disaster Recovery Solutions Deep Dive
Citi Tech Talk Disaster Recovery Solutions Deep Diveconfluent
 
Citi Tech Talk: Hybrid Cloud
Citi Tech Talk: Hybrid CloudCiti Tech Talk: Hybrid Cloud
Citi Tech Talk: Hybrid Cloudconfluent
 
Partner Tech Talk Q3: Q&A with PS - Migration and Upgrade
Partner Tech Talk Q3: Q&A with PS - Migration and UpgradePartner Tech Talk Q3: Q&A with PS - Migration and Upgrade
Partner Tech Talk Q3: Q&A with PS - Migration and Upgradeconfluent
 
Confluent Partner Tech Talk with QLIK
Confluent Partner Tech Talk with QLIKConfluent Partner Tech Talk with QLIK
Confluent Partner Tech Talk with QLIKconfluent
 
Real-time Streaming for Government and the Public Sector
Real-time Streaming for Government and the Public SectorReal-time Streaming for Government and the Public Sector
Real-time Streaming for Government and the Public Sectorconfluent
 

More from confluent (20)

Citi TechTalk Session 2: Kafka Deep Dive
Citi TechTalk Session 2: Kafka Deep DiveCiti TechTalk Session 2: Kafka Deep Dive
Citi TechTalk Session 2: Kafka Deep Dive
 
Build real-time streaming data pipelines to AWS with Confluent
Build real-time streaming data pipelines to AWS with ConfluentBuild real-time streaming data pipelines to AWS with Confluent
Build real-time streaming data pipelines to AWS with Confluent
 
Q&A with Confluent Professional Services: Confluent Service Mesh
Q&A with Confluent Professional Services: Confluent Service MeshQ&A with Confluent Professional Services: Confluent Service Mesh
Q&A with Confluent Professional Services: Confluent Service Mesh
 
Citi Tech Talk: Event Driven Kafka Microservices
Citi Tech Talk: Event Driven Kafka MicroservicesCiti Tech Talk: Event Driven Kafka Microservices
Citi Tech Talk: Event Driven Kafka Microservices
 
Confluent & GSI Webinars series - Session 3
Confluent & GSI Webinars series - Session 3Confluent & GSI Webinars series - Session 3
Confluent & GSI Webinars series - Session 3
 
Citi Tech Talk: Messaging Modernization
Citi Tech Talk: Messaging ModernizationCiti Tech Talk: Messaging Modernization
Citi Tech Talk: Messaging Modernization
 
Citi Tech Talk: Data Governance for streaming and real time data
Citi Tech Talk: Data Governance for streaming and real time dataCiti Tech Talk: Data Governance for streaming and real time data
Citi Tech Talk: Data Governance for streaming and real time data
 
Confluent & GSI Webinars series: Session 2
Confluent & GSI Webinars series: Session 2Confluent & GSI Webinars series: Session 2
Confluent & GSI Webinars series: Session 2
 
Data In Motion Paris 2023
Data In Motion Paris 2023Data In Motion Paris 2023
Data In Motion Paris 2023
 
Confluent Partner Tech Talk with Synthesis
Confluent Partner Tech Talk with SynthesisConfluent Partner Tech Talk with Synthesis
Confluent Partner Tech Talk with Synthesis
 
The Future of Application Development - API Days - Melbourne 2023
The Future of Application Development - API Days - Melbourne 2023The Future of Application Development - API Days - Melbourne 2023
The Future of Application Development - API Days - Melbourne 2023
 
The Playful Bond Between REST And Data Streams
The Playful Bond Between REST And Data StreamsThe Playful Bond Between REST And Data Streams
The Playful Bond Between REST And Data Streams
 
The Journey to Data Mesh with Confluent
The Journey to Data Mesh with ConfluentThe Journey to Data Mesh with Confluent
The Journey to Data Mesh with Confluent
 
Citi Tech Talk: Monitoring and Performance
Citi Tech Talk: Monitoring and PerformanceCiti Tech Talk: Monitoring and Performance
Citi Tech Talk: Monitoring and Performance
 
Confluent Partner Tech Talk with Reply
Confluent Partner Tech Talk with ReplyConfluent Partner Tech Talk with Reply
Confluent Partner Tech Talk with Reply
 
Citi Tech Talk Disaster Recovery Solutions Deep Dive
Citi Tech Talk  Disaster Recovery Solutions Deep DiveCiti Tech Talk  Disaster Recovery Solutions Deep Dive
Citi Tech Talk Disaster Recovery Solutions Deep Dive
 
Citi Tech Talk: Hybrid Cloud
Citi Tech Talk: Hybrid CloudCiti Tech Talk: Hybrid Cloud
Citi Tech Talk: Hybrid Cloud
 
Partner Tech Talk Q3: Q&A with PS - Migration and Upgrade
Partner Tech Talk Q3: Q&A with PS - Migration and UpgradePartner Tech Talk Q3: Q&A with PS - Migration and Upgrade
Partner Tech Talk Q3: Q&A with PS - Migration and Upgrade
 
Confluent Partner Tech Talk with QLIK
Confluent Partner Tech Talk with QLIKConfluent Partner Tech Talk with QLIK
Confluent Partner Tech Talk with QLIK
 
Real-time Streaming for Government and the Public Sector
Real-time Streaming for Government and the Public SectorReal-time Streaming for Government and the Public Sector
Real-time Streaming for Government and the Public Sector
 

Recently uploaded

Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 

Recently uploaded (20)

Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 

ATM Fraud Detection with Apache Kafka and KSQL

  • 1. ATM Fraud Detection @rmoff with Apache Kafka
and KSQL Photoby FreddieCollins on Unsplash
  • 2. ATM Fraud Detection with Apache Kafka and KSQL @rmoff Spotting fraud in realtime hotoby MirzaBabic on Unsplash
  • 4. ATM Fraud Detection with Apache Kafka and KSQL @rmoff • Account id • Location • Amount • Inbound stream of ATM data https://github.com/rmoff/gess
  • 5. ATM Fraud Detection with Apache Kafka and KSQL @rmoff Demo!
  • 6. ATM Fraud Detection with Apache Kafka and KSQL @rmoff Spot patterns within this stream Ac. ID Transaction ID Time ATM A42 xxx116d91d6-ef17 11:56:58 Midland A42 116d91d6-ef17 11:58:19 Halifax A42 09c2f660-ef17 19:31:11 Lloyds
  • 7. ATM Fraud Detection with Apache Kafka and KSQL @rmoff Spot patterns within this stream Ac. ID Transaction ID Time ATM A42 xxx116d91d6-ef17 11:56:58 Midland A42 116d91d6-ef17 11:58:19 Halifax A42 09c2f660-ef17 19:31:11 Lloyds Legit Legit
  • 8. ATM Fraud Detection with Apache Kafka and KSQL @rmoff Ac. ID Transaction ID Time ATM A42 xxx116d91d6-ef17 11:56:58 Midland A42 116d91d6-ef17 11:58:19 Halifax A42 09c2f660-ef17 19:31:11 Lloyds Spot patterns within this stream Legit Dodgy! Legit
  • 9. ATM Fraud Detection with Apache Kafka and KSQL @rmoff Ac. ID Transaction ID Time ATM A42 xxx116d91d6-ef17 11:56:58 Midland A42 116d91d6-ef17 11:58:19 Halifax A42 09c2f660-ef17 19:31:11 Lloyds Spot patterns within this stream Legit Dodgy! Legit
  • 10. ATM Fraud Detection with Apache Kafka and KSQL @rmoff • Account id • Location • Amount • Inbound stream of ATM data https://github.com/rmoff/gess
  • 11. ATM Fraud Detection with Apache Kafka and KSQL @rmoff KSQL : Stream Processing with SQL TXN_ID, ATM, CUSTOMER_NAME, CUSTOMER_PHONE ATM_POSSIBLE_FRAUD; SELECT FROM
  • 12. ATM Fraud Detection with Apache Kafka and KSQL @rmoff
  • 13. ATM Fraud Detection with Apache Kafka and KSQL @rmoff Customer details ATM fraud txns with customer details Elasticsearch Notification service 1. Spot fraud in stream of transactions 2.Enrich transaction events with customer data
  • 14. ATM Fraud Detection with Apache Kafka and KSQL @rmoff KSQLis the Streaming SQL Engine for Apache Kafka
  • 15. ATM Fraud Detection with Apache Kafka and KSQL @rmoff KSQL for Real-Time Monitoring • Log data monitoring, tracking and alerting • syslog data • Sensor / IoT data CREATE STREAM SYSLOG_INVALID_USERS AS SELECT HOST, MESSAGE FROM SYSLOG WHERE MESSAGE LIKE '%Invalid user%'; http://cnfl.io/syslogs-filtering / http://cnfl.io/syslog-alerting
  • 16. ATM Fraud Detection with Apache Kafka and KSQL @rmoff KSQL for Streaming ETL CREATE STREAM vip_actions AS 
 SELECT userid, page, action FROM clickstream c LEFT JOIN users u ON c.userid = u.user_id 
 WHERE u.level = 'Platinum'; Joining, filtering, and aggregating streams of event data
  • 17. ATM Fraud Detection with Apache Kafka and KSQL @rmoff KSQL for Anomaly Detection CREATE TABLE possible_fraud AS
 SELECT card_number, count(*)
 FROM authorization_attempts 
 WINDOW TUMBLING (SIZE 5 SECONDS)
 GROUP BY card_number
 HAVING count(*) > 3; Identifying patterns or anomalies in real-time data, surfaced in milliseconds
  • 18. ATM Fraud Detection with Apache Kafka and KSQL @rmoff CREATE STREAM pageviews WITH (PARTITIONS=4, VALUE_FORMAT='AVRO') AS 
 SELECT * FROM pageviews_json; KSQL for Data Transformation Make simple derivations of existing topics from the command line
  • 19. ATM Fraud Detection with Apache Kafka and KSQL @rmoff KSQL in Development and Production Interactive KSQL
 for development and testing Headless KSQL
 for Production Desired KSQL queries have been identified REST “Hmm, let me try
 out this idea...”
  • 20. ATM Fraud Detection with Apache Kafka and KSQL @rmoff Stream Stream joins Orders Shipments Which orders haven't shipped? order.id = shipment.order_id Leadtime shipment_ts - order_ts
  • 21. ATM Fraud Detection with Apache Kafka and KSQL @rmoff Stream Stream joins ATM transactions
  • 22. ATM Fraud Detection with Apache Kafka and KSQL @rmoff Stream Stream joins ATM transactions
  • 23. ATM Fraud Detection with Apache Kafka and KSQL @rmoff Demo!
  • 24. ATM Fraud Detection with Apache Kafka and KSQL @rmoff Self-Join (Cartesian product) Ac. ID Transaction ID Time ATM A42 xxx116d91d6-ef17 11:56:58 Midland A42 116d91d6-ef17 11:58:19 Halifax A42 09c2f660-ef17 19:31:11 Lloyds T Ac. ID Transaction ID Time ATM A42 xxx116d91d6-ef17 11:56:58 Midland A42 116d91d6-ef17 11:58:19 Halifax A42 09c2f660-ef17 19:31:11 Lloyds Ac. ID Transaction ID Time ATM A42 xxx116d91d6-ef17 11:56:58 Midland A42 116d91d6-ef17 11:58:19 Halifax A42 09c2f660-ef17 19:31:11 Lloyds T1 T2
  • 25. ATM Fraud Detection with Apache Kafka and KSQL @rmoff Ac. ID Transaction ID Time ATM A42 xxx116d91d6-ef17 11:56:58 Midland A42 116d91d6-ef17 11:58:19 Halifax A42 09c2f660-ef17 19:31:11 Lloyds Ac. ID Transaction ID Time ATM A42 xxx116d91d6-ef17 11:56:58 Midland A42 116d91d6-ef17 11:58:19 Halifax A42 09c2f660-ef17 19:31:11 Lloyds T1 T2 Self-Join (Cartesian product) ATM_TXNS T1 INNER JOIN ATM_TXNS T2 ON T1.ACCOUNT_ID = T2.ACCOUNT_ID
  • 26. ATM Fraud Detection with Apache Kafka and KSQL @rmoff Self-Join (Cartesian product) FROM ATM_TXNS T1 INNER JOIN ATM_TXNS T2 WITHIN 10 MINUTES ON T1.ACCOUNT_ID = T2.ACCOUNT_ID Ac. ID Transaction ID Time ATM A42 xxx116d91d6-ef17 11:56:58 Midland A42 116d91d6-ef17 11:58:19 Halifax A42 09c2f660-ef17 19:31:11 Lloyds Ac. ID Transaction ID Time ATM A42 xxx116d91d6-ef17 11:56:58 Midland A42 116d91d6-ef17 11:58:19 Halifax A42 09c2f660-ef17 19:31:11 Lloyds T1 T2
  • 27. ATM Fraud Detection with Apache Kafka and KSQL @rmoff Self-Join T1 Txn ID T2 Txn ID T1 Time T2 Time T1 ATM T2 ATM xxx116d91d6-ef17 116d91d6-ef17 11:56:58 11:58:19 Midland Halifax 116d91d6-ef17 xxx116d91d6-ef17 11:58:19 11:56:58 Halifax Midland xxx116d91d6-ef17 xxx116d91d6-ef17 11:56:58 11:56:58 Midland Midland 116d91d6-ef17 116d91d6-ef17 11:58:19 11:58:19 Halifax Halifax
  • 28. ATM Fraud Detection with Apache Kafka and KSQL @rmoff Self-Join T1 Txn ID T2 Txn ID T1 Time T2 Time T1 ATM T2 ATM xxx116d91d6-ef17 116d91d6-ef17 11:56:58 11:58:19 Midland Halifax 116d91d6-ef17 xxx116d91d6-ef17 11:58:19 11:56:58 Halifax Midland xxx116d91d6-ef17 xxx116d91d6-ef17 11:56:58 11:56:58 Midland Midland 116d91d6-ef17 116d91d6-ef17 11:58:19 11:58:19 Halifax Halifax Self join on same txn IDs
  • 29. ATM Fraud Detection with Apache Kafka and KSQL @rmoff Exclude joins on the same txn WHERE T1.TRANSACTION_ID != T2.TRANSACTION_ID T1 Txn ID T2 Txn ID T1 Time T2 Time T1 ATM T2 ATM xxx116d91d6-ef17 116d91d6-ef17 11:56:58 11:58:19 Midland Halifax 116d91d6-ef17 xxx116d91d6-ef17 11:58:19 11:56:58 Halifax Midland
  • 30. ATM Fraud Detection with Apache Kafka and KSQL @rmoff Exclude joins on the same txn T1 Txn ID T2 Txn ID T1 Time T2 Time T1 ATM T2 ATM xxx116d91d6-ef17 116d91d6-ef17 11:56:58 11:58:19 Midland Halifax 116d91d6-ef17 xxx116d91d6-ef17 11:58:19 11:56:58 Halifax Midland Duplicate results (A:B / B:A)
  • 31. ATM Fraud Detection with Apache Kafka and KSQL @rmoff Join Windows Ac. ID Transaction ID Time ATM A42 xxx116d91d6-ef17 11:56:58 Midland A42 116d91d6-ef17 11:58:19 Halifax A42 09c2f660-ef17 19:31:11 Lloyds Ac. ID Transaction ID Time ATM A42 xxx116d91d6-ef17 11:56:58 Midland A42 116d91d6-ef17 11:58:19 Halifax A42 09c2f660-ef17 19:31:11 Lloyds T1 T2 WITHIN 10 MINUTES WHERE T1.TRANSACTION_ID != T2.TRANSACTION_ID
  • 32. ATM Fraud Detection with Apache Kafka and KSQL @rmoff Join Windows Ac. ID Transaction ID Time ATM A42 xxx116d91d6-ef17 11:56:58 Midland A42 116d91d6-ef17 11:58:19 Halifax A42 09c2f660-ef17 19:31:11 Lloyds Ac. ID Transaction ID Time ATM A42 xxx116d91d6-ef17 11:56:58 Midland A42 116d91d6-ef17 11:58:19 Halifax A42 09c2f660-ef17 19:31:11 Lloyds T1 T2 WITHIN 10 MINUTES WHERE T1.TRANSACTION_ID != T2.TRANSACTION_ID
  • 33. ATM Fraud Detection with Apache Kafka and KSQL @rmoff Join Windows Ac. ID Transaction ID Time ATM A42 xxx116d91d6-ef17 11:56:58 Midland A42 116d91d6-ef17 11:58:19 Halifax A42 09c2f660-ef17 19:31:11 Lloyds Ac. ID Transaction ID Time ATM A42 xxx116d91d6-ef17 11:56:58 Midland A42 116d91d6-ef17 11:58:19 Halifax A42 09c2f660-ef17 19:31:11 Lloyds T1 T2 WITHIN 10 MINUTES WHERE T1.TRANSACTION_ID != T2.TRANSACTION_ID
  • 34. ATM Fraud Detection with Apache Kafka and KSQL @rmoff Only join forward Ac. ID Transaction ID Time ATM A42 xxx116d91d6-ef17 11:56:58 Midland A42 116d91d6-ef17 11:58:19 Halifax A42 09c2f660-ef17 19:31:11 Lloyds Ac. ID Transaction ID Time ATM A42 xxx116d91d6-ef17 11:56:58 Midland A42 116d91d6-ef17 11:58:19 Halifax A42 09c2f660-ef17 19:31:11 Lloyds T1 T2 WITHIN (0 MINUTES, 10 MINUTES) WHERE T1.TRANSACTION_ID != T2.TRANSACTION_ID
  • 35. ATM Fraud Detection with Apache Kafka and KSQL @rmoff Only join forward Ac. ID Transaction ID Time ATM A42 xxx116d91d6-ef17 11:56:58 Midland A42 116d91d6-ef17 11:58:19 Halifax A42 09c2f660-ef17 19:31:11 Lloyds Ac. ID Transaction ID Time ATM A42 xxx116d91d6-ef17 11:56:58 Midland A42 116d91d6-ef17 11:58:19 Halifax A42 09c2f660-ef17 19:31:11 Lloyds T1 T2 WITHIN (0 MINUTES, 10 MINUTES) WHERE T1.TRANSACTION_ID != T2.TRANSACTION_ID
  • 36. ATM Fraud Detection with Apache Kafka and KSQL @rmoff Only join forward T1 Txn ID T2 Txn ID T1 Time T2 Time T1 ATM T2 ATM xxx116d91d6-ef17 116d91d6-ef17 11:56:58 11:58:19 Midland Halifax WITHIN (0 MINUTES, 10 MINUTES) Ignore events in the right-hand stream prior to those in the left
  • 37. ATM Fraud Detection with Apache Kafka and KSQL @rmoff Only join forward T1 Txn ID T2 Txn ID T1 Time T2 Time T1 ATM T2 ATM xxx116d91d6-ef17 116d91d6-ef17 11:56:58 11:58:19 Midland Halifax Legit Dodgy!
  • 38. ATM Fraud Detection with Apache Kafka and KSQL @rmoff Photoby EstebanLopez on Unsplash
  • 39. ATM Fraud Detection with Apache Kafka and KSQL @rmoff Calcuate distance between ATMs GEO_DISTANCE(TX1.location->lat, TX1.location->lon, TX2.location->lat, TX2.location->lon, 'KM') TX1 TX2
  • 40. ATM Fraud Detection with Apache Kafka and KSQL @rmoff Calculate time between transactions TX2.ROWTIME - TX1.ROWTIME AS MILLISECONDS_DIFFERENCE (TX2.ROWTIME - TX1.ROWTIME) / 1000 / 60 / 60 AS HOURS_DIFFERENCE
  • 41. ATM Fraud Detection with Apache Kafka and KSQL @rmoff Photoby EstebanLopez on Unsplash GEO_DISTANCE(…) / HOURS_DIFFERENCE AS KMH_REQUIRED
  • 42. ATM Fraud Detection with Apache Kafka and KSQL @rmoff So speaking of time… ksql> PRINT 'atm_txns_gess' ; Format:JSON { "ROWTIME": 1544116309152, "ROWKEY": "null", "account_id": "a218", "timestamp": "2018-12-06 17:09:58 +0000", "atm": "HSBC", …} Kafka message timestamp 2018-12-06 17:11:49 Event time
  • 43. ksql> PRINT 'atm_txns_gess' ; Format:JSON { "ROWTIME": 1544116309152, "ROWKEY": "null", "account_id": "a218", "timestamp": "2018-12-06 17:09:58 +0000", CREATE STREAM ATM_TXNS_GESS (account_id VARCHAR, timestamp VARCHAR, … WITH (KAFKA_TOPIC='atm_txns_gess', TIMESTAMP='timestamp', TIMESTAMP_FORMAT= 'yyyy-MM-dd HH:mm:ss X'); "timestamp": "2018-12-06 17:09:58 +0000",
  • 44. ATM Fraud Detection with Apache Kafka and KSQL @rmoff But what about the account holder? !
  • 45. ATM Fraud Detection with Apache Kafka and KSQL @rmoff Photoby SamuelZeller on Unsplash
  • 46. ATM Fraud Detection with Apache Kafka and KSQL @rmoff Customer details ATM fraud txns with customer details Elasticsearch Notification service 1. Enrich transaction events with customer data
  • 47. ATM Fraud Detection with Apache Kafka and KSQL @rmoff Streaming Integration with Kafka Connect Kafka Brokers Kafka Connect Tasks Workers Sourcessyslog flat file CSV JSON MQTT
  • 48. ATM Fraud Detection with Apache Kafka and KSQL @rmoff Streaming Integration with Kafka Connect Kafka Brokers Kafka Connect Tasks Workers Sinks Amazon S3 MQTT
  • 49. ATM Fraud Detection with Apache Kafka and KSQL @rmoff Streaming Integration with Kafka Connect Kafka Brokers Kafka Connect Tasks Workers Sources Sinks Amazon S3 MQTT syslog flat file CSV JSON MQTT
  • 50. ATM Fraud Detection with Apache Kafka and KSQL @rmoff Confluent Hub hub.confluent.io • One-stop place to discover and download : • Connectors • Transformations • Converters
  • 51. ATM Fraud Detection with Apache Kafka and KSQL @rmoff Demo Time! Customer details Kafka Connect Debezium
  • 52. ATM Fraud Detection with Apache Kafka and KSQL @rmoff Do you think that’s a table you are querying?
  • 53. ATM Fraud Detection with Apache Kafka and KSQL @rmoff The Table Stream Duality Account ID Balance 12345 €50 Account ID Amount 12345 + €50 12345 + €25 12345 -€60 Account ID Balance 12345 €75 Account ID Balance 12345 €15 Time Stream Table
  • 54. ATM Fraud Detection with Apache Kafka and KSQL @rmoff The truth is the log. The database is a cache of a subset of the log. —Pat Helland Immutability Changes Everything http://cidrdb.org/cidr2015/Papers/CIDR15_Paper16.pdf Photo by Bobby Burch on Unsplash
  • 55. ATM Fraud Detection with Apache Kafka and KSQL @rmoff Ac. ID Transaction ID Time ATM A42 xxx116d91d6-ef17 11:56:58 Midland A42 116d91d6-ef17 11:58:19 Halifax A42 09c2f660-ef17 19:31:11 Lloyds Spot patterns within this stream Legit Dodgy! Legit
  • 56. ATM Fraud Detection with Apache Kafka and KSQL @rmoff Ac. ID T1 Time ATM T2 Time ATM A42 11:56:58 Midland 11:58:19 Halifax Suspect Transactions Dodgy!
  • 57. ATM Fraud Detection with Apache Kafka and KSQL @rmoff Name Phone Ac. ID T1 Time ATM T2 Time ATM Robin M 1234 567 A42 11:56:58 Midland 11:58:19 Halifax Suspect Transactions
  • 58. ATM Fraud Detection with Apache Kafka and KSQL @rmoff Customer details ATM fraud txns with customer details Elasticsearch Notification service 1. Spot fraud in stream of transactions 2.Enrich transaction events with customer data
  • 59. ATM Fraud Detection with Apache Kafka and KSQL @rmoff Customer details ATM fraud txns with customer details Elasticsearch Notification service 1. Spot fraud in stream of transactions 2.Enrich transaction events with customer data ATM_POSSIBLE_FRAUD_ENRICHED atm_txns_gess accounts
  • 60. ATM Fraud Detection with Apache Kafka and KSQL @rmoff What can we do with it? Photoby JoshuaRodriguez on Unsplash
  • 61. ATM Fraud Detection with Apache Kafka and KSQL @rmoff Realtime Operations View & Analysis
  • 62. ATM Fraud Detection with Apache Kafka and KSQL @rmoff Push notification to the customer
  • 63. ATM Fraud Detection with Apache Kafka and KSQL @rmoff Confluent Community Components Apache Kafka with a bunch of cool stuff! For free! Database Changes Log Events loT Data Web Events … CRM Data Warehouse Database Hadoop Data
 Integration … Monitoring Analytics Custom Apps Transformations Real-time Applications … Confluent Platform Confluent Platform Apache Kafka® Core | Connect API | Streams API Data Compatibility Schema Registry Monitoring & Administration Confluent Control Center | Security Operations Replicator | Auto Data Balancing Development and Connectivity Clients | Connectors | REST Proxy | CLI SQL Stream Processing KSQL Datacenter Public Cloud Confluent Cloud CONFLUENT FULLY-MANAGEDCUSTOMER SELF-MANAGED
  • 64. ATM Fraud Detection with Apache Kafka and KSQL @rmoff Free Books! https://www.confluent.io/apache-kafka-stream-processing-book-bundle