Detecting Real-Time Financial Fraud with
Cloudflow on Kubernetes
Gerard Maas, Principal Engineer at Lightbend.
● Intro: Productizing Data Science
● What’s Cloudflow
● Building a Fraud Detection Model
● Running the Model With Cloudflow
Agenda
Cloudflow is a development toolkit that enables you to quickly
develop, orchestrate, and operate distributed streaming
applications on Kubernetes.
$dev>
$> Streamlet API
$> Blueprint
$> Sandbox
$> build tool extensions
$> kubectl cloudflow
Operator
$> Streamlet API
$> Blueprints
$> build extensions
$> kubectl cloudflow
Operator
Streamlet
inlet(s) outlet(s)
{ Schema }
Streamlets
Logic{ Schema }
{ Schema }
Streamlet
Streamlets
Logic
Streamlet
Logic
Streamlet
Logic
✔
❌
Streamlet
inlet(s) outlet(s)
Streamlets
Logic
Easily integrate streamlets written in Akka Streams, Spark Structured
Streaming, and Flink
Merge
different
input streams
Validate
record
formats, field
values
Use ML for more
sophisticated analysis
Compute aggregations
(e.g., statistics)
Send results
downstream
Cloudflow API :: Blueprints
The Data Science Process
we’re here
Img src: https://randalscottking.com/machine-learning-overview/
Transactions
Fraud Detection
Model
Data
Understanding
Data
Preparation
Data
Modelling
Validation Deployment
Transactions
Fraud Detection
Model
Data
Understanding
Data
Preparation
Data
Modelling
Validation Deployment
Data
Cleaning &
Enrichment
Data
Ingestion
Result
Propagation
Model
Scoring
model
@maasg
HTTP
ingress
[schema]
Enrichment
(+features)
Fraud ML
Scoring
Egress
(console)
Transactions
@maasg
HTTP
ingress
[schema]
Enrichment
(+features)
Fraud ML
Scoring
Egress
(console)
Transactions
py
DT
Model
Persistent
Volume
@maasg
HTTP
ingress
[schema]
Enrichment
(+features)
Fraud ML
Scoring
Egress
(console)
Transactions
py
DT
Model
Persistent
Volume
@maasg
HTTP
ingress
[schema]
Enrichment
(+features)
Fraud ML
Scoring
Egress
(console)
Transactions
py
DT
Model
Persistent
Volume
@maasg
HTTP
ingress
[schema]
Enrichment
(+features)
Fraud ML
Scoring
Egress
(console)
Transactions
py
DT
Model
Persistent
Volume
@maasg
HTTP
ingress
[schema]
Enrichment
(+features)
Fraud ML
Scoring
Egress
(console)
Transactions
py
DT
Model
Persistent
Volume
@maasg
HTTP
ingress
[schema]
Enrichment
(+features)
Fraud ML
Scoring
Egress
(console)
Transactions
py
DT
Model
Persistent
Volume
External Batch training
Embedded Model
E.g. SparkML
External Batch training
External Model Service
E.g. TFServing
External Batch training
Managed Streams
E.g. TFjvm
The Mission of ML in Cloudflow
Infuse “AI/ML” or smarter real-time analytics to existing apps
▪ Loan approval, device maintenance, next best offer, recommendation engine
Mix domain logic with streaming analytics and model serving
▪ Offer a programming model and runtime that facilitates the creation of new
data-driven services
Enable the productization of ML in the Enterprise
Create a flowing “stream” between Data Science and Data Engineering
Get started with Cloudflow at cloudflow.io
Join our contributor community at:
http://github.com/lightbend/cloudflow
Thank You
Gerard Maas
Principal Engineer
gerard.maas@lightbend.com
@maasg

Detecting Real-Time Financial Fraud with Cloudflow on Kubernetes