SlideShare a Scribd company logo
building intelligent data products
what actually is fraud
architecting flexible data ‘plumbing’
building solid data products on top of them
stephen whitworth
2 years at Hailo as data scientist/jack of some trades out of
university
product and marketplace analytics, agent based
modelling, data engineering, ‘ML’ services
data science/engineering at ravelin, specifically
focused on our detection capabilities
what is ravelin?
online fraud detection and prevention platform
stream application/server data to our events API
we give fraud probability + beautiful data visualisation
backed by techstars/passion/playfair/amadeus/indeed.com
founder/wonga founder amongst other great investors
fraud?
$14B
a dollar for every year the universe has existed
Same day delivery On-demand services
‘victimless crime’
police ill-equipped to handle
low barrier to entry from dark net
3D secure - conversion killer
traditional: human generated rules, born of deep expertise
order-centric view of the world
hybrid: augment expertise by learning rules from data
cards don’t commit fraud, people do
building
good
plumbing
receive firehose through API
decode arbitrary data and store
extract hundreds of features
http/slack/whatever notification to customer
in 100-300ms (ish)
run through N models and rule engine to get probability
BUZZWORDS ABOUND
go
postgres
AWS
microservices
zookeeper
NSQ python
event-driven
elasticsearch bigquery dynamodb
redis
instrumentation
different
databases
for different
needs
kudos if you get The Office reference
postgres: solid, start here
dynamodb: very high throughput, low latency data
bigquery: to answer any question you could possibly have
elasticsearch: rich querying in a reasonable amount of time
graph db: haven’t decided, recommendations?
asynchronous systemsfirehoses
nice deployment patterns
‘lambda architecture’ - the append only log
services store their own interpretation of events
services are almost entirely decoupled
asynchronous systemsfirehoses
error propagation is challenging
no guarantees of SLA - at least as slow as your queue
hard to know who or what is consuming your data
building
data
products
‘a random forest is like a room full of
experts who have seen different
cases of fraud from different
perspectives’
‘a random forest is like a room full of
experts who have seen different
cases of fraud from different
perspectives’
N
precision: of all of my predictions, what % was I correct?
recall: out of all of the fraudsters, what % did I catch?
implicit tradeoff between conversion and fraud loss
‘accuracy’ a useless metric for fraud
99.8% ACCURATE
keep model interfaces simple
hide arbitrarily complex transformations behind it
blend global and client specific models
building and training statistical models
currently batch
will combine with online
RANDOM FORESTS
‘a random forest is like a room full of
experts who have seen different
cases of fraud from different
perspectives’
RANDOM FORESTS
MONITORING
probabilistic, not deterministic
dogfood - use live robot customers
run models in ‘dark mode’ to determine performance
why not deep learning? ..yet
ability to debug random forests
had nice results with keras
serialisation and deployment: an unsolved problem
in beta and signing up clients
looking for on-demand services/marketplaces
talk to me afterwards
obligatory: we are hiring!
senior machine learning engineers/data scientists
stephen.whitworth@ravelin.com or talk to me after
@sjwhitworth
www.ravelin.com - @ravelinhq

More Related Content

Similar to Building Intelligent Data Products

Experiment
ExperimentExperiment
Experiment
jbashask
 
Nasscom how can you identify fraud in fintech lending using deep learning
Nasscom how can you identify fraud in fintech lending using deep learningNasscom how can you identify fraud in fintech lending using deep learning
Nasscom how can you identify fraud in fintech lending using deep learning
Ratnakar Pandey
 
LSI Spring Agent Open House 2014
LSI Spring Agent Open House 2014LSI Spring Agent Open House 2014
LSI Spring Agent Open House 2014
Ashlie Steele
 
Operationalize deep learning models for fraud detection with Azure Machine Le...
Operationalize deep learning models for fraud detection with Azure Machine Le...Operationalize deep learning models for fraud detection with Azure Machine Le...
Operationalize deep learning models for fraud detection with Azure Machine Le...
Francesca Lazzeri, PhD
 
Fraud Engineering, from Merchant Risk Council Annual Meeting 2012
Fraud Engineering, from Merchant Risk Council Annual Meeting 2012Fraud Engineering, from Merchant Risk Council Annual Meeting 2012
Fraud Engineering, from Merchant Risk Council Annual Meeting 2012
Nick Galbreath
 
shyampresentaaaaaaaaaaaaaaaaaaaaaaa.pptx
shyampresentaaaaaaaaaaaaaaaaaaaaaaa.pptxshyampresentaaaaaaaaaaaaaaaaaaaaaaa.pptx
shyampresentaaaaaaaaaaaaaaaaaaaaaaa.pptx
ShyamaprasadMS
 
DevSecCon London 2018: How to fit threat modelling into agile development: sl...
DevSecCon London 2018: How to fit threat modelling into agile development: sl...DevSecCon London 2018: How to fit threat modelling into agile development: sl...
DevSecCon London 2018: How to fit threat modelling into agile development: sl...
DevSecCon
 

Similar to Building Intelligent Data Products (20)

WeDo Technologies Blog 2014
WeDo Technologies Blog 2014WeDo Technologies Blog 2014
WeDo Technologies Blog 2014
 
Experiment
ExperimentExperiment
Experiment
 
Nasscom how can you identify fraud in fintech lending using deep learning
Nasscom how can you identify fraud in fintech lending using deep learningNasscom how can you identify fraud in fintech lending using deep learning
Nasscom how can you identify fraud in fintech lending using deep learning
 
LSI Spring Agent Open House 2014
LSI Spring Agent Open House 2014LSI Spring Agent Open House 2014
LSI Spring Agent Open House 2014
 
Operationalize deep learning models for fraud detection with Azure Machine Le...
Operationalize deep learning models for fraud detection with Azure Machine Le...Operationalize deep learning models for fraud detection with Azure Machine Le...
Operationalize deep learning models for fraud detection with Azure Machine Le...
 
Mark Villinski - Top 10 Tips for Educating Employees about Cybersecurity
Mark Villinski - Top 10 Tips for Educating Employees about CybersecurityMark Villinski - Top 10 Tips for Educating Employees about Cybersecurity
Mark Villinski - Top 10 Tips for Educating Employees about Cybersecurity
 
Fraud Engineering, from Merchant Risk Council Annual Meeting 2012
Fraud Engineering, from Merchant Risk Council Annual Meeting 2012Fraud Engineering, from Merchant Risk Council Annual Meeting 2012
Fraud Engineering, from Merchant Risk Council Annual Meeting 2012
 
ThreatMetrix ARRC 2016 presentation by Ted Egan
ThreatMetrix ARRC 2016 presentation by Ted EganThreatMetrix ARRC 2016 presentation by Ted Egan
ThreatMetrix ARRC 2016 presentation by Ted Egan
 
Gartner: Top 10 Technology Trends 2015
Gartner: Top 10 Technology Trends 2015Gartner: Top 10 Technology Trends 2015
Gartner: Top 10 Technology Trends 2015
 
shyampresentaaaaaaaaaaaaaaaaaaaaaaa.pptx
shyampresentaaaaaaaaaaaaaaaaaaaaaaa.pptxshyampresentaaaaaaaaaaaaaaaaaaaaaaa.pptx
shyampresentaaaaaaaaaaaaaaaaaaaaaaa.pptx
 
Ai and machine learning help detect, predict and prevent fraud - IBM Watson ...
Ai and machine learning help detect, predict and prevent fraud -  IBM Watson ...Ai and machine learning help detect, predict and prevent fraud -  IBM Watson ...
Ai and machine learning help detect, predict and prevent fraud - IBM Watson ...
 
Credit Card Fraud Detection Using ML In Databricks
Credit Card Fraud Detection Using ML In DatabricksCredit Card Fraud Detection Using ML In Databricks
Credit Card Fraud Detection Using ML In Databricks
 
GraphTalks Italy - Using graphs to fight financial fraud
GraphTalks Italy - Using graphs to fight financial fraudGraphTalks Italy - Using graphs to fight financial fraud
GraphTalks Italy - Using graphs to fight financial fraud
 
Falcon 012009
Falcon 012009Falcon 012009
Falcon 012009
 
GraphTalks Frankfurt - Leveraging Graph-Technology to fight financial fraud
GraphTalks Frankfurt - Leveraging Graph-Technology to fight financial fraudGraphTalks Frankfurt - Leveraging Graph-Technology to fight financial fraud
GraphTalks Frankfurt - Leveraging Graph-Technology to fight financial fraud
 
A Practical Guide to Post-EMV Card Not Present Fraud
A Practical Guide to Post-EMV Card Not Present FraudA Practical Guide to Post-EMV Card Not Present Fraud
A Practical Guide to Post-EMV Card Not Present Fraud
 
Data Loss Threats and Mitigations
Data Loss Threats and MitigationsData Loss Threats and Mitigations
Data Loss Threats and Mitigations
 
DevSecCon London 2018: How to fit threat modelling into agile development: sl...
DevSecCon London 2018: How to fit threat modelling into agile development: sl...DevSecCon London 2018: How to fit threat modelling into agile development: sl...
DevSecCon London 2018: How to fit threat modelling into agile development: sl...
 
A Novel Framework for Credit Card.
A Novel Framework for Credit Card.A Novel Framework for Credit Card.
A Novel Framework for Credit Card.
 
Detecting eCommerce Fraud with Neo4j and Linkurious
Detecting eCommerce Fraud with Neo4j and LinkuriousDetecting eCommerce Fraud with Neo4j and Linkurious
Detecting eCommerce Fraud with Neo4j and Linkurious
 

Recently uploaded

Recently uploaded (20)

"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor Turskyi"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor Turskyi
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
 
Custom Approval Process: A New Perspective, Pavel Hrbacek & Anindya Halder
Custom Approval Process: A New Perspective, Pavel Hrbacek & Anindya HalderCustom Approval Process: A New Perspective, Pavel Hrbacek & Anindya Halder
Custom Approval Process: A New Perspective, Pavel Hrbacek & Anindya Halder
 
UiPath Test Automation using UiPath Test Suite series, part 1
UiPath Test Automation using UiPath Test Suite series, part 1UiPath Test Automation using UiPath Test Suite series, part 1
UiPath Test Automation using UiPath Test Suite series, part 1
 
IoT Analytics Company Presentation May 2024
IoT Analytics Company Presentation May 2024IoT Analytics Company Presentation May 2024
IoT Analytics Company Presentation May 2024
 
Free and Effective: Making Flows Publicly Accessible, Yumi Ibrahimzade
Free and Effective: Making Flows Publicly Accessible, Yumi IbrahimzadeFree and Effective: Making Flows Publicly Accessible, Yumi Ibrahimzade
Free and Effective: Making Flows Publicly Accessible, Yumi Ibrahimzade
 
ODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User GroupODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User Group
 
Demystifying gRPC in .Net by John Staveley
Demystifying gRPC in .Net by John StaveleyDemystifying gRPC in .Net by John Staveley
Demystifying gRPC in .Net by John Staveley
 
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsTo Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
 
UiPath Test Automation using UiPath Test Suite series, part 2
UiPath Test Automation using UiPath Test Suite series, part 2UiPath Test Automation using UiPath Test Suite series, part 2
UiPath Test Automation using UiPath Test Suite series, part 2
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
 
AI revolution and Salesforce, Jiří Karpíšek
AI revolution and Salesforce, Jiří KarpíšekAI revolution and Salesforce, Jiří Karpíšek
AI revolution and Salesforce, Jiří Karpíšek
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
 
Unpacking Value Delivery - Agile Oxford Meetup - May 2024.pptx
Unpacking Value Delivery - Agile Oxford Meetup - May 2024.pptxUnpacking Value Delivery - Agile Oxford Meetup - May 2024.pptx
Unpacking Value Delivery - Agile Oxford Meetup - May 2024.pptx
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
 
10 Differences between Sales Cloud and CPQ, Blanka Doktorová
10 Differences between Sales Cloud and CPQ, Blanka Doktorová10 Differences between Sales Cloud and CPQ, Blanka Doktorová
10 Differences between Sales Cloud and CPQ, Blanka Doktorová
 
Introduction to Open Source RAG and RAG Evaluation
Introduction to Open Source RAG and RAG EvaluationIntroduction to Open Source RAG and RAG Evaluation
Introduction to Open Source RAG and RAG Evaluation
 
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptxIOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
 

Building Intelligent Data Products