Imagine you have a Slack channel for your cool SaaS product where every new sign up is posted. Finally one day, after endless growth marketing, you see a storm of new sign ups! You are so happy and it is great for a few weeks. Until you start seeing sign ups from fake emails and disposable ones. You start filtering out known disposable domains and add in an email scoring system but it's not perfect. You add all of these to feel safe but your user growth is now taking a hit. You spend more on marketing but get fewer genuine new users.
It's usually a similar story for logins. A few accounts are taken over and you start enforcing 2FA in every login. Your activation funnel suffers first, and then the total number of daily active users drop.
The problem is that you're using a traditional approach to solve a modern complex problem: account fraud at web scale. Fraud is, and has always been, a cat and mouse game. Fraudsters are intelligent beings who find new ways when you block the old ways. This is where AI and ML are truly needed because new patterns of fraud emerge every day.
In this talk, Amir will take you through the journey at CrossClassify to developing XAI (explainable AI) solutions for account fraud prevention. He will cover how user authentication methods can be enhanced using behavioural biometrics to implement a continuous authentication system that monitors user behaviour throughout a session. He will also explore anomaly detection techniques and deep learning approaches which work better in the fraud domain, where heavily imbalanced datasets are the norm.
1. How AI is preventing
account fraud at web
scale
Amir Moghimi
Co-founder & CTO
crossclassify.com
1
2. Byron Bay data breach victim told to pay
Adidas, National Basketball Association
$1.2m by US courts
"The charges were cybersquatting,
trademark infringement, IP infringement,
things I don't know anything about."
ABC North Coast /
25 July 2023
2
3. 3
■ In a survey run by AIC, 47% of respondents in 2023, experienced
at least one cybercrime in the 12 months prior to the survey.
■ 20% of these cybercrimes was identity crime and misuse. *
■ No surprise with Optus, Latitude and Medibank data breaches.
* Australian Institute of Criminology
18. Unsupervised
Account sharing like Netflix,
Spotify, Gaming apps
One account with more
than 20 distinct devices and
10 different IPs (Far from
normal accounts)
Accounts with
normal behavior
One account with
approximately near normal
behavior
18
Dev Count
IP Count
23. 23
Explainability-Accuracy Trade-Off
Prediction
Accuracy
Explainability
Learning Techniques (today) Explainability
(notional)
Neural
Nets
Statistica
l
Models
Ensembl
e
Methods
Decisio
n
Trees
Deep
Learnin
g
SVM
s
AOG
s
Bayesian
Belief
Nets
Markov
Models
HBN
s
MLN
s
New
Approach
Create a suite of
machine learning
techniques that
produce more
explainable models,
while maintaining a
high level of
learning
performance
SR
L
CRF
s
Rando
m
Forests
Graphic
al
Models
35. Make it
balanced
Ensemble
methods
Adaptive AI Trial and Error
Proper
Modeling
35
Challenges
Imbalanced
Dataset
Fraud Patterns
Change/Evolve
Cat & Mouse
Model Selection
+
Parameter Tuning
36. Trial and Error /
Experience
At the beginning,
interpretability &
explainability are more
important
As we get more data, more
complicated methods with
large parameters come into
the picture
Model Selection
Parameter Tuning
Implement the
Proper Modeling
36
40. Dataset sampling
✔ Non-fraud records: 32,269,123
✔ Fraud records: 181
✔ Too imbalanced for neural network classifiers
✔ Train and test split: 20% test size
✔ Under sampling for showcasing different model types:
• Under sampled the non-fraud records to 1,000
40