Graphs 
for 
Fraud 
Detec1on 
rik@neotechnology.com 
@rvanbruggen
Agenda 
• About 
Graphs 
• About 
Graph 
Databases 
• Why 
Graph 
Databases 
ma<er 
for 
Fraud 
Detec?on 
– Short 
demonstra?on 
• Case 
Studies 
• Q&A
Introduc?on: 
about 
Graphs
Meet  
Leonhard Euler 
• Swiss 
mathema?cian 
• Inventor 
of 
Graph 
Theory 
(1736)
Königsberg 
(Prussia) 
-­‐ 
1736
A 
B 
D 
C
A 
B 
D 
C 
1 
2 
3 
4 
7 
6 
5
About 
Graph 
Databases
So 
what 
is 
a 
graph 
database? 
• OLTP 
database 
– “end-­‐user” 
transac?ons 
• Model, 
store, 
manage 
data 
as 
a 
graph
What 
is 
a 
graph? 
Node 
Rela?onship
Contrast 
with 
Rela?onal 
Graphs are often referred to as “Whiteboard Friendly”. The 
data model reflects the way a domain expert would naturally 
draw their data on a whiteboard 
“The schema is the data”. Schema flexibility allows the system 
to change in response to a changing environment
What 
are 
graphs 
good 
for? 
Complex 
Querying
Examples 
of 
complex 
queries? 
1. 
Semi-­‐structure 
in 
datasets 
1 
– Normaliza?on 
introduces 
complexity 
– Forces 
developers 
to 
develop 
all 
kinds 
of 
logic 
to 
deal 
with 
this 
variability 
in 
their 
applica?on 
logic
Examples 
of 
complex 
queries: 
2. 
Connectedness 
in 
data 
Lots 
of 
normalized 
rela?onships 
between 
the 
different 
en??es, 
forces 
developers 
to 
do 
• Deep 
joins 
• Recursive 
joins 
• Pathfinding 
opera?ons 
• “open-­‐ended” 
queries
Examples 
of 
Connectedness
Graphs 
in 
Fraud 
Detec1on?
Recommender 
Fraud 
Detec?on 
systems
Recommender 
Fraud 
Detec?on 
systems 
Fraud Detection
Graphs 
in 
Fraud 
Detec?on 
Systems 
• Real 
?me 
aspect 
• Detec?on 
Prac?ces 
rely 
on 
Graph 
Algorithms 
• Opera?onal 
efficiency
Real 
1me 
fraud 
detec?on? 
• Context 
is 
everything 
– You 
don’t 
want 
to 
be 
blocking 
credit 
cards 
for 
no 
reason… 
false 
posi?ves 
are 
fatal… 
• Complexity 
maers 
– Need 
to 
outsmart 
the 
“bad 
guys” 
– Assume 
that 
they 
can 
and 
will 
understand/beat 
the 
system 
• Visualiza?on 
maers 
– Manual 
interven?on 
relies 
on 
fast 
understanding 
of 
context 
– 
and 
visualiza?on 
helps 
there
Fraud 
Detec?on 
prac?ces: 
Graph 
Algorithms 
● Helpful 
for 
naviga?ng 
complex 
networks 
● tell 
me 
how 
A 
and 
B 
are 
related 
● The 
things 
on 
the 
path 
between 
A 
and 
B 
could 
very 
well 
be 
interes?ng 
● ShortestPath, 
AllShortestPaths, 
Weighted 
ShortestPath 
(Dijkstra, 
A*) 
● Helpful 
for 
understanding 
the 
important 
parts 
of 
a 
network 
● Clusters 
● Bridges 
● Centrality 
● Betweenness 
Centrality 
● (Page)Ranking
Opera1onal 
Efficiency 
• Graph 
datamodel 
removes 
the 
need 
for 
many 
“batch 
opera?ons” 
– No 
need 
to 
precalculate 
– 
just 
feed 
it 
into 
the 
graph 
• Complex 
paern 
matching 
in 
milliseconds 
• Graph 
Locality 
== 
Predictability 
 
Speed, 
even 
over 
large 
datasets
Short 
demo 
-­‐ 
1 
● Circular 
paerns 
omen 
indicate 
some 
kind 
of 
aempt 
to 
“trick 
the 
system”
Short 
demo 
-­‐ 
2
Use 
Cases 
(neo4j.com/use-­‐cases)
Customers 
(neo4j.com/customers)
Graph 
Gists 
(hp://gist.neo4j.org/)
Neo4j 
versions 
/ 
licenses 
Neo4j License Overview 
Developer! 
Seats! 
Personal 
 
Startup 
/ 
Departmental 
 
Enterprise 
deployment 
models 
($6K*/Developer/Year) 
Test! 
Instances! 
($6K/Instance/Year) 
Production! 
Instances! 
(Bundle / Core Pricing) 
Open 
source 
 
Commercial 
license 
terms 
available 
Specific 
OEM 
models 
Instances whose purpose is to 
ensure that the software accessing 
Neo4j is meeting specification.! 
! 
(e.g. System Test, Integration Test, 
UAT, Performance Test, Staging) 
Instances that store and process 
data in a way that benefits and 
advances an organization’s goals.! 
! 
May be accessed by applications 
and/or end users 
Includes access by programmers 
to licensed test instances, and 
private instances on the 
programmer’s personal machine 
for the sole purpose of writing, 
debugging, or testing software 
designed to access Neo4j 
*Or otherwise, depending on the Bundle, and negotiation
Future 
trainings 
 
events! 
3
QA, 
Conclusion, 
Next 
Steps 
Neo 
Technology 
www.neotechnology.com 
Neo4j 
www.neo4j.org 
rik@neotechnology.com 
or 
+32 
478 
686800

201411203 goto night on graphs for fraud detection

  • 1.
    Graphs for Fraud Detec1on rik@neotechnology.com @rvanbruggen
  • 2.
    Agenda • About Graphs • About Graph Databases • Why Graph Databases ma<er for Fraud Detec?on – Short demonstra?on • Case Studies • Q&A
  • 3.
  • 5.
    Meet LeonhardEuler • Swiss mathema?cian • Inventor of Graph Theory (1736)
  • 6.
  • 7.
  • 8.
    A B D C 1 2 3 4 7 6 5
  • 9.
  • 10.
    So what is a graph database? • OLTP database – “end-­‐user” transac?ons • Model, store, manage data as a graph
  • 11.
    What is a graph? Node Rela?onship
  • 12.
    Contrast with Rela?onal Graphs are often referred to as “Whiteboard Friendly”. The data model reflects the way a domain expert would naturally draw their data on a whiteboard “The schema is the data”. Schema flexibility allows the system to change in response to a changing environment
  • 13.
    What are graphs good for? Complex Querying
  • 14.
    Examples of complex queries? 1. Semi-­‐structure in datasets 1 – Normaliza?on introduces complexity – Forces developers to develop all kinds of logic to deal with this variability in their applica?on logic
  • 15.
    Examples of complex queries: 2. Connectedness in data Lots of normalized rela?onships between the different en??es, forces developers to do • Deep joins • Recursive joins • Pathfinding opera?ons • “open-­‐ended” queries
  • 16.
  • 17.
    Graphs in Fraud Detec1on?
  • 18.
  • 19.
    Recommender Fraud Detec?on systems Fraud Detection
  • 20.
    Graphs in Fraud Detec?on Systems • Real ?me aspect • Detec?on Prac?ces rely on Graph Algorithms • Opera?onal efficiency
  • 21.
    Real 1me fraud detec?on? • Context is everything – You don’t want to be blocking credit cards for no reason… false posi?ves are fatal… • Complexity maers – Need to outsmart the “bad guys” – Assume that they can and will understand/beat the system • Visualiza?on maers – Manual interven?on relies on fast understanding of context – and visualiza?on helps there
  • 22.
    Fraud Detec?on prac?ces: Graph Algorithms ● Helpful for naviga?ng complex networks ● tell me how A and B are related ● The things on the path between A and B could very well be interes?ng ● ShortestPath, AllShortestPaths, Weighted ShortestPath (Dijkstra, A*) ● Helpful for understanding the important parts of a network ● Clusters ● Bridges ● Centrality ● Betweenness Centrality ● (Page)Ranking
  • 23.
    Opera1onal Efficiency •Graph datamodel removes the need for many “batch opera?ons” – No need to precalculate – just feed it into the graph • Complex paern matching in milliseconds • Graph Locality == Predictability Speed, even over large datasets
  • 24.
    Short demo -­‐ 1 ● Circular paerns omen indicate some kind of aempt to “trick the system”
  • 25.
  • 26.
  • 27.
  • 28.
  • 29.
    Neo4j versions / licenses Neo4j License Overview Developer! Seats! Personal Startup / Departmental Enterprise deployment models ($6K*/Developer/Year) Test! Instances! ($6K/Instance/Year) Production! Instances! (Bundle / Core Pricing) Open source Commercial license terms available Specific OEM models Instances whose purpose is to ensure that the software accessing Neo4j is meeting specification.! ! (e.g. System Test, Integration Test, UAT, Performance Test, Staging) Instances that store and process data in a way that benefits and advances an organization’s goals.! ! May be accessed by applications and/or end users Includes access by programmers to licensed test instances, and private instances on the programmer’s personal machine for the sole purpose of writing, debugging, or testing software designed to access Neo4j *Or otherwise, depending on the Bundle, and negotiation
  • 30.
  • 31.
    QA, Conclusion, Next Steps Neo Technology www.neotechnology.com Neo4j www.neo4j.org rik@neotechnology.com or +32 478 686800