Analyzing Fraud with
Graph Databases
What Does Fraud Look Like?
Organized in groups Synthetic Identities Stolen Identities Hijacked Devices
What Does Fraud Look Like?
Types of Fraud
• Insurance Fraud
• eCommerce Fraud
• Credit Card Fraud
• Rogue Merchants
• Fraud we don’t know about yet…
Endpoint-Centric
Analysis of users and
their end-points
1.
Navigation Centric
Analysis of
navigation behavior
and suspect
patterns
2.
Account-Centric
Analysis of anomaly
behavior by channel
3.
PC:s
Mobile Phones
IP-addresses
User ID:s
Comparing Transaction
Identity Vetting
Traditional Fraud Detection Methods
DISCRETE ANALYSIS
Endpoint-Centric
Analysis of users and
their end-points
1.
Navigation Centric
Analysis of
navigation behavior
and suspect
patterns
2.
Account-Centric
Analysis of anomaly
behavior by channel
3.
Traditional Fraud Detection Methods
INVESTIGATE
Revolving Debt
Number of Accounts
INVESTIGATE
Normal behavior
Fraud Detection with Discrete Analysis
Difficult or unable to detect:
• Synthetic identities
• Stolen identities
• Fraud rings
• Nth degree links
• And more…
Weaknesses
DISCRETE ANALYSIS
Endpoint-Centric
Analysis of users and
their end-points
1.
Navigation Centric
Analysis of
navigation behavior
and suspect
patterns
2.
Account-Centric
Analysis of anomaly
behavior by channel
3.
Traditional Fraud Detection Methods
CONNECTED ANALYSIS
Endpoint-Centric
Analysis of users and
their end-points
Navigation Centric
Analysis of
navigation behavior
and suspect
patterns
Account-Centric
Analysis of anomaly
behavior by channel
DISCRETE ANALYSIS
1. 2. 3.
Cross Channel
Analysis of anomaly
behavior correlated
across channels
4.
Entity Linking
Analysis of relationships
to detect organized
crime and collusion
5.
Modern Fraud Detection
Revolving Debt
Number of Accounts
Normal behavior
Fraudulent pattern
Fraud Detection with Connected Analysis
ACCOUNT
HOLDER 2
ACCOUNT
HOLDER 1
ACCOUNT
HOLDER 3
Fraud Detection with Connected Analysis
ACCOUNT
HOLDER 2
ACCOUNT
HOLDER 1
ACCOUNT
HOLDER 3
CREDIT
CARD
BANK
ACCOUNT
BANK
ACCOUNT
BANK
ACCOUNT
PHONE
NUMBER
UNSECURED
LOAN
SSN 2
UNSECURED
LOAN
Fraud Detection with Connected Analysis
ACCOUNT
HOLDER 2
ACCOUNT
HOLDER 1
ACCOUNT
HOLDER 3
CREDIT
CARD
BANK
ACCOUNT
BANK
ACCOUNT
BANK
ACCOUNT
ADDRESS
PHONE
NUMBER
PHONE
NUMBER
SSN 2
UNSECURED
LOAN
SSN 2
UNSECURED
LOAN
Fraud Detection with Connected Analysis
CONNECTED ANALYSIS
Endpoint-Centric
Analysis of users and
their end-points
Navigation Centric
Analysis of
navigation behavior
and suspect
patterns
Account-Centric
Analysis of anomaly
behavior by channel
DISCRETE ANALYSIS
1. 2. 3.
Cross Channel
Analysis of anomaly
behavior correlated
across channels
4.
Entity Linking
Analysis of relationships
to detect organized
crime and collusion
5.
Modern Fraud Detection
“Don’t consider traditional
technology adequate to keep
up with criminal trends”
Market Guide for Online Fraud Detection, April 27, 2015
Architecture
with Neo4j
Money
Transferring
Purchases Bank
Services Relational/tabular
database
Develop Batch Jobs
Data Scientists
+ Good for Discrete Analysis
– No Holistic View of Data-Relationships
– Slow query speed for connections
Insurance
Claims
Relational
database
Data Lake
+ Good for Map Reduce
+ Good for Analytical Workloads
– No holistic view
– Non-operational workloads
– Weeks-to-months processes
Develop Patterns
Data Scientists
Merchant
Data
Credit
Score
Data
Other 3rd
Party
Data
Money
Transferring
Purchases Bank
Services
Insurance
Claims
Data Lake
Neo4j powers
360° view of
transactions in
real-time
SENSE
Transaction
stream
RESPOND
Alerts &
notification
SYNC RELEVANT DATA
Relational
database
Visualization UI
Fine Tune Patterns
Develop Patterns
Data Scientists
Merchant
Data
Credit
Score
Data
Other 3rd
Party
Data
Money
Transferring
Purchases Bank
Services
Insurance
Claims
Neo4j
Cluster
Data Lake
Neo4j powers
360° view of
transactions in
real-time
SENSE
Transaction
stream
RESPOND
Alerts &
notification
SYNC RELEVANT DATA
Relational
database
Visualization UI
Fine Tune Patterns
Develop Patterns
Data Scientists
Merchant
Data
Credit
Score
Data
Other 3rd
Party
Data
Money
Transferring
Purchases Bank
Services
Insurance
Claims
Neo4j
Cluster
New perspective
into business
Data Lake
Neo4j powers
360° view of
transactions in
real-time
SENSE
Transaction
stream
RESPOND
Alerts &
notification
SYNC RELEVANT DATA
Relational
database
Visualization UI
Fine Tune Patterns
Develop Patterns
Data Scientists
Merchant
Data
Credit
Score
Data
Other 3rd
Party
Data
Money
Transferring
Purchases Bank
Services
Insurance
Claims
Neo4j
Cluster
Rule-based scoring
Predictive analytics
Case management
Augments
classic tools:
The Impact of Fraud
The payment card fraud alone,
constitutes for over 16 billion dollar in
losses for the bank-sector in the US.
$16Bpayment card fraud in 2014*
Banking
$32Byearly e-commerce fraud**
Fraud in E-commerce is estimated
to cost over 32 billion dollars
annually is the US..
E-commerce
The impact of fraud on the insurance
industry is estimated to be $80
billion annually in the US.
Insurance
$80Bestimated yearly impact***
*) Business Wire: http://www.businesswire.com/news/home/20150804007054/en/Global-Card-Fraud-Losses-Reach-16.31-Billion#.VcJZlvlVhBc
**) E-commerce expert Andreas Thim, Klarna, 2015
***) Coalition against insurance fraud: http://www.insurancefraud.org/article.htm?RecID=3274#.UnWuZ5E7ROA
Paper Collisions
Insurance scammers invent automobile

accidents complete with fake drivers,

passengers and witnesses
Insurance Fraud Example
Accidents
Cars
Doctor Attorney
People
Drives
Is	Passenger
Drivers

Passengers

Witnesses
Insurance Fraud Example
View	of	fraud	ring	

in	a	graph	database
Accident

1
Accident

2
Person

1
Person

2
Person

3
Person

4
Person

5
Person

6
Car

1
Car

2
Car

3
Car

4
INVOLVES
DRIVES
REPRESENTS
WITNESSES
ADJUSTS
HEALS
Insurance Fraud Graph
Dashboard Example
Patterns
KDD 2015 3
Subgraphs
Subset of nodes
and relationships in
the data
Ego networks
E
The subgraph made up of
all neighbors and the
relationships among them
Generalizable to Nth
degree neighbors
Same Data, Different Perspective
Mapping Ego Networks
slope=2
slope=1
slope=1.35
Mapping Ego Networks
Mapping Ego Networks
Mapping Ego Networks
128.240.229.18
fred@rbs.co.uk
1234LOL
Personal Networks are Stars
128.240.229.18
fred@rbs.co.uk
1234LOL nick@bearings.com
Ca$hMon£y
Overlapping Stars
Hmm….
MATCH (u1:User {name:”Rik”})––(x)––(u2:User)
WHERE u1 <> u2 AND NOT (x:IP)
RETURN x Network	in	common	is	OK
Sample Query
Remember, It Scales
Detect & prevent fraud in real-time
Faster credit risk analysis and transactions
Reduce chargebacks
Quickly adapt to new methods of fraud
Why Neo4j? Who’s using it?
Financial institutions use Neo4j to:
FINANCE Government Online Retail
Don’t be a lonely node.
Connect with us :-)

GraphTalks Copenhagen - Analyzing Fraud with Graph Databases