Anti-Fraud and eDiscovery using
Graph Databases and Graph
Visualization
Corey Lanum
We are hiring!
Corey Lanum
• 10 years with i2 (now IBM), developing visualization and
analytical solutions for large government and enterprise
customers
– Major insurance companies
• Auto
• Health
– Government Agencies
• RCMP
• FBI
• California Department of Justice
Fraud
Fraud consists of misrepresentation for
personal financial gain
– Personal Misrepresentation
– Pretending to be someone
else to collect money
intended for others
– Transactional Misrepresentation
– Fabricating details of a
transaction to avoid scrutiny
– Fabrication or exaggeration
of insurance claims
Fraud Detection
• Why Graph Databases?
– Almost all fraud cases involve the fabrication of a relationship, so
it makes sense to model your data to highlight relationships
• Why Visualization?
– Visualization of these relationships helps investigators and
analysts determine what patterns are normal, and which are
abnormal, and flag the abnormal patterns for further scrutiny
Fraud Investigation
• Once we have uncovered a fraudulent
transaction, how do we determine who is
responsibility, and prove
misrepresentation?
– Who had access?
– Who benefited?
– Did they work alone?
• 270 public and private sector organizations in
the UK are members of CIFAS
• CIFAS maintains two large databases, one of all
reported fraud instances and one for reported
staff fraud
• CIFAS has contracted to use KeyLines to
visualize connections between fraud instances
Neo4j and KeyLines
KeyLines
Visualise and analyse networks in the browser
• Communication networks
• Social networks
• Fraud networks
Features
• Pure HTML5
• Works on IE6, 7, 8 via Flash
• Graph layouts
• Graph analytics
– SNA measures, path finding & more
• Full event model
• Full workflow support
– Image generation for reports, undo stack, etc
• Very quick integration time
• Thorough documentation
• Good performance
• Great support
KeyLines / Neo Architecture
Credit Card Fraud Scenario
• Employees of a retail merchant swipe
customers’ cards and steal data before
processing transaction
• Cardholders later notice fraudulent
charges on their bill
• How do we walk back to determine who is
responsible?
Insurance Fraud
• A claim on an insurance policy that one is
not entitled to make
– Staged auto accidents
– Doctors billing for services they never
performed
– Claiming pre-existing damage was caused by
a covered event
• Misrepresentation on the policy application
to pay lower premiums
eDiscovery
• Similar to Fraud detection
• Large volumes of transactional data – need to
understand patterns in the data
• Can’t afford to pay lawyers to read every document
• eDiscovery tools help to identify which documents or
communications may be relevant by using a number of
algorithms
• Neo4j and Graph Visualization can help!
Costs of Fraud
• Industry estimates are $2.5 Trillion per
year
• By making it easier to both detect and
investigate fraud, we reduce the incentives
to conduct fraud in the first place
• Neo4j and KeyLines are perfect
technologies to assist in this endevour
Thanks!
corey@cambridge-intelligence.com
All logos, trademarks, service marks and copyrights used in this
presentation belong to their respective owners
Roadmap
• Larger and larger
networks
– Filtering
– Combining nodes
together
– Improved analytics for
node importance
– Faster rendering (long
term)
• Dynamic networks
– Filtering
– Timeline, time slider
• Location information
– Map underlays
– Geographic node layout
• Real time networks
– Visual activity indicators
• Information synthesis
– Shapes, boxes,
attributes for annotation
– Snap to grid
– Elbows on links

Anti-Fraud and eDiscovery using Graph Databases and Graph Visualization - Corey Lanum @ GraphConnect Boston 2013

  • 1.
    Anti-Fraud and eDiscoveryusing Graph Databases and Graph Visualization Corey Lanum
  • 2.
  • 3.
    Corey Lanum • 10years with i2 (now IBM), developing visualization and analytical solutions for large government and enterprise customers – Major insurance companies • Auto • Health – Government Agencies • RCMP • FBI • California Department of Justice
  • 4.
    Fraud Fraud consists ofmisrepresentation for personal financial gain – Personal Misrepresentation – Pretending to be someone else to collect money intended for others – Transactional Misrepresentation – Fabricating details of a transaction to avoid scrutiny – Fabrication or exaggeration of insurance claims
  • 5.
    Fraud Detection • WhyGraph Databases? – Almost all fraud cases involve the fabrication of a relationship, so it makes sense to model your data to highlight relationships • Why Visualization? – Visualization of these relationships helps investigators and analysts determine what patterns are normal, and which are abnormal, and flag the abnormal patterns for further scrutiny
  • 6.
    Fraud Investigation • Oncewe have uncovered a fraudulent transaction, how do we determine who is responsibility, and prove misrepresentation? – Who had access? – Who benefited? – Did they work alone?
  • 7.
    • 270 publicand private sector organizations in the UK are members of CIFAS • CIFAS maintains two large databases, one of all reported fraud instances and one for reported staff fraud • CIFAS has contracted to use KeyLines to visualize connections between fraud instances
  • 8.
  • 9.
    KeyLines Visualise and analysenetworks in the browser • Communication networks • Social networks • Fraud networks Features • Pure HTML5 • Works on IE6, 7, 8 via Flash • Graph layouts • Graph analytics – SNA measures, path finding & more • Full event model • Full workflow support – Image generation for reports, undo stack, etc • Very quick integration time • Thorough documentation • Good performance • Great support
  • 10.
    KeyLines / NeoArchitecture
  • 11.
    Credit Card FraudScenario • Employees of a retail merchant swipe customers’ cards and steal data before processing transaction • Cardholders later notice fraudulent charges on their bill • How do we walk back to determine who is responsible?
  • 12.
    Insurance Fraud • Aclaim on an insurance policy that one is not entitled to make – Staged auto accidents – Doctors billing for services they never performed – Claiming pre-existing damage was caused by a covered event • Misrepresentation on the policy application to pay lower premiums
  • 13.
    eDiscovery • Similar toFraud detection • Large volumes of transactional data – need to understand patterns in the data • Can’t afford to pay lawyers to read every document • eDiscovery tools help to identify which documents or communications may be relevant by using a number of algorithms • Neo4j and Graph Visualization can help!
  • 14.
    Costs of Fraud •Industry estimates are $2.5 Trillion per year • By making it easier to both detect and investigate fraud, we reduce the incentives to conduct fraud in the first place • Neo4j and KeyLines are perfect technologies to assist in this endevour
  • 15.
    Thanks! corey@cambridge-intelligence.com All logos, trademarks,service marks and copyrights used in this presentation belong to their respective owners
  • 16.
    Roadmap • Larger andlarger networks – Filtering – Combining nodes together – Improved analytics for node importance – Faster rendering (long term) • Dynamic networks – Filtering – Timeline, time slider • Location information – Map underlays – Geographic node layout • Real time networks – Visual activity indicators • Information synthesis – Shapes, boxes, attributes for annotation – Snap to grid – Elbows on links

Editor's Notes

  • #11 Bring up neo demo
  • #17 This slide is to explain the main drivers for the features we are planning.The drivers are: large networks, dynamic, location info, real-time dashboards and a need for users to draw stuff ‘on top’ of the networks