Investigating fraud often involves identifying suspicious patterns among mountains of uninteresting transactional data. A new partnership between Neo Technologies and Cambridge Intelligence allows fraud investigators and data analysts to uncover these patters far more easily. By combining the power of Neo4j's graph database and the visualization capabilities of KeyLines, a web-based graph visualization engine tightly integrated with Neo4j's data model, these investigators and analysts can visually drill down from aggregate data to the individual suspicious data elements quickly and without requiring significant technical expertise in query languages. This presentation will summarize the Neo Technology and Cambridge Intelligence partnership, discuss the technical integration between the two products, and demonstrate a number of different scenarios of uncovering fraud across multiple domains and data types.
3. Corey Lanum
• 10 years with i2 (now IBM), developing visualization and
analytical solutions for large government and enterprise
customers
– Major insurance companies
• Auto
• Health
– Government Agencies
• RCMP
• FBI
• California Department of Justice
4. Fraud
Fraud consists of misrepresentation for
personal financial gain
– Personal Misrepresentation
– Pretending to be someone
else to collect money
intended for others
– Transactional Misrepresentation
– Fabricating details of a
transaction to avoid scrutiny
– Fabrication or exaggeration
of insurance claims
5. Fraud Detection
• Why Graph Databases?
– Almost all fraud cases involve the fabrication of a relationship, so
it makes sense to model your data to highlight relationships
• Why Visualization?
– Visualization of these relationships helps investigators and
analysts determine what patterns are normal, and which are
abnormal, and flag the abnormal patterns for further scrutiny
6. Fraud Investigation
• Once we have uncovered a fraudulent
transaction, how do we determine who is
responsibility, and prove
misrepresentation?
– Who had access?
– Who benefited?
– Did they work alone?
7. • 270 public and private sector organizations in
the UK are members of CIFAS
• CIFAS maintains two large databases, one of all
reported fraud instances and one for reported
staff fraud
• CIFAS has contracted to use KeyLines to
visualize connections between fraud instances
9. KeyLines
Visualise and analyse networks in the browser
• Communication networks
• Social networks
• Fraud networks
Features
• Pure HTML5
• Works on IE6, 7, 8 via Flash
• Graph layouts
• Graph analytics
– SNA measures, path finding & more
• Full event model
• Full workflow support
– Image generation for reports, undo stack, etc
• Very quick integration time
• Thorough documentation
• Good performance
• Great support
11. Credit Card Fraud Scenario
• Employees of a retail merchant swipe
customers’ cards and steal data before
processing transaction
• Cardholders later notice fraudulent
charges on their bill
• How do we walk back to determine who is
responsible?
12. Insurance Fraud
• A claim on an insurance policy that one is
not entitled to make
– Staged auto accidents
– Doctors billing for services they never
performed
– Claiming pre-existing damage was caused by
a covered event
• Misrepresentation on the policy application
to pay lower premiums
13. eDiscovery
• Similar to Fraud detection
• Large volumes of transactional data – need to
understand patterns in the data
• Can’t afford to pay lawyers to read every document
• eDiscovery tools help to identify which documents or
communications may be relevant by using a number of
algorithms
• Neo4j and Graph Visualization can help!
14. Costs of Fraud
• Industry estimates are $2.5 Trillion per
year
• By making it easier to both detect and
investigate fraud, we reduce the incentives
to conduct fraud in the first place
• Neo4j and KeyLines are perfect
technologies to assist in this endevour
16. Roadmap
• Larger and larger
networks
– Filtering
– Combining nodes
together
– Improved analytics for
node importance
– Faster rendering (long
term)
• Dynamic networks
– Filtering
– Timeline, time slider
• Location information
– Map underlays
– Geographic node layout
• Real time networks
– Visual activity indicators
• Information synthesis
– Shapes, boxes,
attributes for annotation
– Snap to grid
– Elbows on links
Editor's Notes
Bring up neo demo
This slide is to explain the main drivers for the features we are planning.The drivers are: large networks, dynamic, location info, real-time dashboards and a need for users to draw stuff ‘on top’ of the networks