Webinar | Fighting Fraud with Graph Databases
DataStax and Cambridge Intelligence
Agenda • Introduction - Andrea Cross, Senior
Marketing Director, DataStax (5 mins)
• Setting the Scene - Kaush Kotak, Project
Manager, Cambridge Intelligence (10 mins)
• DSE Graph - Gehrig Kunz, Product
Marketing Manager, DataStax (15 mins)
• Fraud Management with Graphs - Kaush
Kotak (20 mins)
• Q & A (10 mins)
All Rights Reserved | Confidential
Fraud management with graphs
Graph technologies are an
integral part of modern fraud
management solutions
All Rights Reserved | Confidential
Traditional “monolithic”, single supplier, full-stack solutions:
• $$$ price tags: large investments lead to “make do” attitude, reluctance
to change
• Closed architectures: restricted extensibility and interoperability, fixed
models and ways of working
• Scale: unable to cope with demands for information from new disparate
data sources
• Limited analytics/ visualization: leads to poor decisions
• Stalled innovation: lack of agility, creates opportunities for exploitation
Traditional fraud solutions
All Rights Reserved | Confidential
A new approach
1. Data Sources
Structured, unstructured
and event data
Feedback
loop
Alerts
2. Structure/
Load
NLP, Entity Extraction,
Entity Resolution
3. Store
Fraud database
4. Process
Rule-based scoring &
predictive analytics
Additional core data
retrieved on demand
for visualization
New
rules
5. Case
Management
Data streamed or
loaded
New evolving “build your own” architecture
• Use best of breed tools
• Keep ahead of fraudsters
• Grow and develop as your business changes
6. Visualize
Aggregate & network
view
All Rights Reserved | Confidential
DataStax & KeyLines
1. Data Sources
Structured, unstructured
and event data
Feedback
loop
Alerts
2. Structure/
Load
NLP, Entity Extraction,
Entity Resolution
3. Store
Fraud database
4. Process
Rule-based scoring &
predictive analytics
Additional core data
retrieved on demand
for visualization
New
rules
5. Case
Management
6. Visualize
Aggregate & network
view
Data streamed or
loaded
All Rights Reserved | Confidential
Demo Preview
All Rights Reserved | Confidential
DataStax Introduction
Investigating the basics
Cloud Applications:
• Contextual
• Always-On
• Real-time
• Distributed
• Scale
All Rights Reserved | Confidential
Real-time fraud detection
Your Application Message Queue
Streaming
Analytics
Batch
Analytics
Transactional
Single DSE Cluster
DataStax Driver
All Rights Reserved | Confidential
To their credit…
• Fraudsters are getting wiser
• In isolation, all of these
transactions are ‘ok’
• So how can we identify fraud that
doesn’t have obvious outliers?
All Rights Reserved | Confidential
Spoiler: it’s in the relationships
All Rights Reserved | Confidential
Spoiler: it’s in the relationships
All Rights Reserved | Confidential
RELATIONSHIPS
Graph to the rescue
Defining a graph problem
– Value is in the relationships
– A large number of joins
• Self-JOINs
Questionable query performance
All Rights Reserved | Confidential
?
Value in the relationships
All Rights Reserved | Confidential
Understanding the difference
All Rights Reserved | Confidential
Graph database
Relational database
Connectedness of data
Performance
And easier to build
All Rights Reserved | Confidential
DataStax Enterprise Graph
• Real-time graph database
• Manage complex and
highly connected data
• Discovering commonalities
and anomalies in data
• Stored as Cassandra
tables behind the scenes
All Rights Reserved | Confidential
Indexing
& Search
Streaming
Analytics
Graph
Batch
Analytics
DataStax Enterprise multi-model/mixed workload
All Rights Reserved | Confidential
Analytics at your command
• DSE Graph has seamless support
for DSE Analytics powered by
Apache Spark.
• No need to learn Spark – Gremlin
language used both for OLTP and
OLAP.
• Analytics workloads can be
separated from OLTP workloads.
All Rights Reserved | Confidential
Search your graph data
• DSE Graph has seamless support
for DSE Search powered by
Apache Solr.
• Simple, schema-driven index
management.
• DSE Graph’s query optimizer
automatically uses Solr behind the
scenes.
All Rights Reserved | Confidential
Powering cloud applications
Always-on
Designed to handle any failure,
no matter how catastrophic
Effortless scale
Take advantage of every opportunity.
Focus on what matters most to you.
Instant insight
Built into your application to create
actionable, modern experiences.
All Rights Reserved | Confidential
Our focus is graph/ network visualization (aka Link analysis)
What is graph visualization?
node node
Vertex
Edge
Vertex
link
Data in which the connections between entities are important to
understand, e.g. social network data, transaction data, etc.
All Rights Reserved | Confidential
1. Human brain is unrivaled for pattern finding
2. “Consume” a large amount of data
3. See data in full context
4. Intuitively explore data and connections
5. Explain what is happening
6. Communicate complex insight to non-experts
Why visualize graphs?
All Rights Reserved | Confidential
Graph patterns show how your business works
Graph visualization
All Rights Reserved | Confidential
Graph visualization
Business As Usual
Graph patterns show how your business works
All Rights Reserved | Confidential
Graph visualization
Graph patterns show how your business works
Unusually high connectivity,
requires investigation
All Rights Reserved | Confidential
Fraud use cases
Application Fraud
Social Media FraudIdentity Theft Account Takeover
Claims Fraud
+ many others
Transaction Fraud
All Rights Reserved | Confidential
Fraud investigation
• Unusual connectivity -> “suspect” behavior
• Visualization helps humans uncover these anomalies
• Anomaly analysis & investigation
Example nodes:
• Cases
• People
• Addresses
• Vehicles
Example links:
• Transactions
• Ownership
• Family relationships
All Rights Reserved | Confidential
KeyLines Demo
Investigation:
• Suspect behavior is understood
• Rules can be defined
• Automation possible
• Outliers and anomalies require
investigation
• Analyst use visualization to verify
unusual cases
Investigation vs detection
Detection:
• Unknown fraudulent behaviour
• Generally not automated
• Analysts use visualization to
understand MO and patterns
• Requires domain knowledge
• Time consuming
• Uncover detection rules
Investigation Detection
known unknown
adapt
rules
All Rights Reserved | Confidential
Real fraud examples
Link through
Account/Policy holder
with same address,
email & phone
All Rights Reserved | Confidential
Real fraud examples
Initial claim rejected but policy
holder has reclaimed for a
similar incident a week later
All Rights Reserved | Confidential
Third parties involved in
two claims, share links
with policy holders
Real fraud examples
All Rights Reserved | Confidential
Multiple third parties, large
number of personal injury
claims (>$50k), inconsistent
pattern of activity
Real fraud examples
All Rights Reserved | Confidential
KeyLines is a powerful SDK for building network visualization web applications:
Introducing KeyLines
• Cross-browser compatibility
• Works on any device
• Full customization
• Powerful functionality
• A fast developer experience
• Rapid deployment
• Easy maintenance
All Rights Reserved | Confidential
• Graph visualization and graph analytics are critically
important for fraud investigation and detection
• Users need to see graphs and interact with them visually
to get insight
• KeyLines and DSE Graph are a powerful combination for
fraud management solutions
• DSE Graph offers great potential to scale to large graphs!
Summary
All Rights Reserved | Confidential
What’s next?
Free online training from DataStax Academy –
– DS330: DSE Graph
• 29 units
• ~12 hours
DSE Graph Resources
All Rights Reserved | Confidential
http://www.datastax.com/dse-graph-campaign/
Q&A
KeyLines/ DataStax Architecture
Gremlin -
JavaScript
WebSocket
Apache TinkerPop and Apache TinkerPop Gremlin logos are
trademarks of The Apache Software Foundation.
DataStax logo owned by DataStax.
TinkerPop
All Rights Reserved | Confidential

Webinar: Fighting Fraud with Graph Databases

  • 1.
    Webinar | FightingFraud with Graph Databases DataStax and Cambridge Intelligence
  • 2.
    Agenda • Introduction- Andrea Cross, Senior Marketing Director, DataStax (5 mins) • Setting the Scene - Kaush Kotak, Project Manager, Cambridge Intelligence (10 mins) • DSE Graph - Gehrig Kunz, Product Marketing Manager, DataStax (15 mins) • Fraud Management with Graphs - Kaush Kotak (20 mins) • Q & A (10 mins) All Rights Reserved | Confidential
  • 3.
    Fraud management withgraphs Graph technologies are an integral part of modern fraud management solutions All Rights Reserved | Confidential
  • 4.
    Traditional “monolithic”, singlesupplier, full-stack solutions: • $$$ price tags: large investments lead to “make do” attitude, reluctance to change • Closed architectures: restricted extensibility and interoperability, fixed models and ways of working • Scale: unable to cope with demands for information from new disparate data sources • Limited analytics/ visualization: leads to poor decisions • Stalled innovation: lack of agility, creates opportunities for exploitation Traditional fraud solutions All Rights Reserved | Confidential
  • 5.
    A new approach 1.Data Sources Structured, unstructured and event data Feedback loop Alerts 2. Structure/ Load NLP, Entity Extraction, Entity Resolution 3. Store Fraud database 4. Process Rule-based scoring & predictive analytics Additional core data retrieved on demand for visualization New rules 5. Case Management Data streamed or loaded New evolving “build your own” architecture • Use best of breed tools • Keep ahead of fraudsters • Grow and develop as your business changes 6. Visualize Aggregate & network view All Rights Reserved | Confidential
  • 6.
    DataStax & KeyLines 1.Data Sources Structured, unstructured and event data Feedback loop Alerts 2. Structure/ Load NLP, Entity Extraction, Entity Resolution 3. Store Fraud database 4. Process Rule-based scoring & predictive analytics Additional core data retrieved on demand for visualization New rules 5. Case Management 6. Visualize Aggregate & network view Data streamed or loaded All Rights Reserved | Confidential
  • 7.
    Demo Preview All RightsReserved | Confidential
  • 8.
  • 9.
    Investigating the basics CloudApplications: • Contextual • Always-On • Real-time • Distributed • Scale All Rights Reserved | Confidential
  • 10.
    Real-time fraud detection YourApplication Message Queue Streaming Analytics Batch Analytics Transactional Single DSE Cluster DataStax Driver All Rights Reserved | Confidential
  • 11.
    To their credit… •Fraudsters are getting wiser • In isolation, all of these transactions are ‘ok’ • So how can we identify fraud that doesn’t have obvious outliers? All Rights Reserved | Confidential
  • 12.
    Spoiler: it’s inthe relationships All Rights Reserved | Confidential
  • 13.
    Spoiler: it’s inthe relationships All Rights Reserved | Confidential RELATIONSHIPS
  • 14.
    Graph to therescue Defining a graph problem – Value is in the relationships – A large number of joins • Self-JOINs Questionable query performance All Rights Reserved | Confidential ?
  • 15.
    Value in therelationships All Rights Reserved | Confidential
  • 16.
    Understanding the difference AllRights Reserved | Confidential Graph database Relational database Connectedness of data Performance
  • 17.
    And easier tobuild All Rights Reserved | Confidential
  • 18.
    DataStax Enterprise Graph •Real-time graph database • Manage complex and highly connected data • Discovering commonalities and anomalies in data • Stored as Cassandra tables behind the scenes All Rights Reserved | Confidential
  • 19.
    Indexing & Search Streaming Analytics Graph Batch Analytics DataStax Enterprisemulti-model/mixed workload All Rights Reserved | Confidential
  • 20.
    Analytics at yourcommand • DSE Graph has seamless support for DSE Analytics powered by Apache Spark. • No need to learn Spark – Gremlin language used both for OLTP and OLAP. • Analytics workloads can be separated from OLTP workloads. All Rights Reserved | Confidential
  • 21.
    Search your graphdata • DSE Graph has seamless support for DSE Search powered by Apache Solr. • Simple, schema-driven index management. • DSE Graph’s query optimizer automatically uses Solr behind the scenes. All Rights Reserved | Confidential
  • 22.
    Powering cloud applications Always-on Designedto handle any failure, no matter how catastrophic Effortless scale Take advantage of every opportunity. Focus on what matters most to you. Instant insight Built into your application to create actionable, modern experiences. All Rights Reserved | Confidential
  • 23.
    Our focus isgraph/ network visualization (aka Link analysis) What is graph visualization? node node Vertex Edge Vertex link Data in which the connections between entities are important to understand, e.g. social network data, transaction data, etc. All Rights Reserved | Confidential
  • 24.
    1. Human brainis unrivaled for pattern finding 2. “Consume” a large amount of data 3. See data in full context 4. Intuitively explore data and connections 5. Explain what is happening 6. Communicate complex insight to non-experts Why visualize graphs? All Rights Reserved | Confidential
  • 25.
    Graph patterns showhow your business works Graph visualization All Rights Reserved | Confidential
  • 26.
    Graph visualization Business AsUsual Graph patterns show how your business works All Rights Reserved | Confidential
  • 27.
    Graph visualization Graph patternsshow how your business works Unusually high connectivity, requires investigation All Rights Reserved | Confidential
  • 28.
    Fraud use cases ApplicationFraud Social Media FraudIdentity Theft Account Takeover Claims Fraud + many others Transaction Fraud All Rights Reserved | Confidential
  • 29.
    Fraud investigation • Unusualconnectivity -> “suspect” behavior • Visualization helps humans uncover these anomalies • Anomaly analysis & investigation Example nodes: • Cases • People • Addresses • Vehicles Example links: • Transactions • Ownership • Family relationships All Rights Reserved | Confidential
  • 30.
  • 31.
    Investigation: • Suspect behavioris understood • Rules can be defined • Automation possible • Outliers and anomalies require investigation • Analyst use visualization to verify unusual cases Investigation vs detection Detection: • Unknown fraudulent behaviour • Generally not automated • Analysts use visualization to understand MO and patterns • Requires domain knowledge • Time consuming • Uncover detection rules Investigation Detection known unknown adapt rules All Rights Reserved | Confidential
  • 32.
    Real fraud examples Linkthrough Account/Policy holder with same address, email & phone All Rights Reserved | Confidential
  • 33.
    Real fraud examples Initialclaim rejected but policy holder has reclaimed for a similar incident a week later All Rights Reserved | Confidential
  • 34.
    Third parties involvedin two claims, share links with policy holders Real fraud examples All Rights Reserved | Confidential
  • 35.
    Multiple third parties,large number of personal injury claims (>$50k), inconsistent pattern of activity Real fraud examples All Rights Reserved | Confidential
  • 36.
    KeyLines is apowerful SDK for building network visualization web applications: Introducing KeyLines • Cross-browser compatibility • Works on any device • Full customization • Powerful functionality • A fast developer experience • Rapid deployment • Easy maintenance All Rights Reserved | Confidential
  • 37.
    • Graph visualizationand graph analytics are critically important for fraud investigation and detection • Users need to see graphs and interact with them visually to get insight • KeyLines and DSE Graph are a powerful combination for fraud management solutions • DSE Graph offers great potential to scale to large graphs! Summary All Rights Reserved | Confidential
  • 38.
    What’s next? Free onlinetraining from DataStax Academy – – DS330: DSE Graph • 29 units • ~12 hours DSE Graph Resources All Rights Reserved | Confidential http://www.datastax.com/dse-graph-campaign/
  • 39.
  • 40.
    KeyLines/ DataStax Architecture Gremlin- JavaScript WebSocket Apache TinkerPop and Apache TinkerPop Gremlin logos are trademarks of The Apache Software Foundation. DataStax logo owned by DataStax. TinkerPop All Rights Reserved | Confidential