1
Smarter Fraud Detection
Nick Johnson
Sr. Product Marketing Manager
Neo4j Graph Data Science
2
80%
Gartner
of data and analytics innovations will
use graph technologies by 2025.
3
1
What Is a
Graph?
2
The Big
Questions
3
Smarter
Fraud
Detection:
Proactively
Identify Fraud
5
Next Steps
and
Resources
4
How Banking
Circle
Increased
Fraud
Detection
300%
Agenda
4
What Is a Graph?
5
Node: Represents an entity in the
graph
Relationships (edges / links):
Connect nodes to each other
Property: Describes a node or
relationship: name, age, height, etc.
Graphs Show Data Based on Relationships
ASH
DEL
Name: Mel
Born: May 29, 1970
Twitter: @mel
Name: Ash
Born: Dec 5, 1975
CAR
Brand Volvo
Model: V70
Since:
Jan 10, 2011
LOVES
LOVES
LOVES
LIVES WITH
6
Graphs Show Data Based on Relationships
Movies and People
Nodes: People and movies
Relationships: the role each
person plays related to the
movies
Property: release date, tagline,
title, etc.
7
The Big Questions
What’s important? What’s unusual? What’s next?
8
Use Graph Data Science to Answer:
What’s important?
9
Who has the most connections?
Who has the highest page rank?
Who is an influencer?
Use the Data You Already Have to Answer:
What’s important?
Prioritization
Listen for words like:
● Best
● Top performing
● Highest converting
● Most challenging
What’s unusual?
10
Where is a community forming?
What are the group dynamics?
What’s unusual about this data?
Use the Data You Already Have to Answer:
What’s unusual?
Anomaly Detection
Listen for words around behavior like:
● Unusual
● Anomalous
● Strange
● Odd
● Weird
What’s next?
11
What’s the most common path?
Who is in the same community?
What relationship will form?
Use the Data You Already Have to Answer:
What’s next?
Predictions
Listen for words like:
● Recommend
● Optimize
● Improve
● Likely
12
Smarter Fraud Detection:
Proactively Identify Fraudulent
Behavior
13
$5.8billion
CNBC
of consumer money lost to fraud in
2021
19%increase
Experian
in reported fraud cases to the U.S.
Federal Trade Commission in 2021
from the year prior
Source: CNBC Source: Experian
14
Source: WSJ
Source: United States Department of Justice
15
Types of Fraud Detected with Graph Data Science
Types
of
Fraud
Credit Card Fraud
Insurance Fraud
Identity Theft
Wire Fraud
Insider Trading
Money Laundering
Tax Fraud
Accounting Fraud
Subscription Fraud E-commerce Fraud
Streaming Fraud
16
The Traditional Data Science Approach
Problems
● Who are the suspicious actors?
● What accounts have strange
activity?
● Where is the unusual activity
happening?
What’s unusual?
Traditional methods
Manually join tables and
searches across sources to
flag suspicious accounts
Fractured information across
sources misses relationships
Incomplete picture of
underlying network, patterns,
actors, and tools
17
The Graph Data Science Approach
Traditional methods Neo4j Graph Data Science
Manually join tables and
searches across sources to
flag suspicious accounts
Fractured information across
sources misses relationships
Identify fraudulent actors and
patterns with anomaly and
community detection algorithms
Predict fraudulent accounts and
transactions with graph
embeddings and ML features
Visually map and explore
relationships between actors,
identifiers, and events in a graph
Incomplete picture of
underlying network, patterns,
actors, and tools
18
Fraudulent Actors Represented In a Graph
19
How Banking Circle Increased
Fraud Detection 300%
20
Fraud Detection
Identify suspicious activity.
Results:
• 300%+ increase in fraud detection
• 10% True positive alert escalations (industry is <1%)
• Reduced overall number of alert escalations
21
Next Steps & Resources
Business Resources
Read the E-book
View the
Infographic
Technical Resources
Read the Blog
Get the
White Paper

Smarter Fraud Detection With Graph Data Science

  • 1.
    1 Smarter Fraud Detection NickJohnson Sr. Product Marketing Manager Neo4j Graph Data Science
  • 2.
    2 80% Gartner of data andanalytics innovations will use graph technologies by 2025.
  • 3.
    3 1 What Is a Graph? 2 TheBig Questions 3 Smarter Fraud Detection: Proactively Identify Fraud 5 Next Steps and Resources 4 How Banking Circle Increased Fraud Detection 300% Agenda
  • 4.
  • 5.
    5 Node: Represents anentity in the graph Relationships (edges / links): Connect nodes to each other Property: Describes a node or relationship: name, age, height, etc. Graphs Show Data Based on Relationships ASH DEL Name: Mel Born: May 29, 1970 Twitter: @mel Name: Ash Born: Dec 5, 1975 CAR Brand Volvo Model: V70 Since: Jan 10, 2011 LOVES LOVES LOVES LIVES WITH
  • 6.
    6 Graphs Show DataBased on Relationships Movies and People Nodes: People and movies Relationships: the role each person plays related to the movies Property: release date, tagline, title, etc.
  • 7.
  • 8.
    What’s important? What’sunusual? What’s next? 8 Use Graph Data Science to Answer:
  • 9.
    What’s important? 9 Who hasthe most connections? Who has the highest page rank? Who is an influencer? Use the Data You Already Have to Answer: What’s important? Prioritization Listen for words like: ● Best ● Top performing ● Highest converting ● Most challenging
  • 10.
    What’s unusual? 10 Where isa community forming? What are the group dynamics? What’s unusual about this data? Use the Data You Already Have to Answer: What’s unusual? Anomaly Detection Listen for words around behavior like: ● Unusual ● Anomalous ● Strange ● Odd ● Weird
  • 11.
    What’s next? 11 What’s themost common path? Who is in the same community? What relationship will form? Use the Data You Already Have to Answer: What’s next? Predictions Listen for words like: ● Recommend ● Optimize ● Improve ● Likely
  • 12.
    12 Smarter Fraud Detection: ProactivelyIdentify Fraudulent Behavior
  • 13.
    13 $5.8billion CNBC of consumer moneylost to fraud in 2021 19%increase Experian in reported fraud cases to the U.S. Federal Trade Commission in 2021 from the year prior Source: CNBC Source: Experian
  • 14.
    14 Source: WSJ Source: UnitedStates Department of Justice
  • 15.
    15 Types of FraudDetected with Graph Data Science Types of Fraud Credit Card Fraud Insurance Fraud Identity Theft Wire Fraud Insider Trading Money Laundering Tax Fraud Accounting Fraud Subscription Fraud E-commerce Fraud Streaming Fraud
  • 16.
    16 The Traditional DataScience Approach Problems ● Who are the suspicious actors? ● What accounts have strange activity? ● Where is the unusual activity happening? What’s unusual? Traditional methods Manually join tables and searches across sources to flag suspicious accounts Fractured information across sources misses relationships Incomplete picture of underlying network, patterns, actors, and tools
  • 17.
    17 The Graph DataScience Approach Traditional methods Neo4j Graph Data Science Manually join tables and searches across sources to flag suspicious accounts Fractured information across sources misses relationships Identify fraudulent actors and patterns with anomaly and community detection algorithms Predict fraudulent accounts and transactions with graph embeddings and ML features Visually map and explore relationships between actors, identifiers, and events in a graph Incomplete picture of underlying network, patterns, actors, and tools
  • 18.
  • 19.
    19 How Banking CircleIncreased Fraud Detection 300%
  • 20.
    20 Fraud Detection Identify suspiciousactivity. Results: • 300%+ increase in fraud detection • 10% True positive alert escalations (industry is <1%) • Reduced overall number of alert escalations
  • 21.
    21 Next Steps &Resources
  • 22.
    Business Resources Read theE-book View the Infographic
  • 23.
    Technical Resources Read theBlog Get the White Paper

Editor's Notes

  • #3 Forward looking organizations are adopting graph analytics and graph data science to power business critical decision making. Gartner predicts that 80% of data and analytics innovations will use graph tech by 2025. It’s a simple concept with endless enterprise and industry applications. My goal today is to help you imagine what’s possible and give you an example of a use case for understanding when graphs are a better way to solve your most challenging business problems. 
  • #4 So I’ll start today with a brief overview of graphs so we’re all working from the same understanding. Next I’ll hit on the big questions. These are the three high-level questions that Graph Data excels at answering. Then I’ll dive into Fraud Detection - what it is, how it save your organization time, money, and reputation, and how graphs can help.  After that, I’ll show you an example of how Banking Circle, a global B2B Banking Service, increased fraud detection by 300% with Graph Data Science. Lastly we’ll conclude with a couple resources for learning more and continuing your graph data science journey.
  • #5 So, what is a graph?
  • #6 At its most fundamental, a graph is simply a different way of structuring data. Instead of rows and columns, like in a traditional, relational database table or dataframe, graphs use nodes (nouns) and relationships (verbs) as their primary structure. Properties describe the relationships between two nodes. Naturally graphs can be shown as networks of people (customers, employees, partners) or transactions (products or suppliers) to name a few.
  • #7 Let’s give the movie database IMDB as an example. Here we can show the relationship between people and movies (nodes) based on the role each person plays in a movie. Those relationships are described properties like release date or title.
  • #8 Next let’s look at the types of questions Graph Data Science can help us answer.
  • #9 Graph Data Science particularly excels when your business question can be summed up to one of these three questions. What’s important What’s unusual What’s next Here’s what I mean:
  • #10 What’s Important? (Prioritization) There are numerous examples of decision makers trying to determine project urgency and therefore, prioritization. For example: Marketing: What is the most important piece of content, the most important webpage, the most important call to action? Product Teams: Where is the most friction? Support: Which article is the most important? Finance: Which report is most important for leadership teams? If you’re hearing words like best, top performing, converting, or challenging your decision makers are asking you about importance
  • #11 “What’s unusual really gets at suspicious or strange behavior that is out of the ordinary. Departments across the enterprise might ask their data science counterparts to identify unusual behavior such as: IT: Where is unusual activity on my network devices? Finance: Where is unusual activity in my accounting department? SecOps: Where is unusual activity in my data center? Compliance: Is there unusual activity in contract language?
  • #12 Looking ahead and predicting the future is something most of us wish we could do with ease. Recommender systems are perhaps the most applicable example across every area of the business. Predictive insights using graphs can deliver answers to these questions and more. Marketing: What email should we send customers next? Product Teams: What product should we build next? Retailers: What product should we sell next? Human Resources: What training should an employee take next? Finance: How should we price our products next quarter? Operations: What is the fastest path from point A to point B?
  • #13 So what does problem solving with graph data science look like in practice? And why is it better at addressing real life problems better than traditional data science methods? Let’s use the use case of Fraud Detection as an example.
  • #14 Fraud detection is a huge business. The trust in our financial system depends on how well the banking institutions and governments are able to detect and prevent fraud. We know the tactics of fraudsters are constantly evolving and we will never be able to fully prevent all cases of fraud, but we must continue to learn and adapt our methods as fraud tactics evolve to evade detection. — And this is more important than ever right now. According to CNBC $5.8 billion dollars were lost to fraud in 2021 (click) a whopping 19% increase from the year prior according the Federal Trade Commission.
  • #15 And when you start pay attention, fraud is everywhere: For example, on March 24, 2023 the Wall Street Journal Reported that there is a rising trend in accounting fraud, which could be an indicator of looming economic troubles. As companies struggle to meet financial expectations, they may resort to fraudulent practices, causing investors to be misled and markets to be disrupted. It is essential for investors, regulators, and auditors to remain vigilant and detect potential red flags in order to minimize the impact of these fraudulent activities on the economy. (Click) Unfortunately, during the global COVID-19 pandemic, many businesses and individuals wrongfully took advantage of programs intended to help businesses pay their employees during lockdowns. Last year, the Department of Justice charged a Pennyslvania man with executing a $1.7 million Paycheck Protection Program (PPP) loan fraud scheme, according to the U.S. Attorney's Office for the District of New Jersey. The defendant allegedly submitted falsified documents, inflated payroll expenses, and created sham companies to obtain the loans, which were meant to support small businesses during the COVID-19 pandemic. If convicted, the man faces up to 20 years in prison for wire fraud and up to 30 years for bank fraud, along with substantial monetary penalties. These are just a few of the many types of fraud that businesses and governments encounter, so let’s go over some of the most common types of fraud (Click)
  • #16 There are dozens of types of financial fraud including Credit card fraud which is the unauthorized use of a person's credit card information to make purchases, obtain cash advances, or otherwise exploit the cardholder's financial resources. Insurance fraud is the act of intentionally providing false or misleading information to an insurance company to receive undeserved compensation or benefits. Identity theft is the malicious act of obtaining and using someone else's personal information, such as their name, Social Security number, or credit card details, without their consent. Wire fraud is a criminal act that involves using electronic communications, such as email, phone calls, or text messages, to deceive victims into transferring money or divulging sensitive information. Insider trading is the illegal practice of trading stocks or securities based on non-public, material information obtained from a company or its affiliates. Tax fraud is the deliberate act of providing false or misleading information on tax filings to evade tax obligations or obtain undeserved refunds. Accounting fraud is the intentional manipulation of financial statements or records to present a false or misleading picture of a company's financial health. But there are other types of non-financial fraud that are easy to detect with Graph Data Science. These include (click) Streaming Fraud: Trying to purposely manipulate streaming numbers on digital platforms using bots, fake accounts, or purchased streams to artificially inflate the popularity of a song, album, or video, ultimately undermining the integrity of streaming data and fair royalty distribution. Subscription Fraud: Using stolen or fake personal information to create unauthorized accounts or obtain services on subscription-based platforms, resulting in financial losses for businesses and potential identity theft for victims. E-commerce Fraud: Creating fake online businesses to receive payments without any intention of distributing goods or services.
  • #17 So how do you go about detecting fraud at enterprise scale? Let’s talk about what the situation looks like today using traditional data science techniques. Using traditional methods, data scientists need to join a bunch of data across systems and connect them together to build a broad picture of their customers, their associated metadata, and how they relate to one antoher. These are things like people, credit cards, social security numbers, bank accounts, phone numbers, addresses and more. And not only do you need to see how all these pieces fit together, but you need to see how they’re related as well. For example: do two or more people share a credit card? Who are the guarantors on the loan? Do certain groups of people have access a certain bank account, but not others? What about addresses? As you can see these things can get very complex very quickly. When you try to picture these things using traditional tabular method and joins, you get an incomplete picture of the underlying network, searching for suspicious activities across multiple databases quickly becomes confusing and easy to miss important information and relationships. (Click) So, if we were to go back to our graphy questions that discussed in the previous section, do you any of the questions we’re answering roll up to “What’s important? What’s unusual? Or what’s next?” Some of your questions might be: Who are the suspicious actors? What accounts have strange activity? Where is the unusual activity happening? (click) It seems pretty clear you’re asking “what’s unusual” which is indeed a graphy question.
  • #18 So how you do you go about using Graph Data Science to address fraud detection? (click) Fraud networks are just that - networks - which are a type of graph. Instead of combining tables and searching for matches across rows of data, you can map people, credit cards, social security numbers, addresses, loans, and so much more as nodes and relationships. Since fraudulent behaviors can be difficult to identify at a glance, Graph Data Science’s library of 65+ graph algorithms can help you identify fraudulent actors and patterns using anomaly and community detection algorithms. Once you identify common patterns, you can use graph embeddings and graph native ML to flag and predict fraudulent accounts and transactions.
  • #19 Here we can see how financial transactions can be represented in a graph, but at first glance, it can be hard to understand where the fraud is. You’ll notice account holders one, two, and three all of have the same address. Account holder one and two are married and account holder 3 is their adult child who is living with them. Account holders one and two share the same home phone number, while account holder three has their own. Account holder one, the parent of account holder three, are associated with each other’s SSNs because of join tax filing when Account Holder 3 was dependent on the tax return. Synthetic person one shares a phone number, with Account Holder Three and a SSN and an unsecured loan with Account holder 1. While synthetic person 1 could be another dependent or spouse of account holder 3, it’s suspicious that they don’t share an address or have a joint bank account. It’s also strange the unsecured loan is between Account holder 1 and Synthetic Person 1. Next, we see synthetic person 2 has a credit card with Account holder 1 and shares a phone number with account holders 1 & 2, but does not live at the same address. In addition there is a joint social security number, but we we know Account Holders 1 & 2 only have one child, so this is immediately suspicious. But the biggest giveaway is that Synthetic Person 1 & 2 don’t have bank accounts themselves and share SSNs with other people who are account holders. This seems like a case of identity theft that is being used to take out credit cards and loans. The analysis I did can be done using graph queries, but if you’re getting started or you have a bigger graph that is less obvious, it can be helpful to get started with community detection algorithms that can be used to identify disjointed groups that share identifiers or Identify communities that frequently interact. https://neo4j.com/blog/financial-services-neo4j-fraud-detection/
  • #20 So you see how easy it can be to identify fraud using Graph Data Science, but what does that look like in the real world
  • #21 PwC, Global Economic Crime and Fraud Survey 202270% of organizations experiencing fraud reported that the most disruptive incident came via an external attack or collusion between external and internal sources. Banking Circle, initially a fintech firm now operating as a bank, has achieved remarkable results in fraud detection and anti-money laundering by utilizing Neo4j Graph Data Science. The bank focuses on ensuring fast, safe, and secure cross-border transactions for business clients like e-commerce. For example, if a purchase is made in Sweden, but the merchant is located in China, they needs ensure the identities of the buyer and the merchant to ensure the money quickly and securely gets to its destination. Banking Circle’s approach, to anti-money laundering and fraud detection with graph data sicnece has resulted in a 300%+ increase in fraud detection and a 10% true positive alert escalation rate. By integrating transaction data with ownership information from open company registration databases, Banking Circle's team of data scientists developed a machine learning model to rank risks using graph data science algorithms. This forward-thinking method has not only reduced the overall number of alert escalations but also laid the foundation for future event-based improvements.
  • #24 Q: How can Graph Data Science be integrated with other tools, platforms, or machine learning models to enhance fraud detection capabilities? A: That’s a really great question. Graph Data Science can fit into data stacks and data pipelines seamlessly with its native connectors to popular tools used for accessing, storing, moving, and sharing data. These tools include Apache Spark and Apache Kafka Connectors, a native BI Connector, a Data Warehouse Connector, Graph topology export, and BigQuery integration. Additionally, Graph Data Science is compatible with all major clouds, with AuraDS Enterprise now available for early access in AWS and Azure. Q: Are there any specific industries or sectors where Graph Data Science has been particularly successful in fraud detection and prevention? A: Yes; we’ve seen lots of success in banking, insurance, and other financial services as well as within government and e-commerce companies. Q: How can organizations get started with implementing Graph Data Science for fraud detection? A: Getting started is all about understanding the problem you’re trying to solve, and knowing where your critical data is stored so you can transform it into nodes and relationships. We have an evaluation guide that walks you through step-by-step how to get started with your use case, and when you’re ready we have a whole host of Graph Data Science specialists who are ready to help you build out your proof of concept. Evaluation Guide: https://neo4j.com/whitepapers/graph-data-science-evaluation-checklist/