Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Scaling up business value with real-time operational graph analytics

362 views

Published on

Graph-based solutions have been in the market for over a decade with deployments in financial services, healthcare, retail, and manufacturing. The graph technology of the past limited them to simple queries (1 or 2 hops), modest data sizes, or slow response times, which limited their value.

A new generation of fast, scalable graph databases, led by TigerGraph, is opening up a new world of business insight and performance. Join us, as we explore some new exciting use cases powered by native parallel graph database with storage and computation capability for each node:

A large financial services payment provider is using graph-based pattern detection (7 to 11 hop queries) to detect more fraud and money laundering in real time, handling peak volume of 256,000 transactions per second.
IceKredit, an innovative FinTech is transforming the near-prime and sub-prime credit market in United States, China and South Asian countries with customer 360 analytics for credit approval and ongoing monitoring.
A biotech and pharmaceutical giant is building a prescriber and patient 360 graph and using multi-hop exploratory and analytic queries to understand the most efficient ways of launching a new drug for maximum return.
Wish.com is delivering real-time personalized recommendations to increase eCommerce revenue.

Published in: Technology
  • Be the first to comment

Scaling up business value with real-time operational graph analytics

  1. 1. Scaling Up Business Value with Real-Time Operational Graph Analytics Connected Data London, 7 November 2018 Victor Lee, Director of Product Management
  2. 2. © 2018 TigerGraph. All Rights Reserved Graph is Big Business 2 $2.5+ Trillion market cap combined Business is GRAPH
  3. 3. © 2018 TigerGraph. All Rights Reserved 3 Graph analysis is possibly the single most effective competitive differentiator for organizations pursuing data-driven operations and decisions after the design of data capture.” https://www.gartner.com/doc/2852717/it-market-clock-database-management
  4. 4. © 2018 TigerGraph. All Rights Reserved Scaling Up with Big Data + Graph Analytics Benefits ● Richer knowledge o Knowledge Graph Model ● Better Analytics o More, richer data o Ask harder questions: Multi-hop queries o More valuable results 4 Challenges ● Storage ● Speed performance o Searching o Traversing o Updating o Exponential slowdown? ● Cost How do we fulfill the Promise of Graph Analytics?
  5. 5. © 2018 TigerGraph. All Rights Reserved 5 China Mobile Detects Phone Based Fraud with Real-Time Customer 360 Data Hub Business Challenge:: Phone Scams are hard to detect, evolve quickly, and cost billions of $. ● 460M users (inside & outside network) ● 10B+ phone calls every month ● 10K peak calls per second Solution: Run a real-time scam-detection graph query on every phone call as it occur. ● Maintain real-time operational graph with 460M user vertices & 2B call edges, with call details. ● Building training set of 2M calls; run queries to measure 118 graph features. ● Feed graph feature data to machine learning; build a scam detection model. ● As a phone call is requested, measure graph features in real time to detect possible scam. Business Benefits ● If scam is detected, recipients receive a warning on phone, BEFORE call is answered. ● Improve customer satisfaction
  6. 6. © 2018 TigerGraph. All Rights Reserved Detecting Phone-Based Fraud by Analyzing Network or Graph Relationship Features 6 Good Phone Features Bad Phone Features (1) Short term call duration (2) Empty stable group (3) No call back phone (4) Many rejected calls (5) Average distance > 3 Empty stable group Many rejected calls Average distance > 3 (1) High call back phone (2) Stable group (3) Long term phone (4) Many in-group connections (5) 3-step friend relation Stable group Many in-group connections Good Phone Features 3-step friend relation /// Good phone Bad phone X X X Download the solution brief at - https://info.tigergraph.com/MachineLearning
  7. 7. © 2018 TigerGraph. All Rights Reserved Real-Time Operational Graph Analytics Platform Requirements ● Scalability ○ Can handle the application's Big Data (Gigabytes? Terabytes? Petabytes?) ○ Can expand as data grow → distributed graph ● Real-Time Speed @Scale ○ On complex analytics ○ Fast updates ○ Throughput/Concurrency ● Query Language Support for Analytics ○ Describe common analytics patterns ○ Parallel processing 7
  8. 8. © 2018 TigerGraph. All Rights Reserved 8 Native Graph Storage Parallel Loading Parallel Multi-Hop Analytics Parallel Updates (in real-time) Scale up to Support Query Volume Privacy for Sensitive Data Graph 1.0 single server, non-parallel Graph 2.0 NoSQL base for storage Graph 3.0 Native, Parallel Hours to load terabytes Sub-second across 10+ hops Mutable/ Transactional 2 billion+/day in production Runs out of time/memory after 2 hops Scales for simple queries (1-2 hops) Times out after 2 hops Days to load terabytes Evolution of Graph Databases Days to load terabytes MultiGraph service
  9. 9. © 2018 TigerGraph. All Rights Reserved OLTP and OLAP Together in Graph OLTP ● Real-time read and write ● ACID properties (guarantee that transaction is correct) ● Concurrency (many transactions at the same time) OLAP ● Multi-dimensional Analysis ● Compute-intensive ● Aggregation TigerGraph + Real-time read & write + Real-time, compute-intensive, multi-dimensional analysis + Real-time aggregation + HTAP: Hybrid Transactions+Analytics Some Graph Databases + ACID + Concurrency All Graph Database o Multi-dimensional data o Real-time read
  10. 10. © 2018 TigerGraph. All Rights Reserved Delivering New Graph-Based Solutions with TigerGraph 10 ERP: Enterprise Resource Planning Multichannel Marketing & Sales Force Automation CRM: Customer Relationship Management Operational Data Master Data Master Data Management Customer Supplier Devicee Employee Name Address Gender Age Orders Payments Shipments Invoices Visits Downloads MASTERDATA TRANSACTIONAL / OPERATIONAL DATA Data Warehouse Data Mart Data Lake Batch and Streaming Machine Learning Business Initiatives Fraud & AML Supply Chain Intelligence Credit Risk Scoring & Monitoring Product & Service Marketing Network & IT Infrastructure Analytics Location Analytics Enterprise Knowledge Graph BI & Analytics Solutions NoSQL Historical Data Queries / Lookups Complex Graph Analytics [Community Detection, Shortest Path, PageRank..] Graph features SignalsNetwork and Sensor/IoT Systems Recommendat ion Engine Real-time Customer 360 Data Hub Faster than Real-time EMS Cyber Security
  11. 11. © 2018 TigerGraph. All Rights Reserved Building Real-time Customer 360 Data Hub with TigerGraph – Case Studies Fraud & Money Laundering Detection 11 Large Multi-National Pharma Product & Service Marketing Cross-sell & Up-sell Recommendation Credit Risk Scoring & Monitoring Use Case Examples
  12. 12. © 2018 TigerGraph. All Rights Reserved Using C360 for Personalized Results in e-Real Estate • Technology & data science are transforming the US real estate market. • Past: Agents controlled all data (Homes for sale? Comparable prices) • Possible Use Cases: • Recommendations for Home Sellers • Recommendations for Home Buyers - which home on consider. Challenge: home buying is a rare effect. Harder: Find the right test. • Matching Buyers with Sellers • Predict selling price and buying price • Internal operations US leader in e-Real Estate • Seeing unique value in TigerGraph real-time operational analytics 12
  13. 13. © 2018 TigerGraph. All Rights Reserved Network & IT Infrastructure Analytics (Impact Assessment) 13 Business Challenge: Monitor complex energy infrastructure to detect and manage power outages. • Solution: • Model power system using real-time Graph to accelerate power flow and state calculation (eliminates data prep.) • Leverage massively parallel computing in Graph for bus ordering & admittance graph forming to balance load spikes • Visualize energy computation results in Graph for contingency analysis and action plan • Business Value: Completes execution within a SCADA sample cycle of 5 seconds Disconnector Connector Breaker ACline Two Port Transformer Neutral Point Three Port Transformer BU S Substation Unit Load Longitude Latitude Compensator
  14. 14. © 2018 TigerGraph. All Rights Reserved 14 Network & IT Infrastructure Analytics (Impact Assessment) Array Pool Pool LUNLUN LUNLUN Server Server Server Applicati on Applicati on BU Service BU: A business unit uses and pays for a service. Service: Each service calls multiple applications. Application: An application can run on multiple servers. Applications may depend on other applications. Server: Each server may be assigned with multiple LUNs LUN: Logical Unit Number Pool: Each has a port on storage array device & it is the container of logical LUNs Array: Storage Array is the container of logical pools Attributes: PhysicalCapacity, PromisedCapacity, Alert Attributes: PhysicalCapacity, PromisedCapacity, Alert Attributes: PhysicalCapacity, PromisedCapacity, Alert Attributes: IsDecommissioned (1:n) (1:n) (n:1) (n:m) (n:m) (n:m) Business Challenge: Assess business impact of component outage or capacity overflow in a complex, interconnected infrastructure. Solution: 1. Maintain real-time operational map of interconnected resources – arrays, pools, LUNs, virtual machines, servers, applications all catering specific service for a business unit . . . Business Value: Reduce server, disk storage and network provisioning costs and downtime. Avoid interruptions for critical workloads.
  15. 15. © 2018 TigerGraph. All Rights Reserved 15 Network & IT Infrastructure Analytics (Impact Assessment) Arra y Poo l Poo l LUNLUN LUNLUN Serv er Serv er Serv er Applica tion Applica tion BU Servi ce BU: A business unit uses and pays for a service. Service: Each service calls multiple applications. Application: An application can run on multiple sCervers. Applications may depend on other applications.CC Server: Each server may be assigned with multiple LUNs LUN: Logical Unit Number Pool: Each has a port on storage array device & it is the container of logical LUNs Array: Storage Array is the container of logical pools Attributes: PhysicalCapacity, PromisedCapacity, Alert Attributes: PhysicalCapacity, PromisedCapacity, Alert Attributes: PhysicalCapacity, PromisedCapacity, Alert Attributes: IsDecommissioned (1:n) (1:n) (n:1) (n:m) (n:m) (n:m) Solution: 1. Maintain real-time operational map of interconnected resources – arrays, pools, LUNs, virtual machines, servers, applications all catering specific service for a business unit. 2. Process IoT (Internet of Things) sensor signals from physical storage, server and network equipment to identify at-risk components in real-time. 3. Assess impact of at-risk component such as a LUN, server or application on business service to offload critical workloads. 4. Identify storage, server or network element likely to max out based on physical and promised capacity / traffic for proactive remediation measures. • Business Value: Reduce server, disk storage and network provisioning costs and downtime. Avoid interruptions for critical workloads.
  16. 16. © 2018 TigerGraph. All Rights Reserved AML Workflow with TigerGraph and Machine Learning at Alipay 16
  17. 17. © 2018 TigerGraph. All Rights Reserved Location Analytics Use Case 1: Matching Demand & Supply with Predicted Traffic Flow Data For Busy Locations 17 Traffic Inflow (predicted) Traffic Outflow (predicted) Saturday, 11:15 pm Number of Taxis needed: 105 Number of Taxis available: 65 ● Scenario – Based on predicted traffic inflow and outflow from Ginza brand street area for next Saturday at 11:15 pm Japan Standard time, there will be shortage of 40 taxis for about 45 minutes. ● Recommended action - Message 40 taxi drivers from nearby areas that have excess supply of taxis to come to Ginza brand street area by 11:15 pm on Saturday ● Outcome – Increase revenue by matching expected spike in taxi demand with increased supply, Reduce idle time for drivers
  18. 18. © 2018 TigerGraph. All Rights Reserved 40 Location Analytics Use Case 1: Matching Demand & Supply with Predicted Traffic Flow Data For Busy Locations 78 96 120 92 45 45 80 29 45 323445 60 35 17 54 35 14 79 25 72 45 35 44 57 89 648745 45 98 34 65 35 78 78 85 78 45 45 35 98 64 33 63 654 69 Input Driver Suggested Moving direction 120 In Flow 40 Out Flow Flow Prediction Based on Historical Data 40 Number of Taxis in that mesh
  19. 19. © 2018 TigerGraph. All Rights Reserved Solving Problems with Graph Algorithms An Analytics Toolbox specifically for graphs 19 Supervised learning: look for particular patterns/features, then correlate to known cases Unsupervised learning: look for frequent/infrequent patterns & groupings
  20. 20. © 2018 TigerGraph. All Rights Reserved Graph Algorithms: Built-in Analytics Functions Several categories: ● Path-related ○ What is the shortest path between two nodes? ● Centrality ○ Which nodes are the most centrally located? ● Community Detection ○ What the the natural groupings of nodes, based on connectedness? ● Similarity ○ Which nodes are very similar to one another, based on context? 20
  21. 21. © 2018 TigerGraph. All Rights Reserved Graph Algorithm Use Case: Understanding & Leveraging Influencers in Medical Care Given a national network of medical providers and prescribers, find connected communities and who are the top influencers with each community. • USA – Private Health Care System • Very decentralized, many layers and pay-per-service leads in inefficiency • UK/Europe – National Health Care • Is the central authority making good decisions? Applies to other cases with many decision makers • Viral/influencer marketing 21 Based on a project by Large US Pharma
  22. 22. © 2018 TigerGraph. All Rights Reserved Influencer Analysis as a Graph Analytics Problem Business Benefits • Understand care and referral dynamics better • Target education at the influencers • Identify which influencers are also best-practice practitioners 22 Analysis Steps 1. Find the most influential provider in each region, for a related group of medical codes (Diabetes, Cardiac Care, etc.)? 2. Who is influenced by these leaders (e.g. other doctors, physical therapists, facilities)? 3. What is the community size and impact (patients and providers) around these hubs?
  23. 23. © 2018 TigerGraph. All Rights Reserved 1. How do I find the most influential provider in each region for a particular medical condition? Whole-Graph Compute problem A. Analyze claims data to create referral relationships among providers (Time Series Analysis) B. Create subsets of claims around each condition with a group of healthcare codes (e.g. CPT codes) for each region (e.g. local healthcare market) C. Utilize PageRank to score hubs within each market 23 Dr. Thomas Condition: Diabetes Healthcare Market: S. San Jose, CA Hub Identified: Dr. Thomas ?
  24. 24. © 2018 TigerGraph. All Rights Reserved 2. Who is influenced by these leaders (e.g. other doctors, physical therapists, facilities)? Utilize Community Detection A.Identify communities of providers around each hub for each region and for a specific condition B.Track changes over time to detect significant shifts in communities 24 Dr. Thomas Condition: Diabetes Healthcare Market: S. San Jose, CA Hub Identified: Dr. Thomas Community Detected: Diabetes – S. San Jose – Dr. Thomas
  25. 25. © 2018 TigerGraph. All Rights Reserved 3. What is the community size and impact (patients and providers) around these hubs? A.Compute cost of care for initial diagnosis and follow-on treatment for each community B.Compare with other communities with similar patient population C.Track changes over time to detect significant changes in cost of care 25 Dr. Thomas Condition: Diabetes Healthcare Market: S. San Jose, CA Hub Identified: Dr. Thomas Community Detected: Diabetes – S. San Jose – Dr. Thomas Cost of care: initial diagnosis, follow-on care (medicine, tests, treatment)
  26. 26. © 2018 TigerGraph. All Rights Reserved GSQL Graph Algorithm Library ● Well-designed commonly-used graph algorithms written in GSQL ● TigerGraph's Advantages: ○ Native MPP design means faster execution ○ GSQL was designed for analytic functions ○ Open-source and user-extensible ■ Users can see the GSQL, learn from it, and modify it, to customize their algorithms. ■ In other platforms, the algorithms are embedded functions which cannot be viewed or modified. ■ On Github: https://github.com/tigergraph/ecosys/tree/master/graph_algorithms 26
  27. 27. © 2018 TigerGraph. All Rights Reserved Summary ● This is the Year of Graph! ● Enterprises are ready to adopt Operational Graph Analytics, with Big Data. ● Graph DBs need to provide scalability, scale, and query support. ● There are many use cases. ● Graph Algorithm Libraries will help to deliver the query support. 27
  28. 28. © 2018 TigerGraph. All Rights Reserved Thank You! Developer Portal www.tigergraph.com/developers/ Free Developer Edition Now runs on Linux, Docker and VM (for Mac, Windows) www.tigergraph.com/download 28 Victor Lee Director, Product Development TigerGraph victor@tigergraph.com

×