SlideShare a Scribd company logo
Graph Databases
and Machine
Learning: Finding a
Happy Marriage
Victor Lee, November 12, 2018
© 2018 TigerGraph. All Rights Reserved
Know Your Speaker
● Mission: Unleash the power of interconnected
data for deeper insights and better outcomes
● Technology: Industry’s First and Only Native
Massively Parallel Processing(MPP) Graph Technology
● Product: The world’s fastest graph database used by
organizations including AliPay, Intuit, Uber, Visa, Zillow
2
Victor Lee, Director of Product Management
PhD in Graph Data Mining
MS Electrical Engineering, Stanford
BS EE/CS, UC Berkeley
© 2018 TigerGraph. All Rights Reserved
Two Hot Items: A Good Match?
New, exciting way to represent
information and to query it.
3
Graph Database Machine Learning
Established powerhouse for
predictions and "smart"
systems, but still tricky to use.
© 2018 TigerGraph. All Rights Reserved
Graph analysis is possibly the single most
effective competitive differentiator for
organizations pursuing data-driven operations
and decisions after the design of data capture.”
4
© 2018 TigerGraph. All Rights Reserved
What is Machine Learning?
● Branch of AI
● Making predictions, models, or
optimizations using incomplete
or inexact data
○ Model? Weather
○ Optimization?
Best driving route,
best investment portfolio
5
© 2018 TigerGraph. All Rights Reserved
Uses of Machine Learning
● Virtual Personal Assistants
○ Voice-to-text
○ Semantic analysis
○ Formulate a response
● Real-time route optimization
○ Waze, Google Maps
● Image Recognition
○ Facial recognition
○ X-ray / CT scan reading
● Effective spam filters
6
© 2018 TigerGraph. All Rights Reserved
Machine Learning Techniques
● Supervised Learning
Humans provide "training data" with known correct answers
○ Decision Trees
○ Nearest Neighbor
○ Hidden Markov Model,Naïve Bayesian
○ Linear Regression
○ Support Vector Machines (SVM)
○ Neural Networks
● Unsupervised Learning
No "correct" answer; detect patterns or summarize
○ Clustering, Partitioning, Outlier detection
○ Neural Networks
○ Dimensionality reduction
7
© 2018 TigerGraph. All Rights
Reserved
8
Graphs (Networks) are all around us
• Data Never Sleeps
• Internet of Things
• BIG Data
• Social Networks
Graph: collection of
nodes and links ("edges")
between nodes
Node
1
Node
2
Edge
© 2018 TigerGraph. All Rights
Reserved
9
Graph Database - The Natural Choice for
Interconnected Data
• Natural storage model
for interconnected data
• Natural for transactions:
transaction = edge
• Natural for computational
knowledge/inference/
learning –
"connect the dots"
User Product Tag
Order
Location
© 2018 TigerGraph. All Rights Reserved
Matchmaking: Consider Each Party
Graph Databases
● Stores data as nodes & edges.
● An edge (link) IS a data object.
● Capabilities of platforms and
query languages vary:
○ All are good a storing data
○ Wide range of analytical
abilities
10
Machine Learning (ML)
● Wants data as vector, array, or
tensor
● Computationally intensive; long
and time-consuming
● Requires good quality input
data:
Garbage in, garbage out
● Many methods to choose from
© 2018 TigerGraph. All Rights Reserved
ML Features and Modeling
● Most/All ML methods try to correlate features
(properties/attributes) with the target result.
● Quality of modeling depends on quality of features
○ Having the right features
○ Having the right distribution of values & results
● Don't know which features are needed!
11
Feat 1 Feat 2 Feat 3 Result
Item 1 X 0 0 X
Item 2 X X 0 0
Item 3 X 0 X ?
© 2018 TigerGraph. All Rights Reserved
ML Features and Modeling
● If you don't know the right features yet, then just try
to have LOTS of features.
● Big Data 3 V's:
○ Volume
○ Variety
○ Velocity
● Example:
○ Suppose family medical history is the best predictor of an
ailment.
○ If you don't know family medical history, you won't be able
to make good predictions
12
© 2018 TigerGraph. All Rights Reserved
3 Possible Arrangements for Graph DB + ML
1. Graph Data leaves home and moves to ML.
2. ML moves into the Graph Database.
3. A Graph Database and ML partnership.
13
© 2018 TigerGraph. All Rights Reserved
#1 - Graph Data leaves home and moves to ML
● Export relevant graph data
● Simple analogy to Relational Database:
○ Table(s) of node data
○ Table(s) of edge data
● Traditional approach
○ ML is fed data from DBs or just data files
● Business Value
○ Easy to deploy
○ Graph DB still valuable for non-ML queries and analytics
14
NodeID attr1 attr2 attr3
EID NID1 NID2 Eattr1 Eattr2 Eattr3
© 2018 TigerGraph. All Rights Reserved
Exporting Graph Data to ML: Concerns
● Is exported data really "flat" enough?
○ Edge records refers to Node records
○ ML algorithms usually want records to be independent
○ One answer is graph embedding
● Graph Embedding
"Mapping a graph's nodes and edges to points in space."
○ Example: drawing a graph on paper is mapping to 2D space.
○ General graphs can be drawn in 3D space → tensor
15
NodeID attr1 attr2 attr3
EID NID1 NID2 Eattr1 Eattr2 Eattr3
Graph Embedding
© 2018 TigerGraph. All Rights Reserved
#2 - ML moves into the Graph Database
● Implement ML algorithm as an advanced query
● Novel Approach:
in-graph analytics
● Business Value
○ Integration
○ Eliminate need for separate systems
16
© 2018 TigerGraph. All Rights Reserved
ML in Graph Database: Concerns
1. Computational power of Graph DB
2. Graph query language's algorithmic expressiveness
3. Expressing ML algorithms in Graph terms
17
Only 3rd Gen. Native Parallel Graphs satisfy 1 and 2.
© 2018 TigerGraph. All Rights
Reserved
18
Native Graph Storage
Parallel Loading
Parallel Multi-Hop
Analytics
Parallel Updates
(in real-time)
Scale up to Support
Query Volume
Privacy for Sensitive
Data
Graph 1.0
single server,
non-parallel
Graph 2.0
NoSQL base
for storage
Graph 3.0
Native, Parallel
Hours to load
terabytes
Sub-second
across 10+
hops
Mutable/
Transactional
2 billion+/day
in production
Runs out of
time/memory
after 2 hops
Scales for
simple queries
(1-2 hops)
Times out
after 2 hops
Days to load
terabytes
Evolution of Graph
Databases
Days to load
terabytes
MultiGraph
service
© 2018 TigerGraph. All Rights
Reserved
Graph Query Languages
● SparQL (RDF): Designed for semantic reasoning, but
not computationally intensive work
● Gremlin (Apache): Older scheme. Declarative. Very
awkward to write complex tasks.
● Cypher (Neo4j): Known by many. Okay for analytics
if you couple it with Java.
● GSQL (TigerGraph): Designed for big data analytics.
○ Built-in parallelism and accumulation.
○ Syntax is natural and familiar for algorithms.
19
© 2018 TigerGraph. All Rights Reserved
Example: Graph Algorithm Libraries
• Some Graph Databases provide libraries of standard
graph algorithms
• Measure/discover characteristics of a graph
• PageRank, Community Detection, Shortest Path, etc.
• GSQL is well-suited for algorithms
• Library is written in GSQL
• Users can see the code and modify as desired
• Cypher is less well-suited
• Library is pre-compiled function calls
• User can't see how the algorithms are written or modify them
20
© 2018 TigerGraph. All Rights
Reserved
Expressing ML Algorithms in Graph Terms
● Not widely known how to execute most ML
algorithms on a graph.
● Same problem as with exporting the Graph data:
○ ML is used to flat data
○ How to make edges a benefit instead of a hinderance?
● This marriage proposal needs further work.
21
© 2018 TigerGraph. All Rights
Reserved
#3 - A Graph Database and ML partnership
● Multi-Step process:
a. First, Graph DB runs queries to extract graph-based
features
b. Send graph features to ML system.
c. ML system, enriched by new graph features, learns a
predictive model.
d. Model can be applied back to graph for real-time
prediction.
● Seems more like a transactional partnership than a
marriage.
22
© 2018 TigerGraph. All Rights
Reserved
Example: China Mobile Fraud Detection
using Graph Features for ML
23
© 2018 TigerGraph. All Rights
Reserved
Generating New Training Data for ML to
Detect Phone-Based Scam
24
Phone 2 Features
Phone 1 Features
(1) High call back phone
(2) Stable group
(3) Long term phone
(4) Many in-group
connections
(5) 3-step friend relation
(1) Short term call
duration
(2) Empty stable group
(3) No call back phone
(4) Many rejected calls
(5) Avg. distance > 3
Training Data
Tens – Hundreds
of Billions of calls
118 per phone x 600
Million phones =
70 Billion new features
Machine
Learning
Solution
Graph with 600M phones and 15B call edges,1000s of calls/second.
Feed ML with new training data with 118 features per phone.
© 2018 TigerGraph. All Rights Reserved
Graph-enhanced workflow for
Anti-Money Laundering
25
© 2018 TigerGraph. All Rights Reserved
Summary
Graph Databases and Machine Learning
both represent powerful tools for getting
more value from data.
3 Proposals for Marrying Graph DBs + ML:
● Export Graph Data to ML system
○ Conventional, but not clear how to treat the data.
● Perform ML within Graph Database
○ Need fast, scalable analytical Graph DB. ML methods not
yet ported to Graph DB.
● Export Graph Features to ML system
○ Improves ML results; in practice now.
26
© 2018 TigerGraph. All Rights Reserved
Final Thoughts
● Room for diversity.
● Healthy marriages evolve over time.
If you have any questions please
contact info@tigergraph.com
27

More Related Content

What's hot

Deep Dive on Amazon Athena - AWS Online Tech Talks
Deep Dive on Amazon Athena - AWS Online Tech TalksDeep Dive on Amazon Athena - AWS Online Tech Talks
Deep Dive on Amazon Athena - AWS Online Tech Talks
Amazon Web Services
 
The perfect couple: Uniting Large Language Models and Knowledge Graphs for En...
The perfect couple: Uniting Large Language Models and Knowledge Graphs for En...The perfect couple: Uniting Large Language Models and Knowledge Graphs for En...
The perfect couple: Uniting Large Language Models and Knowledge Graphs for En...
Neo4j
 
Graph-Based Network Topology Analysis for Telecom Operators
Graph-Based Network Topology Analysis for Telecom OperatorsGraph-Based Network Topology Analysis for Telecom Operators
Graph-Based Network Topology Analysis for Telecom Operators
Neo4j
 
How to Choose The Right Database on AWS - Berlin Summit - 2019
How to Choose The Right Database on AWS - Berlin Summit - 2019How to Choose The Right Database on AWS - Berlin Summit - 2019
How to Choose The Right Database on AWS - Berlin Summit - 2019
Randall Hunt
 
Knowledge Graphs and Graph Data Science: More Context, Better Predictions (Ne...
Knowledge Graphs and Graph Data Science: More Context, Better Predictions (Ne...Knowledge Graphs and Graph Data Science: More Context, Better Predictions (Ne...
Knowledge Graphs and Graph Data Science: More Context, Better Predictions (Ne...
Neo4j
 
Introduction to Knowledge Graphs and Semantic AI
Introduction to Knowledge Graphs and Semantic AIIntroduction to Knowledge Graphs and Semantic AI
Introduction to Knowledge Graphs and Semantic AI
Semantic Web Company
 
Neptune, the Graph Database | AWS Floor28
Neptune, the Graph Database | AWS Floor28Neptune, the Graph Database | AWS Floor28
Neptune, the Graph Database | AWS Floor28
Amazon Web Services
 
Graph Gurus Episode 26: Using Graph Algorithms for Advanced Analytics Part 1
Graph Gurus Episode 26: Using Graph Algorithms for Advanced Analytics Part 1Graph Gurus Episode 26: Using Graph Algorithms for Advanced Analytics Part 1
Graph Gurus Episode 26: Using Graph Algorithms for Advanced Analytics Part 1
TigerGraph
 
Architecture Patterns for Multi-Region Active-Active Applications (ARC209-R2)...
Architecture Patterns for Multi-Region Active-Active Applications (ARC209-R2)...Architecture Patterns for Multi-Region Active-Active Applications (ARC209-R2)...
Architecture Patterns for Multi-Region Active-Active Applications (ARC209-R2)...
Amazon Web Services
 
MLOps and Reproducible ML on AWS with Kubeflow and SageMaker
MLOps and Reproducible ML on AWS with Kubeflow and SageMakerMLOps and Reproducible ML on AWS with Kubeflow and SageMaker
MLOps and Reproducible ML on AWS with Kubeflow and SageMaker
Provectus
 
Athena & Glue
Athena & GlueAthena & Glue
Athena & Glue
Amazon Web Services
 
Scaling into Billions of Nodes and Relationships with Neo4j Graph Data Science
Scaling into Billions of Nodes and Relationships with Neo4j Graph Data ScienceScaling into Billions of Nodes and Relationships with Neo4j Graph Data Science
Scaling into Billions of Nodes and Relationships with Neo4j Graph Data Science
Neo4j
 
Building a modern data stack to maintain an efficient and safe electrical grid
Building a modern data stack to maintain an efficient and safe electrical gridBuilding a modern data stack to maintain an efficient and safe electrical grid
Building a modern data stack to maintain an efficient and safe electrical grid
Neo4j
 
Predicting Flight Delays with Spark Machine Learning
Predicting Flight Delays with Spark Machine LearningPredicting Flight Delays with Spark Machine Learning
Predicting Flight Delays with Spark Machine Learning
Carol McDonald
 
Neo4j : Graphes de Connaissance, IA et LLMs
Neo4j : Graphes de Connaissance, IA et LLMsNeo4j : Graphes de Connaissance, IA et LLMs
Neo4j : Graphes de Connaissance, IA et LLMs
Neo4j
 
Effective Data Lakes: Challenges and Design Patterns (ANT316) - AWS re:Invent...
Effective Data Lakes: Challenges and Design Patterns (ANT316) - AWS re:Invent...Effective Data Lakes: Challenges and Design Patterns (ANT316) - AWS re:Invent...
Effective Data Lakes: Challenges and Design Patterns (ANT316) - AWS re:Invent...
Amazon Web Services
 
Tiger graph 2021 corporate overview [read only]
Tiger graph 2021 corporate overview [read only]Tiger graph 2021 corporate overview [read only]
Tiger graph 2021 corporate overview [read only]
ercan5
 
Introduction to Sagemaker
Introduction to SagemakerIntroduction to Sagemaker
Introduction to Sagemaker
Amazon Web Services
 
Banking Circle: Money Laundering Beware: A Modern Approach to AML with Machin...
Banking Circle: Money Laundering Beware: A Modern Approach to AML with Machin...Banking Circle: Money Laundering Beware: A Modern Approach to AML with Machin...
Banking Circle: Money Laundering Beware: A Modern Approach to AML with Machin...
Neo4j
 
Ml ops on AWS
Ml ops on AWSMl ops on AWS
Ml ops on AWS
PhilipBasford
 

What's hot (20)

Deep Dive on Amazon Athena - AWS Online Tech Talks
Deep Dive on Amazon Athena - AWS Online Tech TalksDeep Dive on Amazon Athena - AWS Online Tech Talks
Deep Dive on Amazon Athena - AWS Online Tech Talks
 
The perfect couple: Uniting Large Language Models and Knowledge Graphs for En...
The perfect couple: Uniting Large Language Models and Knowledge Graphs for En...The perfect couple: Uniting Large Language Models and Knowledge Graphs for En...
The perfect couple: Uniting Large Language Models and Knowledge Graphs for En...
 
Graph-Based Network Topology Analysis for Telecom Operators
Graph-Based Network Topology Analysis for Telecom OperatorsGraph-Based Network Topology Analysis for Telecom Operators
Graph-Based Network Topology Analysis for Telecom Operators
 
How to Choose The Right Database on AWS - Berlin Summit - 2019
How to Choose The Right Database on AWS - Berlin Summit - 2019How to Choose The Right Database on AWS - Berlin Summit - 2019
How to Choose The Right Database on AWS - Berlin Summit - 2019
 
Knowledge Graphs and Graph Data Science: More Context, Better Predictions (Ne...
Knowledge Graphs and Graph Data Science: More Context, Better Predictions (Ne...Knowledge Graphs and Graph Data Science: More Context, Better Predictions (Ne...
Knowledge Graphs and Graph Data Science: More Context, Better Predictions (Ne...
 
Introduction to Knowledge Graphs and Semantic AI
Introduction to Knowledge Graphs and Semantic AIIntroduction to Knowledge Graphs and Semantic AI
Introduction to Knowledge Graphs and Semantic AI
 
Neptune, the Graph Database | AWS Floor28
Neptune, the Graph Database | AWS Floor28Neptune, the Graph Database | AWS Floor28
Neptune, the Graph Database | AWS Floor28
 
Graph Gurus Episode 26: Using Graph Algorithms for Advanced Analytics Part 1
Graph Gurus Episode 26: Using Graph Algorithms for Advanced Analytics Part 1Graph Gurus Episode 26: Using Graph Algorithms for Advanced Analytics Part 1
Graph Gurus Episode 26: Using Graph Algorithms for Advanced Analytics Part 1
 
Architecture Patterns for Multi-Region Active-Active Applications (ARC209-R2)...
Architecture Patterns for Multi-Region Active-Active Applications (ARC209-R2)...Architecture Patterns for Multi-Region Active-Active Applications (ARC209-R2)...
Architecture Patterns for Multi-Region Active-Active Applications (ARC209-R2)...
 
MLOps and Reproducible ML on AWS with Kubeflow and SageMaker
MLOps and Reproducible ML on AWS with Kubeflow and SageMakerMLOps and Reproducible ML on AWS with Kubeflow and SageMaker
MLOps and Reproducible ML on AWS with Kubeflow and SageMaker
 
Athena & Glue
Athena & GlueAthena & Glue
Athena & Glue
 
Scaling into Billions of Nodes and Relationships with Neo4j Graph Data Science
Scaling into Billions of Nodes and Relationships with Neo4j Graph Data ScienceScaling into Billions of Nodes and Relationships with Neo4j Graph Data Science
Scaling into Billions of Nodes and Relationships with Neo4j Graph Data Science
 
Building a modern data stack to maintain an efficient and safe electrical grid
Building a modern data stack to maintain an efficient and safe electrical gridBuilding a modern data stack to maintain an efficient and safe electrical grid
Building a modern data stack to maintain an efficient and safe electrical grid
 
Predicting Flight Delays with Spark Machine Learning
Predicting Flight Delays with Spark Machine LearningPredicting Flight Delays with Spark Machine Learning
Predicting Flight Delays with Spark Machine Learning
 
Neo4j : Graphes de Connaissance, IA et LLMs
Neo4j : Graphes de Connaissance, IA et LLMsNeo4j : Graphes de Connaissance, IA et LLMs
Neo4j : Graphes de Connaissance, IA et LLMs
 
Effective Data Lakes: Challenges and Design Patterns (ANT316) - AWS re:Invent...
Effective Data Lakes: Challenges and Design Patterns (ANT316) - AWS re:Invent...Effective Data Lakes: Challenges and Design Patterns (ANT316) - AWS re:Invent...
Effective Data Lakes: Challenges and Design Patterns (ANT316) - AWS re:Invent...
 
Tiger graph 2021 corporate overview [read only]
Tiger graph 2021 corporate overview [read only]Tiger graph 2021 corporate overview [read only]
Tiger graph 2021 corporate overview [read only]
 
Introduction to Sagemaker
Introduction to SagemakerIntroduction to Sagemaker
Introduction to Sagemaker
 
Banking Circle: Money Laundering Beware: A Modern Approach to AML with Machin...
Banking Circle: Money Laundering Beware: A Modern Approach to AML with Machin...Banking Circle: Money Laundering Beware: A Modern Approach to AML with Machin...
Banking Circle: Money Laundering Beware: A Modern Approach to AML with Machin...
 
Ml ops on AWS
Ml ops on AWSMl ops on AWS
Ml ops on AWS
 

Similar to Graph Databases and Machine Learning | November 2018

Scaling up business value with real-time operational graph analytics
Scaling up business value with real-time operational graph analyticsScaling up business value with real-time operational graph analytics
Scaling up business value with real-time operational graph analytics
Connected Data World
 
Graph Gurus Episode 8: Location, Location, Location - Geospatial Analysis wit...
Graph Gurus Episode 8: Location, Location, Location - Geospatial Analysis wit...Graph Gurus Episode 8: Location, Location, Location - Geospatial Analysis wit...
Graph Gurus Episode 8: Location, Location, Location - Geospatial Analysis wit...
TigerGraph
 
Big Data LDN 2018: DATA OPERATIONS PROBLEMS CREATED BY DEEP LEARNING, AND HOW...
Big Data LDN 2018: DATA OPERATIONS PROBLEMS CREATED BY DEEP LEARNING, AND HOW...Big Data LDN 2018: DATA OPERATIONS PROBLEMS CREATED BY DEEP LEARNING, AND HOW...
Big Data LDN 2018: DATA OPERATIONS PROBLEMS CREATED BY DEEP LEARNING, AND HOW...
Matt Stubbs
 
Big Stream Processing Systems, Big Graphs
Big Stream Processing Systems, Big GraphsBig Stream Processing Systems, Big Graphs
Big Stream Processing Systems, Big Graphs
Petr Novotný
 
Graph Gurus Episode 37: Modeling for Kaggle COVID-19 Dataset
Graph Gurus Episode 37: Modeling for Kaggle COVID-19 DatasetGraph Gurus Episode 37: Modeling for Kaggle COVID-19 Dataset
Graph Gurus Episode 37: Modeling for Kaggle COVID-19 Dataset
TigerGraph
 
Designing data pipelines for analytics and machine learning in industrial set...
Designing data pipelines for analytics and machine learning in industrial set...Designing data pipelines for analytics and machine learning in industrial set...
Designing data pipelines for analytics and machine learning in industrial set...
DataWorks Summit
 
Graph Data Science at Scale
Graph Data Science at ScaleGraph Data Science at Scale
Graph Data Science at Scale
Neo4j
 
Graph Gurus Episode 1: Enterprise Graph
Graph Gurus Episode 1: Enterprise GraphGraph Gurus Episode 1: Enterprise Graph
Graph Gurus Episode 1: Enterprise Graph
TigerGraph
 
DataOps: An Agile Method for Data-Driven Organizations
DataOps: An Agile Method for Data-Driven OrganizationsDataOps: An Agile Method for Data-Driven Organizations
DataOps: An Agile Method for Data-Driven Organizations
Ellen Friedman
 
Using Connected Data and Graph Technology to Enhance Machine Learning and Art...
Using Connected Data and Graph Technology to Enhance Machine Learning and Art...Using Connected Data and Graph Technology to Enhance Machine Learning and Art...
Using Connected Data and Graph Technology to Enhance Machine Learning and Art...
Neo4j
 
Graph Gurus Episode 5: Webinar PageRank
Graph Gurus Episode 5: Webinar PageRankGraph Gurus Episode 5: Webinar PageRank
Graph Gurus Episode 5: Webinar PageRank
TigerGraph
 
Graph Gurus 21: Integrating Real-Time Deep-Link Graph Analytics with Spark AI
Graph Gurus 21: Integrating Real-Time Deep-Link Graph Analytics with Spark AIGraph Gurus 21: Integrating Real-Time Deep-Link Graph Analytics with Spark AI
Graph Gurus 21: Integrating Real-Time Deep-Link Graph Analytics with Spark AI
TigerGraph
 
The Neo4j Data Platform for Today & Tomorrow.pdf
The Neo4j Data Platform for Today & Tomorrow.pdfThe Neo4j Data Platform for Today & Tomorrow.pdf
The Neo4j Data Platform for Today & Tomorrow.pdf
Neo4j
 
Lambda Architecture and open source technology stack for real time big data
Lambda Architecture and open source technology stack for real time big dataLambda Architecture and open source technology stack for real time big data
Lambda Architecture and open source technology stack for real time big data
Trieu Nguyen
 
Big data analysis
Big data analysisBig data analysis
Big data analysis
SAishwaryaDinesh
 
Cheryl Wiebe - Advanced Analytics in the Industrial World
Cheryl Wiebe - Advanced Analytics in the Industrial WorldCheryl Wiebe - Advanced Analytics in the Industrial World
Cheryl Wiebe - Advanced Analytics in the Industrial World
Rehgan Avon
 
Deep Learning for Recommender Systems
Deep Learning for Recommender SystemsDeep Learning for Recommender Systems
Deep Learning for Recommender Systems
Nick Pentreath
 
Scaling graph investigations with Math, GPUs, & Experts
Scaling graph investigations with Math, GPUs, & ExpertsScaling graph investigations with Math, GPUs, & Experts
Scaling graph investigations with Math, GPUs, & Experts
graphistry
 
Make your data fly - Building data platform in AWS
Make your data fly - Building data platform in AWSMake your data fly - Building data platform in AWS
Make your data fly - Building data platform in AWS
Kimmo Kantojärvi
 
Real-Time Fraud Detection at Scale—Integrating Real-Time Deep-Link Graph Anal...
Real-Time Fraud Detection at Scale—Integrating Real-Time Deep-Link Graph Anal...Real-Time Fraud Detection at Scale—Integrating Real-Time Deep-Link Graph Anal...
Real-Time Fraud Detection at Scale—Integrating Real-Time Deep-Link Graph Anal...
Databricks
 

Similar to Graph Databases and Machine Learning | November 2018 (20)

Scaling up business value with real-time operational graph analytics
Scaling up business value with real-time operational graph analyticsScaling up business value with real-time operational graph analytics
Scaling up business value with real-time operational graph analytics
 
Graph Gurus Episode 8: Location, Location, Location - Geospatial Analysis wit...
Graph Gurus Episode 8: Location, Location, Location - Geospatial Analysis wit...Graph Gurus Episode 8: Location, Location, Location - Geospatial Analysis wit...
Graph Gurus Episode 8: Location, Location, Location - Geospatial Analysis wit...
 
Big Data LDN 2018: DATA OPERATIONS PROBLEMS CREATED BY DEEP LEARNING, AND HOW...
Big Data LDN 2018: DATA OPERATIONS PROBLEMS CREATED BY DEEP LEARNING, AND HOW...Big Data LDN 2018: DATA OPERATIONS PROBLEMS CREATED BY DEEP LEARNING, AND HOW...
Big Data LDN 2018: DATA OPERATIONS PROBLEMS CREATED BY DEEP LEARNING, AND HOW...
 
Big Stream Processing Systems, Big Graphs
Big Stream Processing Systems, Big GraphsBig Stream Processing Systems, Big Graphs
Big Stream Processing Systems, Big Graphs
 
Graph Gurus Episode 37: Modeling for Kaggle COVID-19 Dataset
Graph Gurus Episode 37: Modeling for Kaggle COVID-19 DatasetGraph Gurus Episode 37: Modeling for Kaggle COVID-19 Dataset
Graph Gurus Episode 37: Modeling for Kaggle COVID-19 Dataset
 
Designing data pipelines for analytics and machine learning in industrial set...
Designing data pipelines for analytics and machine learning in industrial set...Designing data pipelines for analytics and machine learning in industrial set...
Designing data pipelines for analytics and machine learning in industrial set...
 
Graph Data Science at Scale
Graph Data Science at ScaleGraph Data Science at Scale
Graph Data Science at Scale
 
Graph Gurus Episode 1: Enterprise Graph
Graph Gurus Episode 1: Enterprise GraphGraph Gurus Episode 1: Enterprise Graph
Graph Gurus Episode 1: Enterprise Graph
 
DataOps: An Agile Method for Data-Driven Organizations
DataOps: An Agile Method for Data-Driven OrganizationsDataOps: An Agile Method for Data-Driven Organizations
DataOps: An Agile Method for Data-Driven Organizations
 
Using Connected Data and Graph Technology to Enhance Machine Learning and Art...
Using Connected Data and Graph Technology to Enhance Machine Learning and Art...Using Connected Data and Graph Technology to Enhance Machine Learning and Art...
Using Connected Data and Graph Technology to Enhance Machine Learning and Art...
 
Graph Gurus Episode 5: Webinar PageRank
Graph Gurus Episode 5: Webinar PageRankGraph Gurus Episode 5: Webinar PageRank
Graph Gurus Episode 5: Webinar PageRank
 
Graph Gurus 21: Integrating Real-Time Deep-Link Graph Analytics with Spark AI
Graph Gurus 21: Integrating Real-Time Deep-Link Graph Analytics with Spark AIGraph Gurus 21: Integrating Real-Time Deep-Link Graph Analytics with Spark AI
Graph Gurus 21: Integrating Real-Time Deep-Link Graph Analytics with Spark AI
 
The Neo4j Data Platform for Today & Tomorrow.pdf
The Neo4j Data Platform for Today & Tomorrow.pdfThe Neo4j Data Platform for Today & Tomorrow.pdf
The Neo4j Data Platform for Today & Tomorrow.pdf
 
Lambda Architecture and open source technology stack for real time big data
Lambda Architecture and open source technology stack for real time big dataLambda Architecture and open source technology stack for real time big data
Lambda Architecture and open source technology stack for real time big data
 
Big data analysis
Big data analysisBig data analysis
Big data analysis
 
Cheryl Wiebe - Advanced Analytics in the Industrial World
Cheryl Wiebe - Advanced Analytics in the Industrial WorldCheryl Wiebe - Advanced Analytics in the Industrial World
Cheryl Wiebe - Advanced Analytics in the Industrial World
 
Deep Learning for Recommender Systems
Deep Learning for Recommender SystemsDeep Learning for Recommender Systems
Deep Learning for Recommender Systems
 
Scaling graph investigations with Math, GPUs, & Experts
Scaling graph investigations with Math, GPUs, & ExpertsScaling graph investigations with Math, GPUs, & Experts
Scaling graph investigations with Math, GPUs, & Experts
 
Make your data fly - Building data platform in AWS
Make your data fly - Building data platform in AWSMake your data fly - Building data platform in AWS
Make your data fly - Building data platform in AWS
 
Real-Time Fraud Detection at Scale—Integrating Real-Time Deep-Link Graph Anal...
Real-Time Fraud Detection at Scale—Integrating Real-Time Deep-Link Graph Anal...Real-Time Fraud Detection at Scale—Integrating Real-Time Deep-Link Graph Anal...
Real-Time Fraud Detection at Scale—Integrating Real-Time Deep-Link Graph Anal...
 

More from TigerGraph

MAXIMIZING THE VALUE OF SCIENTIFIC INFORMATION TO ACCELERATE INNOVATION
MAXIMIZING THE VALUE OF SCIENTIFIC INFORMATION TO ACCELERATE INNOVATIONMAXIMIZING THE VALUE OF SCIENTIFIC INFORMATION TO ACCELERATE INNOVATION
MAXIMIZING THE VALUE OF SCIENTIFIC INFORMATION TO ACCELERATE INNOVATION
TigerGraph
 
Better Together: How Graph database enables easy data integration with Spark ...
Better Together: How Graph database enables easy data integration with Spark ...Better Together: How Graph database enables easy data integration with Spark ...
Better Together: How Graph database enables easy data integration with Spark ...
TigerGraph
 
Building an accurate understanding of consumers based on real-world signals
Building an accurate understanding of consumers based on real-world signalsBuilding an accurate understanding of consumers based on real-world signals
Building an accurate understanding of consumers based on real-world signals
TigerGraph
 
Care Intervention Assistant - Omaha Clinical Data Information System
Care Intervention Assistant - Omaha Clinical Data Information SystemCare Intervention Assistant - Omaha Clinical Data Information System
Care Intervention Assistant - Omaha Clinical Data Information System
TigerGraph
 
Correspondent Banking Networks
Correspondent Banking NetworksCorrespondent Banking Networks
Correspondent Banking Networks
TigerGraph
 
Delivering Large Scale Real-time Graph Analytics with Dell Infrastructure and...
Delivering Large Scale Real-time Graph Analytics with Dell Infrastructure and...Delivering Large Scale Real-time Graph Analytics with Dell Infrastructure and...
Delivering Large Scale Real-time Graph Analytics with Dell Infrastructure and...
TigerGraph
 
Deploying an End-to-End TigerGraph Enterprise Architecture using Kafka, Maria...
Deploying an End-to-End TigerGraph Enterprise Architecture using Kafka, Maria...Deploying an End-to-End TigerGraph Enterprise Architecture using Kafka, Maria...
Deploying an End-to-End TigerGraph Enterprise Architecture using Kafka, Maria...
TigerGraph
 
Fraud Detection and Compliance with Graph Learning
Fraud Detection and Compliance with Graph LearningFraud Detection and Compliance with Graph Learning
Fraud Detection and Compliance with Graph Learning
TigerGraph
 
Fraudulent credit card cash-out detection On Graphs
Fraudulent credit card cash-out detection On GraphsFraudulent credit card cash-out detection On Graphs
Fraudulent credit card cash-out detection On Graphs
TigerGraph
 
FROM DATAFRAMES TO GRAPH Data Science with pyTigerGraph
FROM DATAFRAMES TO GRAPH Data Science with pyTigerGraphFROM DATAFRAMES TO GRAPH Data Science with pyTigerGraph
FROM DATAFRAMES TO GRAPH Data Science with pyTigerGraph
TigerGraph
 
Customer Experience Management
Customer Experience ManagementCustomer Experience Management
Customer Experience Management
TigerGraph
 
Graph+AI for Fin. Services
Graph+AI for Fin. ServicesGraph+AI for Fin. Services
Graph+AI for Fin. Services
TigerGraph
 
Davraz - A graph visualization and exploration software.
Davraz - A graph visualization and exploration software.Davraz - A graph visualization and exploration software.
Davraz - A graph visualization and exploration software.
TigerGraph
 
Plume - A Code Property Graph Extraction and Analysis Library
Plume - A Code Property Graph Extraction and Analysis LibraryPlume - A Code Property Graph Extraction and Analysis Library
Plume - A Code Property Graph Extraction and Analysis Library
TigerGraph
 
TigerGraph.js
TigerGraph.jsTigerGraph.js
TigerGraph.js
TigerGraph
 
GRAPHS FOR THE FUTURE ENERGY SYSTEMS
GRAPHS FOR THE FUTURE ENERGY SYSTEMSGRAPHS FOR THE FUTURE ENERGY SYSTEMS
GRAPHS FOR THE FUTURE ENERGY SYSTEMS
TigerGraph
 
Hardware Accelerated Machine Learning Solution for Detecting Fraud and Money ...
Hardware Accelerated Machine Learning Solution for Detecting Fraud and Money ...Hardware Accelerated Machine Learning Solution for Detecting Fraud and Money ...
Hardware Accelerated Machine Learning Solution for Detecting Fraud and Money ...
TigerGraph
 
How to Build An AI Based Customer Data Platform: Learn the design patterns fo...
How to Build An AI Based Customer Data Platform: Learn the design patterns fo...How to Build An AI Based Customer Data Platform: Learn the design patterns fo...
How to Build An AI Based Customer Data Platform: Learn the design patterns fo...
TigerGraph
 
Machine Learning Feature Design with TigerGraph 3.0 No-Code GUI
Machine Learning Feature Design with TigerGraph 3.0 No-Code GUIMachine Learning Feature Design with TigerGraph 3.0 No-Code GUI
Machine Learning Feature Design with TigerGraph 3.0 No-Code GUI
TigerGraph
 
Recommendation Engine with In-Database Machine Learning
Recommendation Engine with In-Database Machine LearningRecommendation Engine with In-Database Machine Learning
Recommendation Engine with In-Database Machine Learning
TigerGraph
 

More from TigerGraph (20)

MAXIMIZING THE VALUE OF SCIENTIFIC INFORMATION TO ACCELERATE INNOVATION
MAXIMIZING THE VALUE OF SCIENTIFIC INFORMATION TO ACCELERATE INNOVATIONMAXIMIZING THE VALUE OF SCIENTIFIC INFORMATION TO ACCELERATE INNOVATION
MAXIMIZING THE VALUE OF SCIENTIFIC INFORMATION TO ACCELERATE INNOVATION
 
Better Together: How Graph database enables easy data integration with Spark ...
Better Together: How Graph database enables easy data integration with Spark ...Better Together: How Graph database enables easy data integration with Spark ...
Better Together: How Graph database enables easy data integration with Spark ...
 
Building an accurate understanding of consumers based on real-world signals
Building an accurate understanding of consumers based on real-world signalsBuilding an accurate understanding of consumers based on real-world signals
Building an accurate understanding of consumers based on real-world signals
 
Care Intervention Assistant - Omaha Clinical Data Information System
Care Intervention Assistant - Omaha Clinical Data Information SystemCare Intervention Assistant - Omaha Clinical Data Information System
Care Intervention Assistant - Omaha Clinical Data Information System
 
Correspondent Banking Networks
Correspondent Banking NetworksCorrespondent Banking Networks
Correspondent Banking Networks
 
Delivering Large Scale Real-time Graph Analytics with Dell Infrastructure and...
Delivering Large Scale Real-time Graph Analytics with Dell Infrastructure and...Delivering Large Scale Real-time Graph Analytics with Dell Infrastructure and...
Delivering Large Scale Real-time Graph Analytics with Dell Infrastructure and...
 
Deploying an End-to-End TigerGraph Enterprise Architecture using Kafka, Maria...
Deploying an End-to-End TigerGraph Enterprise Architecture using Kafka, Maria...Deploying an End-to-End TigerGraph Enterprise Architecture using Kafka, Maria...
Deploying an End-to-End TigerGraph Enterprise Architecture using Kafka, Maria...
 
Fraud Detection and Compliance with Graph Learning
Fraud Detection and Compliance with Graph LearningFraud Detection and Compliance with Graph Learning
Fraud Detection and Compliance with Graph Learning
 
Fraudulent credit card cash-out detection On Graphs
Fraudulent credit card cash-out detection On GraphsFraudulent credit card cash-out detection On Graphs
Fraudulent credit card cash-out detection On Graphs
 
FROM DATAFRAMES TO GRAPH Data Science with pyTigerGraph
FROM DATAFRAMES TO GRAPH Data Science with pyTigerGraphFROM DATAFRAMES TO GRAPH Data Science with pyTigerGraph
FROM DATAFRAMES TO GRAPH Data Science with pyTigerGraph
 
Customer Experience Management
Customer Experience ManagementCustomer Experience Management
Customer Experience Management
 
Graph+AI for Fin. Services
Graph+AI for Fin. ServicesGraph+AI for Fin. Services
Graph+AI for Fin. Services
 
Davraz - A graph visualization and exploration software.
Davraz - A graph visualization and exploration software.Davraz - A graph visualization and exploration software.
Davraz - A graph visualization and exploration software.
 
Plume - A Code Property Graph Extraction and Analysis Library
Plume - A Code Property Graph Extraction and Analysis LibraryPlume - A Code Property Graph Extraction and Analysis Library
Plume - A Code Property Graph Extraction and Analysis Library
 
TigerGraph.js
TigerGraph.jsTigerGraph.js
TigerGraph.js
 
GRAPHS FOR THE FUTURE ENERGY SYSTEMS
GRAPHS FOR THE FUTURE ENERGY SYSTEMSGRAPHS FOR THE FUTURE ENERGY SYSTEMS
GRAPHS FOR THE FUTURE ENERGY SYSTEMS
 
Hardware Accelerated Machine Learning Solution for Detecting Fraud and Money ...
Hardware Accelerated Machine Learning Solution for Detecting Fraud and Money ...Hardware Accelerated Machine Learning Solution for Detecting Fraud and Money ...
Hardware Accelerated Machine Learning Solution for Detecting Fraud and Money ...
 
How to Build An AI Based Customer Data Platform: Learn the design patterns fo...
How to Build An AI Based Customer Data Platform: Learn the design patterns fo...How to Build An AI Based Customer Data Platform: Learn the design patterns fo...
How to Build An AI Based Customer Data Platform: Learn the design patterns fo...
 
Machine Learning Feature Design with TigerGraph 3.0 No-Code GUI
Machine Learning Feature Design with TigerGraph 3.0 No-Code GUIMachine Learning Feature Design with TigerGraph 3.0 No-Code GUI
Machine Learning Feature Design with TigerGraph 3.0 No-Code GUI
 
Recommendation Engine with In-Database Machine Learning
Recommendation Engine with In-Database Machine LearningRecommendation Engine with In-Database Machine Learning
Recommendation Engine with In-Database Machine Learning
 

Recently uploaded

Enhancing Research Orchestration Capabilities at ORNL.pdf
Enhancing Research Orchestration Capabilities at ORNL.pdfEnhancing Research Orchestration Capabilities at ORNL.pdf
Enhancing Research Orchestration Capabilities at ORNL.pdf
Globus
 
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...
Globus
 
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptx
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptxTop Features to Include in Your Winzo Clone App for Business Growth (4).pptx
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptx
rickgrimesss22
 
Large Language Models and the End of Programming
Large Language Models and the End of ProgrammingLarge Language Models and the End of Programming
Large Language Models and the End of Programming
Matt Welsh
 
GlobusWorld 2024 Opening Keynote session
GlobusWorld 2024 Opening Keynote sessionGlobusWorld 2024 Opening Keynote session
GlobusWorld 2024 Opening Keynote session
Globus
 
Quarkus Hidden and Forbidden Extensions
Quarkus Hidden and Forbidden ExtensionsQuarkus Hidden and Forbidden Extensions
Quarkus Hidden and Forbidden Extensions
Max Andersen
 
Corporate Management | Session 3 of 3 | Tendenci AMS
Corporate Management | Session 3 of 3 | Tendenci AMSCorporate Management | Session 3 of 3 | Tendenci AMS
Corporate Management | Session 3 of 3 | Tendenci AMS
Tendenci - The Open Source AMS (Association Management Software)
 
A Sighting of filterA in Typelevel Rite of Passage
A Sighting of filterA in Typelevel Rite of PassageA Sighting of filterA in Typelevel Rite of Passage
A Sighting of filterA in Typelevel Rite of Passage
Philip Schwarz
 
2024 RoOUG Security model for the cloud.pptx
2024 RoOUG Security model for the cloud.pptx2024 RoOUG Security model for the cloud.pptx
2024 RoOUG Security model for the cloud.pptx
Georgi Kodinov
 
Cyaniclab : Software Development Agency Portfolio.pdf
Cyaniclab : Software Development Agency Portfolio.pdfCyaniclab : Software Development Agency Portfolio.pdf
Cyaniclab : Software Development Agency Portfolio.pdf
Cyanic lab
 
AI Pilot Review: The World’s First Virtual Assistant Marketing Suite
AI Pilot Review: The World’s First Virtual Assistant Marketing SuiteAI Pilot Review: The World’s First Virtual Assistant Marketing Suite
AI Pilot Review: The World’s First Virtual Assistant Marketing Suite
Google
 
First Steps with Globus Compute Multi-User Endpoints
First Steps with Globus Compute Multi-User EndpointsFirst Steps with Globus Compute Multi-User Endpoints
First Steps with Globus Compute Multi-User Endpoints
Globus
 
Navigating the Metaverse: A Journey into Virtual Evolution"
Navigating the Metaverse: A Journey into Virtual Evolution"Navigating the Metaverse: A Journey into Virtual Evolution"
Navigating the Metaverse: A Journey into Virtual Evolution"
Donna Lenk
 
Graphic Design Crash Course for beginners
Graphic Design Crash Course for beginnersGraphic Design Crash Course for beginners
Graphic Design Crash Course for beginners
e20449
 
Understanding Globus Data Transfers with NetSage
Understanding Globus Data Transfers with NetSageUnderstanding Globus Data Transfers with NetSage
Understanding Globus Data Transfers with NetSage
Globus
 
Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...
Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...
Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...
Shahin Sheidaei
 
Developing Distributed High-performance Computing Capabilities of an Open Sci...
Developing Distributed High-performance Computing Capabilities of an Open Sci...Developing Distributed High-performance Computing Capabilities of an Open Sci...
Developing Distributed High-performance Computing Capabilities of an Open Sci...
Globus
 
SOCRadar Research Team: Latest Activities of IntelBroker
SOCRadar Research Team: Latest Activities of IntelBrokerSOCRadar Research Team: Latest Activities of IntelBroker
SOCRadar Research Team: Latest Activities of IntelBroker
SOCRadar
 
Webinar: Salesforce Document Management 2.0 - Smarter, Faster, Better
Webinar: Salesforce Document Management 2.0 - Smarter, Faster, BetterWebinar: Salesforce Document Management 2.0 - Smarter, Faster, Better
Webinar: Salesforce Document Management 2.0 - Smarter, Faster, Better
XfilesPro
 
Enterprise Resource Planning System in Telangana
Enterprise Resource Planning System in TelanganaEnterprise Resource Planning System in Telangana
Enterprise Resource Planning System in Telangana
NYGGS Automation Suite
 

Recently uploaded (20)

Enhancing Research Orchestration Capabilities at ORNL.pdf
Enhancing Research Orchestration Capabilities at ORNL.pdfEnhancing Research Orchestration Capabilities at ORNL.pdf
Enhancing Research Orchestration Capabilities at ORNL.pdf
 
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...
 
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptx
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptxTop Features to Include in Your Winzo Clone App for Business Growth (4).pptx
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptx
 
Large Language Models and the End of Programming
Large Language Models and the End of ProgrammingLarge Language Models and the End of Programming
Large Language Models and the End of Programming
 
GlobusWorld 2024 Opening Keynote session
GlobusWorld 2024 Opening Keynote sessionGlobusWorld 2024 Opening Keynote session
GlobusWorld 2024 Opening Keynote session
 
Quarkus Hidden and Forbidden Extensions
Quarkus Hidden and Forbidden ExtensionsQuarkus Hidden and Forbidden Extensions
Quarkus Hidden and Forbidden Extensions
 
Corporate Management | Session 3 of 3 | Tendenci AMS
Corporate Management | Session 3 of 3 | Tendenci AMSCorporate Management | Session 3 of 3 | Tendenci AMS
Corporate Management | Session 3 of 3 | Tendenci AMS
 
A Sighting of filterA in Typelevel Rite of Passage
A Sighting of filterA in Typelevel Rite of PassageA Sighting of filterA in Typelevel Rite of Passage
A Sighting of filterA in Typelevel Rite of Passage
 
2024 RoOUG Security model for the cloud.pptx
2024 RoOUG Security model for the cloud.pptx2024 RoOUG Security model for the cloud.pptx
2024 RoOUG Security model for the cloud.pptx
 
Cyaniclab : Software Development Agency Portfolio.pdf
Cyaniclab : Software Development Agency Portfolio.pdfCyaniclab : Software Development Agency Portfolio.pdf
Cyaniclab : Software Development Agency Portfolio.pdf
 
AI Pilot Review: The World’s First Virtual Assistant Marketing Suite
AI Pilot Review: The World’s First Virtual Assistant Marketing SuiteAI Pilot Review: The World’s First Virtual Assistant Marketing Suite
AI Pilot Review: The World’s First Virtual Assistant Marketing Suite
 
First Steps with Globus Compute Multi-User Endpoints
First Steps with Globus Compute Multi-User EndpointsFirst Steps with Globus Compute Multi-User Endpoints
First Steps with Globus Compute Multi-User Endpoints
 
Navigating the Metaverse: A Journey into Virtual Evolution"
Navigating the Metaverse: A Journey into Virtual Evolution"Navigating the Metaverse: A Journey into Virtual Evolution"
Navigating the Metaverse: A Journey into Virtual Evolution"
 
Graphic Design Crash Course for beginners
Graphic Design Crash Course for beginnersGraphic Design Crash Course for beginners
Graphic Design Crash Course for beginners
 
Understanding Globus Data Transfers with NetSage
Understanding Globus Data Transfers with NetSageUnderstanding Globus Data Transfers with NetSage
Understanding Globus Data Transfers with NetSage
 
Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...
Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...
Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...
 
Developing Distributed High-performance Computing Capabilities of an Open Sci...
Developing Distributed High-performance Computing Capabilities of an Open Sci...Developing Distributed High-performance Computing Capabilities of an Open Sci...
Developing Distributed High-performance Computing Capabilities of an Open Sci...
 
SOCRadar Research Team: Latest Activities of IntelBroker
SOCRadar Research Team: Latest Activities of IntelBrokerSOCRadar Research Team: Latest Activities of IntelBroker
SOCRadar Research Team: Latest Activities of IntelBroker
 
Webinar: Salesforce Document Management 2.0 - Smarter, Faster, Better
Webinar: Salesforce Document Management 2.0 - Smarter, Faster, BetterWebinar: Salesforce Document Management 2.0 - Smarter, Faster, Better
Webinar: Salesforce Document Management 2.0 - Smarter, Faster, Better
 
Enterprise Resource Planning System in Telangana
Enterprise Resource Planning System in TelanganaEnterprise Resource Planning System in Telangana
Enterprise Resource Planning System in Telangana
 

Graph Databases and Machine Learning | November 2018

  • 1. Graph Databases and Machine Learning: Finding a Happy Marriage Victor Lee, November 12, 2018
  • 2. © 2018 TigerGraph. All Rights Reserved Know Your Speaker ● Mission: Unleash the power of interconnected data for deeper insights and better outcomes ● Technology: Industry’s First and Only Native Massively Parallel Processing(MPP) Graph Technology ● Product: The world’s fastest graph database used by organizations including AliPay, Intuit, Uber, Visa, Zillow 2 Victor Lee, Director of Product Management PhD in Graph Data Mining MS Electrical Engineering, Stanford BS EE/CS, UC Berkeley
  • 3. © 2018 TigerGraph. All Rights Reserved Two Hot Items: A Good Match? New, exciting way to represent information and to query it. 3 Graph Database Machine Learning Established powerhouse for predictions and "smart" systems, but still tricky to use.
  • 4. © 2018 TigerGraph. All Rights Reserved Graph analysis is possibly the single most effective competitive differentiator for organizations pursuing data-driven operations and decisions after the design of data capture.” 4
  • 5. © 2018 TigerGraph. All Rights Reserved What is Machine Learning? ● Branch of AI ● Making predictions, models, or optimizations using incomplete or inexact data ○ Model? Weather ○ Optimization? Best driving route, best investment portfolio 5
  • 6. © 2018 TigerGraph. All Rights Reserved Uses of Machine Learning ● Virtual Personal Assistants ○ Voice-to-text ○ Semantic analysis ○ Formulate a response ● Real-time route optimization ○ Waze, Google Maps ● Image Recognition ○ Facial recognition ○ X-ray / CT scan reading ● Effective spam filters 6
  • 7. © 2018 TigerGraph. All Rights Reserved Machine Learning Techniques ● Supervised Learning Humans provide "training data" with known correct answers ○ Decision Trees ○ Nearest Neighbor ○ Hidden Markov Model,Naïve Bayesian ○ Linear Regression ○ Support Vector Machines (SVM) ○ Neural Networks ● Unsupervised Learning No "correct" answer; detect patterns or summarize ○ Clustering, Partitioning, Outlier detection ○ Neural Networks ○ Dimensionality reduction 7
  • 8. © 2018 TigerGraph. All Rights Reserved 8 Graphs (Networks) are all around us • Data Never Sleeps • Internet of Things • BIG Data • Social Networks Graph: collection of nodes and links ("edges") between nodes Node 1 Node 2 Edge
  • 9. © 2018 TigerGraph. All Rights Reserved 9 Graph Database - The Natural Choice for Interconnected Data • Natural storage model for interconnected data • Natural for transactions: transaction = edge • Natural for computational knowledge/inference/ learning – "connect the dots" User Product Tag Order Location
  • 10. © 2018 TigerGraph. All Rights Reserved Matchmaking: Consider Each Party Graph Databases ● Stores data as nodes & edges. ● An edge (link) IS a data object. ● Capabilities of platforms and query languages vary: ○ All are good a storing data ○ Wide range of analytical abilities 10 Machine Learning (ML) ● Wants data as vector, array, or tensor ● Computationally intensive; long and time-consuming ● Requires good quality input data: Garbage in, garbage out ● Many methods to choose from
  • 11. © 2018 TigerGraph. All Rights Reserved ML Features and Modeling ● Most/All ML methods try to correlate features (properties/attributes) with the target result. ● Quality of modeling depends on quality of features ○ Having the right features ○ Having the right distribution of values & results ● Don't know which features are needed! 11 Feat 1 Feat 2 Feat 3 Result Item 1 X 0 0 X Item 2 X X 0 0 Item 3 X 0 X ?
  • 12. © 2018 TigerGraph. All Rights Reserved ML Features and Modeling ● If you don't know the right features yet, then just try to have LOTS of features. ● Big Data 3 V's: ○ Volume ○ Variety ○ Velocity ● Example: ○ Suppose family medical history is the best predictor of an ailment. ○ If you don't know family medical history, you won't be able to make good predictions 12
  • 13. © 2018 TigerGraph. All Rights Reserved 3 Possible Arrangements for Graph DB + ML 1. Graph Data leaves home and moves to ML. 2. ML moves into the Graph Database. 3. A Graph Database and ML partnership. 13
  • 14. © 2018 TigerGraph. All Rights Reserved #1 - Graph Data leaves home and moves to ML ● Export relevant graph data ● Simple analogy to Relational Database: ○ Table(s) of node data ○ Table(s) of edge data ● Traditional approach ○ ML is fed data from DBs or just data files ● Business Value ○ Easy to deploy ○ Graph DB still valuable for non-ML queries and analytics 14 NodeID attr1 attr2 attr3 EID NID1 NID2 Eattr1 Eattr2 Eattr3
  • 15. © 2018 TigerGraph. All Rights Reserved Exporting Graph Data to ML: Concerns ● Is exported data really "flat" enough? ○ Edge records refers to Node records ○ ML algorithms usually want records to be independent ○ One answer is graph embedding ● Graph Embedding "Mapping a graph's nodes and edges to points in space." ○ Example: drawing a graph on paper is mapping to 2D space. ○ General graphs can be drawn in 3D space → tensor 15 NodeID attr1 attr2 attr3 EID NID1 NID2 Eattr1 Eattr2 Eattr3 Graph Embedding
  • 16. © 2018 TigerGraph. All Rights Reserved #2 - ML moves into the Graph Database ● Implement ML algorithm as an advanced query ● Novel Approach: in-graph analytics ● Business Value ○ Integration ○ Eliminate need for separate systems 16
  • 17. © 2018 TigerGraph. All Rights Reserved ML in Graph Database: Concerns 1. Computational power of Graph DB 2. Graph query language's algorithmic expressiveness 3. Expressing ML algorithms in Graph terms 17 Only 3rd Gen. Native Parallel Graphs satisfy 1 and 2.
  • 18. © 2018 TigerGraph. All Rights Reserved 18 Native Graph Storage Parallel Loading Parallel Multi-Hop Analytics Parallel Updates (in real-time) Scale up to Support Query Volume Privacy for Sensitive Data Graph 1.0 single server, non-parallel Graph 2.0 NoSQL base for storage Graph 3.0 Native, Parallel Hours to load terabytes Sub-second across 10+ hops Mutable/ Transactional 2 billion+/day in production Runs out of time/memory after 2 hops Scales for simple queries (1-2 hops) Times out after 2 hops Days to load terabytes Evolution of Graph Databases Days to load terabytes MultiGraph service
  • 19. © 2018 TigerGraph. All Rights Reserved Graph Query Languages ● SparQL (RDF): Designed for semantic reasoning, but not computationally intensive work ● Gremlin (Apache): Older scheme. Declarative. Very awkward to write complex tasks. ● Cypher (Neo4j): Known by many. Okay for analytics if you couple it with Java. ● GSQL (TigerGraph): Designed for big data analytics. ○ Built-in parallelism and accumulation. ○ Syntax is natural and familiar for algorithms. 19
  • 20. © 2018 TigerGraph. All Rights Reserved Example: Graph Algorithm Libraries • Some Graph Databases provide libraries of standard graph algorithms • Measure/discover characteristics of a graph • PageRank, Community Detection, Shortest Path, etc. • GSQL is well-suited for algorithms • Library is written in GSQL • Users can see the code and modify as desired • Cypher is less well-suited • Library is pre-compiled function calls • User can't see how the algorithms are written or modify them 20
  • 21. © 2018 TigerGraph. All Rights Reserved Expressing ML Algorithms in Graph Terms ● Not widely known how to execute most ML algorithms on a graph. ● Same problem as with exporting the Graph data: ○ ML is used to flat data ○ How to make edges a benefit instead of a hinderance? ● This marriage proposal needs further work. 21
  • 22. © 2018 TigerGraph. All Rights Reserved #3 - A Graph Database and ML partnership ● Multi-Step process: a. First, Graph DB runs queries to extract graph-based features b. Send graph features to ML system. c. ML system, enriched by new graph features, learns a predictive model. d. Model can be applied back to graph for real-time prediction. ● Seems more like a transactional partnership than a marriage. 22
  • 23. © 2018 TigerGraph. All Rights Reserved Example: China Mobile Fraud Detection using Graph Features for ML 23
  • 24. © 2018 TigerGraph. All Rights Reserved Generating New Training Data for ML to Detect Phone-Based Scam 24 Phone 2 Features Phone 1 Features (1) High call back phone (2) Stable group (3) Long term phone (4) Many in-group connections (5) 3-step friend relation (1) Short term call duration (2) Empty stable group (3) No call back phone (4) Many rejected calls (5) Avg. distance > 3 Training Data Tens – Hundreds of Billions of calls 118 per phone x 600 Million phones = 70 Billion new features Machine Learning Solution Graph with 600M phones and 15B call edges,1000s of calls/second. Feed ML with new training data with 118 features per phone.
  • 25. © 2018 TigerGraph. All Rights Reserved Graph-enhanced workflow for Anti-Money Laundering 25
  • 26. © 2018 TigerGraph. All Rights Reserved Summary Graph Databases and Machine Learning both represent powerful tools for getting more value from data. 3 Proposals for Marrying Graph DBs + ML: ● Export Graph Data to ML system ○ Conventional, but not clear how to treat the data. ● Perform ML within Graph Database ○ Need fast, scalable analytical Graph DB. ML methods not yet ported to Graph DB. ● Export Graph Features to ML system ○ Improves ML results; in practice now. 26
  • 27. © 2018 TigerGraph. All Rights Reserved Final Thoughts ● Room for diversity. ● Healthy marriages evolve over time. If you have any questions please contact info@tigergraph.com 27