Graph enhancements to Artificial Intelligence and Machine Learning are changing the landscape of intelligent applications. Beyond improving accuracy and modeling speed, graph technologies make building AI solutions more accessible. Join us to hear about 4 areas at the forefront of graph enhanced AI and ML, and find out which techniques are commonly used today and which hold the potential for disrupting industries. We'll provide examples and specifically look how: - Graphs provide better accuracy through connected feature extraction - Graphs provide better performance through contextual model optimization - Graphs provide context through knowledge graphs - Graphs add explainability to neural networks
Speakers: Jake Graham, Alicia Frame
2. Jake Graham & Alicia Frame, Neo4j
How Graph Technology is
Changing AI
#UnifiedAnalytics #SparkAISummit
3.
4. FinCrime Detection Drug Discovery Recommendations
Cybersecurity Predictive Maintenance
Customer Segmentation
Churn Prediction Search/MDM
Where Do Graphs Matter?
5. CAR
DRIVES
name: “Dan”
born: May 29, 1970
twitter: “@dan”
name: “Ann”
born: Dec 5, 1975
since:
Jan 10, 2011
brand: “Volvo”
model: “V70”
Latitude: 37.5629900°
Longitude: -122.3255300°
Nodes
• Can have Labels to classify nodes
• Labels have native indexes
Relationships
• Relate nodes by type and direction
Properties
• Attributes of Nodes & Relationships
• Stored as Name/Value pairs
• Can have indexes and composite indexes
MARRIED TO
LIVES WITH
OW
NS
PERSON PERSON
5
Labeled Property Graphs
6. Graphs provide more accurate predictions
With the data you already have
o Current data science models ignore network structure and complex
relationships
o Graph models add highly predictive features to existing ML models
MACHINE LEARNING LIBRARY
7. The idea is that graph networks are bigger than any one machine-learning
approach. Graphs bring an ability to generalize about structure that the
individual neural nets don't have.
Lest you think the authors think they've got it all figured out, the paper
lists some lingering shortcomings. Battaglia et al. pose the big question,
"Where do the graphs come from that graph networks operate over?”
9. Explore Graphs Build Graphs
o Massively scalable
o Powerful data pipelining
o Robust ML Libraries
o Non-persistent, non-native graphs
o Persistent, dynamic graphs
o Graph native query and algorithm
performance
o Constantly growing list of graph
algorithms and embeddings
in in
12. Connecting the Dots at NASA
“Using Neo4j someone from our Orion project found information from the Apollo
project that prevented an issue, saving well over two years of work and one
million dollars of taxpayer funds.”
David Meza, Chief Knowledge Architect – NASA 2015
14. Mining Knowledge Graphs for Drug Discovery
• HetioNet is a knowledge
graph integrating over 50
years of biomedical data
• Leveraged to predict new
uses for drugs by using the
graph topology to create
features to predict new
links
14
15. Knowledge Graphs - het.io
• HetioNet is a knowledge
graph integrating over 50
years of biomedical data
• Leveraged to predict new
uses for drugs by using the
graph topology to create
features to predict new
links
15
16. Knowledge Graphs - het.io
• HetioNet is a knowledge
graph integrating over 50
years of biomedical data
• Leveraged to predict new
uses for drugs by using the
graph topology to create
features to predict new
links
16
17. Knowledge Graphs: getting started
17
Graph
Transactions
Graph
Analytics
• Build a graph data
pipeline to bring into
native graph
• Bring graph features
back to ML pipeline
• Move to Neo4J to build
expert queries and
persist your graph
• Merge distributed data
into dataframes
• Reshape your tables
into graphs
• Explore cypher queries
SparkCypher &
SparkGraph
Neo4j
Morpheus
Neo4j Graph
Platform
19. Graph Feature Engineering
19
MACHINE LEARNING LIBRARY
Make use of your existing machine learning pipeline:
• Tabular data from Spark
• Enriched with graph based features from Neo4j
• Combined into a single model building pipeline
20. Categories of Graph Features
20
Pathfinding
& Search
Centrality /
Importance
Community
Detection
Heuristic Link Prediction
Finds optimal paths
or evaluates route
availability and quality
Determines the
importance of distinct
nodes in the network
Detects group
clustering or partition
options
Evaluates how
alike nodes are
Estimates the likelihood of
nodes forming a relationship
SimilarityEmbeddings
Vectors that capture
connectivity or topology
21. Financial Crime: Detecting Fraud
21
Many large financial
institutions have existing
pipelines to identify fraud
Graph based features
improve accuracy:
• Connected components to
identify disjoint graphs
• PageRank to measure influence
• Louvain to identify communities
• Jaccard to measure account
similarity
22. Financial Crime: Detecting Fraud
22
Many large financial
institutions have existing
pipelines to identify fraud
Graph based features
improve accuracy:
• Connected components to
identify disjoint graphs
• PageRank to measure influence
• Louvain to identify communities
• Jaccard to measure account
similarity
23. Graph Feature Engineering: getting started
23
Graph
Transactions
Graph
Analytics
• Move to Neo4J to build
run native graph
algorithms
• Write algorithm derived
features to persistent
graph
• Merge distributed data
into dataframes
• Reshape your tables
into graphs
• Explore graph algorithms
• Build a graph data
pipeline to bring into
native graph
• Bring graph features
back to ML pipeline
24. Graph Features in Neo4J
24
• Parallel Breadth First Search
• Parallel Depth First Search
• Shortest Path
• Single-Source Shortest Path
• All Pairs Shortest Path
• Minimum Spanning Tree
• A* Shortest Path
• Yen’s K Shortest Path
• K-Spanning Tree (MST)
• Random Walk
• Degree Centrality
• Closeness Centrality
• CC Variations: Harmonic, Dangalchev,
Wasserman & Faust
• Betweenness Centrality
• Approximate Betweenness Centrality
• PageRank
• Personalized PageRank
• ArticleRank
• Eigenvector Centrality
• Triangle Count
• Clustering Coefficients
• Connected Components (Union Find)
• Strongly Connected Components
• Label Propagation
• Louvain Modularity – 1 Step & Multi-Step
• Balanced Triad (identification)
• Euclidean Distance
• Cosine Similarity
• Jaccard Similarity
• Overlap Similarity
• Pearson Similarity
Pathfinding
& Search
Centrality /
Importance
Community
Detection
Similarity
neo4j.com/docs/
graph-algorithms/current/
Link
Prediction
• Adamic Adar
• Common Neighbors
• Preferential Attachment
• Resource Allocations
• Same Community
• Total Neighbors
26. Graph Embeddings
Embeddings transform graphs into a vector, or set of vectors,
describing topology, connectivity, or attributes of nodes and
edges in the graph
26
• Vertex embeddings: describe connectivity of each node
• Path embeddings: traversals across the graph
• Graph embeddings: encode an entire graph into a single vector
27. Graph Embeddings - Recommendations
Explainable Reasoning over Knowledge Graphs for
Recommendation
27
28. Graph Embeddings - Recommendations
Explainable Reasoning over Knowledge Graphs for
Recommendation
28
29. Graph Embeddings: Getting Started
29
Graph
Transactions
Graph
Analytics
• Move to Neo4J to build
expert queries and
persist
• Stay tuned for DeepWalk
and DeepGL
• Merge distributed data
into dataframes
• Reshape your tables
into graphs
• Explore graph algorithms
• Build a graph data
pipeline to bring into
native graph
• Bring graph features
back to ML pipeline
31. Graph Native Learning
Deep Learning refers to training multi-layer neural
networks using gradient descent
31
32. Graph Native Learning
Graph Native Learning refers to deep learning models
that take a graph as an input, performs computations,
and returns a graph.
32
Battaglia et al, 2018
33. Graph Native Learning
Example: electron path prediction Bradshaw et al, 2019
33
Given reactants and reagents, what will the
products be?
Given reactants and reagents, what will the
products be?
35. 35#UnifiedAnalytics #SparkAISummit
Query Based
Knowledge
Graph
Query Based
Feature
Engineering
Graph
Algorithm
Feature
Engineering
Graph
Embeddings
Graph Neural
Networks
Knowledge
Graph
Graph
Feature
Engineering
Graph
Native
Learning
Neo4J for Graph Persistence
Delivery Timeline
Complexity
The Steps of Graph Data Science
36. Resources
o O’Reilly Graph Algorithms Book
o Neo4j Graph Algorithms Library
o Check out the documentation
o Reach out to us
36#UnifiedAnalytics #SparkAISummit
37. DON’T FORGET TO RATE
AND REVIEW THE SESSIONS
SEARCH SPARK + AI SUMMIT