GraphAware®
GRAPH POWERED
MACHINE LEARNING
Vlasta Kůs, Data Scientist @ GraphAware
graphaware.com
@graph_aware, @VlastaKus
WHAT IS MACHINE LEARNING?
GraphAware®
[Machine Learning is the] field of study that gives
computers the ability to learn without being explicitly
programmed.


— Arthur Samuel, 1959
WHAT IS A GRAPH?
GraphAware®
G = (V, E)
WHY NEO4J?
GraphAware®
It is a proper graph database
It is a proper database
MACHINE LEARNING LIFECYCLE
GraphAware®
MACHINE LEARNING CHALLENGES
The Source of Truth Performance
Model Storing Real Time
THE SOURCE OF TRUTH
GraphAware®
“Even the best learning algorithm

on wrong data produces wrong results.”


— Alessandro Negro, 2015
THE SOURCE OF TRUTH
GraphAware®
Source:
Michele Banko and Eric Brill
Scaling to Very Very Large Corpora for Natural Language Disambiguation - 2001
THE SOURCE OF TRUTH
GraphAware®Source:
https://www.domo.com/learn/data-never-sleeps-5
Predictive accuracy
Training Performance
Prediction Performance
Ability to scale
PERFOMANCE
GraphAware®
NEO4J CAUSAL CLUSTER
GraphAware®Source: https://neo4j.com/docs/operations-manual/
Store the results of the training phase
Provide multiple access patterns
Mix models
The size depends on the algorithm
STORING THE RESULTS & MODEL
GraphAware®
LEARN FAST
PREDICT FAST
REAL TIME
GraphAware®
STORING DATA SOURCES: TENSOR
GraphAware®
Simple Recommendation
f: User x Item -> Relevance Score
Context Aware Recommendation
f: User x Item x Context1 x Context2 x Context3 -> Relevance Score
The results of machine learning process can be stored in a graph as well.
Some examples are:

‣ Similarity (k-Nearest Neighbours)
‣ Cluster
‣ Spanning Tree
‣ Decision Tree
‣ Random forest
‣ Markov Chain
STORING RESULTS & MODELS
GraphAware®
STORING RESULTS & MODELS
GraphAware®
K-Nearest Neighbours
Markov Chain
Decision Tree
STORING DATA SOURCES: KNOWLEDGE
GRAPH
GraphAware®
GraphAware®
‣ NLP and graphs: natural fit
GRAPH-BASED NATURAL LANGUAGE
PROCESSING
GraphAware®
‣ Knowledge enrichment
Source: http://nlp.stanford.edu:8080/corenlp/process
Unsupervised techniques tend to be underestimated …
‣ No need for time & money to get massive labeled training datasets
‣ Often faster to train & faster to predict
‣ Unsupervised deep learning
UNSUPERVISED ML ALGORITHMS
GraphAware®
Some graph-native algorithms that are relevant to machine learning processes:

‣ Random Walk
‣ Page Rank
‣ Graph Matching
‣ Shortest Path
‣ Depth-First Graph Traversal
‣ Breadth-First Graph Traversal
‣ Minimum Spanning Tree
‣ Graph Clustering
‣ Node2vec
GRAPH-BASED ML ALGORITHMS
GraphAware®
pressure
information
system
telemetry
spacecraft
power
critical
ground
command
performance
component
drive
connector
standard
software
operation
testing
environmental
flight
damage
spare
subsystem
orbiter
posit
flight
software
control
instrument
propellant
flight
project
program
resource
equipment
fluid
re
condition
analysis
servicemaintenance
device
electrical
circuit
area
accident
number
management
risk
implem
significant
personnel
exploration
rover
facility
human
effect
event
life
lack
loss
potential
review
ground
system
engineering
different
configuration
control
configuration
logistics
datum
NASA
ground
processing
processing
safety
verification
risk
management
part
t
launch
pipe
shuttle
leak
lesson
load environment
space
check
line
source
training
factor
material
due
practice
capability
battery flight
hardware
government
assembly
figure
evm
space
shuttle
surface
operational
thermal
installation
many
set
one
inspection
propulsion
system
actuator
impact
flight
system
launch
vehicle
flow
current
leakage
shuttle
program
digital
mishap
propulsion
engine
complete
NASA
program
fault
attitude
science
flight
operation
mission
operation
independent
investigation
JPL
fire
procedure
torque
incident
mars
space
system
general
critical
hardware
fault
protection
mer
user
system
engineering
expertisespace
flight
manager
project
manager
mitigation
Pegasus
actuator
system
amp
capability
legacy
actuator
system
output
control
system
different
manager
many
review
purpose
appropriate
f
integration
important
ground
testing
computer
flight
equipment
independent
review
space
hardware
software
development
ssme
technical
content
value
management
aerodynamic
load
aerodynamic
ppe
deceased
successful
implementation
formal
mishap
investigation
lax
GRAPH-BASED ML ALGORITHMS:

PAGE RANK
GraphAware®
Keywords Extraction
Rada Mihalcea, Paul Tarau. 2004. TextRank: Bringing Order into Texts. Proceedings of EMNLP 2004, pages 404–411,
Barcelona, Spain. Association for Computational Linguistics. http://www.aclweb.org/anthology/W04-3252.
GRAPH-BASED ML ALGORITHMS:

GRAPH CLUSTERING
GraphAware®
Continuous Cellular Tower Data Analysis
Eagle N., Quinn J.A., Clauset A. (2009) Methodologies for Continuous Cellular Tower Data Analysis. In: Tokuda H., Beigl M.,
Friday A., Brush A.J.B., Tobe Y. (eds) Pervasive Computing. Pervasive 2009. Lecture Notes in Computer Science, vol 5538. Springer,
Berlin, Heidelberg
GraphAware®
GRAPH VISUALIZATION
GraphAware®
‣ Meet us tomorrow at Neo4j GraphTour
‣ Come to our meet-ups
graphaware.com/events
‣ Visit our blog
graphaware.com/blog
‣ Watch us
youtube.com -> GraphAware channel
‣ And most importantly …
Get in touch!
INTERESTED IN MORE?
GraphAware®
www.graphaware.com @graph_aware

Graph-Powered Machine Learning