SlideShare a Scribd company logo
This is a novice-track talk, so all concepts and examples are kept simple
1. Basic graph theory concepts and definitions
2. A few real-world scenarios framed as graph data
3. Working with graphs in Python
The overall goal of this talk is to spark your interest in and show you what’s
out there as a jumping off point for you to go deeper
Graph: “A structure amounting to a set of objects in which some
pairs of the objects are in some sense ‘related’. The objects
correspond to mathematical abstractions called vertices (also called
nodes or points) and each of the related pairs of vertices is called an
edge (also called an arc or line)” – Richard Trudeau, Introduction to
Graph Theory (1st edition, 1993)
Graph Analytics: “Analysis of data structured as a graph
(sometimes also part of network analysis or link analysis depending
on scope and context)” – Me, talking to a stress ball as I made these
slides
• We see two vertices joined by
a single edge
• Vertex 1 is adjacent to vertex 2
• The neighborhood of vertex 1
is all adjacent vertices (vertex
2 in this case)
• We see that there is a loop on
vertex a
• Vertices a and b have multiple
edges between them
• Vertex c has a degree of 3
• There exists a path from vertex a
to vertex e
• Vertices f, g, and h form a 3-
cycle
• We have no single cut vertex or cut
edge (one that would create more
disjoint vertex/edge sets if
removed)
• We can separate this graph into two
disconnected sets:
1) Vertex Set 1 = {a, b, c, d, e}
2) Vertex Set 2 = {f, g, h}
• Imagine symmetric vertex
labels along the top and
left hand sides of the
matrix
• A one in a particular slot
tells us that the two
vertices are adjacent
• In this graph two vertices are
joined by a single directed
edge
• There is a dipath from vertex 1
to vertex 2 but not from vertex
2 to vertex 1
• Every vertex has ‘played’ every
other vertex
• We can see that there is no clear
winner (every vertex has
indegree and outdegree of 2)
• Vertices from Set 1 = {a, b, c, d} are
only adjacent to vertices from Set 2
= {e, f, g, h}
• This can be extended to tripartite
graphs (3 sets) or as many sets as we
like (n-partite graphs)
• Can we pair vertices from each set
together?
We can pair every vertex
from one set to a vertex
from the other using only
existing edges
• We can assign weights to edges
of a graph
• As we follow a path through the
graph, these weights accumulate
• For example, the path a -
> b -> c has an associated
weight of 0.5 + 0.4 = 0.9
• We can assign colors to vertices
• The graph we see here has a
proper coloring (no two vertices
of the same color are adjacent)
• We can also color edges!
• Are we focused more on objects or the relationships/interactions
between them?
• Are we looking at transition states?
• Is orientation important?
If you can imagine a graph to represent it, it’s probably worth giving it a
shot, if only for your own learning and exploration!
• If the lines represent
connections, what can we say
about the people highlighted
in red?
• What kinds of questions might
a graph be able to answer?
• e and d have the highest
degree
• What might the c-d-e cycle
tell us?
• What can we say about cut
vertices?
If we have page view
data with timestamps
how might we
represent this as a
graph?
• What might loops or multiple edges
between vertices represent?
• What types of data might we want to
use as values on the edges?
• What might comparing indegrees and
outdegrees on different vertices
represent?
If we have to regularly pick up a
load at the train station, make
deliveries to every factory and
then return to the garage how can
a graph help us find an optimal
route?
• We can assign weights to each edge to
represent distance, travel time, gas cost
for the distance, etc
• The path with the lowest total weight
represents the
shortest/cheapest/fastest/etc
• Note that edge weights are only
displayed for f-e and f-a
If the following people want to
attend the following talks (a-h),
what’s the minimum number of
sessions we need to satisfy
everyone?
• We can use the talks as
vertices and add edges
between talks that have the
same person interested
• The minimum number of
colors needed for a proper
coloring shows us the
minimum number of
sessions we need to satisfy
everyone
https://github.com/igraph/python-igraph https://github.com/networkx
https://graph-tool.skewed.de
• GraphML (XML-based)
• GML (ASCII-based)
• NetworkX has built in functions to work with a Pandas DataFrame or a
NumPy array/matrix
import networkx as nx
import matplotlib.pyplot as plt
G = nx.Graph()
vertices = []
for x in range(1, 6):
vertices.append(x)
G.add_nodes_from(vertices)
G.add_edges_from([(1, 2), (2, 3), (5, 4),
(4, 2), (1, 3), (5, 1), (5, 2), (3, 4)])
pos = nx.spring_layout(G)
nx.draw_networkx_nodes(G, pos, node_size=20)
nx.draw_networkx_edges(G, pos, width=5)
nx.draw_networkx_labels(G, pos,
font_size=14)
nx.draw(G, pos)
plt.show()
import networkx as nx
import matplotlib.pyplot as plt
G = nx.Graph()
G.add_nodes_from(['a', 'b', 'c'])
G.add_edge('a', 'b', weight=0.5)
G.add_edge('b', 'c', weight=0.2)
G.add_edge('c', 'a', weight=0.7)
pos = nx.spring_layout(G)
nx.draw_networkx_nodes(G, pos, node_size=500)
nx.draw_networkx_edges(G, pos, width=6)
nx.draw_networkx_labels(G, pos, font_size=14)
nx.draw_networkx_edge_labels(G, pos,
font_size=14)
nx.draw(G, pos)
plt.show()
>>> G.nodes()
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11,
12, 13, 14, 15, 16, 17, 18, 19, 20]
>>> nx.shortest_path(G, 1, 18)
[1, 3, 18]
>>> G.degree()
{1: 4, 2: 3, 3: 4, 4: 4, 5: 4, 6: 3,
7: 3, 8: 3, 9: 4, 10: 3, 11: 2,
12: 2, 13: 2, 14: 4, 15: 3, 16: 3,
17: 2, 18: 3, 19: 3, 20: 3}
>>> nx.greedy_color(G)
{'d': 0, 'a': 0, 'e': 1, 'b': 1,
'c': 1, 'f': 2, 'h': 1, 'g': 0}
>>> temp = nx.greedy_color(G)
>>> len(set(temp.values()))
3
import networkx as nx
import matplotlib.pyplot as plt
G = nx.DiGraph([(1, 2), (1, 3), (4, 1),
(1, 5), (2, 3), (2, 4), (2, 5), (3, 4),
(3, 5), (4, 5)])
pos = nx.circular_layout(G)
nx.draw_networkx_nodes(G, pos,
node_size=200)
nx.draw_networkx_edges(G, pos)
nx.draw_networkx_labels(G, pos,
fontsize=14)
>>> nx.has_path(G, 1, 5)
True
>>> nx.has_path(G, 5, 1)
False
>>> nx.shortest_path(G, 1, 4)
[1, 2, 4]
>>> nx.maximal_matching(G)
{(1, 4), (5, 2), (6, 3)}
• There’s a NetworkX tutorial tomorrow!
• In-browser Graphviz: webgraphviz.com
• Free graph theory textbook: An Introduction to Combinatorics and
Graph Theory, David Guichard
• Open problems in graph theory: openproblemgarden.org
• Graph databases
• Association for Computational Linguistics (ACL) 2010 Workshop on
Graph-based Methods for Natural Language Processing
• Free papers: researchgate.net

More Related Content

What's hot

Vector in R
Vector in RVector in R
Generalized Notions of Data Depth
Generalized Notions of Data DepthGeneralized Notions of Data Depth
Generalized Notions of Data Depth
Mukund Raj
 
Dijkstra’S Algorithm
Dijkstra’S AlgorithmDijkstra’S Algorithm
Dijkstra’S Algorithm
ami_01
 
Networks dijkstra's algorithm- pgsr
Networks  dijkstra's algorithm- pgsrNetworks  dijkstra's algorithm- pgsr
Networks dijkstra's algorithm- pgsrLinawati Adiman
 
Data structure
Data structureData structure
Data structure
kavitha muneeshwaran
 
Data structure and algorithm
Data structure and algorithmData structure and algorithm
Data structure and algorithm
sakthibalabalamuruga
 
Shortest path problem
Shortest path problemShortest path problem
Shortest path problem
Ifra Ilyas
 
Common fixed point theorems for contractive maps of
Common fixed point theorems for contractive maps ofCommon fixed point theorems for contractive maps of
Common fixed point theorems for contractive maps of
Alexander Decker
 
Dijkstra & flooding ppt(Routing algorithm)
Dijkstra & flooding ppt(Routing algorithm)Dijkstra & flooding ppt(Routing algorithm)
Dijkstra & flooding ppt(Routing algorithm)
Anshul gour
 
Graph clustering
Graph clusteringGraph clustering
Graph clustering
ssusered887b
 
Double Patterning (3/31 update)
Double Patterning (3/31 update)Double Patterning (3/31 update)
Double Patterning (3/31 update)guest833ea6e
 
Shortest path algorithm
Shortest  path algorithmShortest  path algorithm
Shortest path algorithm
Subrata Kumer Paul
 
Dijkstra's Algorithm
Dijkstra's Algorithm Dijkstra's Algorithm
Dijkstra's Algorithm
Rashik Ishrak Nahian
 
Combinatorial Optimization
Combinatorial OptimizationCombinatorial Optimization
Combinatorial Optimization
Institute of Technology, Nirma University
 
Image similarity using symbolic representation and its variations
Image similarity using symbolic representation and its variationsImage similarity using symbolic representation and its variations
Image similarity using symbolic representation and its variations
sipij
 
Machine Learning Basics
Machine Learning BasicsMachine Learning Basics
Machine Learning Basics
Humberto Marchezi
 
Color vs texture feature extraction and matching in visual content retrieval ...
Color vs texture feature extraction and matching in visual content retrieval ...Color vs texture feature extraction and matching in visual content retrieval ...
Color vs texture feature extraction and matching in visual content retrieval ...
IAEME Publication
 
Double Patterning
Double PatterningDouble Patterning
Double PatterningDanny Luk
 

What's hot (20)

Vector in R
Vector in RVector in R
Vector in R
 
Generalized Notions of Data Depth
Generalized Notions of Data DepthGeneralized Notions of Data Depth
Generalized Notions of Data Depth
 
Dijkstra’S Algorithm
Dijkstra’S AlgorithmDijkstra’S Algorithm
Dijkstra’S Algorithm
 
Networks dijkstra's algorithm- pgsr
Networks  dijkstra's algorithm- pgsrNetworks  dijkstra's algorithm- pgsr
Networks dijkstra's algorithm- pgsr
 
Data structure
Data structureData structure
Data structure
 
Data structure and algorithm
Data structure and algorithmData structure and algorithm
Data structure and algorithm
 
Shortest path problem
Shortest path problemShortest path problem
Shortest path problem
 
Common fixed point theorems for contractive maps of
Common fixed point theorems for contractive maps ofCommon fixed point theorems for contractive maps of
Common fixed point theorems for contractive maps of
 
Dijkstra & flooding ppt(Routing algorithm)
Dijkstra & flooding ppt(Routing algorithm)Dijkstra & flooding ppt(Routing algorithm)
Dijkstra & flooding ppt(Routing algorithm)
 
Graph clustering
Graph clusteringGraph clustering
Graph clustering
 
Double Patterning (3/31 update)
Double Patterning (3/31 update)Double Patterning (3/31 update)
Double Patterning (3/31 update)
 
d
dd
d
 
Shortest path algorithm
Shortest  path algorithmShortest  path algorithm
Shortest path algorithm
 
Dijkstra's Algorithm
Dijkstra's Algorithm Dijkstra's Algorithm
Dijkstra's Algorithm
 
Combinatorial Optimization
Combinatorial OptimizationCombinatorial Optimization
Combinatorial Optimization
 
Image similarity using symbolic representation and its variations
Image similarity using symbolic representation and its variationsImage similarity using symbolic representation and its variations
Image similarity using symbolic representation and its variations
 
Machine Learning Basics
Machine Learning BasicsMachine Learning Basics
Machine Learning Basics
 
cdrw
cdrwcdrw
cdrw
 
Color vs texture feature extraction and matching in visual content retrieval ...
Color vs texture feature extraction and matching in visual content retrieval ...Color vs texture feature extraction and matching in visual content retrieval ...
Color vs texture feature extraction and matching in visual content retrieval ...
 
Double Patterning
Double PatterningDouble Patterning
Double Patterning
 

Similar to Graph Analytics - From the Whiteboard to Your Toolbox - Sam Lerma

LEC 12-DSALGO-GRAPHS(final12).pdf
LEC 12-DSALGO-GRAPHS(final12).pdfLEC 12-DSALGO-GRAPHS(final12).pdf
LEC 12-DSALGO-GRAPHS(final12).pdf
MuhammadUmerIhtisham
 
lecture 17
lecture 17lecture 17
lecture 17sajinsc
 
Graphs and eularian circuit & path with c++ program
Graphs and eularian circuit & path with c++ programGraphs and eularian circuit & path with c++ program
Graphs and eularian circuit & path with c++ program
Muhammad Danish Badar
 
Unit 9 graph
Unit   9 graphUnit   9 graph
Unit 9 graph
Dabbal Singh Mahara
 
Unit ix graph
Unit   ix    graph Unit   ix    graph
Unit ix graph
Tribhuvan University
 
18 Basic Graph Algorithms
18 Basic Graph Algorithms18 Basic Graph Algorithms
18 Basic Graph Algorithms
Andres Mendez-Vazquez
 
Graphs
GraphsGraphs
Unit II_Graph.pptxkgjrekjgiojtoiejhgnltegjte
Unit II_Graph.pptxkgjrekjgiojtoiejhgnltegjteUnit II_Graph.pptxkgjrekjgiojtoiejhgnltegjte
Unit II_Graph.pptxkgjrekjgiojtoiejhgnltegjte
pournima055
 
DATA STRUCTURES.pptx
DATA STRUCTURES.pptxDATA STRUCTURES.pptx
DATA STRUCTURES.pptx
KENNEDY GITHAIGA
 
Data Structures and Agorithm: DS 21 Graph Theory.pptx
Data Structures and Agorithm: DS 21 Graph Theory.pptxData Structures and Agorithm: DS 21 Graph Theory.pptx
Data Structures and Agorithm: DS 21 Graph Theory.pptx
RashidFaridChishti
 
Graphs in Data Structure
 Graphs in Data Structure Graphs in Data Structure
Graphs in Data Structure
hafsa komal
 
Graph theory concepts complex networks presents-rouhollah nabati
Graph theory concepts   complex networks presents-rouhollah nabatiGraph theory concepts   complex networks presents-rouhollah nabati
Graph theory concepts complex networks presents-rouhollah nabati
nabati
 
Unit-6 Graph.ppsx ppt
Unit-6 Graph.ppsx                                       pptUnit-6 Graph.ppsx                                       ppt
Unit-6 Graph.ppsx ppt
DhruvilSTATUS
 
Algorithms Design Assignment Help
Algorithms Design Assignment HelpAlgorithms Design Assignment Help
Algorithms Design Assignment Help
Programming Homework Help
 
Algorithms Design Exam Help
Algorithms Design Exam HelpAlgorithms Design Exam Help
Algorithms Design Exam Help
Programming Exam Help
 
graph_theory_1-11.pdf___________________
graph_theory_1-11.pdf___________________graph_theory_1-11.pdf___________________
graph_theory_1-11.pdf___________________
ssuser1989da
 
ae_722_unstructured_meshes.ppt
ae_722_unstructured_meshes.pptae_722_unstructured_meshes.ppt
ae_722_unstructured_meshes.ppt
Sushilkumar Jogdankar
 

Similar to Graph Analytics - From the Whiteboard to Your Toolbox - Sam Lerma (20)

LEC 12-DSALGO-GRAPHS(final12).pdf
LEC 12-DSALGO-GRAPHS(final12).pdfLEC 12-DSALGO-GRAPHS(final12).pdf
LEC 12-DSALGO-GRAPHS(final12).pdf
 
lecture 17
lecture 17lecture 17
lecture 17
 
Graphs and eularian circuit & path with c++ program
Graphs and eularian circuit & path with c++ programGraphs and eularian circuit & path with c++ program
Graphs and eularian circuit & path with c++ program
 
Unit 9 graph
Unit   9 graphUnit   9 graph
Unit 9 graph
 
Unit ix graph
Unit   ix    graph Unit   ix    graph
Unit ix graph
 
18 Basic Graph Algorithms
18 Basic Graph Algorithms18 Basic Graph Algorithms
18 Basic Graph Algorithms
 
Graphs
GraphsGraphs
Graphs
 
Unit II_Graph.pptxkgjrekjgiojtoiejhgnltegjte
Unit II_Graph.pptxkgjrekjgiojtoiejhgnltegjteUnit II_Graph.pptxkgjrekjgiojtoiejhgnltegjte
Unit II_Graph.pptxkgjrekjgiojtoiejhgnltegjte
 
logic.pptx
logic.pptxlogic.pptx
logic.pptx
 
DATA STRUCTURES.pptx
DATA STRUCTURES.pptxDATA STRUCTURES.pptx
DATA STRUCTURES.pptx
 
Data Structures and Agorithm: DS 21 Graph Theory.pptx
Data Structures and Agorithm: DS 21 Graph Theory.pptxData Structures and Agorithm: DS 21 Graph Theory.pptx
Data Structures and Agorithm: DS 21 Graph Theory.pptx
 
Graphs in Data Structure
 Graphs in Data Structure Graphs in Data Structure
Graphs in Data Structure
 
Graph theory concepts complex networks presents-rouhollah nabati
Graph theory concepts   complex networks presents-rouhollah nabatiGraph theory concepts   complex networks presents-rouhollah nabati
Graph theory concepts complex networks presents-rouhollah nabati
 
Unit-6 Graph.ppsx ppt
Unit-6 Graph.ppsx                                       pptUnit-6 Graph.ppsx                                       ppt
Unit-6 Graph.ppsx ppt
 
Algorithms Design Assignment Help
Algorithms Design Assignment HelpAlgorithms Design Assignment Help
Algorithms Design Assignment Help
 
Algorithms Design Exam Help
Algorithms Design Exam HelpAlgorithms Design Exam Help
Algorithms Design Exam Help
 
8150.graphs
8150.graphs8150.graphs
8150.graphs
 
Dijkstra
DijkstraDijkstra
Dijkstra
 
graph_theory_1-11.pdf___________________
graph_theory_1-11.pdf___________________graph_theory_1-11.pdf___________________
graph_theory_1-11.pdf___________________
 
ae_722_unstructured_meshes.ppt
ae_722_unstructured_meshes.pptae_722_unstructured_meshes.ppt
ae_722_unstructured_meshes.ppt
 

More from PyData

Michal Mucha: Build and Deploy an End-to-end Streaming NLP Insight System | P...
Michal Mucha: Build and Deploy an End-to-end Streaming NLP Insight System | P...Michal Mucha: Build and Deploy an End-to-end Streaming NLP Insight System | P...
Michal Mucha: Build and Deploy an End-to-end Streaming NLP Insight System | P...
PyData
 
Unit testing data with marbles - Jane Stewart Adams, Leif Walsh
Unit testing data with marbles - Jane Stewart Adams, Leif WalshUnit testing data with marbles - Jane Stewart Adams, Leif Walsh
Unit testing data with marbles - Jane Stewart Adams, Leif Walsh
PyData
 
The TileDB Array Data Storage Manager - Stavros Papadopoulos, Jake Bolewski
The TileDB Array Data Storage Manager - Stavros Papadopoulos, Jake BolewskiThe TileDB Array Data Storage Manager - Stavros Papadopoulos, Jake Bolewski
The TileDB Array Data Storage Manager - Stavros Papadopoulos, Jake Bolewski
PyData
 
Using Embeddings to Understand the Variance and Evolution of Data Science... ...
Using Embeddings to Understand the Variance and Evolution of Data Science... ...Using Embeddings to Understand the Variance and Evolution of Data Science... ...
Using Embeddings to Understand the Variance and Evolution of Data Science... ...
PyData
 
Deploying Data Science for Distribution of The New York Times - Anne Bauer
Deploying Data Science for Distribution of The New York Times - Anne BauerDeploying Data Science for Distribution of The New York Times - Anne Bauer
Deploying Data Science for Distribution of The New York Times - Anne Bauer
PyData
 
Do Your Homework! Writing tests for Data Science and Stochastic Code - David ...
Do Your Homework! Writing tests for Data Science and Stochastic Code - David ...Do Your Homework! Writing tests for Data Science and Stochastic Code - David ...
Do Your Homework! Writing tests for Data Science and Stochastic Code - David ...
PyData
 
RESTful Machine Learning with Flask and TensorFlow Serving - Carlo Mazzaferro
RESTful Machine Learning with Flask and TensorFlow Serving - Carlo MazzaferroRESTful Machine Learning with Flask and TensorFlow Serving - Carlo Mazzaferro
RESTful Machine Learning with Flask and TensorFlow Serving - Carlo Mazzaferro
PyData
 
Mining dockless bikeshare and dockless scootershare trip data - Stefanie Brod...
Mining dockless bikeshare and dockless scootershare trip data - Stefanie Brod...Mining dockless bikeshare and dockless scootershare trip data - Stefanie Brod...
Mining dockless bikeshare and dockless scootershare trip data - Stefanie Brod...
PyData
 
Avoiding Bad Database Surprises: Simulation and Scalability - Steven Lott
Avoiding Bad Database Surprises: Simulation and Scalability - Steven LottAvoiding Bad Database Surprises: Simulation and Scalability - Steven Lott
Avoiding Bad Database Surprises: Simulation and Scalability - Steven Lott
PyData
 
Words in Space - Rebecca Bilbro
Words in Space - Rebecca BilbroWords in Space - Rebecca Bilbro
Words in Space - Rebecca Bilbro
PyData
 
End-to-End Machine learning pipelines for Python driven organizations - Nick ...
End-to-End Machine learning pipelines for Python driven organizations - Nick ...End-to-End Machine learning pipelines for Python driven organizations - Nick ...
End-to-End Machine learning pipelines for Python driven organizations - Nick ...
PyData
 
Pydata beautiful soup - Monica Puerto
Pydata beautiful soup - Monica PuertoPydata beautiful soup - Monica Puerto
Pydata beautiful soup - Monica Puerto
PyData
 
1D Convolutional Neural Networks for Time Series Modeling - Nathan Janos, Jef...
1D Convolutional Neural Networks for Time Series Modeling - Nathan Janos, Jef...1D Convolutional Neural Networks for Time Series Modeling - Nathan Janos, Jef...
1D Convolutional Neural Networks for Time Series Modeling - Nathan Janos, Jef...
PyData
 
Extending Pandas with Custom Types - Will Ayd
Extending Pandas with Custom Types - Will AydExtending Pandas with Custom Types - Will Ayd
Extending Pandas with Custom Types - Will Ayd
PyData
 
Measuring Model Fairness - Stephen Hoover
Measuring Model Fairness - Stephen HooverMeasuring Model Fairness - Stephen Hoover
Measuring Model Fairness - Stephen Hoover
PyData
 
What's the Science in Data Science? - Skipper Seabold
What's the Science in Data Science? - Skipper SeaboldWhat's the Science in Data Science? - Skipper Seabold
What's the Science in Data Science? - Skipper Seabold
PyData
 
Applying Statistical Modeling and Machine Learning to Perform Time-Series For...
Applying Statistical Modeling and Machine Learning to Perform Time-Series For...Applying Statistical Modeling and Machine Learning to Perform Time-Series For...
Applying Statistical Modeling and Machine Learning to Perform Time-Series For...
PyData
 
Solving very simple substitution ciphers algorithmically - Stephen Enright-Ward
Solving very simple substitution ciphers algorithmically - Stephen Enright-WardSolving very simple substitution ciphers algorithmically - Stephen Enright-Ward
Solving very simple substitution ciphers algorithmically - Stephen Enright-Ward
PyData
 
The Face of Nanomaterials: Insightful Classification Using Deep Learning - An...
The Face of Nanomaterials: Insightful Classification Using Deep Learning - An...The Face of Nanomaterials: Insightful Classification Using Deep Learning - An...
The Face of Nanomaterials: Insightful Classification Using Deep Learning - An...
PyData
 
Deprecating the state machine: building conversational AI with the Rasa stack...
Deprecating the state machine: building conversational AI with the Rasa stack...Deprecating the state machine: building conversational AI with the Rasa stack...
Deprecating the state machine: building conversational AI with the Rasa stack...
PyData
 

More from PyData (20)

Michal Mucha: Build and Deploy an End-to-end Streaming NLP Insight System | P...
Michal Mucha: Build and Deploy an End-to-end Streaming NLP Insight System | P...Michal Mucha: Build and Deploy an End-to-end Streaming NLP Insight System | P...
Michal Mucha: Build and Deploy an End-to-end Streaming NLP Insight System | P...
 
Unit testing data with marbles - Jane Stewart Adams, Leif Walsh
Unit testing data with marbles - Jane Stewart Adams, Leif WalshUnit testing data with marbles - Jane Stewart Adams, Leif Walsh
Unit testing data with marbles - Jane Stewart Adams, Leif Walsh
 
The TileDB Array Data Storage Manager - Stavros Papadopoulos, Jake Bolewski
The TileDB Array Data Storage Manager - Stavros Papadopoulos, Jake BolewskiThe TileDB Array Data Storage Manager - Stavros Papadopoulos, Jake Bolewski
The TileDB Array Data Storage Manager - Stavros Papadopoulos, Jake Bolewski
 
Using Embeddings to Understand the Variance and Evolution of Data Science... ...
Using Embeddings to Understand the Variance and Evolution of Data Science... ...Using Embeddings to Understand the Variance and Evolution of Data Science... ...
Using Embeddings to Understand the Variance and Evolution of Data Science... ...
 
Deploying Data Science for Distribution of The New York Times - Anne Bauer
Deploying Data Science for Distribution of The New York Times - Anne BauerDeploying Data Science for Distribution of The New York Times - Anne Bauer
Deploying Data Science for Distribution of The New York Times - Anne Bauer
 
Do Your Homework! Writing tests for Data Science and Stochastic Code - David ...
Do Your Homework! Writing tests for Data Science and Stochastic Code - David ...Do Your Homework! Writing tests for Data Science and Stochastic Code - David ...
Do Your Homework! Writing tests for Data Science and Stochastic Code - David ...
 
RESTful Machine Learning with Flask and TensorFlow Serving - Carlo Mazzaferro
RESTful Machine Learning with Flask and TensorFlow Serving - Carlo MazzaferroRESTful Machine Learning with Flask and TensorFlow Serving - Carlo Mazzaferro
RESTful Machine Learning with Flask and TensorFlow Serving - Carlo Mazzaferro
 
Mining dockless bikeshare and dockless scootershare trip data - Stefanie Brod...
Mining dockless bikeshare and dockless scootershare trip data - Stefanie Brod...Mining dockless bikeshare and dockless scootershare trip data - Stefanie Brod...
Mining dockless bikeshare and dockless scootershare trip data - Stefanie Brod...
 
Avoiding Bad Database Surprises: Simulation and Scalability - Steven Lott
Avoiding Bad Database Surprises: Simulation and Scalability - Steven LottAvoiding Bad Database Surprises: Simulation and Scalability - Steven Lott
Avoiding Bad Database Surprises: Simulation and Scalability - Steven Lott
 
Words in Space - Rebecca Bilbro
Words in Space - Rebecca BilbroWords in Space - Rebecca Bilbro
Words in Space - Rebecca Bilbro
 
End-to-End Machine learning pipelines for Python driven organizations - Nick ...
End-to-End Machine learning pipelines for Python driven organizations - Nick ...End-to-End Machine learning pipelines for Python driven organizations - Nick ...
End-to-End Machine learning pipelines for Python driven organizations - Nick ...
 
Pydata beautiful soup - Monica Puerto
Pydata beautiful soup - Monica PuertoPydata beautiful soup - Monica Puerto
Pydata beautiful soup - Monica Puerto
 
1D Convolutional Neural Networks for Time Series Modeling - Nathan Janos, Jef...
1D Convolutional Neural Networks for Time Series Modeling - Nathan Janos, Jef...1D Convolutional Neural Networks for Time Series Modeling - Nathan Janos, Jef...
1D Convolutional Neural Networks for Time Series Modeling - Nathan Janos, Jef...
 
Extending Pandas with Custom Types - Will Ayd
Extending Pandas with Custom Types - Will AydExtending Pandas with Custom Types - Will Ayd
Extending Pandas with Custom Types - Will Ayd
 
Measuring Model Fairness - Stephen Hoover
Measuring Model Fairness - Stephen HooverMeasuring Model Fairness - Stephen Hoover
Measuring Model Fairness - Stephen Hoover
 
What's the Science in Data Science? - Skipper Seabold
What's the Science in Data Science? - Skipper SeaboldWhat's the Science in Data Science? - Skipper Seabold
What's the Science in Data Science? - Skipper Seabold
 
Applying Statistical Modeling and Machine Learning to Perform Time-Series For...
Applying Statistical Modeling and Machine Learning to Perform Time-Series For...Applying Statistical Modeling and Machine Learning to Perform Time-Series For...
Applying Statistical Modeling and Machine Learning to Perform Time-Series For...
 
Solving very simple substitution ciphers algorithmically - Stephen Enright-Ward
Solving very simple substitution ciphers algorithmically - Stephen Enright-WardSolving very simple substitution ciphers algorithmically - Stephen Enright-Ward
Solving very simple substitution ciphers algorithmically - Stephen Enright-Ward
 
The Face of Nanomaterials: Insightful Classification Using Deep Learning - An...
The Face of Nanomaterials: Insightful Classification Using Deep Learning - An...The Face of Nanomaterials: Insightful Classification Using Deep Learning - An...
The Face of Nanomaterials: Insightful Classification Using Deep Learning - An...
 
Deprecating the state machine: building conversational AI with the Rasa stack...
Deprecating the state machine: building conversational AI with the Rasa stack...Deprecating the state machine: building conversational AI with the Rasa stack...
Deprecating the state machine: building conversational AI with the Rasa stack...
 

Recently uploaded

The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
Jemma Hussein Allen
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
Product School
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
DianaGray10
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
RTTS
 
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Tobias Schneck
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
Prayukth K V
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
Thijs Feryn
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
Alan Dix
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Product School
 
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsTo Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
Paul Groth
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
OnBoard
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
BookNet Canada
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
Elena Simperl
 
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Ramesh Iyer
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
Guy Korland
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Jeffrey Haguewood
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
Alison B. Lowndes
 
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Product School
 

Recently uploaded (20)

The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
 
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
 
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsTo Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
 
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
 
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
 

Graph Analytics - From the Whiteboard to Your Toolbox - Sam Lerma

  • 1.
  • 2. This is a novice-track talk, so all concepts and examples are kept simple 1. Basic graph theory concepts and definitions 2. A few real-world scenarios framed as graph data 3. Working with graphs in Python The overall goal of this talk is to spark your interest in and show you what’s out there as a jumping off point for you to go deeper
  • 3. Graph: “A structure amounting to a set of objects in which some pairs of the objects are in some sense ‘related’. The objects correspond to mathematical abstractions called vertices (also called nodes or points) and each of the related pairs of vertices is called an edge (also called an arc or line)” – Richard Trudeau, Introduction to Graph Theory (1st edition, 1993) Graph Analytics: “Analysis of data structured as a graph (sometimes also part of network analysis or link analysis depending on scope and context)” – Me, talking to a stress ball as I made these slides
  • 4.
  • 5. • We see two vertices joined by a single edge • Vertex 1 is adjacent to vertex 2 • The neighborhood of vertex 1 is all adjacent vertices (vertex 2 in this case)
  • 6.
  • 7. • We see that there is a loop on vertex a • Vertices a and b have multiple edges between them • Vertex c has a degree of 3 • There exists a path from vertex a to vertex e • Vertices f, g, and h form a 3- cycle
  • 8. • We have no single cut vertex or cut edge (one that would create more disjoint vertex/edge sets if removed) • We can separate this graph into two disconnected sets: 1) Vertex Set 1 = {a, b, c, d, e} 2) Vertex Set 2 = {f, g, h}
  • 9. • Imagine symmetric vertex labels along the top and left hand sides of the matrix • A one in a particular slot tells us that the two vertices are adjacent
  • 10. • In this graph two vertices are joined by a single directed edge • There is a dipath from vertex 1 to vertex 2 but not from vertex 2 to vertex 1
  • 11. • Every vertex has ‘played’ every other vertex • We can see that there is no clear winner (every vertex has indegree and outdegree of 2)
  • 12. • Vertices from Set 1 = {a, b, c, d} are only adjacent to vertices from Set 2 = {e, f, g, h} • This can be extended to tripartite graphs (3 sets) or as many sets as we like (n-partite graphs) • Can we pair vertices from each set together?
  • 13. We can pair every vertex from one set to a vertex from the other using only existing edges
  • 14. • We can assign weights to edges of a graph • As we follow a path through the graph, these weights accumulate • For example, the path a - > b -> c has an associated weight of 0.5 + 0.4 = 0.9
  • 15. • We can assign colors to vertices • The graph we see here has a proper coloring (no two vertices of the same color are adjacent) • We can also color edges!
  • 16. • Are we focused more on objects or the relationships/interactions between them? • Are we looking at transition states? • Is orientation important? If you can imagine a graph to represent it, it’s probably worth giving it a shot, if only for your own learning and exploration!
  • 17. • If the lines represent connections, what can we say about the people highlighted in red? • What kinds of questions might a graph be able to answer?
  • 18. • e and d have the highest degree • What might the c-d-e cycle tell us? • What can we say about cut vertices?
  • 19. If we have page view data with timestamps how might we represent this as a graph?
  • 20. • What might loops or multiple edges between vertices represent? • What types of data might we want to use as values on the edges? • What might comparing indegrees and outdegrees on different vertices represent?
  • 21. If we have to regularly pick up a load at the train station, make deliveries to every factory and then return to the garage how can a graph help us find an optimal route?
  • 22. • We can assign weights to each edge to represent distance, travel time, gas cost for the distance, etc • The path with the lowest total weight represents the shortest/cheapest/fastest/etc • Note that edge weights are only displayed for f-e and f-a
  • 23. If the following people want to attend the following talks (a-h), what’s the minimum number of sessions we need to satisfy everyone?
  • 24. • We can use the talks as vertices and add edges between talks that have the same person interested • The minimum number of colors needed for a proper coloring shows us the minimum number of sessions we need to satisfy everyone
  • 27. • GraphML (XML-based) • GML (ASCII-based) • NetworkX has built in functions to work with a Pandas DataFrame or a NumPy array/matrix
  • 28. import networkx as nx import matplotlib.pyplot as plt G = nx.Graph() vertices = [] for x in range(1, 6): vertices.append(x) G.add_nodes_from(vertices) G.add_edges_from([(1, 2), (2, 3), (5, 4), (4, 2), (1, 3), (5, 1), (5, 2), (3, 4)]) pos = nx.spring_layout(G) nx.draw_networkx_nodes(G, pos, node_size=20) nx.draw_networkx_edges(G, pos, width=5) nx.draw_networkx_labels(G, pos, font_size=14) nx.draw(G, pos) plt.show()
  • 29. import networkx as nx import matplotlib.pyplot as plt G = nx.Graph() G.add_nodes_from(['a', 'b', 'c']) G.add_edge('a', 'b', weight=0.5) G.add_edge('b', 'c', weight=0.2) G.add_edge('c', 'a', weight=0.7) pos = nx.spring_layout(G) nx.draw_networkx_nodes(G, pos, node_size=500) nx.draw_networkx_edges(G, pos, width=6) nx.draw_networkx_labels(G, pos, font_size=14) nx.draw_networkx_edge_labels(G, pos, font_size=14) nx.draw(G, pos) plt.show()
  • 30. >>> G.nodes() [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20] >>> nx.shortest_path(G, 1, 18) [1, 3, 18] >>> G.degree() {1: 4, 2: 3, 3: 4, 4: 4, 5: 4, 6: 3, 7: 3, 8: 3, 9: 4, 10: 3, 11: 2, 12: 2, 13: 2, 14: 4, 15: 3, 16: 3, 17: 2, 18: 3, 19: 3, 20: 3}
  • 31.
  • 32. >>> nx.greedy_color(G) {'d': 0, 'a': 0, 'e': 1, 'b': 1, 'c': 1, 'f': 2, 'h': 1, 'g': 0} >>> temp = nx.greedy_color(G) >>> len(set(temp.values())) 3
  • 33. import networkx as nx import matplotlib.pyplot as plt G = nx.DiGraph([(1, 2), (1, 3), (4, 1), (1, 5), (2, 3), (2, 4), (2, 5), (3, 4), (3, 5), (4, 5)]) pos = nx.circular_layout(G) nx.draw_networkx_nodes(G, pos, node_size=200) nx.draw_networkx_edges(G, pos) nx.draw_networkx_labels(G, pos, fontsize=14) >>> nx.has_path(G, 1, 5) True >>> nx.has_path(G, 5, 1) False >>> nx.shortest_path(G, 1, 4) [1, 2, 4]
  • 35. • There’s a NetworkX tutorial tomorrow! • In-browser Graphviz: webgraphviz.com • Free graph theory textbook: An Introduction to Combinatorics and Graph Theory, David Guichard • Open problems in graph theory: openproblemgarden.org • Graph databases • Association for Computational Linguistics (ACL) 2010 Workshop on Graph-based Methods for Natural Language Processing • Free papers: researchgate.net