SlideShare a Scribd company logo
1 of 80
Download to read offline
WHEREWE ARE
•  Graphs
•  What is a graph?
•  Graph Structures
•  Graph Analytics
•  Traversals
•  Patterns
•  Recursion
•  Graph Representations
1
WHAT IS A GRAPH?
G = (V, E)
V is a set of vertices
(synonym: nodes)
E is a set of edges.
• Each edge is a pair
(vsource,vtarget)
• Edges may be considered
directed or undirected
2
b d
e c
f g
a
GRAPHS IN THEWILD
The Web
The Internet
Social networks
Communication logs
• Ex: PRISM
many more – ubiquitous!
(why?)
3
b d
e c
f g
a
GRAPH ANALYTICS
Structural Algorithms
Traversal Algorithms
Pattern-Matching Algorithms
4
b d
e c
f g
a
Analyzing Analytics,Bordawekar 2012
GRAPH ANALYTICS:SOME STRUCTURAL
TASKS
Basic Metrics
•  |V|, |E|
•  |E| is more interesting than |V|
•  in-degree(v), out-degree(v)
5
Analyzing Analytics,Bordawekar 2012
“I have a big graph”
“How many edges, and
what is the highest in or out
degree?”
b d
e c
f g
a
WHEREWE ARE
•  Graphs
•  What is a graph?
•  Graph Structures
•  Graph Analytics
•  Traversals
•  Patterns
•  Recursion
•  Graph Representations
6
STRUCTURAL TASK:THE HISTOGRAM OF A
GRAPH
Outdegree of a vertex = number of
outgoing edges
For each integer d, let n(d) = number of
vertices with outdegree d
The outdegree histogram of a graph =
the scatterplot (d, n(d))
7
0
2
4
2
1
1
1
d n(d)
0 1
1 3
2 2
3 0
4 1
0
1
2
3
4
0 1 2 3 4 5
d
n
Outdegree 1 is
seen at 3 nodessrc: Dan Suciu
HISTOGRAMS TELL US SOMETHING ABOUT
THE GRAPH
8
What can you
say about these
graphs?
0
20
40
60
80
100
120
0 5 10
x10000
0
50
100
150
0 5 10
x10000
0
20
40
60
80
100
120
0 5 10
x10000
src: Dan Suciu
EXPONENTIAL DISTRIBUTION
9
(generally, cxd, for some x < 1)
A random graph has exponential distribution
Best seen when n is on a log scale
1
100
10000
1000000
0 5 10
n
0
500000
1000000
1500000
0 5 10
n
Quickly vanishing
src: Dan Suciu
src: Dan Suciu
Long tail
ZIPF DISTRIBUTION
10
for some value x>0
Human-generated data has Zipf distribution: letters in
alphabet, words in vocabulary, etc.
Best seen in a log-log scale
1000
10000
100000
1 4 16
n
10
100
1000
10000
100000
0 2 4 6 81012141618
n
src: Dan Suciu
A HISTOGRAM OF THEWEB
11
Late 1990’s
200M Webpages
Exponential ?
Zipf ?
Broder et al. 2000
THE BOWTIE STRUCTURE OF THEWEB
12
Broder et al. 2000
WHEREWE ARE
•  Graphs
•  What is a graph?
•  Graph Structures
•  Graph Analytics
•  Traversals
•  Patterns
•  Recursion
•  Graph Representations
13
GRAPH ANALYTICS:SOME STRUCTURAL
TASKS
Diameter
• Longest of all shortest paths
14
b d
e c
f g
a
Analyzing Analytics,Bordawekar 2012
What is the
diameter of this
graph?
GRAPH ANALYTICS:SOME STRUCTURAL
TASKS
Connectivity Coefficient
•  Minimum number of vertices you need to remove that will
disconnect the graph
15
b d
e c
f g
a
What is the connectivity coefficient
of this graph?
Why might you want to compute
the connectivity coefficient?
May depend on what is meant by “connectivity”
x and y are strongly connected if
x is reachable from y and y is reachable from x
x and y are connected if
x is reachable from y or y is reachable from x
Analyzing Analytics,Bordawekar 2012
GRAPH ANALYTICS:SOME STRUCTURAL
TASKS
We may want to understand the “importance” of a vertex
Various notions of Centrality
•  Closeness Centrality of a vertex:
•  Average length of all its shortest paths
•  Betweeness Centrality of a vertex v:
•  the fraction of all shortest paths that pass through v
16
b d
e c
f g
a
What is the betweeness
centrality of vertex e?
Why might you want to compute
betweenness centrality?
Analyzing Analytics,Bordawekar 2012
GRAPH ANALYTICS:SOME STRUCTURAL
TASKS
Degree Centrality
• Degree centrality of v: degree(v) / |E|
•  “Important” nodes are those with high degree
But: say we both have 5 friends
• We have the same degree centrality
• But what if your 5 friends are Barack Obama, Larry
Page, Bill Gates, the Dalai Lama, and Oprah Winfrey?
• Shouldn’t you be considered more “important?”
17
Analyzing Analytics,Bordawekar 2012
EIGENVECTOR CENTRALITY (I.E.,
PAGERANK)
18
Basic idea (but oversimplified)
while not converged:
for each vertex v
rank(v) = sum of ranks from incoming edges
Two problems:
1) If a page with millions of outgoing links
links to me, that’s less valuable than a page
with only a few outgoing links.
2) If I’m 27 hops away from Barack Obama, he
shouldn’t really influence my rank. We need
a damping factor.
b d
e c
f g
a
PAGERANK:2ND ATTEMPT
19
•  Assume edges BàA, CàA, DàA, …
•  PR(X) is the PageRank of vertex X
•  L(X) is the number of outgoing links from X
•  d is the damping factor
•  N is the number of vertices
while not converged:
ensures ranks sum to 1
WHEREWE ARE
•  Graphs
•  What is a graph?
•  Graph Structures
•  Graph Analytics
•  Traversals
•  Patterns
•  Recursion
•  Graph Representations
20
TRAVERSAL TASKS
Minimum Spanning Tree
•  Find the smallest subset of edges that (weakly) connect the
graph
4
b d
e c
f g
a
What is a minimum
spanning tree of this
graph?
Why might you
want to compute
this?
TRAVERSAL TASKS:EULER’S BRIDGES OF
KONIGSBERG
5
Can we visit every vertex and cross
each bridge only once?
Observation: if you enter by a bridge,
you must leave by a bridge
Result: If you don’t care where you start
and end, then at most two vertices can
have an odd degree.
Result: If you want to start and end in the
same place, every vertex must have an
even degree.
Very easy to check!
TRAVERSAL TASKS:HAMILTONIAN PATHS
6
Can we create a path that visits every vertex only
once?
A lot harder! Why?
Intuition: Each vertex is involved in a lot of possible
paths. But we can only “use” the vertex once. Which
is the “best” way to use the vertex?
Extension: Assume there is a cost to traversing each
edge. Can we find the path that visits every vertex
with the minimum cost?
No efficient algorithm can exist. Heuristics and
approximations are the best we can do.
b d
e c
f g
a
TRAVERSAL TASKS
Maximum Flow
• Input: A graph with labeled edges indicating the
“capacity” of that edge, and special vertices called
sources and sinks
• Find a subgraph that maximizes flow between sources
and sinks
7
b d
e c
f g
a
5
4
4
2
3
3 2
3
2 4
5
Condition: for each vertex, incoming
flow must equal outgoing flow
WHEREWE ARE
•  Graphs
•  Traversals
•  Patterns
•  Pattern Matching
•  Pattern Expressions
•  Recursion
•  Graph Representations
25
PATTERN-MATCHING TASKS
Find all instances of a pattern
May involve vertex or edge labels
26
b d
e c
f g
a
$y
$x
POPULAR PATTERN-MATCHING TASK:TRIANGLES
Find Triangles a ! b ! c ! a
The total number of triangles is another
measure of connectedness.
Lots of algorithms; a popular challenge
problem for computer scientists
Utility in practice is not so clear (to me)
Key idea to remember: a naïve algorithm
will find the same triangle multiple times.
a à b à c à a
b à c à a à b
c à a à b à c
You can extend this to k-cycles, etc.
27
b d
e c
f g
a
ANOTHER PATTERN MATCH EXAMPLE
Given a graph with edge labels
•  Drug X interferes with DrugY
•  DrugY regulates the expression of gene Z
•  gene Z is associated with disease w
Find drugs that interfere with another drug
involved in the treatment of a disease
28
$y$x $z
interferes_with regulates
$w
associated_with
NEED SOME KIND OF PATTERN EXPRESSION
LANGUAGE
SPARQL
29
SELECT ?x WHERE
?x interferes_with ?y .
?y regulates ?z .
?z associated_with?w .
subject predicate object
terazine interferes_with betamin
betamin regulates GT1234
GT1234 associated_with cancer
doratin regulates XY6789
SQL-like language over a
fixed schema:
subject, predicate, object
RDF = “Resource Description Framework”
Defines a formal data model for (subject, predicate, object) “triples.”
A triple is just a labeled edge in a graph.
NEED SOME KIND OF PATTERN EXPRESSION
LANGUAGE
Datalog
30
Ans(x) :-
R(x, interferes_with, y),
R(y, regulates, z),
R(z, associated_with, w)
Assume a relation R(subject, predicate, object)
NEED SOME KIND OF PATTERN EXPRESSION
LANGUAGE
SQL
31
SELECT i.subject
FROM R i, R r, R a
WHERE i.predicate = “interferes_with”
AND r.predicate = “regulates”
AND a.predicate = “associated_with”
AND i.object = r.subject
AND r.object = a.subject
Assume a relation R(subject, predicate, object)
WHEREWE ARE
•  Graphs
•  Traversals
•  Patterns
•  Pattern Matching
•  Pattern Expressions
•  Recursion
•  Graph Representations
32
NEED SOME KIND OF PATTERN EXPRESSION
LANGUAGE
Relational Algebra
33
σp=interferes_with(R) o=s σp=regulates(R) o=s σp=associated_with(R)
Assume a relation R(subject, predicate, object)
NEED SOME KIND OF PATTERN EXPRESSION
LANGUAGE
Datalog, 2nd formulation
34
Ans(x) :-
InterferesWith(x, y),
Regulates(y, z),
AssociatedWith(z, w)=
Assume relations
InterferesWith(drug1, drug2),
Regulates(drug, gene),
AssociatedWith(gene, disease)
NEED SOME KIND OF PATTERN EXPRESSION LANGUAGE
SQL, 2nd formulation
35
SELECT i.drug1
FROM InterferesWith i, Regulates r, AssociatedWith a
WHERE i.drug2 = r.drug
AND r.gene = a.gene
Assume relations
InterferesWith(drug1, drug2),
Regulates(drug, gene),
AssociatedWith(gene, disease)
ASIDE ON SPARQL
Designed for graph query
Not algebraically closed
•  Input is a graph, but output is a set of records
Limited expressiveness
•  Support for path expressions since v1.1
•  But not arbitrary recursion
If your input is tabular, you have to shred it into a
graph before using SPARQL
•  Often a 3x-6x blow up in size!
36
ASIDE:PRISM-LIKE EXAMPLE IN DATALOG
37
http://www.businessweek.com/magazine/palantir-the-vanguard-of-cyberterror-
security-11222011.html
“In October, a foreign national named
Mike Fikri purchased a one-way plane
ticket from Cairo to Miami, where he
rented a condo.”
“Over the previous few weeks, he’d
made a number of large withdrawals
from a Russian bank account and
placed repeated calls to a few people
in Syria.”
“More recently, he rented a truck,
drove to Orlando, and visited Walt
Disney World by himself.”
BoughtFlight(‘Fikri’,‘Cairo’,
‘Miami’,‘oneway’ 4/1/2013)
Withdrawal(‘Fikri’, 5000,‘some bank’,
4/4/2013)
Withdrawal(‘Fikri’, 2000,‘some bank’,
4/5/2013)
…
Rented(‘Fikri’,‘truck’,‘Miami’,‘Orlando’,
4/13/2013)
PRISM EXAMPLE MODELED IN
DATALOG
38
http://www.businessweek.com/magazine/palantir-the-
vanguard-of-cyberterror-security-11222011.html
flag(person, 1, date) :- BoughtFlight(person, origin, destination, oneway, date),
FlaggedAirports(origin),
USAirports(destination),
oneway=‘oneway’
alert(person, flagcnt, mindate, maxdate) :-
totalflags(person, flagcnt, mindate, maxdate),
maxdate – mindate < 10 days, flagcnt > 3
totalflags(person, sum(flag), min(date), max(date) :- flag(person, flag, date)
flag(person, 1, date) :- Rented(person, vehicle, origin, dest, date),
ImportantLocation(dest)
ForeignWithdrawal(person, sum(amount), date) :-
Withdrawal(person, amount, bank, date),
ForeignBank(bank), amount > 1000
flag(person, 1, date) :- ForeignWithdrawal(person, totamount, date), totamount>10k
ANOTHER DATALOG EXAMPLE
39
contacted(Person1, Person2,Time) :-
email(Person1, Person2, Message,Time, …)
contacted(Person1, Person2,Time) :-
called(Time,Voicemail, Person1, Person2, …)
contacted(Person1, Person2,Time) :-
text_message(Time, Message, Person1, Person2, …),
Who contacted who,when? We don’t care how.
ANOTHER DATALOG EXAMPLE
40
knew(“Sam”).
knew(Person2) :-
knew(Person1)
email(Person1, Person2, Message,Time),
Time < “June 3”.
knew(Person2) :-
knew(Person1),
met_with(Person1, Person2,Time),
Time < “June 3”.
Who could have known before June 3 that X was going to
happen?
self-reference!
WHEREWE ARE
41
1. Graph Tasks
3. Structural 5.Traversal 6. Patterns2. Ex: Histograms
4. Ex: PageRank
9. Ex: Datalog in MR
7. Pattern Languages
12. Ex: PageRank in Pregel
11. Ex: PageRank in MR
8. Ex: PRISM
10. Representations
WHEREWE ARE
•  Graphs
•  Traversals
•  Patterns
•  Recursion
•  What’s recursion?
•  Recursive Graphs
•  Graph Representations
42
A(y) :- Edge(‘w’, y)
A(y) :- A(x), Edge(x,y)
Reachability from a given node
wstep 0: A0 = all nodes connected to ‘w’
step 1: for each node in A0, find neighbors
step 2: for each node in A1, find neighbors
EVALUATION OF RECURSIVE PROGRAMS
43
EVALUATING RECURSIVE QUERIES
Problems with this approach
• What if there are cycles in the graph?
44
Bob
Sue Tom Lin
EXAMPLE OF SEMI-NAÏVE EVALUATION
45
Join Dupe-elim
A(y) :- R(‘a’, y)
A(y) :- A(x), R(x,y)
Reachability from ‘a’ in datalog
IN MAPREDUCE
46
i=i+1
Anything new?
Driver
done
(compute next
generation of nodes)
(remove the ones
we’ve already seen)
map
map
map
map
map
reduce
reduce
Join Difference
ΔAi-1
R(1)
R(0)
mapAi
(0)
Ai
(1)
reduce
reduce
map
WHEREWE ARE
•  Graphs
•  Traversals
•  Patterns
•  Recursion
•  What’s recursion?
•  Recursive Graphs
•  Graph Representations
47
IN MAPREDUCE
48
i=i+1
Anything new?
Driver
done
(compute next
generation of nodes)
(remove the ones
we’ve already seen)
map
map
map
map
map
reduce
reduce
Join Difference
ΔAi-1
R(1)
R(0)
mapAi
(0)
Ai
(1)
reduce
reduce
map
EVALUATING RECURSIVE QUERIES AT SCALE
49
(a) R is loop invariant,but gets loaded and shuffled on each iteration
(b) Ai grows slowly and monotonically,but is loaded and shuffled on each
iteration.
Bu et al.VLDB10,VLDBJ12
Shaw et al. Datalog12
map
map
map
map
map
reduce
reduce
Join Difference
ΔAi-1
R(1)
R(0)
mapAi
(0)
Ai
(1)(a) (b)
reduce
reduce
map
(compute the next
generation of answers)
(remove the ones
we’ve already seen)
IDEA:CACHE LOOP-INVARIANT DATA
50
map map
map
reduce
reduce
Join Difference
ΔAi-1 reduce
reduce
mapAi
(0)
Ai
(1)
map
R(1)
R(0)
Iteration i > 0:
map
mapR(1)
R(0)
Iteration i = 0: Load a distributed cache
A(1)
A(0)
51
0
100
200
300
400
500
600
700
0 2 4 6 8 10
time(s)
iteration # (3 jobs per iteration)
outer cache time (ms)
no cache
BTC2010, 680GB
Cache-enabled join
(query: transitive reachability from 7 nodes)
Takeaways:
•  10x improvement over no optimizations.
•  All optimizations are useful
•  We’re approaching the raw overhead of Hadoop (bottom gray line)
0

5000

10000

15000

20000

25000

30000

35000

0
 50
 100
 150
 200
 250

'me
(s)

itera'on
#

(a) all optimizations
raw Hadoop
(d): (a) w/o diff cache
(b): (a) w/o faster cache
(c): (a) w/o project opt.
(e): (a) w/o file combiner
no optimizations
EFFECT OFVARIOUS OPTIMIZATIONS FOR A
RECURSIVE GRAPH QUERY ON BTC 2010
52
NEW TUPLES DISCOVERED BY ITERATION
NUMBER
53
1
10
100
1,000
10,000
100,000
1,000,000
10,000,000
100,000,000
0 20 40 60 80 100 120 140 160 180
#oftuplesdiscovered
iteration #
frontier tuples
previously discovered tuples removed
WHEREWE ARE
•  Graphs
•  Traversals
•  Patterns
•  Recursion
•  Graph Representations
•  Examples of Graph Representations
•  Graphs with Big Data
54
GRAPH REPRESENTATIONS
Edge Table
55
Source Target
a b
b a
a f
b f
b e
b d
d e
d c
e g
g c
c g
b d
e c
f g
a
EXAMPLE USING EDGE TABLE
56
SELECT top 5 target, incount
FROM (
SELECT target, count(source) as incount
FROM edges
GROUP BY target
)
ORDER BY incount
Find top 5 highest in-degree vertices
GRAPH REPRESENTATIONS
Adjacency List
57
Source Target
a b, f
b a, d, e, f
d c, e
e g
g c
c g
b d
e c
f g
a
EXAMPLE USING ADJACENCY LIST (AND
MAPREDUCE)
58
Map:
Input key: vertex
Input value: adjacency list
Output key: adjacent vertex
Output value: vertex
Reduce:
Input key: vertex
Input value: list of vertices
Output key: vertex
Output value: length of list
Find top 5 highest in-degree vertices
b d
e c
f g
a
Source Target
a b, f
b a, d, e
d c, e
e g
g c
c g
Source Target
b a
f a
a b
d b
e b
e d
c d
g e
c g
g c
map
GRAPH REPRESENTATIONS
Adjacency Matrix
59
b d
e c
f g
a
a	 b	 c	 d	 e	 f	 g	
a	 0	 1	 0	 0	 0	 1	 0	
b	 1	 0	 0	 1	 1	 1	 0	
c	 0	 0	 0	 1	 0	 0	 1	
d	 0	 0	 1	 0	 1	 0	 0	
e	 0	 0	 0	 0	 0	 0	 1	
f	 0	 0	 0	 0	 0	 0	 0	
g	 0	 0	 1	 0	 0	 0	 0
EXAMPLE USING ADJACENCY MATRIX
60
Find top 5 highest in-
degree vertices
a	 b	 c	 d	 e f	 g	
a	 0	 1	 0	 0	 0 1	 0	
b	 1	 0	 0	 1	 1 1	 0	
c	 0	 0	 0	 1	 0 0	 1	
d	 0	 0	 1	 0	 1 0	 0	
e	 0	 0	 0	 0	 0 0	 1	
f	 0	 0	 0	 0	 0 0	 0	
g	 0	 0	 1	 0	 0 0	 0	
C = sum(A,1)
[B, IX] = sort(C)
[B, IX] = sort([5,6,4])
# B is [4,5,6]
# IX = [3,1,2]
Top = IX(1,5)
WHEREWE ARE
61
1. Graph
Tasks
3. Structural 5.Traversal 6. Patterns2. Ex: Histograms
4. Ex: PageRank
9-10. Ex: Loops in MR
7. Pattern
Languages
12. Ex: PageRank in
Pregel
12. Ex: PageRank in
MR
8. Ex: PRISM
11.
Representations
WHEREWE ARE
•  Graphs
•  Traversals
•  Patterns
•  Recursion
•  Graph Representations
•  Examples of Graph Representations
•  Graphs with Big Data
62
BIG GRAPHS
Social scale
•  1 billion vertices, 100 billion edges
Web scale
•  50 billion vertices, 1 trillion edges
Brain scale
•  100 billion vertices, 100 trillion edges
63
Gerhard et al, frontiers in
neuroinformatics, 2011
Web graph from the SNAP database
(http://snap.stanford.edu/data)
Paul Butler, Facebook, 2010
material adapted from
Paul Burkhardt,ChrisWaring
https://www.facebook.com/notes/facebook-
engineering/visualizing-friendships/
469716398919
MAPREDUCE FOR PAGERANK
class Mapper
method Map(id n, vertex N)
p " N.PAGERANK/|N.ADJACENCYLIST|
EMIT(id n, vertex N)
for all nodeid m in N.ADJACENCYLIST do
EMIT(id m, value p)
class Reducer
method REDUCE(id m, [p1, p2, …])
M " null, s " 0
for all p in [p1, p2, …] do
if ISVERTEX(p) then
M " p
else
s " s + p
M.PAGERANK " s * 0.85 + 0.15 / TOTALVERTICES
EMIT(id m, vertex M)
64
PROBLEMS
The entire state of the graph is shuffled on every iteration
We only need to shuffle the new rank contributions, not the graph
structure
Further, we have to control the iteration outside of MapReduce
65
PREGEL
Originally from Google
• Open source implementations
• Apache Giraph, Stanford GPS, Jpregel, Hama
Batch algorithms on large graphs
66
Malewicz et al. SIGMOD 10
while any vertex is active or max iterations not reached:
for each vertex:
process messages from neighbors from previous
iteration
send messages to neighbors
set active flag appropriately
this loop is run in parallel
67
class PageRankVertex: public Vertex<double, void, double> {
public:
virtual void Compute(MessageIterator* msgs) {
if (superstep() >= 1) {
double sum = 0;
for (; !msgs->Done(); msgs->Next())
sum += msgs->Value();
*MutableValue() = 0.15 / NumVertices() + 0.85 * sum;
}
if (superstep() < 30) {
const int64 n = GetOutEdgeIterator().size();
SendMessageToAllNeighbors(GetValue() / n);
} else {
VoteToHalt();
}
}
};
68
sum = sum(incoming values)
rank =
0.15
5+ 0.85*sum
0.2
0.2
0.2
0.2
0.2
69
0.1
0.1
0.066
0.066
0.066
0.2
0.2
0.2
0.2
0.2
0.2
0.2
0.2
70
0.1
0.1
0.066
0.066
0.066
0.2
0.2
0.172
0.03
0.426
0.34
0.03
0.2
sum = sum(incoming values)
rank =
0.15
5+ 0.85*sum
71
0.172
0.03
0.426
0.34
0.03
sum = sum(incoming values)
rank =
0.15
5+ 0.85*sum
72
0.172
0.03
0.426
0.34
0.03
0.015
0.015
0.01
0.01
0.01
0.172
0.34
sum = sum(incoming values)
rank =
0.15
5+ 0.85*sum
0.426
73
0.0513
0.03
0.69
0.197
0.03
0.015
0.015
0.01
0.01
0.01
0.172
0.34
0.426
sum = sum(incoming values)
rank =
0.15
5+ 0.85*sum
74
0.0513
0.03
0.69
0.197
0.03
sum = sum(incoming values)
rank =
0.15
5+ 0.85*sum
75
0.0513
0.03
0.69
0.197
0.03
0.015
0.015
0.01
0.01
0.01
0.0513
0.197
0.69
sum = sum(incoming values)
rank =
0.15
5+ 0.85*sum
76
0.0513
0.03
0.794
0.095
0.03
0.015 0.01
0.01
0.0513
0.197
sum = sum(incoming values)
rank =
0.15
5+ 0.85*sum
0.69
77
0.0513
0.03
0.794
0.095
0.03
0.015 0.01
0.01
0.0513
0.197
sum = sum(incoming values)
rank =
0.15
5+ 0.85*sum
0.69
78
0.0513
0.03
0.794
0.095
0.03
0.01
0.095
0.794
sum = sum(incoming values)
rank =
0.15
5+ 0.85*sum
79
0.0513
0.03
0.794
0.095
0.03
sum = sum(incoming values)
rank =
0.15
5+ 0.85*sum
SLIDES CAN BE FOUND AT:
TEACHINGDATASCIENCE.ORG

More Related Content

What's hot

What's hot (16)

Graph
GraphGraph
Graph
 
Graphs in Data Structure
 Graphs in Data Structure Graphs in Data Structure
Graphs in Data Structure
 
Map Coloring and Some of Its Applications
Map Coloring and Some of Its Applications Map Coloring and Some of Its Applications
Map Coloring and Some of Its Applications
 
Introduction to Graphs
Introduction to GraphsIntroduction to Graphs
Introduction to Graphs
 
Application Of Graph Data Structure
Application Of Graph Data StructureApplication Of Graph Data Structure
Application Of Graph Data Structure
 
SFC
SFCSFC
SFC
 
Graph Theory
Graph TheoryGraph Theory
Graph Theory
 
Graph Data Structure
Graph Data StructureGraph Data Structure
Graph Data Structure
 
Graphs data structures
Graphs data structuresGraphs data structures
Graphs data structures
 
Graphs data Structure
Graphs data StructureGraphs data Structure
Graphs data Structure
 
Data Structures - Lecture 10 [Graphs]
Data Structures - Lecture 10 [Graphs]Data Structures - Lecture 10 [Graphs]
Data Structures - Lecture 10 [Graphs]
 
Graphs in data structures
Graphs in data structuresGraphs in data structures
Graphs in data structures
 
On NURBS Geometry Representation in 3D modelling
On NURBS Geometry Representation in 3D modellingOn NURBS Geometry Representation in 3D modelling
On NURBS Geometry Representation in 3D modelling
 
Graph in Data Structure
Graph in Data StructureGraph in Data Structure
Graph in Data Structure
 
Graphs
GraphsGraphs
Graphs
 
LEC 12-DSALGO-GRAPHS(final12).pdf
LEC 12-DSALGO-GRAPHS(final12).pdfLEC 12-DSALGO-GRAPHS(final12).pdf
LEC 12-DSALGO-GRAPHS(final12).pdf
 

Similar to Bill howe 8_graphs

Graph Analyses with Python and NetworkX
Graph Analyses with Python and NetworkXGraph Analyses with Python and NetworkX
Graph Analyses with Python and NetworkXBenjamin Bengfort
 
141222 graphulo ingraphblas
141222 graphulo ingraphblas141222 graphulo ingraphblas
141222 graphulo ingraphblasMIT
 
141205 graphulo ingraphblas
141205 graphulo ingraphblas141205 graphulo ingraphblas
141205 graphulo ingraphblasgraphulo
 
An Introduction to Graph Databases
An Introduction to Graph DatabasesAn Introduction to Graph Databases
An Introduction to Graph DatabasesInfiniteGraph
 
Unit II_Graph.pptxkgjrekjgiojtoiejhgnltegjte
Unit II_Graph.pptxkgjrekjgiojtoiejhgnltegjteUnit II_Graph.pptxkgjrekjgiojtoiejhgnltegjte
Unit II_Graph.pptxkgjrekjgiojtoiejhgnltegjtepournima055
 
GraphX: Graph Analytics in Apache Spark (AMPCamp 5, 2014-11-20)
GraphX: Graph Analytics in Apache Spark (AMPCamp 5, 2014-11-20)GraphX: Graph Analytics in Apache Spark (AMPCamp 5, 2014-11-20)
GraphX: Graph Analytics in Apache Spark (AMPCamp 5, 2014-11-20)Ankur Dave
 
Can you trust the internet? An introduction to graph theory, computational co...
Can you trust the internet? An introduction to graph theory, computational co...Can you trust the internet? An introduction to graph theory, computational co...
Can you trust the internet? An introduction to graph theory, computational co...Denise Gosnell, Ph.D.
 
09_Graphs_handout.pdf
09_Graphs_handout.pdf09_Graphs_handout.pdf
09_Graphs_handout.pdfIsrar63
 
Graph Algorithms, Sparse Algebra, and the GraphBLAS with Janice McMahon
Graph Algorithms, Sparse Algebra, and the GraphBLAS with Janice McMahonGraph Algorithms, Sparse Algebra, and the GraphBLAS with Janice McMahon
Graph Algorithms, Sparse Algebra, and the GraphBLAS with Janice McMahonChristopher Conlan
 
Graph Introduction.ppt
Graph Introduction.pptGraph Introduction.ppt
Graph Introduction.pptFaruk Hossen
 
Embeddings the geometry of relational algebra
Embeddings  the geometry of relational algebraEmbeddings  the geometry of relational algebra
Embeddings the geometry of relational algebraNikolaos Vasiloglou
 
Graph Analysis over Relational Database. Roberto Franchini - Arcade Analytics
Graph Analysis over Relational Database. Roberto Franchini - Arcade AnalyticsGraph Analysis over Relational Database. Roberto Franchini - Arcade Analytics
Graph Analysis over Relational Database. Roberto Franchini - Arcade AnalyticsData Driven Innovation
 
Lecture 5b graphs and hashing
Lecture 5b graphs and hashingLecture 5b graphs and hashing
Lecture 5b graphs and hashingVictor Palmar
 
Skiena algorithm 2007 lecture10 graph data strctures
Skiena algorithm 2007 lecture10 graph data strcturesSkiena algorithm 2007 lecture10 graph data strctures
Skiena algorithm 2007 lecture10 graph data strctureszukun
 
Lecture-31.pptx
Lecture-31.pptxLecture-31.pptx
Lecture-31.pptxAliZaib71
 
Andrew Goldberg. Highway Dimension and Provably Efficient Shortest Path Algor...
Andrew Goldberg. Highway Dimension and Provably Efficient Shortest Path Algor...Andrew Goldberg. Highway Dimension and Provably Efficient Shortest Path Algor...
Andrew Goldberg. Highway Dimension and Provably Efficient Shortest Path Algor...Computer Science Club
 
Inductive Triple Graphs: A purely functional approach to represent RDF
Inductive Triple Graphs: A purely functional approach to represent RDFInductive Triple Graphs: A purely functional approach to represent RDF
Inductive Triple Graphs: A purely functional approach to represent RDFJose Emilio Labra Gayo
 
SCALABLE PATTERN MATCHING OVER COMPRESSED GRAPHS VIA DE-DENSIFICATION
SCALABLE PATTERN MATCHING OVER COMPRESSED GRAPHS VIA DE-DENSIFICATIONSCALABLE PATTERN MATCHING OVER COMPRESSED GRAPHS VIA DE-DENSIFICATION
SCALABLE PATTERN MATCHING OVER COMPRESSED GRAPHS VIA DE-DENSIFICATIONaftab alam
 

Similar to Bill howe 8_graphs (20)

Graph Analyses with Python and NetworkX
Graph Analyses with Python and NetworkXGraph Analyses with Python and NetworkX
Graph Analyses with Python and NetworkX
 
141222 graphulo ingraphblas
141222 graphulo ingraphblas141222 graphulo ingraphblas
141222 graphulo ingraphblas
 
141205 graphulo ingraphblas
141205 graphulo ingraphblas141205 graphulo ingraphblas
141205 graphulo ingraphblas
 
An Introduction to Graph Databases
An Introduction to Graph DatabasesAn Introduction to Graph Databases
An Introduction to Graph Databases
 
Unit II_Graph.pptxkgjrekjgiojtoiejhgnltegjte
Unit II_Graph.pptxkgjrekjgiojtoiejhgnltegjteUnit II_Graph.pptxkgjrekjgiojtoiejhgnltegjte
Unit II_Graph.pptxkgjrekjgiojtoiejhgnltegjte
 
GraphX: Graph Analytics in Apache Spark (AMPCamp 5, 2014-11-20)
GraphX: Graph Analytics in Apache Spark (AMPCamp 5, 2014-11-20)GraphX: Graph Analytics in Apache Spark (AMPCamp 5, 2014-11-20)
GraphX: Graph Analytics in Apache Spark (AMPCamp 5, 2014-11-20)
 
Can you trust the internet? An introduction to graph theory, computational co...
Can you trust the internet? An introduction to graph theory, computational co...Can you trust the internet? An introduction to graph theory, computational co...
Can you trust the internet? An introduction to graph theory, computational co...
 
09_Graphs_handout.pdf
09_Graphs_handout.pdf09_Graphs_handout.pdf
09_Graphs_handout.pdf
 
Graph Algorithms, Sparse Algebra, and the GraphBLAS with Janice McMahon
Graph Algorithms, Sparse Algebra, and the GraphBLAS with Janice McMahonGraph Algorithms, Sparse Algebra, and the GraphBLAS with Janice McMahon
Graph Algorithms, Sparse Algebra, and the GraphBLAS with Janice McMahon
 
An introduction to R
An introduction to RAn introduction to R
An introduction to R
 
Graph Introduction.ppt
Graph Introduction.pptGraph Introduction.ppt
Graph Introduction.ppt
 
Embeddings the geometry of relational algebra
Embeddings  the geometry of relational algebraEmbeddings  the geometry of relational algebra
Embeddings the geometry of relational algebra
 
Graph Analysis over Relational Database. Roberto Franchini - Arcade Analytics
Graph Analysis over Relational Database. Roberto Franchini - Arcade AnalyticsGraph Analysis over Relational Database. Roberto Franchini - Arcade Analytics
Graph Analysis over Relational Database. Roberto Franchini - Arcade Analytics
 
Lecture 5b graphs and hashing
Lecture 5b graphs and hashingLecture 5b graphs and hashing
Lecture 5b graphs and hashing
 
Skiena algorithm 2007 lecture10 graph data strctures
Skiena algorithm 2007 lecture10 graph data strcturesSkiena algorithm 2007 lecture10 graph data strctures
Skiena algorithm 2007 lecture10 graph data strctures
 
Lecture-31.pptx
Lecture-31.pptxLecture-31.pptx
Lecture-31.pptx
 
Andrew Goldberg. Highway Dimension and Provably Efficient Shortest Path Algor...
Andrew Goldberg. Highway Dimension and Provably Efficient Shortest Path Algor...Andrew Goldberg. Highway Dimension and Provably Efficient Shortest Path Algor...
Andrew Goldberg. Highway Dimension and Provably Efficient Shortest Path Algor...
 
Inductive Triple Graphs: A purely functional approach to represent RDF
Inductive Triple Graphs: A purely functional approach to represent RDFInductive Triple Graphs: A purely functional approach to represent RDF
Inductive Triple Graphs: A purely functional approach to represent RDF
 
SCALABLE PATTERN MATCHING OVER COMPRESSED GRAPHS VIA DE-DENSIFICATION
SCALABLE PATTERN MATCHING OVER COMPRESSED GRAPHS VIA DE-DENSIFICATIONSCALABLE PATTERN MATCHING OVER COMPRESSED GRAPHS VIA DE-DENSIFICATION
SCALABLE PATTERN MATCHING OVER COMPRESSED GRAPHS VIA DE-DENSIFICATION
 
UNIT 4_DSA_KAVITHA_RMP.ppt
UNIT 4_DSA_KAVITHA_RMP.pptUNIT 4_DSA_KAVITHA_RMP.ppt
UNIT 4_DSA_KAVITHA_RMP.ppt
 

More from Mahammad Valiyev

More from Mahammad Valiyev (6)

Bill howe 7_machinelearning_2
Bill howe 7_machinelearning_2Bill howe 7_machinelearning_2
Bill howe 7_machinelearning_2
 
Bill howe 6_machinelearning_1
Bill howe 6_machinelearning_1Bill howe 6_machinelearning_1
Bill howe 6_machinelearning_1
 
Bill howe 5_statistics
Bill howe 5_statisticsBill howe 5_statistics
Bill howe 5_statistics
 
Bill howe 4_bigdatasystems
Bill howe 4_bigdatasystemsBill howe 4_bigdatasystems
Bill howe 4_bigdatasystems
 
Bill howe 3_mapreduce
Bill howe 3_mapreduceBill howe 3_mapreduce
Bill howe 3_mapreduce
 
Bill howe 2_databases
Bill howe 2_databasesBill howe 2_databases
Bill howe 2_databases
 

Recently uploaded

B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxStephen266013
 
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...Suhani Kapoor
 
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...shivangimorya083
 
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls DubaiDubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls Dubaihf8803863
 
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Jack DiGiovanna
 
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改atducpo
 
Unveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystUnveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystSamantha Rae Coolbeth
 
Predicting Employee Churn: A Data-Driven Approach Project Presentation
Predicting Employee Churn: A Data-Driven Approach Project PresentationPredicting Employee Churn: A Data-Driven Approach Project Presentation
Predicting Employee Churn: A Data-Driven Approach Project PresentationBoston Institute of Analytics
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPramod Kumar Srivastava
 
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /WhatsappsBeautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsappssapnasaifi408
 
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...dajasot375
 
Aminabad Call Girl Agent 9548273370 , Call Girls Service Lucknow
Aminabad Call Girl Agent 9548273370 , Call Girls Service LucknowAminabad Call Girl Agent 9548273370 , Call Girls Service Lucknow
Aminabad Call Girl Agent 9548273370 , Call Girls Service Lucknowmakika9823
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130Suhani Kapoor
 
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一ffjhghh
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfLars Albertsson
 
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Delhi Call girls
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxJohnnyPlasten
 
Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingNeil Barnes
 
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...Pooja Nehwal
 

Recently uploaded (20)

B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docx
 
Decoding Loan Approval: Predictive Modeling in Action
Decoding Loan Approval: Predictive Modeling in ActionDecoding Loan Approval: Predictive Modeling in Action
Decoding Loan Approval: Predictive Modeling in Action
 
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
 
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...
 
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls DubaiDubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
 
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
 
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
 
Unveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystUnveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data Analyst
 
Predicting Employee Churn: A Data-Driven Approach Project Presentation
Predicting Employee Churn: A Data-Driven Approach Project PresentationPredicting Employee Churn: A Data-Driven Approach Project Presentation
Predicting Employee Churn: A Data-Driven Approach Project Presentation
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
 
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /WhatsappsBeautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
 
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
 
Aminabad Call Girl Agent 9548273370 , Call Girls Service Lucknow
Aminabad Call Girl Agent 9548273370 , Call Girls Service LucknowAminabad Call Girl Agent 9548273370 , Call Girls Service Lucknow
Aminabad Call Girl Agent 9548273370 , Call Girls Service Lucknow
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
 
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdf
 
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptx
 
Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data Storytelling
 
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
 

Bill howe 8_graphs

  • 1. WHEREWE ARE •  Graphs •  What is a graph? •  Graph Structures •  Graph Analytics •  Traversals •  Patterns •  Recursion •  Graph Representations 1
  • 2. WHAT IS A GRAPH? G = (V, E) V is a set of vertices (synonym: nodes) E is a set of edges. • Each edge is a pair (vsource,vtarget) • Edges may be considered directed or undirected 2 b d e c f g a
  • 3. GRAPHS IN THEWILD The Web The Internet Social networks Communication logs • Ex: PRISM many more – ubiquitous! (why?) 3 b d e c f g a
  • 4. GRAPH ANALYTICS Structural Algorithms Traversal Algorithms Pattern-Matching Algorithms 4 b d e c f g a Analyzing Analytics,Bordawekar 2012
  • 5. GRAPH ANALYTICS:SOME STRUCTURAL TASKS Basic Metrics •  |V|, |E| •  |E| is more interesting than |V| •  in-degree(v), out-degree(v) 5 Analyzing Analytics,Bordawekar 2012 “I have a big graph” “How many edges, and what is the highest in or out degree?” b d e c f g a
  • 6. WHEREWE ARE •  Graphs •  What is a graph? •  Graph Structures •  Graph Analytics •  Traversals •  Patterns •  Recursion •  Graph Representations 6
  • 7. STRUCTURAL TASK:THE HISTOGRAM OF A GRAPH Outdegree of a vertex = number of outgoing edges For each integer d, let n(d) = number of vertices with outdegree d The outdegree histogram of a graph = the scatterplot (d, n(d)) 7 0 2 4 2 1 1 1 d n(d) 0 1 1 3 2 2 3 0 4 1 0 1 2 3 4 0 1 2 3 4 5 d n Outdegree 1 is seen at 3 nodessrc: Dan Suciu
  • 8. HISTOGRAMS TELL US SOMETHING ABOUT THE GRAPH 8 What can you say about these graphs? 0 20 40 60 80 100 120 0 5 10 x10000 0 50 100 150 0 5 10 x10000 0 20 40 60 80 100 120 0 5 10 x10000 src: Dan Suciu
  • 9. EXPONENTIAL DISTRIBUTION 9 (generally, cxd, for some x < 1) A random graph has exponential distribution Best seen when n is on a log scale 1 100 10000 1000000 0 5 10 n 0 500000 1000000 1500000 0 5 10 n Quickly vanishing src: Dan Suciu src: Dan Suciu
  • 10. Long tail ZIPF DISTRIBUTION 10 for some value x>0 Human-generated data has Zipf distribution: letters in alphabet, words in vocabulary, etc. Best seen in a log-log scale 1000 10000 100000 1 4 16 n 10 100 1000 10000 100000 0 2 4 6 81012141618 n src: Dan Suciu
  • 11. A HISTOGRAM OF THEWEB 11 Late 1990’s 200M Webpages Exponential ? Zipf ? Broder et al. 2000
  • 12. THE BOWTIE STRUCTURE OF THEWEB 12 Broder et al. 2000
  • 13. WHEREWE ARE •  Graphs •  What is a graph? •  Graph Structures •  Graph Analytics •  Traversals •  Patterns •  Recursion •  Graph Representations 13
  • 14. GRAPH ANALYTICS:SOME STRUCTURAL TASKS Diameter • Longest of all shortest paths 14 b d e c f g a Analyzing Analytics,Bordawekar 2012 What is the diameter of this graph?
  • 15. GRAPH ANALYTICS:SOME STRUCTURAL TASKS Connectivity Coefficient •  Minimum number of vertices you need to remove that will disconnect the graph 15 b d e c f g a What is the connectivity coefficient of this graph? Why might you want to compute the connectivity coefficient? May depend on what is meant by “connectivity” x and y are strongly connected if x is reachable from y and y is reachable from x x and y are connected if x is reachable from y or y is reachable from x Analyzing Analytics,Bordawekar 2012
  • 16. GRAPH ANALYTICS:SOME STRUCTURAL TASKS We may want to understand the “importance” of a vertex Various notions of Centrality •  Closeness Centrality of a vertex: •  Average length of all its shortest paths •  Betweeness Centrality of a vertex v: •  the fraction of all shortest paths that pass through v 16 b d e c f g a What is the betweeness centrality of vertex e? Why might you want to compute betweenness centrality? Analyzing Analytics,Bordawekar 2012
  • 17. GRAPH ANALYTICS:SOME STRUCTURAL TASKS Degree Centrality • Degree centrality of v: degree(v) / |E| •  “Important” nodes are those with high degree But: say we both have 5 friends • We have the same degree centrality • But what if your 5 friends are Barack Obama, Larry Page, Bill Gates, the Dalai Lama, and Oprah Winfrey? • Shouldn’t you be considered more “important?” 17 Analyzing Analytics,Bordawekar 2012
  • 18. EIGENVECTOR CENTRALITY (I.E., PAGERANK) 18 Basic idea (but oversimplified) while not converged: for each vertex v rank(v) = sum of ranks from incoming edges Two problems: 1) If a page with millions of outgoing links links to me, that’s less valuable than a page with only a few outgoing links. 2) If I’m 27 hops away from Barack Obama, he shouldn’t really influence my rank. We need a damping factor. b d e c f g a
  • 19. PAGERANK:2ND ATTEMPT 19 •  Assume edges BàA, CàA, DàA, … •  PR(X) is the PageRank of vertex X •  L(X) is the number of outgoing links from X •  d is the damping factor •  N is the number of vertices while not converged: ensures ranks sum to 1
  • 20. WHEREWE ARE •  Graphs •  What is a graph? •  Graph Structures •  Graph Analytics •  Traversals •  Patterns •  Recursion •  Graph Representations 20
  • 21. TRAVERSAL TASKS Minimum Spanning Tree •  Find the smallest subset of edges that (weakly) connect the graph 4 b d e c f g a What is a minimum spanning tree of this graph? Why might you want to compute this?
  • 22. TRAVERSAL TASKS:EULER’S BRIDGES OF KONIGSBERG 5 Can we visit every vertex and cross each bridge only once? Observation: if you enter by a bridge, you must leave by a bridge Result: If you don’t care where you start and end, then at most two vertices can have an odd degree. Result: If you want to start and end in the same place, every vertex must have an even degree. Very easy to check!
  • 23. TRAVERSAL TASKS:HAMILTONIAN PATHS 6 Can we create a path that visits every vertex only once? A lot harder! Why? Intuition: Each vertex is involved in a lot of possible paths. But we can only “use” the vertex once. Which is the “best” way to use the vertex? Extension: Assume there is a cost to traversing each edge. Can we find the path that visits every vertex with the minimum cost? No efficient algorithm can exist. Heuristics and approximations are the best we can do. b d e c f g a
  • 24. TRAVERSAL TASKS Maximum Flow • Input: A graph with labeled edges indicating the “capacity” of that edge, and special vertices called sources and sinks • Find a subgraph that maximizes flow between sources and sinks 7 b d e c f g a 5 4 4 2 3 3 2 3 2 4 5 Condition: for each vertex, incoming flow must equal outgoing flow
  • 25. WHEREWE ARE •  Graphs •  Traversals •  Patterns •  Pattern Matching •  Pattern Expressions •  Recursion •  Graph Representations 25
  • 26. PATTERN-MATCHING TASKS Find all instances of a pattern May involve vertex or edge labels 26 b d e c f g a $y $x
  • 27. POPULAR PATTERN-MATCHING TASK:TRIANGLES Find Triangles a ! b ! c ! a The total number of triangles is another measure of connectedness. Lots of algorithms; a popular challenge problem for computer scientists Utility in practice is not so clear (to me) Key idea to remember: a naïve algorithm will find the same triangle multiple times. a à b à c à a b à c à a à b c à a à b à c You can extend this to k-cycles, etc. 27 b d e c f g a
  • 28. ANOTHER PATTERN MATCH EXAMPLE Given a graph with edge labels •  Drug X interferes with DrugY •  DrugY regulates the expression of gene Z •  gene Z is associated with disease w Find drugs that interfere with another drug involved in the treatment of a disease 28 $y$x $z interferes_with regulates $w associated_with
  • 29. NEED SOME KIND OF PATTERN EXPRESSION LANGUAGE SPARQL 29 SELECT ?x WHERE ?x interferes_with ?y . ?y regulates ?z . ?z associated_with?w . subject predicate object terazine interferes_with betamin betamin regulates GT1234 GT1234 associated_with cancer doratin regulates XY6789 SQL-like language over a fixed schema: subject, predicate, object RDF = “Resource Description Framework” Defines a formal data model for (subject, predicate, object) “triples.” A triple is just a labeled edge in a graph.
  • 30. NEED SOME KIND OF PATTERN EXPRESSION LANGUAGE Datalog 30 Ans(x) :- R(x, interferes_with, y), R(y, regulates, z), R(z, associated_with, w) Assume a relation R(subject, predicate, object)
  • 31. NEED SOME KIND OF PATTERN EXPRESSION LANGUAGE SQL 31 SELECT i.subject FROM R i, R r, R a WHERE i.predicate = “interferes_with” AND r.predicate = “regulates” AND a.predicate = “associated_with” AND i.object = r.subject AND r.object = a.subject Assume a relation R(subject, predicate, object)
  • 32. WHEREWE ARE •  Graphs •  Traversals •  Patterns •  Pattern Matching •  Pattern Expressions •  Recursion •  Graph Representations 32
  • 33. NEED SOME KIND OF PATTERN EXPRESSION LANGUAGE Relational Algebra 33 σp=interferes_with(R) o=s σp=regulates(R) o=s σp=associated_with(R) Assume a relation R(subject, predicate, object)
  • 34. NEED SOME KIND OF PATTERN EXPRESSION LANGUAGE Datalog, 2nd formulation 34 Ans(x) :- InterferesWith(x, y), Regulates(y, z), AssociatedWith(z, w)= Assume relations InterferesWith(drug1, drug2), Regulates(drug, gene), AssociatedWith(gene, disease)
  • 35. NEED SOME KIND OF PATTERN EXPRESSION LANGUAGE SQL, 2nd formulation 35 SELECT i.drug1 FROM InterferesWith i, Regulates r, AssociatedWith a WHERE i.drug2 = r.drug AND r.gene = a.gene Assume relations InterferesWith(drug1, drug2), Regulates(drug, gene), AssociatedWith(gene, disease)
  • 36. ASIDE ON SPARQL Designed for graph query Not algebraically closed •  Input is a graph, but output is a set of records Limited expressiveness •  Support for path expressions since v1.1 •  But not arbitrary recursion If your input is tabular, you have to shred it into a graph before using SPARQL •  Often a 3x-6x blow up in size! 36
  • 37. ASIDE:PRISM-LIKE EXAMPLE IN DATALOG 37 http://www.businessweek.com/magazine/palantir-the-vanguard-of-cyberterror- security-11222011.html “In October, a foreign national named Mike Fikri purchased a one-way plane ticket from Cairo to Miami, where he rented a condo.” “Over the previous few weeks, he’d made a number of large withdrawals from a Russian bank account and placed repeated calls to a few people in Syria.” “More recently, he rented a truck, drove to Orlando, and visited Walt Disney World by himself.” BoughtFlight(‘Fikri’,‘Cairo’, ‘Miami’,‘oneway’ 4/1/2013) Withdrawal(‘Fikri’, 5000,‘some bank’, 4/4/2013) Withdrawal(‘Fikri’, 2000,‘some bank’, 4/5/2013) … Rented(‘Fikri’,‘truck’,‘Miami’,‘Orlando’, 4/13/2013)
  • 38. PRISM EXAMPLE MODELED IN DATALOG 38 http://www.businessweek.com/magazine/palantir-the- vanguard-of-cyberterror-security-11222011.html flag(person, 1, date) :- BoughtFlight(person, origin, destination, oneway, date), FlaggedAirports(origin), USAirports(destination), oneway=‘oneway’ alert(person, flagcnt, mindate, maxdate) :- totalflags(person, flagcnt, mindate, maxdate), maxdate – mindate < 10 days, flagcnt > 3 totalflags(person, sum(flag), min(date), max(date) :- flag(person, flag, date) flag(person, 1, date) :- Rented(person, vehicle, origin, dest, date), ImportantLocation(dest) ForeignWithdrawal(person, sum(amount), date) :- Withdrawal(person, amount, bank, date), ForeignBank(bank), amount > 1000 flag(person, 1, date) :- ForeignWithdrawal(person, totamount, date), totamount>10k
  • 39. ANOTHER DATALOG EXAMPLE 39 contacted(Person1, Person2,Time) :- email(Person1, Person2, Message,Time, …) contacted(Person1, Person2,Time) :- called(Time,Voicemail, Person1, Person2, …) contacted(Person1, Person2,Time) :- text_message(Time, Message, Person1, Person2, …), Who contacted who,when? We don’t care how.
  • 40. ANOTHER DATALOG EXAMPLE 40 knew(“Sam”). knew(Person2) :- knew(Person1) email(Person1, Person2, Message,Time), Time < “June 3”. knew(Person2) :- knew(Person1), met_with(Person1, Person2,Time), Time < “June 3”. Who could have known before June 3 that X was going to happen? self-reference!
  • 41. WHEREWE ARE 41 1. Graph Tasks 3. Structural 5.Traversal 6. Patterns2. Ex: Histograms 4. Ex: PageRank 9. Ex: Datalog in MR 7. Pattern Languages 12. Ex: PageRank in Pregel 11. Ex: PageRank in MR 8. Ex: PRISM 10. Representations
  • 42. WHEREWE ARE •  Graphs •  Traversals •  Patterns •  Recursion •  What’s recursion? •  Recursive Graphs •  Graph Representations 42
  • 43. A(y) :- Edge(‘w’, y) A(y) :- A(x), Edge(x,y) Reachability from a given node wstep 0: A0 = all nodes connected to ‘w’ step 1: for each node in A0, find neighbors step 2: for each node in A1, find neighbors EVALUATION OF RECURSIVE PROGRAMS 43
  • 44. EVALUATING RECURSIVE QUERIES Problems with this approach • What if there are cycles in the graph? 44 Bob Sue Tom Lin
  • 45. EXAMPLE OF SEMI-NAÏVE EVALUATION 45 Join Dupe-elim A(y) :- R(‘a’, y) A(y) :- A(x), R(x,y) Reachability from ‘a’ in datalog
  • 46. IN MAPREDUCE 46 i=i+1 Anything new? Driver done (compute next generation of nodes) (remove the ones we’ve already seen) map map map map map reduce reduce Join Difference ΔAi-1 R(1) R(0) mapAi (0) Ai (1) reduce reduce map
  • 47. WHEREWE ARE •  Graphs •  Traversals •  Patterns •  Recursion •  What’s recursion? •  Recursive Graphs •  Graph Representations 47
  • 48. IN MAPREDUCE 48 i=i+1 Anything new? Driver done (compute next generation of nodes) (remove the ones we’ve already seen) map map map map map reduce reduce Join Difference ΔAi-1 R(1) R(0) mapAi (0) Ai (1) reduce reduce map
  • 49. EVALUATING RECURSIVE QUERIES AT SCALE 49 (a) R is loop invariant,but gets loaded and shuffled on each iteration (b) Ai grows slowly and monotonically,but is loaded and shuffled on each iteration. Bu et al.VLDB10,VLDBJ12 Shaw et al. Datalog12 map map map map map reduce reduce Join Difference ΔAi-1 R(1) R(0) mapAi (0) Ai (1)(a) (b) reduce reduce map (compute the next generation of answers) (remove the ones we’ve already seen)
  • 50. IDEA:CACHE LOOP-INVARIANT DATA 50 map map map reduce reduce Join Difference ΔAi-1 reduce reduce mapAi (0) Ai (1) map R(1) R(0) Iteration i > 0: map mapR(1) R(0) Iteration i = 0: Load a distributed cache A(1) A(0)
  • 51. 51 0 100 200 300 400 500 600 700 0 2 4 6 8 10 time(s) iteration # (3 jobs per iteration) outer cache time (ms) no cache BTC2010, 680GB Cache-enabled join
  • 52. (query: transitive reachability from 7 nodes) Takeaways: •  10x improvement over no optimizations. •  All optimizations are useful •  We’re approaching the raw overhead of Hadoop (bottom gray line) 0
 5000
 10000
 15000
 20000
 25000
 30000
 35000
 0
 50
 100
 150
 200
 250
 'me
(s)
 itera'on
#
 (a) all optimizations raw Hadoop (d): (a) w/o diff cache (b): (a) w/o faster cache (c): (a) w/o project opt. (e): (a) w/o file combiner no optimizations EFFECT OFVARIOUS OPTIMIZATIONS FOR A RECURSIVE GRAPH QUERY ON BTC 2010 52
  • 53. NEW TUPLES DISCOVERED BY ITERATION NUMBER 53 1 10 100 1,000 10,000 100,000 1,000,000 10,000,000 100,000,000 0 20 40 60 80 100 120 140 160 180 #oftuplesdiscovered iteration # frontier tuples previously discovered tuples removed
  • 54. WHEREWE ARE •  Graphs •  Traversals •  Patterns •  Recursion •  Graph Representations •  Examples of Graph Representations •  Graphs with Big Data 54
  • 55. GRAPH REPRESENTATIONS Edge Table 55 Source Target a b b a a f b f b e b d d e d c e g g c c g b d e c f g a
  • 56. EXAMPLE USING EDGE TABLE 56 SELECT top 5 target, incount FROM ( SELECT target, count(source) as incount FROM edges GROUP BY target ) ORDER BY incount Find top 5 highest in-degree vertices
  • 57. GRAPH REPRESENTATIONS Adjacency List 57 Source Target a b, f b a, d, e, f d c, e e g g c c g b d e c f g a
  • 58. EXAMPLE USING ADJACENCY LIST (AND MAPREDUCE) 58 Map: Input key: vertex Input value: adjacency list Output key: adjacent vertex Output value: vertex Reduce: Input key: vertex Input value: list of vertices Output key: vertex Output value: length of list Find top 5 highest in-degree vertices b d e c f g a Source Target a b, f b a, d, e d c, e e g g c c g Source Target b a f a a b d b e b e d c d g e c g g c map
  • 59. GRAPH REPRESENTATIONS Adjacency Matrix 59 b d e c f g a a b c d e f g a 0 1 0 0 0 1 0 b 1 0 0 1 1 1 0 c 0 0 0 1 0 0 1 d 0 0 1 0 1 0 0 e 0 0 0 0 0 0 1 f 0 0 0 0 0 0 0 g 0 0 1 0 0 0 0
  • 60. EXAMPLE USING ADJACENCY MATRIX 60 Find top 5 highest in- degree vertices a b c d e f g a 0 1 0 0 0 1 0 b 1 0 0 1 1 1 0 c 0 0 0 1 0 0 1 d 0 0 1 0 1 0 0 e 0 0 0 0 0 0 1 f 0 0 0 0 0 0 0 g 0 0 1 0 0 0 0 C = sum(A,1) [B, IX] = sort(C) [B, IX] = sort([5,6,4]) # B is [4,5,6] # IX = [3,1,2] Top = IX(1,5)
  • 61. WHEREWE ARE 61 1. Graph Tasks 3. Structural 5.Traversal 6. Patterns2. Ex: Histograms 4. Ex: PageRank 9-10. Ex: Loops in MR 7. Pattern Languages 12. Ex: PageRank in Pregel 12. Ex: PageRank in MR 8. Ex: PRISM 11. Representations
  • 62. WHEREWE ARE •  Graphs •  Traversals •  Patterns •  Recursion •  Graph Representations •  Examples of Graph Representations •  Graphs with Big Data 62
  • 63. BIG GRAPHS Social scale •  1 billion vertices, 100 billion edges Web scale •  50 billion vertices, 1 trillion edges Brain scale •  100 billion vertices, 100 trillion edges 63 Gerhard et al, frontiers in neuroinformatics, 2011 Web graph from the SNAP database (http://snap.stanford.edu/data) Paul Butler, Facebook, 2010 material adapted from Paul Burkhardt,ChrisWaring https://www.facebook.com/notes/facebook- engineering/visualizing-friendships/ 469716398919
  • 64. MAPREDUCE FOR PAGERANK class Mapper method Map(id n, vertex N) p " N.PAGERANK/|N.ADJACENCYLIST| EMIT(id n, vertex N) for all nodeid m in N.ADJACENCYLIST do EMIT(id m, value p) class Reducer method REDUCE(id m, [p1, p2, …]) M " null, s " 0 for all p in [p1, p2, …] do if ISVERTEX(p) then M " p else s " s + p M.PAGERANK " s * 0.85 + 0.15 / TOTALVERTICES EMIT(id m, vertex M) 64
  • 65. PROBLEMS The entire state of the graph is shuffled on every iteration We only need to shuffle the new rank contributions, not the graph structure Further, we have to control the iteration outside of MapReduce 65
  • 66. PREGEL Originally from Google • Open source implementations • Apache Giraph, Stanford GPS, Jpregel, Hama Batch algorithms on large graphs 66 Malewicz et al. SIGMOD 10 while any vertex is active or max iterations not reached: for each vertex: process messages from neighbors from previous iteration send messages to neighbors set active flag appropriately this loop is run in parallel
  • 67. 67 class PageRankVertex: public Vertex<double, void, double> { public: virtual void Compute(MessageIterator* msgs) { if (superstep() >= 1) { double sum = 0; for (; !msgs->Done(); msgs->Next()) sum += msgs->Value(); *MutableValue() = 0.15 / NumVertices() + 0.85 * sum; } if (superstep() < 30) { const int64 n = GetOutEdgeIterator().size(); SendMessageToAllNeighbors(GetValue() / n); } else { VoteToHalt(); } } };
  • 68. 68 sum = sum(incoming values) rank = 0.15 5+ 0.85*sum 0.2 0.2 0.2 0.2 0.2
  • 71. 71 0.172 0.03 0.426 0.34 0.03 sum = sum(incoming values) rank = 0.15 5+ 0.85*sum
  • 74. 74 0.0513 0.03 0.69 0.197 0.03 sum = sum(incoming values) rank = 0.15 5+ 0.85*sum
  • 76. 76 0.0513 0.03 0.794 0.095 0.03 0.015 0.01 0.01 0.0513 0.197 sum = sum(incoming values) rank = 0.15 5+ 0.85*sum 0.69
  • 77. 77 0.0513 0.03 0.794 0.095 0.03 0.015 0.01 0.01 0.0513 0.197 sum = sum(incoming values) rank = 0.15 5+ 0.85*sum 0.69
  • 79. 79 0.0513 0.03 0.794 0.095 0.03 sum = sum(incoming values) rank = 0.15 5+ 0.85*sum
  • 80. SLIDES CAN BE FOUND AT: TEACHINGDATASCIENCE.ORG