(Tentative) Network Analysis with networkX : Fundamentals of network theory-2
1. Kyunghoon Kim
Network Analysis with networkX
Fundamentals of network theory-2
2014. 05. 28.
UNIST Mathematical Sciences
Kyunghoon Kim ( Kyunghoon@unist.ac.kr )
5/28/2014 Fundamentals of network theory-2 1
2. Kyunghoon Kim
Indexing
5/28/2014 Fundamentals of network theory-2 2
Google Glasses is a
computer with a
head-mounted
display.
He wore thick
glasses. He worked
in google corporation.
He wore glasses to
be able to read signs
at a distance.
google
glasses
is
a
computer
with
head-mounted
display
he
1 2
1 2 3
1
1 3
1
1
1
1
2 3
wore
thick
worked
in
corporation
to
be
able
read
…
2 3
2
2
2
2
3
3
3
3
1 2 3
3. Kyunghoon Kim
Indexing
5/28/2014 Fundamentals of network theory-2 3
Google Glasses is a
computer with a
head-mounted
display.
He wore thick
glasses. He worked
in google corporation.
He wore glasses to
be able to read signs
at a distance.
google
glasses
is
a
computer
with
head-mounted
display
he
1 2
1 2 3
1
1 3
1
1
1
1
2 3
wore
thick
worked
in
corporation
to
be
able
read
…
2 3
2
2
2
2
3
3
3
3
1 2 3
4. Kyunghoon Kim
Indexing with position
5/28/2014 Fundamentals of network theory-2 4
Google Glass is a
computer with a
head-mounted
display.
He wore thick
glasses. He worked
in google corporation.
He wore glasses to
be able to read signs
at a distance.
google
glasses
is
a
computer
with
head-mounted
display
he
1-1 2-8
1-2 2-4 3-3
1-3
1-4 3-11
1-5
1-6
1-7
1-8
2-1 2-5 3-1
wore
thick
worked
in
corporation
to
be
able
read
…
2-2 3-2
2-3
2-6
2-7
2-9
3-4 3-7
3-5
3-6
1 2 3
5. Kyunghoon Kim
Indexing with position
5/28/2014 Fundamentals of network theory-2 5
Google Glass is a
computer with a
head-mounted
display.
He wore thick
glasses. He worked
in google corporation.
He wore glasses to
be able to read signs
at a distance.
google
glasses
is
a
computer
with
head-mounted
display
he
1-1 2-8
1-2 2-4 3-3
1-3
1-4 3-11
1-5
1-6
1-7
1-8
2-1 2-5 3-1
wore
thick
worked
in
corporation
to
be
able
read
…
2-2 3-2
2-3
2-6
2-7
2-9
3-4 3-7
3-5
3-6
1 2 3
6. Kyunghoon Kim
Indexing with position & metatag
5/28/2014 Fundamentals of network theory-2 6
<title>New Google
Glass</title><body>Google
Glass is a computer with a
head-mounted
display.</body>
<title>Daily life of
David</title><body>He
wore thick glasses. He
worked in google
corporation.</body>
<title>Black
Glasses</title><body>He
wore glasses to be able to
read signs at a
distance.</body>
google
glasses
is
a
computer
with
head-mounted
display
he
1-1 2-8
1-2 2-4 3-3
1-3
1-4 3-11
1-5
1-6
1-7
1-8
2-1 2-5 3-1
wore
thick
worked
in
corporation
<title>
</title>
<body>
</body>
2-2 3-2
2-3
2-6
2-7
2-9
#
#
#
#
1 2 3
7. Kyunghoon Kim
Indexing with position & metatag
5/28/2014 Fundamentals of network theory-2 7
Altavista
“Constrained Searching of an index”, 1999
8. Kyunghoon Kim
“uncanny knack for returning extremely
relevant results.” – PC Magazine
The technology that launched google
5/28/2014 Fundamentals of network theory-2 8
Screenshot of “google.stanford.edu” 1997, http://blogoscoped.com/archive/2006-04-21-n63.html
9. Kyunghoon Kim
A hyperlink is a reference to data that the
reader can directly follow either by clicking or
by hovering or that is followed automatically.
– Merriam-Webster.com
Hyperlink
5/28/2014 Fundamentals of network theory-2 9
10. Kyunghoon Kim
Hyperlink Trick
5/28/2014 Fundamentals of network theory-2 10
Barny’s tomato pasta recipe
Cook Pasta Sheets, one at a time,
for 1 minute each. And before
serving, place pasta in the bottom
of a soup bowl.
Tony’s tomato pasta
Bring to a boil and add pasta.
Cook for 10 minutes. Drain
on paper towel.
I like barny’s recipe. I enjoyed tony’s recipe. Tony’s recipe is
amazing!
I’m in
admiration of
tony’s recipe.
11. Kyunghoon Kim
Authority Trick
5/28/2014 Fundamentals of network theory-2 11
Barny’s tomato pasta recipe
Cook Pasta Sheets, one at a time,
for 1 minute each. And before
serving, place pasta in the bottom
of a soup bowl.
Tony’s tomato pasta
Bring to a boil and add pasta.
Cook for 10 minutes. Drain
on paper towel.
I like barny’s recipe. I enjoyed tony’s recipe. Tony’s recipe is
amazing!
I’m in
admiration of
tony’s recipe.
100 1 1
1
12. Kyunghoon Kim
Authority Trick
5/28/2014 Fundamentals of network theory-2 12
Barny’s tomato pasta recipe
Cook Pasta Sheets, one at a time,
for 1 minute each. And before
serving, place pasta in the bottom
of a soup bowl.
Tony’s tomato pasta
Bring to a boil and add pasta.
Cook for 10 minutes. Drain
on paper towel.
I like barny’s recipe.
I enjoyed tony’s recipe. Tony’s recipe is
amazing!
I’m in
admiration of
tony’s recipe.
100
1 1
1
100
3
A B
100 3
13. Kyunghoon Kim
The iterates will not converge no matter how
long the process is run.
Cycle
5/28/2014 Fundamentals of network theory-2 13
1
2 3
41
2
14. Kyunghoon Kim
There is a person who is randomly surfing the
internet.
Random Surfer Trick
5/28/2014 Fundamentals of network theory-2 14
15. Kyunghoon Kim
There is a person who is randomly surfing the
internet.
Surfer starts off at a single web page selected
at random from the entire World Wide Web.
Random Surfer Trick
5/28/2014 Fundamentals of network theory-2 15
16. Kyunghoon Kim
There is a person who is randomly surfing the
internet.
Surfer starts off at a single web page selected
at random from the entire World Wide Web.
The surfer then examines all the hyperlinks on
the page, picks one of them at random, and
clicks on it.
Random Surfer Trick
5/28/2014 Fundamentals of network theory-2 16
17. Kyunghoon Kim
There is a person who is randomly surfing the
internet.
Surfer starts off at a single web page selected
at random from the entire World Wide Web.
The surfer then examines all the hyperlinks on
the page, picks one of them at random, and
clicks on it.
The new page is then examined and one of its
hyperlinks is chosen at random.
Random Surfer Trick
5/28/2014 Fundamentals of network theory-2 17
18. Kyunghoon Kim
There is a person who is randomly surfing the
internet.
Surfer starts off at a single web page selected
at random from the entire World Wide Web.
The surfer then examines all the hyperlinks on
the page, picks one of them at random, and
clicks on it.
The new page is then examined and one of its
hyperlinks is chosen at random.
This process continues…
Random Surfer Trick
5/28/2014 Fundamentals of network theory-2 18
19. Kyunghoon Kim
import networkx as nx
from matplotlib import pyplot as plt
G = nx.DiGraph()
G.add_edges_from([(1,2),(2,3),(2,12),(3,4),(3,8),(3,12),(4,8),(5,1),
(6,1),(6,2),(7,2),(7,12),(8,12),(8,13),(8,14),(8,15),(8,9),(9,16),
(10,5),(10,6),(10,7),(11,10),(12,16),(13,16),(14,16),(15,16),
(16,11)])
plt.clf()
nx.draw_spring(G)
Random Surfer Trick
5/28/2014 Fundamentals of network theory-2 19
Example
20. Kyunghoon Kim
spring layout – places nodes using
Fruchterman-Reingold force-directed algorithm
circular layout – places nodes in a circle
random layout – positions nodes based on an
uniform distribution in a unit square
shell layout – places nodes in concentric circles
spectral layout – positions nodes using eigen-
vectors of the graph laplacian
Layout of NetworkX
5/28/2014 Fundamentals of network theory-2 20
e.g., pos = nx.spectral_layout(G)
nx.draw(G, pos)
21. Kyunghoon Kim
nx.spring_layout has different position at each
iteration.
To fix the position, we use nx.spectral_layout
Just be careful the following case:
networkx.spectral_layout
5/28/2014 Fundamentals of network theory-2 21
22. Kyunghoon Kim
import numpy as np
def addvalue(values,node):
values[node-1] += 1
lennode = len(G.nodes())
values = np.zeros(lennode)
Random Surfer Trick – Outline
5/28/2014 Fundamentals of network theory-2 22
25. Kyunghoon Kim
for i in range(20000):
selectednode = nextstep(selectednode)
plt.clf()
nx.draw_networkx_nodes(G, pos, node_size=values)
nx.draw_networkx_edges(G, pos)
nx.draw_networkx_labels(G, pos)
Random Surfer Trick – Outline (Cont.)
5/28/2014 Fundamentals of network theory-2 25
26. Kyunghoon Kim
for i in range(10000000):
selectednode = nextstep(selectednode)
plt.clf()
nx.draw_networkx_nodes(G, pos, node_size=10000*values/sum(values))
nx.draw_networkx_edges(G, pos)
nx.draw_networkx_labels(G, pos)
Random Surfer Trick – Outline (Cont.)
5/28/2014 Fundamentals of network theory-2 26
27. Kyunghoon Kim
There is one twist : restart probability(15%)
Surfer does not click on one of the available
hyperlinks.
Instead, he restarts the procedure by picking
another page randomly from the whole web.
(boring, error of browser, etc.)
Twist
5/28/2014 Fundamentals of network theory-2 27
29. Kyunghoon Kim
selectednode = randomstart(G)
for i in range(1000000):
if np.random.rand(1)[0] > 0.15:
selectednode = nextstep(selectednode)
else:
selectednode = randomstart(G)
if i % 100000 == 0:
print i
plt.clf()
nx.draw_networkx_nodes(G, pos, node_size=10000*values/sum(values))
nx.draw_networkx_edges(G, pos)
nx.draw_networkx_labels(G, pos)
Random Surfer Trick with Twist (Cont.)
5/28/2014 Fundamentals of network theory-2 29
30. Kyunghoon Kim
plt.clf()
pos = nx.spectral_layout(G)
nx.draw_networkx_nodes(G, pos, node_size=10000*values/sum(values))
nx.draw_networkx_edges(G, pos)
labels = 100*values/sum(values) % percent
labels = list(labels.astype('|S4'))
labels = dict(zip(range(1,lennode+1), labels))
nx.draw_networkx_labels(G, pos, labels)
Random Surfer Trick with Twist
5/28/2014 Fundamentals of network theory-2 30
31. Kyunghoon Kim
What is the connection between random surfer
model and the authority trick that we would
like to use for ranking web pages?
Connection
5/28/2014 Fundamentals of network theory-2 31
32. Kyunghoon Kim
What is the connection between random surfer
model and the authority trick that we would
like to use for ranking web pages?
The percentages calculated from random surfer
simulations turn out to be exactly what we
need to measure a page’s authority.
Connection
5/28/2014 Fundamentals of network theory-2 32
33. Kyunghoon Kim
What is the connection between random surfer
model and the authority trick that we would
like to use for ranking web pages?
The percentages calculated from random surfer
simulations turn out to be exactly what we
need to measure a page’s authority.
Surfer authority score
= percentage of time that a random surfer
would spend visiting that page
Connection
5/28/2014 Fundamentals of network theory-2 33
34. Kyunghoon Kim
Hyperlink Trick
: the main idea was that a page with many
incoming links should receive a high ranking.
Tricks for ranking web pages
5/28/2014 Fundamentals of network theory-2 34
35. Kyunghoon Kim
Hyperlink Trick
: the main idea was that a page with many
incoming links should receive a high ranking.
Tricks for ranking web pages
5/28/2014 Fundamentals of network theory-2 35
pos = nx.circular_layout(G)
36. Kyunghoon Kim
Authority Trick
: the main idea was that an incoming link from
a highly authoritative page should improve a
page’s ranking more than an incoming link from
a less authoritative page.
Tricks for ranking web pages
5/28/2014 Fundamentals of network theory-2 36
37. Kyunghoon Kim
Authority Trick
: the main idea was that an incoming link from
a highly authoritative page should improve a
page’s ranking more than an incoming link from
a less authoritative page.
Connection
: An incoming link from a popular page will
have more opportunities to be followed than a
link from an unpopular page.
Tricks for ranking web pages
5/28/2014 Fundamentals of network theory-2 37
40. Kyunghoon Kim
The random surfer model take account the
quantity(hyperlink trick) and quality(authority
trick) of incoming links at each page.
Random Surfer Model
5/28/2014 Fundamentals of network theory-2 40
41. Kyunghoon Kim
The random surfer model take account the
quantity(hyperlink trick) and quality(authority
trick) of incoming links at each page.
This model works perfectly well whether or not
there are cycles in the hyperlinks.
Random Surfer Model
5/28/2014 Fundamentals of network theory-2 41
42. Kyunghoon Kim
The random surfer model take account the
quantity(hyperlink trick) and quality(authority
trick) of incoming links at each page.
This model works perfectly well whether or not
there are cycles in the hyperlinks.
Random Surfer Model
5/28/2014 Fundamentals of network theory-2 42
1
2 3
4
43. Kyunghoon Kim
This model works perfectly well whether or not
there are cycles in the hyperlinks.
Random Surfer Model
5/28/2014 Fundamentals of network theory-2 43
1
2 3
4
44. Kyunghoon Kim
with constant term without constant term
Divide by out-degree
PageRank
x=D(D−𝜶𝑨)−1 𝟏
Degree centrality
x=𝑨D−1x
No division
Katz centrality
x=(I−𝜶𝑨)−1
𝟏
Eigenvector centrality
x=𝜅1
−1
𝑨x
Pagerank with Linear Algebra
5/28/2014 Fundamentals of network theory-2 44
45. Kyunghoon Kim
Matrix Equation
Linear Transformation
Eigenvalue
Eigenvector
Eigenvector centrality
Katz centrality
Pagerank centrality
Contents with Linear Algebra
5/28/2014 Fundamentals of network theory-2 45
46. Kyunghoon Kim
If 𝐴 is an 𝑚 × 𝑛 matrix, with columns
and if x is in ℝ 𝑛
, then the product of 𝐴 and x is
the linear combination of the columns of 𝐴
using the corresponding entries in x as weights;
that is,
The matrix equation Ax=b
5/28/2014 Fundamentals of network theory-2 46
1
1 2 1 1 2 2n n n
n
x
A x x x
x
x a a a a a a
1 2, , , na a a
matrix equation vector equation
47. Kyunghoon Kim
The matrix equation Ax=b
5/28/2014 Fundamentals of network theory-2 47
1
1 2 1 1 2 2n n n
n
x
A x x x
x
x a a a a a a
4
1 2 1 1 2 1 3
3 4 3 7
0 5 3 0 5 3 6
7
48. Kyunghoon Kim
from matplotlib import pyplot as plt
import numpy as np
pointset = [[1,2],[2,2],[2,3],[3,4],[3,5],[2,5],[1,5],[1,3],[1,2]]
plt.plot(*np.transpose(pointset), marker='.')
plt.axis([0, 6, 0, 6])
Example – Plot of point set
5/28/2014 Fundamentals of network theory-2 48
49. Kyunghoon Kim
from matplotlib import pyplot as plt
import numpy as np
pointset = [[1,2],[2,2],[2,3],[3,4],[3,5],[2,5],[1,5],[1,3],[1,2]]
plt.plot(*np.transpose(pointset), marker='.')
plt.axis([0, 6, 0, 6])
newset = []
for i in pointset:
x = np.matrix(i).transpose()
A = np.matrix([[0,1],[1,0]])
Ax = A*x
Ax = list(np.array(Ax).reshape(-1,))
newset.append(Ax)
plt.plot(*np.transpose(newset), marker='.')
Example (Cont.) – Plot of rotation
5/28/2014 Fundamentals of network theory-2 49
50. Kyunghoon Kim
newset = []
for i in pointset:
x = np.matrix(i).transpose()
deg = pi*35/180 # deg to radian
A = np.matrix([[cos(deg),-sin(deg)],[sin(deg),cos(deg)]]) # counter clockwise
Ax = A*x
Ax = list(np.array(Ax).reshape(-1,))
newset.append(Ax)
plt.plot(*np.transpose(newset), marker='.')
plt.axis([-6, 6, -6, 6])
Example (Cont.) – Plot of rotation
5/28/2014 Fundamentals of network theory-2 50
51. Kyunghoon Kim
In two dimensions,
Rotation matrix
5/28/2014 Fundamentals of network theory-2 51
http://librairie.immateriel.fr/fr/read_book/9780596516130/ch11s02
52. Kyunghoon Kim
newset = []
for i in pointset:
x = np.matrix(i).transpose()
A = np.matrix([[1.5, 0],[0, 1]])
Ax = A*x
Ax = list(np.array(Ax).reshape(-1,))
newset.append(Ax)
plt.plot(*np.transpose(newset), marker='.')
A = np.matrix([[-1.8, 0],[0, 1]])
A = np.matrix([[1, 0],[0, -1]])
Example (Cont.) – Plot of dilation
5/28/2014 Fundamentals of network theory-2 52
55. Kyunghoon Kim
Linear Transformation
5/28/2014 Fundamentals of network theory-2 55
3 2 2 4 2
2
1 0 1 2 1
A x x
The number 𝜆 is the eigenvalue. It tells
whether the special vector x is stretched or
shrunk or reversed or left unchanged – when it
is multiplied by 𝐴.
56. Kyunghoon Kim
The eigenvalues are a new way to see into the
heart of a matrix.
Almost all vectors change direction, when they
are multiplied by 𝐴. Certain exceptional vectors
x are in the same direction as 𝐴x. Those are the
eigenvectors.
Multiply an eigenvector by 𝐴, and the vector
𝐴x is a number 𝜆 times the original x.
Eigenvalues and Eigenvectors
5/28/2014 Fundamentals of network theory-2 56
57. Kyunghoon Kim
If (𝐴 − 𝜆𝐼)x=0 has a nonzero solution, then
(𝐴 − 𝜆𝐼) cannot have an inverse, i.e., is not
invertible. The determinant of 𝐴 − 𝜆𝐼 must be
zero.
Example – eigenvalue and eigenvector
5/28/2014 Fundamentals of network theory-2 57
3 2 2 4
1 0 1 2
( ) 0
A
A I
x x
x
( ) 0A I x
58. Kyunghoon Kim
A natural extension of the simple degree
centrality is eigenvector centrality.
All neighbors are not equivalent.
A vertex’s importance is increased by having
connections to other vertices that are
themselves important.
Eigenvector Centrality
5/28/2014 Fundamentals of network theory-2 58
59. Kyunghoon Kim
Let 𝑥𝑖 = 1 for all 𝑖.
We define to be the sum of the centralities of
𝑖’s neighbors
𝑥𝑖
′
=
𝑗
𝐴𝑖𝑗 𝑥𝑗
We can write this expression in matrix
notation as x′
= 𝐴x.
Repeating this process to make better
estimates, we have after 𝑡 steps a vector x(𝑡).
Eigenvector Centrality
5/28/2014 Fundamentals of network theory-2 59
60. Kyunghoon Kim
x 𝑡 = 𝐴 𝑡
x(0).
Let’s write x(0) as a linear combination of the
eigenvectors v𝑖 of the adjacency matrix
x(0)=
𝑖
𝑐𝑖 𝐯𝑖
Then
x(t)=𝐴 𝑡
𝑖
𝑐𝑖 𝐯𝑖 =
𝑖
𝑐𝑖 𝑘𝑖
𝑡
𝐯𝑖 = 𝑘1
𝑡
𝑖
𝑐𝑖
𝑘𝑖
𝑘1
𝑡
𝐯𝑖
where the 𝑘𝑖 are the eigenvalues of 𝐴, 𝑘1 is largest eigenvalue.
Eigenvector Centrality
5/28/2014 Fundamentals of network theory-2 60
61. Kyunghoon Kim
Since
𝑘 𝑖
𝑘1
< 1 for all 𝑖 ≠ 1, all terms in the sum
other than the first decay exponentially as 𝑡
becomes large.
We get x 𝑡 → 𝑐1 𝑘1
𝑡
v1 as 𝑡 → ∞.
The limiting vector of centralities is simply
proportional to the leading eigenvector of the
adjacency matrix.
Equivalently we could say that the centrality x
satisfies 𝐴x = 𝑘1x.
Eigenvector Centrality
5/28/2014 Fundamentals of network theory-2 61
62. Kyunghoon Kim
𝐴x = 𝑘1x.
The centrality 𝑥𝑖 of vertex 𝑖 is proportional to
the sum of the centralities of 𝑖’s neighbors:
𝑥𝑖 = 𝑘1
−1
𝑗
𝐴𝑖𝑗 𝑥𝑗
It can be large
1. a vertex has many neighbors
2. a vertex has important neighbors
Eigenvector Centrality
5/28/2014 Fundamentals of network theory-2 62
65. Kyunghoon Kim
MacCormick, John. Nine Algorithms that
Changed the Future: The Ingenious Ideas that
Drive Today's Computers. Princeton University
Press, 2011.
Newman, Mark. Networks: an introduction.
Oxford University Press, 2010.
Strang, Gilbert. Introduction to linear algebra.
SIAM, 2003.
References
5/28/2014 Fundamentals of network theory-2 65