Lecture 5 - Qunatifying a Network.pdf

Network
05.
Lecturer: Dr. Reem Essameldin Ebrahim
Introduction to Social Networks
Based on CS224W Analysis of Networks Mining and Learning with Graphs: Stanford University
Copyright © Dr. Reem Essameldin 2023-2024
Properties

In this Lecture
Topics to be covered are:
Quantifying Networks
Key Network Properties
Social Networks Modeling

Quantifying Social Structure
You’ve learned the basic mechanics behind network analysis. Without a firm
understanding of those foundations, you are unable to construct the more
advanced concepts and their associated measures which are used by network
analysts to understand the social world.
Given a graph, we have two questions in hand:
What are the properties of the graph? And once
we will have that we start to ask, how could we
generate artificial graphs to mimic the real
graphs. This is why we generate artificial
graphs to understand and give a real sense of
what processes might be generating networks
that we see in real life (e.g. what is the good
generative model for how to form a friendships).
Q:howtoquantify andmodela network?
1
2

Quantifying Social Structure
There are some fundamental measurements that we can use to quantify the
structure of the networks.
Key Network Properties
Degree distribution: 𝑝(𝑘).
Path length: ℎ.
Clustering coefficient: 𝐶.
Connected components: 𝑠.
Certain of these characteristics are shared among
different types of networks.

Degree Distribution
𝑷(𝒌) is simply a histogram that tells how many nodes have a given degree.
Degree distribution 𝑷(𝒌) : Probability that a
randomly chosen node has degree 𝒌. Since
𝑷(𝒌) is a probability, it must be normalized:
𝑝𝑘
∞
𝑘=1 =1
For a network with 𝑵 nodes, the degree
distribution is the normalized histogram.
Where, 𝑁𝑘 = # nodes with degree 𝒌 .
𝑝 𝑘 =
𝑁𝑘
𝑁
TheMathematical Definition
Key
Network
Properties

Degree Distribution
For the histogram, on the x-axis we plot the degree, on the y-axis we plot the number the
count or the proportional of nodes having that degree. Note that: for the second case, we can
normalize the y-axis so that the height of the bars are summed up to 1 and this is a
distribution so that it’s a portion of nodes with a given degree or we can leave it to express
the count as shown in Figure.
1
1
2
2
For the given graph, to build the histogram we can count how many nodes
have a degree of one (𝒌 = 𝟏), and plot its bar.
One property of real world networks is that they have
what is called skewed degree distribution.

Test Yourself
For the given graphs, find the degree distribution and the corresponding histograms.
Solution:
4
3
2 1
a)
b)

Test Yourself
For the given graphs, find the degree distribution and the corresponding histograms.
Solution:
4
3
2 1
a)
b)
𝑁 = 4, then 𝑃1 = 1/4 , 𝑃2 = 2/4 = 1/2, 𝑃3 = 1/4, 𝑃4 = 0.
E𝑎𝑐ℎ 𝑛𝑜𝑑𝑒 ℎ𝑎𝑠 𝑡ℎ𝑒 𝑠𝑎𝑚𝑒 𝑑𝑒𝑔𝑟𝑒𝑒 𝑘 = 2.

Paths in a Graph
How many edges are between different pairs of nodes.
A path is a sequence of nodes in which each
node is linked to the next one. A path between
nodes 𝑖0 and 𝑖𝑛 is an ordered list of 𝑛 links
𝑃𝑛 = (𝑖0, 𝑖1), (𝑖1, 𝑖2), (𝑖2, 𝑖3) … . , (𝑖𝑛−1, 𝑖𝑛)
Note that:
Path can intersect itself and pass through the
same edge multiple times e.g.: ACBDCDEG
In a directed graph a path can only follow the
direction of the “arrow”
TheMathematical Definition
Key
Network
Properties

The Shortest Path
We are not interested in the general path, but in the shortest path (𝒉)that the least number of
hubs/edges to get from one node to the other. We can quantify the distance between a pair of
nodes as the distance between the shortest path between that pair.
Undirected Directed
We have to
traverse 2 edges to
get from B to D this
is the minimum # of
edge we have to
traverse to go from
B to D. We can go
through A but that
is longer.
If the graph is disconnected then there is no
shortest path between X and A because
there is no connection for us to traverse.
In directed graph the idea is the same but the path must
follow the edge direction. Thus, in undirected graphs
distances are symmetric while in directed graphs
distances are not symmetric. E.g. ℎ𝐵,𝐷=2 but ℎ𝐷,𝐵= ∞
because we cannot traverse in the opposite direction.
Q: what is the distance
of the node to itself?

Network Diameter
The network diameter is the largest distance in the network. This is the longest shortest
path that exists in the graph. This is what we do in graph theory, but for real data the
graph might be disconnected then the diameter would be infinite so what we generally
do is to quantify the network by its average shortest path length.
AverageShortestPath
where basically we will go over all pairs of nodes and asking what is the average
shortest path between all pairs of nodes. Here I how we could compute it:
ℎ =
1
2 𝐸𝑚𝑎𝑥
ℎ𝑖𝑗
𝑖,𝑗≠𝑖
This is the normalization factor. The
reason we put Emax here is basically as
we can ask what is the possible # of
pairs in a network
we go over all pairs 𝑖𝑗 where 𝑖 ≠ 𝑗 , ℎ𝑖𝑗is the length of the
shortest path, the total number of possible edges in the
network (sum of overall pairs of nodes).
Where ℎ𝑖𝑗 is the distance from node 𝑖 to node 𝑗. 𝐸𝑚𝑎𝑥 is
max number of edges (total number of node pairs) =
𝑛(𝑛 − 1)/2

Example

Clustering coefficient: C
This quantity a real application of social networks analysis. The way we define
this quantity is to ask do edges cluster in the network. what do we mean by
clustering is do edges appear more densely in certain part of the network or are
there social communities exist in the network?.
The way we can quantity this mathematically is to say
what proportion of one’s neighbors are connected
among themselves. For a node 𝑖 with degree 𝑘𝑖 the
local clustering coefficient is defined as:
𝐶𝑖 =
2 𝑒𝑖
𝑘𝑖(𝑘𝑖 − 1)
TheMathematicalDefinition
Key
Network
Properties
𝑒𝑖represents the number of links between the 𝑘𝑖 neighbors of
node 𝑖. 𝐶𝑖= 0 if none of the neighbors of node 𝑖 link to each
other. 𝐶𝑖= 1 if the neighbors of node 𝑖 form a complete graph
(i.e., they all link to each other).
𝐶𝑖 ∈ [0, 1]

Clustering coefficient: C
𝐶𝑖 is the probability that two neighbors of a node link to each other. So for every
node we ask what fraction of your friends are also friends with themselves. In
social networks this is known as social triadic closure because it says if two of us
are friends and you have another friend there then we will likely to be friends as
well. Then you are likely to be friend with someone if you have common friends in
between.
What we see in social networks is that social networks
have a high clustering coefficient, people tends to
group to in a connected dense communities where
there is a lot of friendships between this set of people.
Key
Network
Properties
So this is we define clustering coefficient of a node and then how do we
quantify the network is by compuingt the average over all the nodes 𝑖.
Average clustering coefficient:
𝐶 =
1
𝑁
𝐶𝑖
𝑁
𝑖

Examples
What portion of 𝑖’s neighbors are connected? 𝐶𝑖 =
2 𝑒𝑖
𝑘𝑖(𝑘𝑖 − 1)
Q: if a node has a cluster
coeff = 0, is it must be a
bridge? Cycle
Q: what for A, G, F (degree 1 nodes) who
has no possibilities to have clusters? we
define it as zero or ignore it.
a)
b)

Connectivity
Is the size of the largest connected component, where any two vertices can be
joined by a path (Largest component = Giant component).
• Start from random node and perform
Breadth First Search (BFS).
• Label the nodes BFS visited.
• If all nodes are visited, the network is
connected.
• Otherwise find an unvisited node and
repeat BFS.
Key
Network
Properties
Howtofind connected components:
Note that: BFS algorithm is used to search a graph data
structure for a node that meets a set of criteria. It starts at
the root of the graph and visits all nodes at the current
depth level before moving on to the nodes at the next
depth level.

Social Networks Modeling
Therandomgraphmodel
A network or graph is known as a
scale-free network whose degree
distribution follows a power law, at
least asymptotically. Examples of
scale-free networks are:
• Barabási Albert model (BAM)
• Bianconi–Barabási model (BBM).
Various models of random graphs
have been proposed for the social
network such as:
• Erdos–Renyi model
• Small-world model (SWM)
• Preferential attachment model
• Forest-fire model
Social networks can be represented and measured by
using two basic mathematical models
Thescale-free graphmodel

Lecture 5 - Qunatifying a Network.pdf

Recommended

Recommended

More Related Content

Similar to Lecture 5 - Qunatifying a Network.pdf

Similar to Lecture 5 - Qunatifying a Network.pdf (20)

Recently uploaded

Recently uploaded (20)

Lecture 5 - Qunatifying a Network.pdf