This document discusses using graph theory and algorithms to analyze the Instagram social network. It examines the follow and like graphs of Instagram users and compares properties like average clustering coefficient and average degree of separation to theoretical scale-free and small world graphs. Sampling algorithms like Biased Random Walk with Fly Back are also presented and used to sample subgraphs from the large Instagram graphs for analysis.
3. Dataset
Follow graph
• Two users are connected if
they follow each other
Like graph
• Two users are connected if
they like the same photo
User 1
User 2
User 1
User 2
5. Average Degree of Separation
for each vertex v distance[v][v] = 0
for k from 1 to |v|
for i from 1 to |v|
for j from 1 to |v|
if dist[i][j] > dist[i][k] + dist[k][j]
dist[i][j] = dist[i][k] + dist[k][j]
6. Average Clustering Coefficient
For each vertex v in the graph{
get the neighbors of v
k = number of neighbors
n = the number of edges between neighbors
clustering coefficient = n/
𝑘 𝑘−1
2
}
Calculate the average
7. Different graphs
Small world graph
• High average clustering
coefficient
• Low average degree of
separation
Scale free graph
• Small average clustering
coefficient
• Low average degree of
separation
8. Comparing user graphs
Clustering coefficient chart Degree of separation chart
Avg.Clusteringcoefficient
Vertices
Instagram Scale free
Small wolrd
Avg.Degreeofseparation
Vertices
Instagram Scale free
Small world
9. Comparing photo graphs
Clustering coefficient chart Degree of separation chart
Avg.Clusteringcoefficient
Vertices
Instagram Scale free
Small wolrd
Avg.Degreeofseparation
Vertices
Instagram Scale free
Small world
10. Sampling from a large graph
• Biased Random Walk with Fly Back
Input G, h (desired sample size), α
Output S (sample graph)
Pick a random node P from G
while (number of vertices in S < h)
foundANewNode = false
while !foundANewNode
choose a neighbor y of p with probability B(p,y)
if y doesn’t exit in S
p <- y
else
foundANewNode = true
add y to S
11. Sampling from a large graph
• Biased Random Walk with Fly Back
Input G, h (desired sample size), α
Output S (sample graph)
Pick a random node P from G
while (number of vertices in S < h)
foundANewNode = false
while !foundANewNode
choose a neighbor y of p with probability B(p,y)
if y doesn’t exit in S
p <- y
else
foundANewNode = true
add y to S
12. Sampling from a large graph
𝐵(𝑥, 𝑦) =
[𝑑𝑒𝑔𝑟𝑒𝑒 𝑦 ] 𝛼
𝑛∈Г(𝑥)[𝑑𝑒𝑔𝑟𝑒𝑒 𝑦 ] 𝛼
13. Sampling from a large graph
• Biased Random Walk with Fly Back
Input G, h (desired sample size), α
Output S (sample graph)
Pick a random node P from G
while (number of vertices in S < h)
foundANewNode = false
while !foundANewNode
choose a neighbor y of p with probability B(p,y)
if y doesn’t exit in S
p <- y
else
foundANewNode = true
add y to S
14. Sampling from a large graph
• Tiny sampler (by Harish Sethu and Xiaoyu Chu)
Input: G, h (desired sample size)
Output: S (sample graph)
D = MRW(G,h)
S0 = BRW-FB(G,h,0)
D0 = degree exponent of S0
S1 = BRW-FB(G,h,-1)
D1 = degree exponent of S1
α = -((D-D0)/(D1-D0))
S = BRW-FB(G,h,α)
• BRW-FB -> Biased Random Walk with Fly Back
• MRW -> Metropolized Random Walk