Social Network Analysis

2,570 views

Published on

Introductive presentation on static social network models.

Published in: Technology
0 Comments
3 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
2,570
On SlideShare
0
From Embeds
0
Number of Embeds
876
Actions
Shares
0
Downloads
62
Comments
0
Likes
3
Embeds 0
No embeds

No notes for slide

Social Network Analysis

  1. 1. AOT LAB DII, UNIPR SOCIAL NETWORK ANALYSIS Enrico Franchi (efranchi@ce.unipr.it)1
  2. 2. Outline SNA = Complex Network Analysis on Social Networks Notation & Metrics Degree Distribution Path Lengths Transitivity Models Random Graphs Small-Worlds Preferential Attachment Models Discussion Conclusion 2
  3. 3. Network Directed NetworkG = (V, E) E ⊂ V 2 k out = ∑ A ij k = ∑ A ji in{(x, x) x ∈V } ∩ E = ∅ i i j j ki = kiin + kiout Undirected NetworkAdjacency Matrix A symmetric ⎧1 if (i,j) ∈EA ij = ⎨ ⎩0 otherwise ki = ∑ A ji = ∑ A ij j j px = # {i ki = x } 1Degree Distribution nAverage Degree k =n −1 ∑k x x∈V 3
  4. 4. Measure of Transitivity () −1 kiLocal Clustering Coefficient Ci = 2 T (i) T(i): # distinct triangles with i as vertex 1Clustering Coefficient C = ∑ Ci n i∈VC= ( number of closed paths of length 2 ) = ( number of triangles ) × 3 ( number of paths of length 2 ) ( number of connected triples ) 4
  5. 5. Shortest Path Length and Diameter scalar operations AB = A + .⋅ B The matrix product depends from ( A,+,⋅) [ AB]ij = ∑ A ik ⋅ Bkj the operations of the semi-ring kSet of Adjacency Matrices min Other matrix products make sense: e.g., ( A,+,^ ) or ( A,^,+ ) We consider: ( Sk (M) = M + .^ M k ^ .+ M k ) Shortest path lengths matrix: L = ( Sn … S1 ) ( M ) Diameter: d = max L Average shortest path:  = Lij ij 5
  6. 6. Computational Complexity of ASPL: All pairs shortest path matrix based (parallelizable): ( ) α ≈ 3/ 4 O n 3+α All pairs shortest path Bellman-Ford: O (n )3 All pairs shortest path Dijkstra w. Fibonacci Heaps: O ( n log n + nm ) 2Computing the CPL x = M q (S) q#S elements are ≤ than x and (1-q)#S are > than x x = Lqδ (S) q#S(1-δ) elements are ≤ than x and (1-q)#S(1-δ) are > than xHuber Algorithm 2 2 (1 − δ ) 2 Let R a random sample of S such that #R=s, then s = 2 ln q  δ 2 Lqδ(S) = Mq(R) with probability p = 1-ε. 6
  7. 7. 2 2 (1 − δ ) 2s = 2 ln q  δ 2 7
  8. 8. Facebook Hugs Degree Distribution10000000 Nodes: 1322631 Edges: 1555597 m/n: 1.17 CPL: 11.74 1000000 Clustering Coefficient: 0.0527 Number of Components: 18987 100000 Isles: 0 10000 Largest Component Size: 1169456 1000 For large k we have 100 statistical fluctuations 10 1 1 10 100 1000 For small k power-laws do not hold 8
  9. 9. Many networks havepower-law degree distribution. pk ∝ k −γ γ >1• Citation networks k r =?• Biological networks• WWW graph• Internet graph• Social Networks Power-Law: ! gamma=3 1000000 100000 10000 1000 100 10 1 0.1 9 1 10 100 1000
  10. 10. Erdös-Rényi Random Graphs Connectedness p Threshold log n / nG(n, p) pG(n, m) p p p pEnsembles of Graphs p pWhen describe values of pproperties, we actually the p Pr(Aij = 1) = pexpected value of the propertyd := d = ∑ Pr(G)⋅ d(G) ∝ log n Pr(G) = p m (1− p) () n 2 −m G log k ⎛ n⎞ m =⎜ ⎟ p k = (n − 1)p C = k (n − 1) −1 ⎝ 2⎠ ⎛ n − 1⎞ k k kpk = ⎜ ⎟ p (1− p) n−1−k n→∞ pk = e − k 10 ⎝k ⎠ k!
  11. 11. p Watts-Strogatz Model In the modified model, we only add the edges. ki = κ + si ps = e −κ s (κ p ) s C= 3(κ − 2) s! 4(κ − 1) + 8κ p + 4κ p 2Edges inthe lattice # added pk = e −κ s (κ p ) k−κ ≈ log(npκ ) shortcuts ( k − κ )! κ p 2 11
  12. 12. Strogatz-Watts Model - 10000 nodes k = 4 1 CPL(p)/CPL(0) C(p)/C(0) 0.8CPL(p)/CPL(0) 0.6 C(p)/C(0) 0.4 0.2 0 0 0.2 0.4 p 0.6 0.8 1 Short CPL Large Clustering Coefficient 12 Threshold Threshold
  13. 13. 13Matt Britt ©
  14. 14. Barabási-Albert Model Connectedness log n Threshold log log nBARABASI-ALBERT-MODEL(G,M0,STEPS) Pr(V = x ) = ∑ Pr(E = e) = FOR K FROM 1 TO STEPS e∈N ( x ) N0 ← NEW-NODE(G) kx 2k x = = ADD-NODE(G,N0) m ∑ kx A ← MAKE-ARRAY() x FOR N IN NODES(G) −3 PUSH(A, N) pk ∝ x FOR J IN DEGREE(N) log n PUSH(A, N) ≈ FOR J FROM 1 TO M log log n N ← RANDOM-CHOICE(A) −3/4 ADD-LINK (N0, N) C≈n Scale-free entails short CPL Transitivity disappears 14 with network size No analytical proof available
  15. 15. OSN Refs. Users Links <k> C CP d γ r LClub Nexus Adamic et al 2.5 K 10 K 8.2 0.2 4 13 n.a. n.a.Cyworld Ahn et al 12 M 191 M 31.6 0.2 3.2 16 -0.1Cyworld T Ahn et al 92 K 0.7 M 15.3 0.3 7.2 n.a. n.a. 0.4LiveJournal Mislove et al 5 M 77 M 17 0.3 5.9 20 0.2Flickr Mislove et al 1.8 M 22 M 12.2 0.3 5.7 27 0.2Twitter Kwak et al 41 M 1700 M n.a. n.a. 4 4.1 n.a.Orkut Mislove et al 3 M 223 M 106 0.2 4.3 9 1.5 0.1Orkut Ahn et al 100 K 1.5 M 30.2 0.3 3.8 n.a. 3.7 0.3Youtube Mislove et al 1.1 M 5 M 4.29 0.1 5.1 21 -0Facebook Gjoka et al 1 M n.a. n.a. 0.2 n.a. n.a. 0.23FB H Nazir et al 51 K 116 K n.a. 0.4 n.a. 29 n.a.FB GL Nazir et al 277 K 600 K n.a. 0.3 n.a. 45 n.a.BrightKite Scellato et al 54 K 213 K 7.88 0.2 4.7 n.a. n.a.FourSquare Scellato et al 58 K 351 K 12 0.3 4.6 n.a. n.a.LiveJournal Scellato et al 993 K 29.6 M 29.9 0.2 4.9 n.a. n.a.Twitter Java et al 87 K 829 K 18.9 0.1 n.a. 6 0.59Twitter Scellato et al 409 K 183 M 447 0.2 2.8 n.a. n.a. 15
  16. 16. Static Deg C Rigid ER Yes Poisson Low - WS Yes Poisson Ok Yes BA No PL γ=3 Fixable Yes• Moreover:• Mostly no navigability• Uniformity assumption• Sometimes too complex for analytic study• Few features studied• Power-law? 16
  17. 17. Alternative models for degree distributionsPower-laws are difficult to fit.When they do, there are often better distributions. Power-law with cutoff almost always fits better than plain power-law. f (x;γ , β ) = x −γ eβ x Sometimes the log-normal distribution is more appropriate 1 ⎛ − ( log(x / m))2 ⎞ f (x;σ , m) = exp ⎜ ⎟ xσ (2π )1/2 ⎝ 2σ 2 ⎠ Most of the times random and preferential attachment processes concur F(x;r) = 1− (rm)1+r (x + rm)−(1+r ) r→0 r→∞ 17 scale-free negative exponential dist.
  18. 18. Massachussets 1st run: 64/296 arrived, most Boston delivered to him by 2 menNebraska 2nd run: 24/160 arrived, 2/3 delivered by “Mr. Jacobs” Omaha 2 ≤ hops ≤ 10; µ=5.x Wichita 6 Degrees CPL, hubs, ... Kansas ... and Kleinberg’s IntuitionMilgram’s Experiment• Random people from Omaha & Wichita were asked to send a postcard to a person in Boston:• Write the name on the postcard• Forward the message only to people personally known 18 that was more likely to know the target
  19. 19. Biased Preferential AttachmentAt each step: A new node is added to the network and is assigned to one of the sets P, I and L according to a probability distribution h + e0 ∈ edges are added to the network for each edge (u,v) u is chosen with distribution D0 and: if u ∈ I, v is a new node and is assigned to P; if u ∈ L, v is chosen according to Dγ. ⎧(β + 1)(ku + 1) u ∈L β ⎪ D (u) ∝ ⎨ ku + 1 u ∈I ⎪0 u ∈P ⎩ No analytic results available. 19
  20. 20. Transitive Linking Model [Davidsen 02] Transitive Linking I At each step: TL: a random node is chosen, and it introduces two other nodes that are linked to it; if the node does not have 2 edges, it introduces himself to a random node RM: with probability p a node is chosen and removed along its edges and replaced with a node with one random edge I When p ⇤ 1 the TL dominates the process: I the degree distribution is a power-law with cutoff I 1 C = p(⌅k ⇧ 1), i.e., quite large in practice I For larger values of p the two different process concur to form an exponential degree distribution I for p ⇥ 1 the degree distribution is essentially a Poisson distribution Instead of p it would make sense to have distinct p and rBergenti, Franchi, Poggi (Univ. Parma) Models for Agent-based Simulation of SN SNAMAS ’11 11 / 19 parameters for nodes leaving and entering the network Few analytic results available. 20
  21. 21. [1] Dorogovtsev, S. N. and Mendes, J. F. F. 2003 Evolution of Networks: From Biological Nets to the Internet and WWW (Physics). Oxford University Press, USA.[2] Watts, D. J. 2003 Small Worlds: The Dynamics of Networks between Order and Randomness (Princeton Studies in Complexity). Princeton University Press.[3] Jackson, M. O. 2010 Social and Economic Networks. Princeton University Press.[4] Newman, M. 2010 Networks: An Introduction. Oxford University Press, USA.[5] Wasserman, S. and Faust, K. 1994 Social Network Analysis: Methods and Applications (Structural Analysis in the Social Sciences). Cambridge University Press.[6] Scott, J. P. 2000 Social Network Analysis: A Handbook. Sage Publications Ltd.[7] Kepner, J. and Gilbert, J. 2011 Graph Algorithms in the Language of Linear Algebra (Software, Environments, and Tools). Society for Industrial & Applied Mathematics.[8] Cormen, T. H., Leiserson, C. E., Rivest, R. L., and Stein, C. 2009 Introduction to Algorithms. The MIT Press.[9] Skiena, S. S. 2010 The Algorithm Design Manual. Springer.[10] Bollobas, B. 1998 Modern Graph Theory. Springer.[11] Watts, D. J. and Strogatz, S. H. 1998. Collective dynamics of ‘small-world’networks. Nature. 393, 6684, 440-442.[12] Barabási, A. L. and Albert, R. 1999. Emergence of scaling in random networks. Science. 286, 5439, 509.[13] Kleinberg, J. 2000. The small-world phenomenon: an algorithm perspective. Proceedings of the thirty-second annual ACM symposium on Theory of computing. 163-170.[14] Milgram, S. 1967. The small world problem. Psychology today. 2, 1, 60-67. 21
  22. 22. Thanks for your kind attention.Enrico Franchi (efranchi@ce.unipr.it)AOTLAB, Dipartimento Ingegneria dell’Informazione,Università di Parma 22

×