Socialmedia

224 views

Published on

Published in: Health & Medicine
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
224
On SlideShare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
3
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Socialmedia

  1. 1. Social Media Complex Networks Assortativity Statistical Analysis of Social Media October 5, 2012Statistical Analysis of Social Media 1 / 33
  2. 2. Social Media Complex Networks AssortativityRecap 1 Social Media 2 Complex Networks 3 AssortativityStatistical Analysis of Social Media 2 / 33
  3. 3. Social Media Complex Networks AssortativityOutline 1 Social Media 2 Complex Networks 3 AssortativityStatistical Analysis of Social Media 3 / 33
  4. 4. Social Media Complex Networks AssortativityM.E.J Newman “Six degrees of separation - is the idea that everyone is on average approximately six steps away, by way of introduction, from any other person in the world, so that a chain of ”a friend of a friend” statements can be made, on average, to connect any two people in six steps or fewer.” — Frigyes Karinthy[1929], WikipediaStatistical Analysis of Social Media 4 / 33
  5. 5. Social Media Complex Networks AssortativitySocial Media “What is Social Media?”Statistical Analysis of Social Media 5 / 33
  6. 6. Social Media Complex Networks AssortativitySocial Media “Social Media employ web- and mobile-based technologies to support interactive dialogue and introduce substantial and pervasive changes to communication between organizations, communities, and individuals. ” — WikipediaStatistical Analysis of Social Media 6 / 33
  7. 7. Social Media Complex Networks AssortativityWhy you HAVE to understand Social Media Why bother with Social Media?Statistical Analysis of Social Media 7 / 33
  8. 8. Social Media Complex Networks AssortativityWhy you HAVE to understand Social Media Why bother with Social Media? People spend their time here ...Statistical Analysis of Social Media 7 / 33
  9. 9. Social Media Complex Networks AssortativityWhy you HAVE to understand Social Media Why bother with Social Media? People spend their time here ... 91 % of today’s online adults use Social MediaStatistical Analysis of Social Media 7 / 33
  10. 10. Social Media Complex Networks AssortativityWhy you HAVE to understand Social Media Why bother with Social Media? People spend their time here ... 91 % of today’s online adults use Social Media One in every nine people on Earth on Facebook and spend 700 billion minutes per month on FacebookStatistical Analysis of Social Media 7 / 33
  11. 11. Social Media Complex Networks AssortativityWhy you HAVE to understand Social Media Why bother with Social Media? People spend their time here ... 91 % of today’s online adults use Social Media One in every nine people on Earth on Facebook and spend 700 billion minutes per month on Facebook YouTube generates 92 billion page views per monthStatistical Analysis of Social Media 7 / 33
  12. 12. Social Media Complex Networks AssortativityWhy you HAVE to understand Social Media Why bother with Social Media? People spend their time here ... 91 % of today’s online adults use Social Media One in every nine people on Earth on Facebook and spend 700 billion minutes per month on Facebook YouTube generates 92 billion page views per month Companies spend their time here too ...Statistical Analysis of Social Media 7 / 33
  13. 13. Social Media Complex Networks AssortativityWhy you HAVE to understand Social Media Why bother with Social Media? People spend their time here ... 91 % of today’s online adults use Social Media One in every nine people on Earth on Facebook and spend 700 billion minutes per month on Facebook YouTube generates 92 billion page views per month Companies spend their time here too ...Statistical Analysis of Social Media 7 / 33
  14. 14. Social Media Complex Networks AssortativityHow you can study a Social Media Where to start ...Statistical Analysis of Social Media 8 / 33
  15. 15. Social Media Complex Networks AssortativityHow you can study a Social Media Where to start ... Gather information about the real networks ...Statistical Analysis of Social Media 8 / 33
  16. 16. Social Media Complex Networks AssortativityHow you can study a Social Media Where to start ... Gather information about the real networks ... Statistical properties of the complex network are important due to its visualization limits Clustering coefficients Degree distributionsStatistical Analysis of Social Media 8 / 33
  17. 17. Social Media Complex Networks AssortativityHow you can study a Social Media Where to start ... Gather information about the real networks ... Statistical properties of the complex network are important due to its visualization limits Clustering coefficients Degree distributions AssortativityStatistical Analysis of Social Media 8 / 33
  18. 18. Social Media Complex Networks AssortativityHow you can study a Social Media Where to start ... Gather information about the real networks ... Statistical properties of the complex network are important due to its visualization limits Clustering coefficients Degree distributions Assortativity Community structure and number of other measures ...Statistical Analysis of Social Media 8 / 33
  19. 19. Social Media Complex Networks AssortativityOutline 1 Social Media 2 Complex Networks 3 AssortativityStatistical Analysis of Social Media 9 / 33
  20. 20. Social Media Complex Networks AssortativityExample of networks: School Friendship Network.[JamesMoody] Figure: Friendship network of children in a US school. Friendships are determined by asking the participants, and hence are directed, since A may say that B is their friend but not vice versa. Vertices are color coded according to race, as marked, and the split from left to right in the figure is clearly primarily along lines of race.Statistical Analysis of Social Media 10 / 33
  21. 21. Social Media Complex Networks AssortativityComplex networks In the context of network theory, a complex network is a graph (network) with non-trivial topological features - features that do not occur in simple networks such as lattices or random graphs but often occur in real graphs.Statistical Analysis of Social Media 11 / 33
  22. 22. Social Media Complex Networks AssortativityComplex networks In the context of network theory, a complex network is a graph (network) with non-trivial topological features - features that do not occur in simple networks such as lattices or random graphs but often occur in real graphs. Complex networks are often many orders of magnitude larger than the networks of graph theory and display properties on the edge of chaos (not completely erratic, nor perfectly structured).Statistical Analysis of Social Media 11 / 33
  23. 23. Social Media Complex Networks AssortativityInternetStatistical Analysis of Social Media 12 / 33
  24. 24. Social Media Complex Networks AssortativityGraph theoryStatistical Analysis of Social Media 13 / 33
  25. 25. Social Media Complex Networks AssortativityGraph theory A Graph is an object composed by vertices and edgesStatistical Analysis of Social Media 13 / 33
  26. 26. Social Media Complex Networks AssortativityGraph theory A Graph is an object composed by vertices and edges Degree k = number of edges (oriented) per verticesStatistical Analysis of Social Media 13 / 33
  27. 27. Social Media Complex Networks AssortativityGraph theory A Graph is an object composed by vertices and edges Degree k = number of edges (oriented) per vertices Distance d = number of edges amongst two vertices (in the connected region!)Statistical Analysis of Social Media 13 / 33
  28. 28. Social Media Complex Networks AssortativityGraph theory A Graph is an object composed by vertices and edges Degree k = number of edges (oriented) per vertices Distance d = number of edges amongst two vertices (in the connected region!) Degree is n kj = aij (1) j=1Statistical Analysis of Social Media 13 / 33
  29. 29. Social Media Complex Networks AssortativityDegree distribution We define pk to be the fraction of vertices in the network that have degree k.Statistical Analysis of Social Media 14 / 33
  30. 30. Social Media Complex Networks AssortativityDegree distribution We define pk to be the fraction of vertices in the network that have degree k. One way to visualize the behaviour of the degree in a network is to check the behaviour of the degree frequence distribution P(k).Statistical Analysis of Social Media 14 / 33
  31. 31. Social Media Complex Networks AssortativityLarge scale - free networks Most of these real-world networks are scale-free, i.e., their degree distribution has huge variability and closely follows a power law. The fraction of nodes with degree k is roughly proportional to k −y −1 , y > 0 (2) Let’s consider correlations between degrees of two nodes connected by an edge.Statistical Analysis of Social Media 15 / 33
  32. 32. Social Media Complex Networks AssortativityOutline 1 Social Media 2 Complex Networks 3 AssortativityStatistical Analysis of Social Media 16 / 33
  33. 33. Social Media Complex Networks AssortativityVertex correlation: Assortativity A way to check the Conditioned Probability P(k|k1) that a vertex whose degree is k is connected with another vertex with degree k1 is given by the measure of the average degree of the neighbours.Statistical Analysis of Social Media 17 / 33
  34. 34. Social Media Complex Networks AssortativityVertex correlation: Assortativity Consider then a network, an undirected graph of N vertices and M edges, with degree distribution pk .Statistical Analysis of Social Media 18 / 33
  35. 35. Social Media Complex Networks AssortativityVertex correlation: Assortativity Consider then a network, an undirected graph of N vertices and M edges, with degree distribution pk . Consider a vertex reached by following a randomly chosen edge on the graph - is not distributed according to pkStatistical Analysis of Social Media 18 / 33
  36. 36. Social Media Complex Networks AssortativityVertex correlation: Assortativity Consider then a network, an undirected graph of N vertices and M edges, with degree distribution pk . Consider a vertex reached by following a randomly chosen edge on the graph - is not distributed according to pk ... biased in favor of vertices of high degree, since more edges end at a high-degree vertex than at a low-degree oneStatistical Analysis of Social Media 18 / 33
  37. 37. Social Media Complex Networks AssortativityVertex correlation: Assortativity Consider then a network, an undirected graph of N vertices and M edges, with degree distribution pk . Consider a vertex reached by following a randomly chosen edge on the graph - is not distributed according to pk ... biased in favor of vertices of high degree, since more edges end at a high-degree vertex than at a low-degree one Degree Distr at the end of a randomly chosen edge is proportional kpkStatistical Analysis of Social Media 18 / 33
  38. 38. Social Media Complex Networks AssortativityVertex correlation: Assortativity Consider then a network, an undirected graph of N vertices and M edges, with degree distribution pk . Consider a vertex reached by following a randomly chosen edge on the graph - is not distributed according to pk ... biased in favor of vertices of high degree, since more edges end at a high-degree vertex than at a low-degree one Degree Distr at the end of a randomly chosen edge is proportional kpk interested not in the total degree of such a vertex, but in the remaining degree—the number of edges leavingStatistical Analysis of Social Media 18 / 33
  39. 39. Social Media Complex Networks AssortativityVertex correlation: Assortativity Consider then a network, an undirected graph of N vertices and M edges, with degree distribution pk . Consider a vertex reached by following a randomly chosen edge on the graph - is not distributed according to pk ... biased in favor of vertices of high degree, since more edges end at a high-degree vertex than at a low-degree one Degree Distr at the end of a randomly chosen edge is proportional kpk interested not in the total degree of such a vertex, but in the remaining degree—the number of edges leaving The correctly normalized distribution qk of the remaining degree is thenStatistical Analysis of Social Media 18 / 33
  40. 40. Social Media Complex Networks AssortativityVertex correlation: Assortativity Consider then a network, an undirected graph of N vertices and M edges, with degree distribution pk . Consider a vertex reached by following a randomly chosen edge on the graph - is not distributed according to pk ... biased in favor of vertices of high degree, since more edges end at a high-degree vertex than at a low-degree one Degree Distr at the end of a randomly chosen edge is proportional kpk interested not in the total degree of such a vertex, but in the remaining degree—the number of edges leaving The correctly normalized distribution qk of the remaining degree is then (k + 1)pk+1 qk = . (3) j jpj ejk to be the joint probability distribution of the remaining degrees of the two vertices at either end of a randomly chosen edgeStatistical Analysis of Social Media 18 / 33
  41. 41. Social Media Complex Networks AssortativityVertex correlation: Assortativity Consider then a network, an undirected graph of N vertices and M edges, with degree distribution pk . Consider a vertex reached by following a randomly chosen edge on the graph - is not distributed according to pk ... biased in favor of vertices of high degree, since more edges end at a high-degree vertex than at a low-degree one Degree Distr at the end of a randomly chosen edge is proportional kpk interested not in the total degree of such a vertex, but in the remaining degree—the number of edges leaving The correctly normalized distribution qk of the remaining degree is then (k + 1)pk+1 qk = . (3) j jpj ejk to be the joint probability distribution of the remaining degrees of the two vertices at either end of a randomly chosen edge It is symmetric in its indices on an undirected graph ejk = ekj , and obeys the sum rulesStatistical Analysis of Social Media 18 / 33
  42. 42. Social Media Complex Networks AssortativityVertex correlation: Assortativity Consider then a network, an undirected graph of N vertices and M edges, with degree distribution pk . Consider a vertex reached by following a randomly chosen edge on the graph - is not distributed according to pk ... biased in favor of vertices of high degree, since more edges end at a high-degree vertex than at a low-degree one Degree Distr at the end of a randomly chosen edge is proportional kpk interested not in the total degree of such a vertex, but in the remaining degree—the number of edges leaving The correctly normalized distribution qk of the remaining degree is then (k + 1)pk+1 qk = . (3) j jpj ejk to be the joint probability distribution of the remaining degrees of the two vertices at either end of a randomly chosen edge It is symmetric in its indices on an undirected graph ejk = ekj , and obeys the sum rules ejk = 1, ejk = qk . (4) jk jStatistical Analysis of Social Media 18 / 33
  43. 43. Social Media Complex Networks AssortativityVertex correlation: Assortativity The Amount of assortative mixing can be quantified by the connected degree-degree correlation function < jk > − < j >< k >= jk jk(ejk − qj qk )Statistical Analysis of Social Media 19 / 33
  44. 44. Social Media Complex Networks AssortativityVertex correlation: Assortativity The Amount of assortative mixing can be quantified by the connected degree-degree correlation function < jk > − < j >< k >= jk jk(ejk − qj qk ) where av. . . indicates an average over edgesStatistical Analysis of Social Media 19 / 33
  45. 45. Social Media Complex Networks AssortativityVertex correlation: Assortativity The Amount of assortative mixing can be quantified by the connected degree-degree correlation function < jk > − < j >< k >= jk jk(ejk − qj qk ) where av. . . indicates an average over edges This correlation function is zero if no assortative mixing and positive or negative for assortativeStatistical Analysis of Social Media 19 / 33
  46. 46. Social Media Complex Networks AssortativityVertex correlation: Assortativity The Amount of assortative mixing can be quantified by the connected degree-degree correlation function < jk > − < j >< k >= jk jk(ejk − qj qk ) where av. . . indicates an average over edges This correlation function is zero if no assortative mixing and positive or negative for assortative To compare between others use maximal value i.e., one with ejk = qk δjk .Statistical Analysis of Social Media 19 / 33
  47. 47. Social Media Complex Networks AssortativityVertex correlation: Assortativity The Amount of assortative mixing can be quantified by the connected degree-degree correlation function < jk > − < j >< k >= jk jk(ejk − qj qk ) where av. . . indicates an average over edges This correlation function is zero if no assortative mixing and positive or negative for assortative To compare between others use maximal value i.e., one with ejk = qk δjk . 2 2 2 variance σq = k k qk − k kqk of the distribution qk , and hence the normalized correlation function isStatistical Analysis of Social Media 19 / 33
  48. 48. Social Media Complex Networks AssortativityVertex correlation: Assortativity The Amount of assortative mixing can be quantified by the connected degree-degree correlation function < jk > − < j >< k >= jk jk(ejk − qj qk ) where av. . . indicates an average over edges This correlation function is zero if no assortative mixing and positive or negative for assortative To compare between others use maximal value i.e., one with ejk = qk δjk . 2 2 2 variance σq = k k qk − k kqk of the distribution qk , and hence the normalized correlation function is 1 r = jk(ejk − qj qk ), (5) 2 σq jk Pearson correlation coefficient of the degrees at either ends of an edge and lies in the range −1 ≤ r ≤ 1 For the practical purpose of evaluating r on an observed network 2 M −1 i ji ki − M i 1/2(ji + ki ) −1 r = 2 (6) 2 2 M −1 i 1/2(ji + ki ) − M i 1/2(ji + ki ) −1Statistical Analysis of Social Media 19 / 33
  49. 49. Social Media Complex Networks AssortativityVertex correlation: Assortativity The Amount of assortative mixing can be quantified by the connected degree-degree correlation function < jk > − < j >< k >= jk jk(ejk − qj qk ) where av. . . indicates an average over edges This correlation function is zero if no assortative mixing and positive or negative for assortative To compare between others use maximal value i.e., one with ejk = qk δjk . 2 2 2 variance σq = k k qk − k kqk of the distribution qk , and hence the normalized correlation function is 1 r = jk(ejk − qj qk ), (5) 2 σq jk Pearson correlation coefficient of the degrees at either ends of an edge and lies in the range −1 ≤ r ≤ 1 For the practical purpose of evaluating r on an observed network 2 M −1 i ji ki − M i 1/2(ji + ki ) −1 r = 2 (6) 2 2 M −1 i 1/2(ji + ki ) − M i 1/2(ji + ki ) −1 where ji , ki are the degrees of the vertices at the ends of the i th edge, with i = 1 . . . MStatistical Analysis of Social Media 19 / 33
  50. 50. Social Media Complex Networks AssortativityCorrelation between random variables The correlation coefficient ρ for two random variables X and Y with Var(X), Var(Y ) < ∞ is defined by E [XY ] − E [X ]E [Y ] ρ= (7) Var (X ) Var (Y ) By Cauchy-Schwarz, ρ ∈ [−1, 1], and ρ measures the linear dependence between the random variablesStatistical Analysis of Social Media 20 / 33
  51. 51. Social Media Complex Networks AssortativitySample correlation coefficient We can approximate ρ from a sample by computing the sample correlation coefficient 1 n ¯ ¯ 1−n i =1 (Xi − Xn )(Yi − Yn ) ρn = (8) Sn (X )Sn (Y )Statistical Analysis of Social Media 21 / 33
  52. 52. Social Media Complex Networks AssortativitySample correlation coefficient We can approximate ρ from a sample by computing the sample correlation coefficient 1 n ¯ ¯ 1−n i =1 (Xi − Xn )(Yi − Yn ) ρn = (8) Sn (X )Sn (Y ) ¯ where Xn = 1 n ¯ 1 n n i =1 Xi , Yn = n i =1 Yi is sample averages.Statistical Analysis of Social Media 21 / 33
  53. 53. Social Media Complex Networks AssortativitySample correlation coefficient We can approximate ρ from a sample by computing the sample correlation coefficient 1 n ¯ ¯ 1−n i =1 (Xi − Xn )(Yi − Yn ) ρn = (8) Sn (X )Sn (Y ) ¯ where Xn = n 1 n ¯ 1 n i =1 Xi , Yn = n i =1 Yi is sample averages. while 2 1 Sn (X ) = n−1 n ¯ − Xn )2 , i =1 (Xi 2 (Y ) = 1 Sn n ¯ − Yn )2 denote the sample variances. n−1 i =1 (YiStatistical Analysis of Social Media 21 / 33
  54. 54. Social Media Complex Networks AssortativitySample correlation coefficient We can approximate ρ from a sample by computing the sample correlation coefficient 1 n ¯ ¯ 1−n i =1 (Xi − Xn )(Yi − Yn ) ρn = (8) Sn (X )Sn (Y ) ¯ ¯ where Xn = n n=1 Xi , Yn = n n=1 Yi is sample averages. 1 1 i i while 2 1 ¯ Sn (X ) = n−1 n=1 (Xi − Xn )2 , i Sn2 (Y ) = 1 n ¯ 2 n−1 i =1 (Yi − Yn ) denote the sample variances. under the assumption that Var (X ), Var (Y ) < ∞, the estimator ρn of ρ is consistent, i.e.,ρn → ρ, So let’s SEEStatistical Analysis of Social Media 21 / 33
  55. 55. Social Media Complex Networks AssortativitySample correlation coefficient We can approximate ρ from a sample by computing the sample correlation coefficient 1 n ¯ ¯ 1−n i =1 (Xi − Xn )(Yi − Yn ) ρn = (8) Sn (X )Sn (Y ) ¯ ¯ where Xn = n n=1 Xi , Yn = n n=1 Yi is sample averages. 1 1 i i while 2 1 ¯ Sn (X ) = n−1 n=1 (Xi − Xn )2 , i Sn2 (Y ) = 1 n ¯ 2 n−1 i =1 (Yi − Yn ) denote the sample variances. under the assumption that Var (X ), Var (Y ) < ∞, the estimator ρn of ρ is consistent, i.e.,ρn → ρ, So let’s SEE what happens to ρn when Var (X ), Var (Y ) = ∞, andStatistical Analysis of Social Media 21 / 33
  56. 56. Social Media Complex Networks AssortativitySample correlation coefficient We can approximate ρ from a sample by computing the sample correlation coefficient 1 n ¯ ¯ 1−n i =1 (Xi − Xn )(Yi − Yn ) ρn = (8) Sn (X )Sn (Y ) ¯ ¯ where Xn = n n=1 Xi , Yn = n n=1 Yi is sample averages. 1 1 i i while 2 1 ¯ Sn (X ) = n−1 n=1 (Xi − Xn )2 , i Sn2 (Y ) = 1 n ¯ 2 n−1 i =1 (Yi − Yn ) denote the sample variances. under the assumption that Var (X ), Var (Y ) < ∞, the estimator ρn of ρ is consistent, i.e.,ρn → ρ, So let’s SEE what happens to ρn when Var (X ), Var (Y ) = ∞, and SHOW that the use of ρn in random graphs is uninformative, and it leads to deceptive behavior in the context of a linear dependence.Statistical Analysis of Social Media 21 / 33
  57. 57. Social Media Complex Networks AssortativitySample correlation coefficientStatistical Analysis of Social Media 22 / 33
  58. 58. Social Media Complex Networks AssortativityConvergence in probability Assumption: In practice, however, we tend not to know 2 whether Var (X ), Var (Y ) < ∞, since Sn (X ) < ∞ and 2 Sn (Y ) < ∞Statistical Analysis of Social Media 23 / 33
  59. 59. Social Media Complex Networks AssortativityConvergence in probability Assumption: In practice, however, we tend not to know 2 whether Var (X ), Var (Y ) < ∞, since Sn (X ) < ∞ and 2 Sn (Y ) < ∞ Assumption: Assortativity coefficient is not the approriate way to describe degree-degree dependencies for scale free graphStatistical Analysis of Social Media 23 / 33
  60. 60. Social Media Complex Networks AssortativityRank Correlations For two-dimensional data ((Xi , Yi ))n=1 , let riX and riY be the i rank of an observation Xi and Yi , respectivelyStatistical Analysis of Social Media 24 / 33
  61. 61. Social Media Complex Networks AssortativityRank Correlations For two-dimensional data ((Xi , Yi ))n=1 , let riX and riY be the i rank of an observation Xi and Yi , respectively The statistical correlation coefficient for the rank is known as Spearman’s rho isStatistical Analysis of Social Media 24 / 33
  62. 62. Social Media Complex Networks AssortativityRank Correlations For two-dimensional data ((Xi , Yi ))n=1 , let riX and riY be the i rank of an observation Xi and Yi , respectively The statistical correlation coefficient for the rank is known as Spearman’s rho is n X i =1 (ri − (n + 1)/2)(riY − (n + 1)/2) ρrank = n = n X n Y i =1 (ri − (n + 1)/2)2 i (ri − (n + 1)/2)2 n 2 6 i =1 li 1− , n3 − n (9) where li = riX − riYStatistical Analysis of Social Media 24 / 33
  63. 63. Social Media Complex Networks AssortativityLinear dependencies ρ in general measures linear dependence between two random variablesStatistical Analysis of Social Media 25 / 33
  64. 64. Social Media Complex Networks AssortativityLinear dependencies ρ in general measures linear dependence between two random variables relation between X and Y are described through the following linear model: X = α1 U1 + · · · + αm Um , Y = β1 U1 + · · · + βm Um(10) , where Uj , j = 1, . . . , m, are independent identically distributed Random variable U is regularly varying with index γ > 0, if P(U > x) = P(V > x) = L(x)x −γ , where L(x) is a slowly varying function, that is, for u > 0, L(ux)/L(x) → 1 as x → ∞, for instance, L(u) may be equal to a constant or log(u).Statistical Analysis of Social Media 25 / 33
  65. 65. Social Media Complex Networks AssortativityWeak convergence of correlation coefficient Theorem (Weak convergence of correlation coefficient) Let ((Xi , Yi ))in=1 be i.i.d. copies of the random variables (X , Y ) , and where (Uj )m are i.i.d. random variables satisfying with j=1 γ ∈ (0, 2), so that Var (Uj ) = ∞. Then, m αm βm Zj ρn → ρ ≡ √ m j=1 √ m 2 , where (Zj )m are i.i.d. random 2 j=1 j=1 αm Zj j=1 βm Zj variables having stable distributions with parameter γ/2 ∈ (0, 1), and → denotes convergence in distribution. In particular, ρ has a density on [−1, 1], which is strictly positive on (−1, 1) if there exist k, l such that αk βk < 0 < αl βl , while the density is positive on (a, 1) when αk βk ≥ 0 for every k, where m j=1 αj βj 1{j∈S} a = minS⊂{1,2,...,m},|S|≥2 m m ∈ (0, 1). j=1 α2 1{j∈S} j j=1 βj2 1{j∈S}Statistical Analysis of Social Media 26 / 33
  66. 66. Social Media Complex Networks AssortativityWeak convergence of correlation coefficient Theorem (Weak convergence of correlation coefficient) Let ((Xi , Yi ))in=1 be i.i.d. copies of the random variables (X , Y ) , and where (Uj )m are i.i.d. random variables satisfying with j=1 γ ∈ (0, 2), so that Var (Uj ) = ∞. Then, m αm βm Zj ρn → ρ ≡ √ m j=1 √ m 2 , where (Zj )m are i.i.d. random 2 j=1 j=1 αm Zj j=1 βm Zj variables having stable distributions with parameter γ/2 ∈ (0, 1), and → denotes convergence in distribution. In particular, ρ has a density on [−1, 1], which is strictly positive on (−1, 1) if there exist k, l such that αk βk < 0 < αl βl , while the density is positive on (a, 1) when αk βk ≥ 0 for every k, where m j=1 αj βj 1{j∈S} a = minS⊂{1,2,...,m},|S|≥2 m m ∈ (0, 1). j=1 α2 1{j∈S} j j=1 βj2 1{j∈S} random variables X and Y have the same distribution when (β1 , . . . , βm ) is a permutation of (α1 , . . . , αm )Statistical Analysis of Social Media 26 / 33
  67. 67. Social Media Complex Networks AssortativityAsymptotics of sums in stable domain In order to proof theorem we need this lemma:Statistical Analysis of Social Media 27 / 33
  68. 68. Social Media Complex Networks AssortativityAsymptotics of sums in stable domain In order to proof theorem we need this lemma: Lemma (Asymptotics of sums in stable domain) Let (Ui ,j )i ∈[n],j∈[2] be i.i.d. random variables satisfying for some γ ∈ (0, 2). Then there exists a sequence an with an = n2/γ ℓ(n), where n → ℓ(n) is slowly varying, such that 1 n 2 1 n an i =1 Ui ,1 → Z1 , an i =1 Ui ,1 Ui ,2 → 0, where Z1 is stable with parameter γ/2 and → denotes convergence in probability.Statistical Analysis of Social Media 27 / 33
  69. 69. Social Media Complex Networks AssortativityApplications to random graph models consider the example with Uj ’s from a Pareto distribution,Statistical Analysis of Social Media 28 / 33
  70. 70. Social Media Complex Networks AssortativityApplications to random graph models consider the example with Uj ’s from a Pareto distribution, We generate N data samples and compute ρn and ρn rank for each of the N samples.Statistical Analysis of Social Media 28 / 33
  71. 71. Social Media Complex Networks AssortativityApplications to random graph models consider the example with Uj ’s from a Pareto distribution, We generate N data samples and compute ρn and ρn rank for each of the N samples. we obtain the vectors (ρn,j )N and (ρrank )N of N j=1 n,j j=1 independent realisations for ρn and ρrank , respectively n We then compute N N 1 rank 1 rank EN (ρn ) = ρn,j , EN (ρn )= ρn,j ; (11) N j =1 N j =1 N N 1 rank 1 σN (ρn ) = (ρn,j − EN (ρn ))2 , σN (ρn )= (ρrank − EN (ρn n,j rank ))2 . N − 1 j =1 N − 1 j =1 (12)Statistical Analysis of Social Media 28 / 33
  72. 72. Social Media Complex Networks AssortativityResults N 103 102 Model parameters n 102 103 104 105 EN (ρn ) 0.4395 0.4365 0.4458 0.4067 α = (1/2, 1/2, 0) σN (ρn ) 0.3399 0.3143 0.3175 0.3106 β = (0, 1/2, 1/2) EN (ρrank ) n 0.4508 0.4485 0.4504 0.4519 σN (ρrank ) n 0.0922 0.0293 0.0091 0.0033 EN (ρn ) 0.8251 0.7986 0.8289 0.8070 α = (1/2, 1/3, 1/6) σN (ρn ) 0.1151 0.1125 0.1108 0.1130 β = (1/6, 1/3, 1/2) EN (ρrank ) n 0.8800 0.8850 0.8858 0.8856 σN (ρrank ) n 0.0248 0.0073 0.0023 0.0007 EN (ρn ) -0.3052 -0.3386 -0.3670 -0.3203 α = (1/2, −1/3, 1/6) σN (ρn ) 0.6087 0.5841 0.5592 0.5785 β = (1/6, 1/2, −1/3) EN (ρrank ) n -0.3448 -0.3513 -0.3503 -0.3517 σN (ρrank ) n 0.1202 0.0393 0.0120 0.0034 Estimated mean and standard deviation of ρn and ρrank in N Table: n samples with linear dependence (10).Statistical Analysis of Social Media 29 / 33
  73. 73. Social Media Complex Networks AssortativityApplications to random graph models Let G = (V , E ) be a graph with vertex set V and edge set E .Statistical Analysis of Social Media 30 / 33
  74. 74. Social Media Complex Networks AssortativityApplications to random graph models Let G = (V , E ) be a graph with vertex set V and edge set E . then 2 1 1 1 |E | ij∈E Di Dj − |E | ij∈E 2 (Di + Dj ) ρ(G ) = 2 , 1 1 2 1 1 |E | ij∈E 2 (Di + Dj2 ) − |E | ij∈E 2 (Di + Dj )Statistical Analysis of Social Media 30 / 33
  75. 75. Social Media Complex Networks AssortativityApplications to random graph models Let G = (V , E ) be a graph with vertex set V and edge set E . then 2 1 1 1 |E | ij∈E Di Dj − |E | ij∈E 2 (Di + Dj ) ρ(G ) = 2 , 1 1 2 1 1 |E | ij∈E 2 (Di + Dj2 ) − |E | ij∈E 2 (Di + Dj ) where the sum is over directed edges of G , and Di is the degree of vertex i , i.e., ij and ji are two distinct edges. The assortativity coefficient is equal to the correlation coefficient of the sequence of random variables ((Di , Dj ))ij∈E .Statistical Analysis of Social Media 30 / 33
  76. 76. Social Media Complex Networks AssortativityResults Random variables X and Y is the degrees on two ends of a random undirected edge Assign [X = a, Y = b] or [X = b, Y = a] with probability 1/2. Furthermore, many values of X and Y will be the same because a degree d will appear at the end of an edge d times. Resolved the draws by adding independent random variables, uniformly distributed on [0, 1], to each value of X and Y . N 103 102 10 Model n 102 103 104 105 EN (ρ(Gn )) 0.0021 -0.0013 0.0001 -0.0003 Configuration model σN (ρ(Gn )) 0.0672 0.0212 0.0068 0.0024 with self-loops and double edges EN (ρrank ) n 0.0012 -0.0010 -0.0002 -0.0002 σN (ρrank (Gn )) 0.0656 0.0202 0.0066 0.0014 EN (ρ(Gn )) -0.0785 -0.0346 -0.0115 -0.0046 Configuration model σN (ρ(Gn )) 0.0686 0.0274 0.0102 0.0039 without self-loops and double edges EN (ρrank (Gn )) -0.0615 -0.0151 -0.0040 -0.0002 σN (ρrank (Gn )) 0.0836 0.0337 0.0075 0.0024 ¯ EN (ρ(Gn )) -0.2589 -0.1243 -0.0587 -0.0303 Configuration model ¯ σN (ρ(Gn )) 0.0872 0.0509 0.0255 0.0189 rmrank ¯ with intermediate vertices EN (ρ (Gn )) -0.7482 -0.7499 -0.7498 -0.7501 σN (ρrank (Gn )) ¯ 0.0121 0.0036 0.0011 0.0006 EN (ρ(Gn )) -0.2597 -0.1302 -0.0607 -0.0294 Preferential attachment σN (ρ(Gn )) 0.0550 0.0261 0.0127 0.0088 EN (ρrank (Gn )) -0.4167 -0.4151 -0.4166 -0.4158 σN (ρrank (Gn )) 0.0695 0.0202 0.0066 0.0022 Table:Estimated mean and standard deviation of ρ(G ) and ρrank (G ) in random graphs.Statistical Analysis of Social Media 31 / 33
  77. 77. Social Media Complex Networks AssortativityConclusion Familiarized with Social networks, Scale - free graph networks Investigated dependency measures for power-law random variables Degree-degree correlations in random graphs with heavy-tailed degrees correlation coefficient is inappropriate to describe dependencies between heavy-tailed random variables Further research is needed to study rank correlations on graphs.Statistical Analysis of Social Media 32 / 33
  78. 78. Social Media Complex Networks AssortativityReferences I Y. Volkovich, N. Litvak, and B. Zwart. Extremal Dependencies and Rank Correlations in Power Law Networks, In: J. Zhou, O. Akan, P. Bellavista et al. (Eds), Complex Sciences, Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, 5:1642-1653, Springer Berlin Heidelberg, 2009. M.E.J. Newman. The structure and function of complex networks. SIAM Rev., 45(2):167–256, 2003.Statistical Analysis of Social Media 33 / 33

×