Social Network Analysis <ul><li>Social Network Introduction </li></ul><ul><li>Statistics and Probability Theory </li></ul>...
Society Nodes : individuals  Links :  social relationship (family/work/friendship/etc.) S. Milgram  (1967)  Social network...
Communication networks The Earth is developing an electronic nervous system, a network with diverse  nodes  and  links  ar...
New York Times Complex systems Made of  many non-identical  elements  connected by diverse  interactions . NETWORK
“ Natural” Networks and Universality <ul><li>Consider many kinds of networks: </li></ul><ul><ul><li>social, technological,...
Some Interesting Quantities   <ul><li>Connected components : </li></ul><ul><ul><li>how many, and how large? </li></ul></ul...
A “Canonical” Natural Network has… <ul><li>Few   connected components: </li></ul><ul><ul><li>often only 1 or a small numbe...
Probabilistic Models of Networks <ul><li>All of the network generation models we will study are  probabilistic  or  statis...
Zipf’s Law <ul><li>Look at the frequency of English words: </li></ul><ul><ul><li>“ the” is the most common, followed by “o...
Zipf’s Law Linear scales on both axes  Logarithmic scales on both axes  The same data plotted on linear and logarithmic sc...
Social Network Analysis <ul><li>Social Network Introduction </li></ul><ul><li>Statistics and Probability Theory </li></ul>...
Some Models of Network Generation <ul><li>Random graphs (Erdös-Rényi   models): </li></ul><ul><ul><li>gives few components...
The Clustering Coefficient of a Network <ul><li>Let nbr(u) denote the set of  neighbors  of u in a graph </li></ul><ul><ul...
Clustering : My friends will likely know each other! Probability to be connected C   »   p C  = # of links between 1,2,…n ...
Erdos-Renyi: Clustering Coefficient <ul><li>Generate a network G according to G(N,p) </li></ul><ul><li>Examine a “typical”...
Scale-free Networks <ul><li>The number of nodes (N) is not fixed </li></ul><ul><ul><li>Networks continuously expand by add...
Scale-Free Networks <ul><li>Start with (say) two vertices connected by an edge </li></ul><ul><li>For i = 3 to N: </li></ul...
<ul><li>Preferential attachment explains </li></ul><ul><ul><li>heavy-tailed degree distributions </li></ul></ul><ul><ul><l...
Social Network Analysis <ul><li>Social Network Introduction </li></ul><ul><li>Statistics and Probability Theory </li></ul>...
Bio-Map protein-gene interactions protein-protein interactions PROTEOME GENOME Citrate Cycle METABOLISM Bio-chemical react...
Citrate Cycle METABOLISM Bio-chemical reactions Metabolic Network
Boehring-Mennheim
Metab-movie Nodes :  chemicals (substrates) Links :  bio-chemical reactions  Metabolic Network
Meta-P(k) Organisms from all three domains of life are  scale-free  networks! H. Jeong, B. Tombor, R. Albert, Z.N. Oltvai,...
Bio-Map protein-gene interactions protein-protein interactions PROTEOME GENOME Citrate Cycle METABOLISM Bio-chemical react...
Protein Network protein-protein interactions PROTEOME
Prot Interaction map Nodes :  proteins  Links :  physical interactions (binding)  P. Uetz,  et al.   Nature   403 , 623-7 ...
Prot P(k) H. Jeong, S.P. Mason, A.-L. Barabasi, Z.N. Oltvai, Nature 411, 41-42 (2001) Topology of the Protein Network
P53 … “ One way to understand the p53 network  is to compare it to the Internet.  The cell, like the Internet, appears to ...
P53 P(k) p53 Network (mammals)
Upcoming SlideShare
Loading in...5
×

Socialnetworkanalysis (Tin180 Com)

1,395

Published on

http://tin180.com - Trang tin tức văn hóa lành mạnh

Published in: Business, Technology, Education
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
1,395
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
81
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide
  • MIGHT GIVE REAL EXAMPLES HERE? FROM WATTS?
  • Socialnetworkanalysis (Tin180 Com)

    1. 1. Social Network Analysis <ul><li>Social Network Introduction </li></ul><ul><li>Statistics and Probability Theory </li></ul><ul><li>Models of Social Network Generation </li></ul><ul><li>Networks in Biological System </li></ul>
    2. 2. Society Nodes : individuals Links : social relationship (family/work/friendship/etc.) S. Milgram (1967) Social networks: Many individuals with diverse social interactions between them. John Guare Six Degrees of Separation
    3. 3. Communication networks The Earth is developing an electronic nervous system, a network with diverse nodes and links are -computers -routers -satellites -phone lines -TV cables -EM waves Communication networks: Many non-identical components with diverse connections between them.
    4. 4. New York Times Complex systems Made of many non-identical elements connected by diverse interactions . NETWORK
    5. 5. “ Natural” Networks and Universality <ul><li>Consider many kinds of networks: </li></ul><ul><ul><li>social, technological, business, economic, content,… </li></ul></ul><ul><li>These networks tend to share certain informal properties: </li></ul><ul><ul><li>large scale; continual growth </li></ul></ul><ul><ul><li>distributed, organic growth: vertices “decide” who to link to </li></ul></ul><ul><ul><li>interaction restricted to links </li></ul></ul><ul><ul><li>mixture of local and long-distance connections </li></ul></ul><ul><ul><li>abstract notions of distance: geographical, content, social,… </li></ul></ul><ul><li>Do natural networks share more quantitative universals? </li></ul><ul><li>What would these “universals” be? </li></ul><ul><li>How can we make them precise and measure them? </li></ul><ul><li>How can we explain their universality? </li></ul><ul><li>This is the domain of social network theory </li></ul><ul><li>Sometimes also referred to as link analysis </li></ul>
    6. 6. Some Interesting Quantities <ul><li>Connected components : </li></ul><ul><ul><li>how many, and how large? </li></ul></ul><ul><li>Network diameter : </li></ul><ul><ul><li>maximum (worst-case) or average? </li></ul></ul><ul><ul><li>exclude infinite distances? (disconnected components) </li></ul></ul><ul><ul><li>the small-world phenomenon </li></ul></ul><ul><li>Clustering : </li></ul><ul><ul><li>to what extent that links tend to cluster “locally”? </li></ul></ul><ul><ul><li>what is the balance between local and long-distance connections? </li></ul></ul><ul><ul><li>what roles do the two types of links play? </li></ul></ul><ul><li>Degree distribution : </li></ul><ul><ul><li>what is the typical degree in the network? </li></ul></ul><ul><ul><li>what is the overall distribution? </li></ul></ul>
    7. 7. A “Canonical” Natural Network has… <ul><li>Few connected components: </li></ul><ul><ul><li>often only 1 or a small number, indep. of network size </li></ul></ul><ul><li>Small diameter: </li></ul><ul><ul><li>often a constant independent of network size (like 6) </li></ul></ul><ul><ul><li>or perhaps growing only logarithmically with network size or even shrink? </li></ul></ul><ul><ul><li>typically exclude infinite distances </li></ul></ul><ul><li>A high degree of clustering: </li></ul><ul><ul><li>considerably more so than for a random network </li></ul></ul><ul><ul><li>in tension with small diameter </li></ul></ul><ul><li>A heavy-tailed degree distribution: </li></ul><ul><ul><li>a small but reliable number of high-degree vertices </li></ul></ul><ul><ul><li>often of power law form </li></ul></ul>
    8. 8. Probabilistic Models of Networks <ul><li>All of the network generation models we will study are probabilistic or statistical in nature </li></ul><ul><li>They can generate networks of any size </li></ul><ul><li>They often have various parameters that can be set: </li></ul><ul><ul><li>size of network generated </li></ul></ul><ul><ul><li>average degree of a vertex </li></ul></ul><ul><ul><li>fraction of long-distance connections </li></ul></ul><ul><li>The models generate a distribution over networks </li></ul><ul><li>Statements are always statistical in nature: </li></ul><ul><ul><li>with high probability , diameter is small </li></ul></ul><ul><ul><li>on average , degree distribution has heavy tail </li></ul></ul><ul><li>Thus, we’re going to need some basic statistics and probability theory </li></ul>
    9. 9. Zipf’s Law <ul><li>Look at the frequency of English words: </li></ul><ul><ul><li>“ the” is the most common, followed by “of”, “to”, etc. </li></ul></ul><ul><ul><li>claim: frequency of the n-th most common ~ 1/n (power law, α = 1) </li></ul></ul><ul><li>General theme: </li></ul><ul><ul><li>rank events by their frequency of occurrence </li></ul></ul><ul><ul><li>resulting distribution often is a power law! </li></ul></ul><ul><li>Other examples: </li></ul><ul><ul><li>North America city sizes </li></ul></ul><ul><ul><li>personal income </li></ul></ul><ul><ul><li>file sizes </li></ul></ul><ul><ul><li>genus sizes (number of species) </li></ul></ul><ul><li>People seem to dither over exact form of these distributions (e.g. value of α ), but not heavy tails </li></ul>
    10. 10. Zipf’s Law Linear scales on both axes Logarithmic scales on both axes The same data plotted on linear and logarithmic scales. Both plots show a Zipf distribution with 300 datapoints
    11. 11. Social Network Analysis <ul><li>Social Network Introduction </li></ul><ul><li>Statistics and Probability Theory </li></ul><ul><li>Models of Social Network Generation </li></ul><ul><li>Networks in Biological System </li></ul><ul><li>Summary </li></ul>
    12. 12. Some Models of Network Generation <ul><li>Random graphs (Erdös-Rényi models): </li></ul><ul><ul><li>gives few components and small diameter </li></ul></ul><ul><ul><li>does not give high clustering and heavy-tailed degree distributions </li></ul></ul><ul><ul><li>is the mathematically most well-studied and understood model </li></ul></ul><ul><li>Watts-Strogatz models: </li></ul><ul><ul><li>give few components, small diameter and high clustering </li></ul></ul><ul><ul><li>does not give heavy-tailed degree distributions </li></ul></ul><ul><li>Scale-free Networks: </li></ul><ul><ul><li>gives few components, small diameter and heavy-tailed distribution </li></ul></ul><ul><ul><li>does not give high clustering </li></ul></ul><ul><li>Hierarchical networks: </li></ul><ul><ul><li>few components, small diameter, high clustering, heavy-tailed </li></ul></ul><ul><li>Affiliation networks: </li></ul><ul><ul><li>models group-actor formation </li></ul></ul>
    13. 13. The Clustering Coefficient of a Network <ul><li>Let nbr(u) denote the set of neighbors of u in a graph </li></ul><ul><ul><li>all vertices v such that the edge (u,v) is in the graph </li></ul></ul><ul><li>The clustering coefficient of u: </li></ul><ul><ul><li>let k = |nbr(u)| (i.e., number of neighbors of u) </li></ul></ul><ul><ul><li>choose(k,2): max possible # of edges between vertices in nbr(u) </li></ul></ul><ul><ul><li>c(u) = ( actual # of edges between vertices in nbr(u))/choose(k,2) </li></ul></ul><ul><ul><li>0 <= c(u) <= 1; measure of cliquishness of u’s neighborhood </li></ul></ul><ul><li>Clustering coefficient of a graph: </li></ul><ul><ul><li>average of c(u) over all vertices u </li></ul></ul>k = 4 choose(k,2) = 6 c(u) = 4/6 = 0.666…
    14. 14. Clustering : My friends will likely know each other! Probability to be connected C » p C = # of links between 1,2,…n neighbors n(n-1)/2 Networks are clustered [large C(p)] but have a small characteristic path length [small L(p)]. The Clustering Coefficient of a Network
    15. 15. Erdos-Renyi: Clustering Coefficient <ul><li>Generate a network G according to G(N,p) </li></ul><ul><li>Examine a “typical” vertex u in G </li></ul><ul><ul><li>choose u at random among all vertices in G </li></ul></ul><ul><ul><li>what do we expect c(u) to be? </li></ul></ul><ul><li>Answer: exactly p! </li></ul><ul><li>In G(N,m), expect c(u) to be 2m/N(N-1) </li></ul><ul><li>Both cases: c(u) entirely determined by overall density </li></ul><ul><li>Baseline for comparison with “more clustered” models </li></ul><ul><ul><li>Erdos-Renyi has no bias towards clustered or local edges </li></ul></ul>
    16. 16. Scale-free Networks <ul><li>The number of nodes (N) is not fixed </li></ul><ul><ul><li>Networks continuously expand by additional new nodes </li></ul></ul><ul><ul><ul><li>WWW: addition of new nodes </li></ul></ul></ul><ul><ul><ul><li>Citation: publication of new papers </li></ul></ul></ul><ul><li>The attachment is not uniform </li></ul><ul><ul><li>A node is linked with higher probability to a node that already has a large number of links </li></ul></ul><ul><ul><ul><li>WWW: new documents link to well known sites (CNN, Yahoo, Google) </li></ul></ul></ul><ul><ul><ul><li>Citation: Well cited papers are more likely to be cited again </li></ul></ul></ul>
    17. 17. Scale-Free Networks <ul><li>Start with (say) two vertices connected by an edge </li></ul><ul><li>For i = 3 to N: </li></ul><ul><ul><li>for each 1 <= j < i, d(j) = degree of vertex j so far </li></ul></ul><ul><ul><li>let Z = S d(j) (sum of all degrees so far) </li></ul></ul><ul><ul><li>add new vertex i with k edges back to {1, …, i-1}: </li></ul></ul><ul><ul><ul><li>i is connected back to j with probability d(j)/Z </li></ul></ul></ul><ul><li>Vertices j with high degree are likely to get more links! </li></ul><ul><li>“ Rich get richer” </li></ul><ul><li>Natural model for many processes: </li></ul><ul><ul><li>hyperlinks on the web </li></ul></ul><ul><ul><li>new business and social contacts </li></ul></ul><ul><ul><li>transportation networks </li></ul></ul><ul><li>Generates a power law distribution of degrees </li></ul><ul><ul><li>exponent depends on value of k </li></ul></ul>
    18. 18. <ul><li>Preferential attachment explains </li></ul><ul><ul><li>heavy-tailed degree distributions </li></ul></ul><ul><ul><li>small diameter (~log(N), via “hubs”) </li></ul></ul><ul><li>Will not generate high clustering coefficient </li></ul><ul><ul><li>no bias towards local connectivity, but towards hubs </li></ul></ul>Scale-Free Networks
    19. 19. Social Network Analysis <ul><li>Social Network Introduction </li></ul><ul><li>Statistics and Probability Theory </li></ul><ul><li>Models of Social Network Generation </li></ul><ul><li>Networks in Biological System </li></ul><ul><li>Mining on Social Network </li></ul><ul><li>Summary </li></ul>
    20. 20. Bio-Map protein-gene interactions protein-protein interactions PROTEOME GENOME Citrate Cycle METABOLISM Bio-chemical reactions
    21. 21. Citrate Cycle METABOLISM Bio-chemical reactions Metabolic Network
    22. 22. Boehring-Mennheim
    23. 23. Metab-movie Nodes : chemicals (substrates) Links : bio-chemical reactions Metabolic Network
    24. 24. Meta-P(k) Organisms from all three domains of life are scale-free networks! H. Jeong, B. Tombor, R. Albert, Z.N. Oltvai, and A.L. Barabasi, Nature, 407 651 (2000) Archaea Bacteria Eukaryotes Metabolic Network
    25. 25. Bio-Map protein-gene interactions protein-protein interactions PROTEOME GENOME Citrate Cycle METABOLISM Bio-chemical reactions
    26. 26. Protein Network protein-protein interactions PROTEOME
    27. 27. Prot Interaction map Nodes : proteins Links : physical interactions (binding) P. Uetz, et al. Nature 403 , 623-7 (2000). Yeast Protein Network
    28. 28. Prot P(k) H. Jeong, S.P. Mason, A.-L. Barabasi, Z.N. Oltvai, Nature 411, 41-42 (2001) Topology of the Protein Network
    29. 29. P53 … “ One way to understand the p53 network is to compare it to the Internet. The cell, like the Internet, appears to be a ‘ scale-free network ’.” p53 Network Nature 408 307 (2000)
    30. 30. P53 P(k) p53 Network (mammals)
    1. A particular slide catching your eye?

      Clipping is a handy way to collect important slides you want to go back to later.

    ×