Socialnetworkanalysis (Tin180 Com)
Upcoming SlideShare
Loading in...5

Like this? Share it with your network


Socialnetworkanalysis (Tin180 Com)


  • 1,904 views - Trang tin tức văn hóa lành mạnh - Trang tin tức văn hóa lành mạnh



Total Views
Views on SlideShare
Embed Views



1 Embed 1 1


Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
Post Comment
Edit your comment

Socialnetworkanalysis (Tin180 Com) Presentation Transcript

  • 1. Social Network Analysis
    • Social Network Introduction
    • Statistics and Probability Theory
    • Models of Social Network Generation
    • Networks in Biological System
  • 2. Society Nodes : individuals Links : social relationship (family/work/friendship/etc.) S. Milgram (1967) Social networks: Many individuals with diverse social interactions between them. John Guare Six Degrees of Separation
  • 3. Communication networks The Earth is developing an electronic nervous system, a network with diverse nodes and links are -computers -routers -satellites -phone lines -TV cables -EM waves Communication networks: Many non-identical components with diverse connections between them.
  • 4. New York Times Complex systems Made of many non-identical elements connected by diverse interactions . NETWORK
  • 5. “ Natural” Networks and Universality
    • Consider many kinds of networks:
      • social, technological, business, economic, content,…
    • These networks tend to share certain informal properties:
      • large scale; continual growth
      • distributed, organic growth: vertices “decide” who to link to
      • interaction restricted to links
      • mixture of local and long-distance connections
      • abstract notions of distance: geographical, content, social,…
    • Do natural networks share more quantitative universals?
    • What would these “universals” be?
    • How can we make them precise and measure them?
    • How can we explain their universality?
    • This is the domain of social network theory
    • Sometimes also referred to as link analysis
  • 6. Some Interesting Quantities
    • Connected components :
      • how many, and how large?
    • Network diameter :
      • maximum (worst-case) or average?
      • exclude infinite distances? (disconnected components)
      • the small-world phenomenon
    • Clustering :
      • to what extent that links tend to cluster “locally”?
      • what is the balance between local and long-distance connections?
      • what roles do the two types of links play?
    • Degree distribution :
      • what is the typical degree in the network?
      • what is the overall distribution?
  • 7. A “Canonical” Natural Network has…
    • Few connected components:
      • often only 1 or a small number, indep. of network size
    • Small diameter:
      • often a constant independent of network size (like 6)
      • or perhaps growing only logarithmically with network size or even shrink?
      • typically exclude infinite distances
    • A high degree of clustering:
      • considerably more so than for a random network
      • in tension with small diameter
    • A heavy-tailed degree distribution:
      • a small but reliable number of high-degree vertices
      • often of power law form
  • 8. Probabilistic Models of Networks
    • All of the network generation models we will study are probabilistic or statistical in nature
    • They can generate networks of any size
    • They often have various parameters that can be set:
      • size of network generated
      • average degree of a vertex
      • fraction of long-distance connections
    • The models generate a distribution over networks
    • Statements are always statistical in nature:
      • with high probability , diameter is small
      • on average , degree distribution has heavy tail
    • Thus, we’re going to need some basic statistics and probability theory
  • 9. Zipf’s Law
    • Look at the frequency of English words:
      • “ the” is the most common, followed by “of”, “to”, etc.
      • claim: frequency of the n-th most common ~ 1/n (power law, α = 1)
    • General theme:
      • rank events by their frequency of occurrence
      • resulting distribution often is a power law!
    • Other examples:
      • North America city sizes
      • personal income
      • file sizes
      • genus sizes (number of species)
    • People seem to dither over exact form of these distributions (e.g. value of α ), but not heavy tails
  • 10. Zipf’s Law Linear scales on both axes Logarithmic scales on both axes The same data plotted on linear and logarithmic scales. Both plots show a Zipf distribution with 300 datapoints
  • 11. Social Network Analysis
    • Social Network Introduction
    • Statistics and Probability Theory
    • Models of Social Network Generation
    • Networks in Biological System
    • Summary
  • 12. Some Models of Network Generation
    • Random graphs (Erdös-Rényi models):
      • gives few components and small diameter
      • does not give high clustering and heavy-tailed degree distributions
      • is the mathematically most well-studied and understood model
    • Watts-Strogatz models:
      • give few components, small diameter and high clustering
      • does not give heavy-tailed degree distributions
    • Scale-free Networks:
      • gives few components, small diameter and heavy-tailed distribution
      • does not give high clustering
    • Hierarchical networks:
      • few components, small diameter, high clustering, heavy-tailed
    • Affiliation networks:
      • models group-actor formation
  • 13. The Clustering Coefficient of a Network
    • Let nbr(u) denote the set of neighbors of u in a graph
      • all vertices v such that the edge (u,v) is in the graph
    • The clustering coefficient of u:
      • let k = |nbr(u)| (i.e., number of neighbors of u)
      • choose(k,2): max possible # of edges between vertices in nbr(u)
      • c(u) = ( actual # of edges between vertices in nbr(u))/choose(k,2)
      • 0 <= c(u) <= 1; measure of cliquishness of u’s neighborhood
    • Clustering coefficient of a graph:
      • average of c(u) over all vertices u
    k = 4 choose(k,2) = 6 c(u) = 4/6 = 0.666…
  • 14. Clustering : My friends will likely know each other! Probability to be connected C » p C = # of links between 1,2,…n neighbors n(n-1)/2 Networks are clustered [large C(p)] but have a small characteristic path length [small L(p)]. The Clustering Coefficient of a Network
  • 15. Erdos-Renyi: Clustering Coefficient
    • Generate a network G according to G(N,p)
    • Examine a “typical” vertex u in G
      • choose u at random among all vertices in G
      • what do we expect c(u) to be?
    • Answer: exactly p!
    • In G(N,m), expect c(u) to be 2m/N(N-1)
    • Both cases: c(u) entirely determined by overall density
    • Baseline for comparison with “more clustered” models
      • Erdos-Renyi has no bias towards clustered or local edges
  • 16. Scale-free Networks
    • The number of nodes (N) is not fixed
      • Networks continuously expand by additional new nodes
        • WWW: addition of new nodes
        • Citation: publication of new papers
    • The attachment is not uniform
      • A node is linked with higher probability to a node that already has a large number of links
        • WWW: new documents link to well known sites (CNN, Yahoo, Google)
        • Citation: Well cited papers are more likely to be cited again
  • 17. Scale-Free Networks
    • Start with (say) two vertices connected by an edge
    • For i = 3 to N:
      • for each 1 <= j < i, d(j) = degree of vertex j so far
      • let Z = S d(j) (sum of all degrees so far)
      • add new vertex i with k edges back to {1, …, i-1}:
        • i is connected back to j with probability d(j)/Z
    • Vertices j with high degree are likely to get more links!
    • “ Rich get richer”
    • Natural model for many processes:
      • hyperlinks on the web
      • new business and social contacts
      • transportation networks
    • Generates a power law distribution of degrees
      • exponent depends on value of k
  • 18.
    • Preferential attachment explains
      • heavy-tailed degree distributions
      • small diameter (~log(N), via “hubs”)
    • Will not generate high clustering coefficient
      • no bias towards local connectivity, but towards hubs
    Scale-Free Networks
  • 19. Social Network Analysis
    • Social Network Introduction
    • Statistics and Probability Theory
    • Models of Social Network Generation
    • Networks in Biological System
    • Mining on Social Network
    • Summary
  • 20. Bio-Map protein-gene interactions protein-protein interactions PROTEOME GENOME Citrate Cycle METABOLISM Bio-chemical reactions
  • 21. Citrate Cycle METABOLISM Bio-chemical reactions Metabolic Network
  • 22. Boehring-Mennheim
  • 23. Metab-movie Nodes : chemicals (substrates) Links : bio-chemical reactions Metabolic Network
  • 24. Meta-P(k) Organisms from all three domains of life are scale-free networks! H. Jeong, B. Tombor, R. Albert, Z.N. Oltvai, and A.L. Barabasi, Nature, 407 651 (2000) Archaea Bacteria Eukaryotes Metabolic Network
  • 25. Bio-Map protein-gene interactions protein-protein interactions PROTEOME GENOME Citrate Cycle METABOLISM Bio-chemical reactions
  • 26. Protein Network protein-protein interactions PROTEOME
  • 27. Prot Interaction map Nodes : proteins Links : physical interactions (binding) P. Uetz, et al. Nature 403 , 623-7 (2000). Yeast Protein Network
  • 28. Prot P(k) H. Jeong, S.P. Mason, A.-L. Barabasi, Z.N. Oltvai, Nature 411, 41-42 (2001) Topology of the Protein Network
  • 29. P53 … “ One way to understand the p53 network is to compare it to the Internet. The cell, like the Internet, appears to be a ‘ scale-free network ’.” p53 Network Nature 408 307 (2000)
  • 30. P53 P(k) p53 Network (mammals)