Upcoming SlideShare
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Standard text messaging rates apply

Computational Social Science, Lecture 06: Networks, Part II

1,462

Published on

Published in: Education
0 Likes
Statistics
Notes
• Full Name
Comment goes here.

Are you sure you want to Yes No
• Be the first to comment

• Be the first to like this

Views
Total Views
1,462
On Slideshare
0
From Embeds
0
Number of Embeds
3
Actions
Shares
0
0
0
Likes
0
Embeds 0
No embeds

No notes for slide

Transcript

• 1. Networks Part II Sharad Goel Columbia UniversityComputational Social Science: Lecture 6 March 1, 2013
• 2. Corporate E-mail Communication[ Adamic & Adar, 2004 ]via Easley & Kleinberg
• 3. Networks/Graphs Nodes/verticespeople, organizations, webpages, computers Edgesrepresent connections between pairs of nodes
• 4. DistanceLength of the shortest path between two nodes
• 5. DistanceLength of the shortest path between two nodes
• 6. Breadth-first Searchiteratively explore nodes one layer at a time
• 7. # initialize distancesdist = {}for u in G: dist[u] = NAdist [u0] = 0d=0periphery = { u0 }while len(periphery) > 0: # find nodes one step away from the periphery next_level = {} for u in periphery: next_level += { w for w in neighbors[u] if dist[w] == NA } # update distances d += 1 for u in next_level: dist[u] = d # update periphery periphery = next_level
• 8. BFS @ scale undirected network Input edge list, starting node u0 OutputDistance to all nodes from u0
• 9. BFS @ scale undirected networkInput: edge list, distances (u, d)1. join distances with edge list2. foreach (u, d, w) output (w, d+1) [ also output (u0, 0) ]3. group by w, and output min d
• 10. Connected Components undirected network Input Edge list OutputList of nodes for each component
• 11. Connected GraphThere is a path between every pair of nodes
• 12. Connected GraphThere is a path between every pair of nodes
• 13. Connected Component A connected subset of nodes that is notcontained in any larger connected subset
• 14. Connected Components undirected network1. Select a node u0 that has not yet been assigned2. BFS starting from u03. Record nodes reached by BFS
• 15. Consider the global human social network,with an edge between every pair of friends Is this network connected?
• 16. Consider the global human social network, with an edge between every pair of friends Is this network connected?No – there are people with no (living) friends, who are hence isolated from the rest of the network
• 17. Consider the global human social network,with an edge between every pair of friendsIs there a “giant” connected component?
• 18. Consider the global human social network,with an edge between every pair of friendsIs there more than one “giant” component?
• 19. Consider the global human social network,with an edge between every pair of friendsIs there more than one “giant” component? No – unlikely to have two large disconnected sets of people
• 20. Consider the global human social network,with an edge between every pair of friendsIs there more than one “giant” component? No – unlikely to have two large disconnected sets of people Historically it was more likely e.g., pre-Columbian America & Eurasia
• 21. Consider the global human social network, with an edge between every pair of friendsOn average, how far are people from one another?
• 22. The Small-world Experiment Stanley Milgram, 1967296 people were randomly selected in Omaha and WichitaPackages sent to the selected individuals with instructions toforward to a particular stock broker in Boston through a chain ofpeople they knew on a first-name basis.
• 23. The Small-world Experiment Stanley Milgram, 1967 Of the 296 packages, 232 did not reach targetOf the 64 that did arrive, average path length was 6 “Six degrees of separation”
• 24. Small-world phenomenonIs “six degrees” big or small?
• 25. Small-world phenomenonnavigational vs. topological
• 26. The Anatomy of the Facebook Social GraphJ. Ugander, B. Karrer, L. Backstrom, C. Marlow 721 million users, 69 billion edges 5 degrees of separation
• 27. Edge list  degree distribution undirected network Input Edge list Output Degree distribution
• 28. 31 2 5 4 7 6 Degree of node u # of edges incident on u
• 29. Edge list  degree distribution undirected network Map input: (u, w) output: (u, w), key := u output: (w, u), key := w Reduce input: u, {w1, …, wk} output: u, k
• 30. Edge list  degree distribution undirected network Map input: u, k identity, key := k Reduce input: k, {u1, …, um} output: k, m
• 31. An email network of 130M usersEdges indicate reciprocated communication
• 32. An email network of 130M usersEdges indicate reciprocated communication (log-log plot)
• 33. Clustering
• 34. Clustering
• 35. Triadic closure1. Opportunity2. Incentive3. Commonality
• 36. Counting Triangles undirected network Input adjacency list OutputNumber of triangles incident on each node
• 37. Counting Triangles In memoryfor u in nodes: triangles[u] = 0 for w in neighbors[u]: triangles[u] += len(neighbors[w] & neighbors[u])triangles[u] = triangles[u] / 2
• 38. Counting Triangles @ scaleEvery node needs to know to which nodes it is connected and to which nodes its neighbors are connected
• 39. Counting Triangles @ scale Map input: u {w1, …, wk} foreach wi: output wi u {w1, …, wk} ReduceIn memory triangle count
• 40. Homophilythe tendency of individuals to associate with similar others “birds of a feather flock together”
• 41. Birds of a Feather: Homophily in Social Networks McPherson, Smith-Lovin, Cook race, sex, age, religion, education, occupation, social class, behaviors, attitudes, abilities, aspirations
• 42. Homophily1. Preference2. Influence3. Opportunity
• 43. Fantasy Football
• 44. Computing Homophily Input Edge list, race of each individual Output Distribution of race among friends White Black Latino AsianWhiteBlackLatinoAsian
• 45. Computing Homophily1. Join edges (u, v) by u, demographics (w, race) by w2. Join edges (u, v, urace) by v, demographics (w, race) by w
• 46. Computing Homophily1. Join edges (u, v) by u, demographics (w, race) by w2. Join edges (u, v, urace) by v, demographics (w, race) by w3. Group edges (u, v, urace, vrace) by sorted([urace, vrace])
• 47. Computing Homophily1. Join edges (u, v) by u, demographics (w, race) by w2. Join edges (u, v, urace) by v, demographics (w, race) by w3. Group edges (u, v, urace, vrace) by sorted([urace, vrace])4. Count edges in each group5. Normalize the table
• 48. How do ideas and products spread through society?
• 49. The structure of diffusion93% 5% 1% 0.3% 0.3%
• 50. Computing the structure of diffusion Input Twitter network Time-stamped “adoptions” of 1B URLs Output Distribution of cascade structures
• 51. Computing the structure of diffusion We assume v influenced u to adopt link t ifv is the last of u’s contacts to adopt before u 2 3 5 9
• 52. Computing the structure of diffusion Draw a labeled edge from v to u 3 t 5
• 53. Computing the structure of diffusion Group edges by their labels (URL)
• 54. Computing the structure of diffusionCompute the connected components for each forest corresponding to a URL
• 55. Computing the structure of diffusionDefinition. Two (rooted) trees are isomorphic if they areidentical under a relabeling of the vertices.
• 56. x (x) (x, x) ((x)) (x, (x))Basis. The canonical name c(T) for the one-node tree T is x.Induction. If T has more than one node, let T1, . . . ,Tk denote thesubtrees of the root indexed such that c(T1) ≤ c(T2) ≤ · · · ≤ c(Tk)under the lexicographic order. Then the canonical name for T is(c(T1), . . . ,c(Tk)). Aho et al. [1974]
• 57. Computing the structure of diffusion Compute the canonical name for each tree in the URL forests
• 58. Computing the structure of diffusionCount the number of trees of each type
• 59. Computing the structure of diffusion We assume v influenced u to adopt link t ifv is the last of u’s contacts to adopt before u 2 3 5 9
• 60. Computing the structure of diffusion Draw a labeled edge from v to u 3 t 5