Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Mathematics and Social Networks

1,196 views

Published on

These slides are for my talk for the Somerville College Mathematics Reunion ("Somerville Maths Reunion", 6/24/17): http://www.some.ox.ac.uk/event/somerville-maths-reunion/

Published in: Science
  • Be the first to comment

  • Be the first to like this

Mathematics and Social Networks

  1. 1. Mathematics and Social Networks Mason A. Porter (@masonporter) Department of Mathematics, UCLA (2007–2016: Tutor in Applied Mathematics, Somerville College)
  2. 2. Outline • Introduction and a Few Ideas • Pictures of social networks • Types of networks and mathematical representations • Some old questions about social networks • What is a complex system? • Small worlds • Which nodes are important? • Multilayer Networks • Introduction • Which nodes are important? • Conclusions
  3. 3. Introduction • Some motivation, general ideas, and examples
  4. 4. • A network has nodes (representing entities), which are connected by edges (representing ties between the entities). The simplest type of network is a graph. • Example: • Members of a karate club connected to others in the club with whom they hung out (left figure) • “The harmonic oscillator of network science” Zachary Karate Club
  5. 5. Gangs in Los Angeles
  6. 6. Rabbit Warren
  7. 7. Types of Networks • Binary networks: 1 if there is a connection and 0 if there isn’t • Weighted networks: Some value if there is a connection (representing strength of connection) and otherwise 0 • Directed networks: awkward? • Bipartite networks: only nodes of different types are connected to each other (e.g. an actor connected to a movie in which he/she appeared) • Time-dependent networks: nodes and/or edges (existence and/or weights) are time-dependent • Multiplex networks: more than one type of edge • Spatial networks: embedded in space • More…
  8. 8. „ Representing a Network „ Adjacency matrix A „ This example: binary (“unweighted”) „ Aij = 1 if there is a connection between nodes i and j „ Aij = 0 if no connection „ How do we generalize this representation to weighted, directed, and bipartite examples? Representing a Network
  9. 9. A Couple of Longstanding Questions in the Study of Social Networks • I. de Sola Pool & M. Kochen [1978–79], “Contacts and Influence”, Social Networks, 1: 5–51 (though a preprint for 2 decades) man through the right channels, and the more channels one has in reserve. the better. Prominent politicians count their acquaintances by the thousands. They run into people they know everywhere they go. The experience of casual contact and the practice of influence are not unrelated. A common theory of human contact nets might help clarify them both. No such theory exists at present. Sociologists talk of social stratification; political scientists of influence. These quantitative concepts ought to lend themselves to a rigorous metric based upon the elementary social events of man-to-man contact. “Stratification” expresses the probability of two people in the same stratum meeting and the improbability of two people from dif- ferent strata meeting. Political access may be expressed as the probability that there exists an easy chain of contacts leading to the power holder. Yet such measures of stratification and influence as functions of contacts do not exist. What is it that we should like to know about human contact nets? -~-For any individual we should like to know how many other people he knows, i.c. his acquaintance volume. - For a popnfatiorl we want to know the distribution of acquaintance volumes, the mean and the range between the extremes. _ We want to know what kinds of people they are who have many con- tacts and whether those people are also the influentials. ,.- We want to know how the lines of contact are stratified; what is the structure of the network? If we know the answers to these questions about individuals and about the whole population, we can pose questions about the implications for paths between pairs of individuals. - How great is the probability that two persons chosen at random from the population will know each other? - How great is the chance that they will have a friend in common? - How great is the chance that the shortest chain between them requires two intermediaries; i.e., a friend of a friend? very stuff of politics. Influence is in large part the ability to reach the crucial man through the right channels, and the more channels one has in reserve. the better. Prominent politicians count their acquaintances by the thousands. They run into people they know everywhere they go. The experience of casual contact and the practice of influence are not unrelated. A common theory of human contact nets might help clarify them both. No such theory exists at present. Sociologists talk of social stratification; political scientists of influence. These quantitative concepts ought to lend themselves to a rigorous metric based upon the elementary social events of man-to-man contact. “Stratification” expresses the probability of two people in the same stratum meeting and the improbability of two people from dif- ferent strata meeting. Political access may be expressed as the probability that there exists an easy chain of contacts leading to the power holder. Yet such measures of stratification and influence as functions of contacts do not exist. What is it that we should like to know about human contact nets? -~-For any individual we should like to know how many other people he knows, i.c. his acquaintance volume. - For a popnfatiorl we want to know the distribution of acquaintance volumes, the mean and the range between the extremes. _ We want to know what kinds of people they are who have many con- tacts and whether those people are also the influentials. ,.- We want to know how the lines of contact are stratified; what is the structure of the network? If we know the answers to these questions about individuals and about the whole population, we can pose questions about the implications for paths between pairs of individuals. - How great is the probability that two persons chosen at random from the population will know each other? - How great is the chance that they will have a friend in common?
  10. 10. Spread of “Fake News” on Social Networks
  11. 11. What is a Complex System? • Complexity*: • “The behaviour shown by Complex Systems” • Complex System*: • “A System Whose Behaviour Exhibits Complexity” • I wrote a brief introduction for a general audience on Quora: https://www.quora.com/How-do-I-explain-to-non-mathematical- people-what-non-linear-and-complex-systems-mean * Neil Johnson, Two’s Company,Three is Complexity, Oneworld Publications, 2007.
  12. 12. What is a Complex System? My definition follows the way that the US Supreme Court once defined pornography: “I shall not today attempt further to define the kinds of material I understand to be embraced within that shorthand description [‘complex systems’]; and perhaps I could never succeed in intelligibly doing so. But I know it when I see it.” (adapted from Justice Potter Stewart)
  13. 13. What are complex systems? No rigorous definition; some potential features/components*: • The system has a collection of many interacting objects or “agents” • These objects may be affected by memory or feedback • The system may be “open” (influenced by environment) • The system exhibits “emergent phenomena” • Emergent phenomena arise without a central controller * Neil Johnson, Two’s Company,Three is Complexity, Oneworld Publications, 2007.
  14. 14. Two Fundamental Ideas in Complex Systems • Emergence: Finding (possibly unexpected) order in high- dimensional systems (large systems of interacting components) • We seek examples of “emergence” in the study of networks. • Chaos: Finding (possibly unexpected) disorder in low- dimensional systems • E.g. Lorenz attractor
  15. 15. Networks are Complex Systems • We want to somehow summarize the information in networks to learn something about them. • Important entities? • Important interactions? • Dense sets (“communities”) of entities? • Perhaps connected sparsely to other dense sets? • Sets of behaviorally similar entities? • Structural bottlenecks to dynamical processes (e.g. disease or rumor spreading) • Where should you start a rumor to maximize how far it spreads? And how does this depend on the network structure? • Voting dynamics on networks (“echo chambers” and “majority illusion”) • Etc.
  16. 16. 6 Degrees of Karin Erdmann
  17. 17. 6 Degrees of Karin Erdmann
  18. 18. 6 Degrees of Karin Erdmann
  19. 19. 6 Degrees of Karin Erdmann
  20. 20. Watts–Strogatz “Small World” Model • Watts–Strogatz ‘small world’ model (reviewed in MAP, Scholarpedia, 2012) • Start with a 1D ring of nodes but connect each node to its k nearest neighbors on each side. Then either rewire nodes with probability p (original model) or add ‘shortcut’ with probability p (Newman–Watts variant). • What happens as p increases from 0?
  21. 21. Regime with both short mean geodesic path length (“small world”) and high mean clustering coefficient A small number of shortcuts has a small effect on local clustering, but very quickly shortens the mean geodesic distance from scaling linearly to scaling logarithmically with the number of nodes. Note: Navigation of networks efficiently is even more amazing than the fact that the world is often small.
  22. 22. Determining Important (“Central”) Nodes Example: Hubs and Authorities • J. M. Kleinberg, Journal of the ACM, Vol. 46: 604–632 (1999) • Intuition: A Web page (node) is a good hub if it has many hyperlinks (out-edges) to important nodes, and a node is a good authority if many important nodes have hyperlinks to it (in-edges) • Imagine a random walker surfing the Web. It should spend a lot of time on important Web pages. Equilibrium populations of an ensemble of walkers satisfy an eigenvalue problem: • x = aAy ; y = bATx è ATAy = λy & AATx = λx, where λ = 1/(ab) • Leading eigenvalue λ1 (strictly positive) gives strictly positive authority vector x and hub vector y (leading eigenvectors) • Node i has hub centrality xi and authority centrality yi
  23. 23. Application: Ranking Mathematics Programs • Apply the same idea to mathematics departments based on the flow of Ph.D. students • S. A. Meyer, P. J. Mucha, & MAP, “Mathematical genealogy and department prestige”, Chaos, Vol. 21: 041104 (2011) • One-page paper in Gallery of Nonlinear Images • Data from Mathematics Genealogy Project
  24. 24. Hubs and Authorities Among US Mathematics Programs • We consider MPG data in the US from 1973– 2010 (data from 10/09) • Example: I earned a PhD from Cornell and subsequently supervised students at Oxford and UCLA. • è Directed edge of unit weight from Cornell to Oxford (and also from Cornell to UCLA) • A university is a good authority if it hires students from good hubs, and a university is good hub if its students are hired by good authorities. • Caveats • Our measurement has a time delay (only have the Cornell è UCLA edge after I supervise a PhD student there; normally there’s a delay) • Eventually, there will be an edge from Oxford to University of Vermont when Puck Rombach graduates her first Ph.D. student. (I want grandstudents!) • Hubs and authorities should change in time
  25. 25. Geographically-Inspired Visualization Mathematical genealogy and department prestige Sean A. Myers,1 Peter J. Mucha,1 and Mason A. Porter2 1 Department of Mathematics, University of North Carolina, FIG. 1. (Color) Visualizations of a mathematics genealogy network. CHAOS 21, 041104 (2011) Hubs: node size Authorities: node color
  26. 26. How do our rankings do? rtment n A. Porter2 olina, LB, UK 2011) (http://www. 000 scholars elated fields. graduation s. The MGP used to trace rant, Hilbert, We use a “geographically inspired” layout to balance node locations and node overlap. A Kamada-Kawai visualization4 Visualizations of a mathematics genealogy network. FIG. 2. (Color) Rankings versus authority scores.
  27. 27. Multilayer Networks Review Article: M. Kivelä, A. Arenas, M. Barthelemy, J. P. Gleeson,Y. Moreno, & MAP, “Multilayer Networks”, Journal of Complex Networks, 2(3): 203–271, 2014.
  28. 28. What is a Multilayer Network?
  29. 29. General Form of a Multilayer Network • Definition of a multilayer network M – M = (VM,EM,V,L) • V: set of nodes (“entities”) – As in ordinary graphs • L: sequence of sets of possible layers – One set for each additional “aspect” d ≥ 0 beyond an ordinary network (examples: d = 1 in schematic on this page; d = 2 on last page) • VM: set of tuples that represent node-layers • EM: multilayer edge set that connects these tuples • Note 1: allow weighted multilayer networks by mapping edges to real numbers with w: EM èR • Note 2: d = 0 yields the usual single-layer (“monolayer”) networks
  30. 30. Supra-Adjacency Matrices (from “flattening” tensors) • Schematic from M. Bazzi, MAP, S. Williams, M. McDonald, D. J. Fenn, & S. D. Howison [2016] Multiscale Modeling and Simulation: A SIAM Interdisciplinary Journal, 14(1): 1–41 13 Layer 1 11 21 31 Layer 2 12 22 32 Layer 3 13 23 33 ! 2 6 6 6 6 6 6 6 6 6 6 6 6 4 0 1 1 ! 0 0 0 0 0 1 0 0 0 ! 0 0 0 0 1 0 0 0 0 ! 0 0 0 ! 0 0 0 1 1 ! 0 0 0 ! 0 1 0 1 0 ! 0 0 0 ! 1 1 0 0 0 ! 0 0 0 ! 0 0 0 1 0 0 0 0 0 ! 0 1 0 1 0 0 0 0 0 ! 0 1 0 3 7 7 7 7 7 7 7 7 7 7 7 7 5 Fig. 3.1. Example of (left) a multilayer network with unweighted intra-layer connections (solid lines) and uniformly weighted inter-layer connections (dashed curves) and (right) its corresponding adjacency matrix. (The adjacency matrix that corresponds to a multilayer network is sometimes called a “supra-adjacency matrix” in the network-science literature [39].) or an adjacency matrix to represent a multilayer network.) The generalization in [49] consists of applying the function in (2.16) to the N|T |-node multilayer network: N|T | ✓ ◆
  31. 31. Example: Multiplex Network (e.g. multirelational social network) • Monster movement in the game “Munchkin Quest”
  32. 32. My Ego-Centric Multiplex Network (edge-colored multigraph)
  33. 33. Example: Interconnected Network (e.g. UK infrastucture) (CourtesyofScottThacker,ITRC,UniversityofOxford)
  34. 34. Time-dependent centrality from multilayer representation of time-dependent network • Math department rankings change in time, so we need centrality measures that change in time • E.g. via a multilayer representation of a temporal network • Multilayer network with adjacency tensor elements Aijt • Directed intralayer edge from university i to university j at time t for a specific person’s PhD granted at time t for a person who later advised a student at i (multi-edges give weights) • E.g. I would yield an OxfordèCornell edge and a UCLAèCornell edge for t = 2002 • Though neither of these is actually in the analyzed data set • Use a multilayer network with “diagonal” and ordinal interlayer coupling • 231 US universities, T = 65 time layers (1946–2010)
  35. 35. Ranking Math Programs: Best Authorities • Note: We construct a “supra-centrality matrix” and do a singular perturbation expansion (and examine the coefficients of the expansion). 18 S. A. MYERS et al. Table 4.1 Top centralities and first-order movers for universities in the MGP [4]. Top Time-Averaged Centralities Top First-Order Mover Scores Rank University ↵i 1 MIT 0.6685 2 Berkeley 0.2722 3 Stanford 0.2295 4 Princeton 0.1803 5 Illinois 0.1645 6 Cornell 0.1642 7 Harvard 0.1628 8 UW 0.1590 9 Michigan 0.1521 10 UCLA 0.1456 Rank University mi 1 MIT 688.62 2 Berkeley 299.07 3 Princeton 248.72 4 Stanford 241.71 5 Georgia Tech 189.34 6 Maryland 186.65 7 Harvard 185.34 8 CUNY 182.59 9 Cornell 180.50 10 Yale 159.11 map: do we want to indicate what any of those other papers do with the MGP data?drt: my vote is no. keep things as brief as possible. We extend our previous consideration of this data [71] by keeping the year that each faculty member graduated with his/her Ph.D. degree. We thus construct a multilayer network of the MGP Ph.D. exchange using elements Aijt that indicate a directed edge from university i to university j at time t to represent a doctoral degree
  36. 36. Conclusions
  37. 37. Conclusions • Mathematical ideas have long been used in the study of social networks. • ”Network science” is a vibrant and ever-growing area of mathematics. • Note: major connections to data science, graph theory, probability, statistics, dynamical systems, statistical mechanics, optimization, computer science, and more • The study of “multilayer networks”, currently the most prominent area of network science, allows one to examine (and integrate) heterogeneous types of information.

×