Network Flow and Network Formation: A Social Network Analysis Perspective

Uploaded on

Ringvorlesung der Research School Business & Economics (RSBE) …

Ringvorlesung der Research School Business & Economics (RSBE)
University of Siegen , Germany
June 28, 2011

Ralf Klamma
RWTH Aachen University

More in: Technology , Education
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads


Total Views
On Slideshare
From Embeds
Number of Embeds



Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

    No notes for slide


  • 1. Informatik 5 (DBIS) RWTH Aachen UniversityTeLLNet GALA Network Flow and Network Formation: A Social Network Analysis Perspective Ralf Klamma RWTH Aachen University Ringvorlesung der Research School Business & Economics (RSBE) Siegen June 28, 2011Lehrstuhl Informatik 5(Informationssysteme) Prof. Dr. M. Jarke I5-KL-111010-1
  • 2. AgendaTeLLNet GALA Conclusions and Outlook Network Formation Network Science Network FlowLehrstuhl Informatik 5(Informationssysteme) Prof. Dr. M. Jarke I5-KL-111010-2
  • 3. RWTH Aachen University • 260 institutes in 9 faculties as Europe’s leading institutions for science and researchTeLLNet • Currently around 31,400 students are enrolled GALA in over 100 academic programs • Over 5,000 of them are international students hailing from 120 different countries • 1,250 spin-off businesses have created around 30,000 jobs in the greater Aachen region over the past 20 years. • IDEA League • Germany’s Excellence Initiative: 3 clusters of excellence, a graduate schoolLehrstuhl Informatik 5 and the institutional strategy “RWTH Aachen 2020: Meeting Global Challenges”(Informationssysteme) Prof. Dr. M. Jarke I5-KL-111010-3
  • 4. Community Information Systems Research GroupTeLLNet GALA Established at DBIS chair, RWTH Aachen UniversityLehrstuhl Informatik 5 3 Postdocs, 7 PhD students,(Informationssysteme) Prof. Dr. M. Jarke + paid student workers & thesis workers I5-KL-111010-4
  • 5. TeLLNet GALA Network ScienceLehrstuhl Informatik 5(Informationssysteme) Prof. Dr. M. Jarke I5-KL-111010-5
  • 6. Questions within Network Science  How well the position of a agent is to receive andTeLLNet disseminate information? GALA – experts (centrality measures) [Wasserman & Faust, 1997]  Are users communicate only within their groups or with some agents from the other groups as well? – innovation stars (boundary spanners, brokers, high betwenness centrality) [Burt, 2005]  Who and what effects a agent? – influence networks [Lewis, 2008]  What are groups/communities an agent belongs to?Lehrstuhl Informatik 5 – community mining [Clauset et al., 2004](Informationssysteme) Prof. Dr. M. Jarke I5-KL-111010-6
  • 7. Executive Board Networks:  A prototype as of 2004TeLLNet  What is the connection between Motorola and Whirlpool? GALA  How does the academic institutes and the companiesLehrstuhl Informatik 5 network look like?(Informationssysteme) Prof. Dr. M. Jarke I5-KL-111010-7
  • 8. Who rule 3M, Motorola, AT&T, Coca- Cola, PepsiCo, and McDonald‘s?TeLLNet GALALehrstuhl Informatik 5(Informationssysteme) Prof. Dr. M. Jarke I5-KL-111010-8
  • 9. Spread of ContagionTeLLNet GALALehrstuhl Informatik 5(Informationssysteme) Prof. Dr. M. Jarke Source: I5-KL-111010-9
  • 10. Network Science Paradigms  Merge of analytic and engineering paradigms  In an analytic disciplineTeLLNet GALA Scientific – To find laws (computing paradigms) disciplines Commerce – To generate phenomena Communication – To explain observed phenomena serves a  In a engineering discipline purpose – To realize and implement Entertainment Politics the paradigms of Networks – To understand the cases when particular technologies should be usedLehrstuhl Informatik 5(Informationssysteme) – To store Network data efficiently (Mediabase) Prof. Dr. M. Jarke I5-KL-111010-10
  • 11. Web Science: The Long Tail & Fragments IN Continent Central Core OUT ContinentTeLLNet GALA Tunnels [Anderson, 2006] Tendrils Island [Barabasi, 2002]  The Web is a scale-free, fragmented network – The power law (Pareto-Distribution etc.)Lehrstuhl Informatik 5 – 95 % of users are located in the Long Tail (Communities)(Informationssysteme) Prof. Dr. M. Jarke – Trust and passion based cooperation I5-KL-111010-11
  • 12. Principle Analytic Approach  Interdisciplinary multidimensional model of networksTeLLNet – Social network analysis (SNA) is defining measures for social relations GALA – Actor network theory (ANT) is connecting human and media agents – i* framework is defining strategic goals and dependencies – Theory of media transcriptions is studying cross-media knowledge social software Media Networks network of artifacts Wiki, Blog, Podcast, IM, Chat, Microcontent, Blog entry, Message, Burst, Thread, Email, Newsgroup, Chat … Comment, Conversation, Feedback (Rating) i*-Dependencies (Structural, Cross-media) network of membersLehrstuhl Informatik 5 Members (Social Network Analysis: Centrality,(Informationssysteme) Prof. Dr. M. Jarke Efficiency) Communities of practice I5-KL-111010-12
  • 13. MediaBase  Collection of Social Software artifacts with parameterizedTeLLNet PERL scripts GALA – Mailing lists – Newsletter – Web sites – RSS Feeds – Blogs  Database support by IBM DB2, eXist, Oracle, ...  Web Interface based on Firefox Plugin, Plone/Zope, Widgets, ...  Strategies of visualization – Tree mapsLehrstuhl Informatik 5(Informationssysteme) Prof. Dr. M. Jarke – Cross-media graphs I5-KL-111010-13 Klamma et al.: Pattern-Based Cross Media Social Network Analysis for Technology Enhanced Learning in Europe, EC-TEL 2006
  • 14. TeLLNet GALA Network FlowLehrstuhl Informatik 5(Informationssysteme) Prof. Dr. M. Jarke I5-KL-111010-14
  • 15. Fundamentals: Definitions of Network A network Γ= (N, L) whereFundamentals of networks  TeLLNet N = {1, 2, ..., n} is a (finite) set of nodes (vertices), GALA L ⊆ N x N is a set of links (edges)  Assumed: – Unweighted – No multiple links => only one link exist between two given nodes => these two nodes are neighbors or adjacent – Directed or undirectedLehrstuhl Informatik 5(Informationssysteme) Prof. Dr. M. Jarke I5-KL-111010-15
  • 16. Definitions in a NetworkFundamentals of networks  Degree of a node: z = { j ∈ N : ij ∈ L} i TeLLNet GALA number of incoming and outgoing links  A path is a sequence of nodes v0, …, vn-1 with (vi, vi+1) ∈ L, for 0 ≤ i < n-1, A path is a set of connected links  Length of a path : number of links on a path  A path is a simple path, if all vertices on a path are pair wise different  A cycle is a path with v0 = vn-1 and length n ≥ 2  A subnetwork of a network Γ= (N, L) is a graph Γ’= (N’,Lehrstuhl Informatik 5(Informationssysteme) L’) with N’ ⊆ N und L’ ⊆ L Prof. Dr. M. Jarke I5-KL-111010-16
  • 17. Representation of NetworksFundamentals of networks  Adjacency matrix representation TeLLNet An n x n-dimensional matrix A, where GALA 1 if (i, j)∈ L aij = 0 otherwise  Neighborhood N ≡ { j ∈ N : (i , j ) ∈ L} i  Any network is the collection of neighborhoodsLehrstuhl Informatik 5(Informationssysteme) Γ= N { } i i∈Ν Prof. Dr. M. Jarke I5-KL-111010-17
  • 18. Boolean Adjacency Matrix ExampleFundamentals of networks  For Network Γ1, the adjacency matrix is as follows: TeLLNet true =1, if there exists a link between two nodes GALA false = 0, otherwise 0 1 2 Incoming degree 0 1 2 3 4 0 0 1 0 1 0 Outgoing degree 1 1 0 0 1 0 3 4 2 0 0 1 0 1 3 0 0 0 0 1Lehrstuhl Informatik 5(Informationssysteme) 4 0 0 1 0 0 Prof. Dr. M. Jarke I5-KL-111010-18
  • 19. Important Types of Degree DistributionFundamentals of networks  For any network Γ, its (kth-order) degree distribution p(·) specifies 1 for each k = 0, TeLLNet p(k ) = {i ∈ N : zi = k} GALA n 1, …, n-1Lehrstuhl Informatik 5(Informationssysteme) Prof. Dr. M. Jarke I5-KL-111010-19
  • 20. Network Characteristics: Geodesic DistancesFundamentals of networks  The average geodesic distance d(i, j) is defined as the TeLLNet minimum number of links that connect i and j GALA if no such path exists, d(i, j)=+∞  The distribution ϖ specifying the fraction ϖ (r) of nodes pairs at distance r {(i, j) ∈ N × N : d (i, j) = r} ϖ (r) = n(n − 1) where ∑r >0ϖ (r) = 1  The average network distance d = ∑ rϖ (r) 0< r <∞Lehrstuhl Informatik 5  The diameter of the network d = max{r : ϖ (r) > 0} ˆ(Informationssysteme) Prof. Dr. M. Jarke I5-KL-111010-20
  • 21. Network Characteristics: DensityTeLLNet GALALehrstuhl Informatik 5(Informationssysteme) Prof. Dr. M. Jarke I5-KL-111010-21
  • 22. Network Characteristics: Closeness & ClusteringFundamentals of networks  The total distance ∑ j∈N d (i, j) TeLLNet  The closeness is defined as: c(i) ≡ 1 GALA ∑ j∈N d (i, j )  For each node i having at least two neighbors: clustering { jk ∈ L : ij ∈ L ∧ ik ∈ L} C ≡ i zi ( zi − 1) 2  For each node j having less than two neighbors Cj =0 1 n i  Clustering index of the network Γ C = ∑C n i =1Lehrstuhl Informatik 5(Informationssysteme) Prof. Dr. M. Jarke I5-KL-111010-22
  • 23. Network Characteristics: Cohesiveness & BetweenessFundamentals of networks  Given a network Γ= (N, L), let M⊂N, for each node i ∈ M TeLLNet the fraction of its connections i {ij ∈ L : j ∈ M } GALA H (M ) = zi  The overall cohesiveness of the set M is defined as H ( M ) = min H i ( M ) i∈M  if the network Γ is connected the shortest-paths v(j, k) for each j, k and j≠k the betweenness of node i is i v ( j, k ) b ≡∑ iLehrstuhl Informatik 5(Informationssysteme) j ≠k v( j, k ) Prof. Dr. M. Jarke I5-KL-111010-23
  • 24. Shortest-path Betweenness: Example i v ( j, k ) Shortest-path betweenness b ≡∑Fundamentals of networks  i TeLLNet  Nodes A and B will have j ≠k v( j, k ) GALA high (shortest-path) betweenness in this configuration, while node C will not A measure of the extent to which an actor has control over information flowing between others In a network in which flow is entirely or at least mostly alongLehrstuhl Informatik 5 geodesic paths, the betweenness of a node measures how much flow will pass through that particular node(Informationssysteme) Prof. Dr. M. Jarke I5-KL-111010-24
  • 25. Flow BetweennessFundamentals of networks  Flow betweenness of a node i is defined as the amount of TeLLNet flow through node i when the maximum flow is transmitted GALA from s to t, averaged over all s and t: f st (i) i bmf ≡ ∑s,t∈N ,i ≠ s,i ≠t, f >0 st f st  While calculating flow betweenness, vertices A and B will get high scores while vertex C will notLehrstuhl Informatik 5(Informationssysteme) Prof. Dr. M. Jarke I5-KL-111010-25
  • 26. Case: AERCS - Recommendation of Venues for Young Computer Scientists  DBLP (http://www.informatik.uni- - 788,259 author’s names GALA - 1,226,412 publications - 3,490 venues (conferences, workshops, journals)  CiteSeerX ( - 7,385,652 publications - 22,735,240 citations - Over 4 million author’s names  Combination - Canopy clustering [McCallum 2000] - Result: 864,097 matched pairs - On average: venues cite 2306 andLehrstuhl Informatik 5 are cited 2037 times(Informationssysteme) Prof. Dr. M. Jarke Pham, Klamma, Jarke: Development of Computer Science Disciplines – A Social Network I5-KL-111010-26 Analysis Approach, SNAM, 2011
  • 27. Properties of Collaboration and Citation Graphs of VenuesTeLLNet GALALehrstuhl Informatik 5(Informationssysteme) Prof. Dr. M. Jarke I5-KL-111010-27
  • 28. User-based CF: Author Clustering  Data: DBLPTeLLNet  Perform 2 test cases for the years of 2005 GALA and 2006 - Clustering of co-authorship networks - Prediction of the venue  Clustering algorithm - Density-based algorithm [Clauset 2004] - Obtained modularity: 0.829 and 0.82  Cluster size distribution follows Power lawLehrstuhl Informatik 5(Informationssysteme) Prof. Dr. M. Jarke I5-KL-111010-28
  • 29. User-based CF: Precision and Recall  Precisions for 1000 random chosenTeLLNet authors GALA  Precisions computed at 11 standard recall levels 0%, 10%,….,100%  Results - Clustering performs better - Not significant improved - Better efficiency  Further improvement - Different networks: citation - Overlapping clusteringLehrstuhl Informatik 5(Informationssysteme) Prof. Dr. M. Jarke I5-KL-111010-29
  • 30. Item-based CF: Venue Network Creation and Clustering  Knowledge network - Aggregate bibliography coupling counts at venue levelTeLLNet - Undirected graph G(V, E), where V: venues, E: edges weighted by cosine GALA similarity ∑k =1 Bi ,k B j ,k n Bi • B j Ci , j = = Bi × B j ∑k =1 , Bi2k ∑k =1 B 2, k n n 2 j 2 - Threshold: Ci , j >= 0.1 - Clustering: density-based algorithm [Neuman 2004, Clauset 2004] - Network visualization: force-directed paradigm [Fruchterman 1991]  Knowledge flow network - Aggregate bibliography coupling counts at venue level - Threshold: citation counts >= 50  Domains from Microsoft Academic SearchLehrstuhl Informatik 5(Informationssysteme) Prof. Dr. M. Jarke ( I5-KL-111010-30
  • 31. Knowledge Network: the VisualizationTeLLNet GALALehrstuhl Informatik 5(Informationssysteme) Prof. Dr. M. Jarke I5-KL-111010-31
  • 32. Interdisciplinary Venues: Top Betweenness CentralityTeLLNet GALALehrstuhl Informatik 5(Informationssysteme) Prof. Dr. M. Jarke I5-KL-111010-32
  • 33. High Prestige Series: Top PageRankTeLLNet GALALehrstuhl Informatik 5(Informationssysteme) Prof. Dr. M. Jarke I5-KL-111010-33
  • 34. TeLLNet GALA Network FormationLehrstuhl Informatik 5(Informationssysteme) Prof. Dr. M. Jarke I5-KL-111010-34
  • 35. Case: TeLLNet - SNA for European Teachers‘ Life Long Learning  How to manage and handle large scale data on social networks?TeLLNet  How to analyse social network data in GALA order to develop teachers’ competence, e.g. to facilitate a better project collaboration?  How to make the network visualization useful for teachers’ lifelong learning?Lehrstuhl Informatik 5(Informationssysteme) Prof. Dr. M. Jarke I5-KL-111010-35
  • 36. Analysis and Visualization of Lifelong Learner Data  Performance Data on Projects  Network Structures and PatternsTeLLNet GALALehrstuhl Informatik 5(Informationssysteme) Prof. Dr. M. Jarke I5-KL-111010-36
  • 37. Network Formation Strategies  Homophily – love of the same [LaMe54, MSK01]TeLLNet – similar socio-economical status GALA – thinking in a similar way  Contagion – being influenced by others  How to represent strategies for lifelong learner?Lehrstuhl Informatik 5(Informationssysteme) Prof. Dr. M. Jarke I5-KL-111010-37
  • 38. Game Theory Basics  Every situation as a game [Borel38, NeMo44]  A player – makes decisions in a gameTeLLNet GALA  Players choose best strategies based on payoff functions  Payoffs  motivations of players  A strategy defines a set of moves or actions a player will follow in a given game (mixed strategy, pure strategy)Lehrstuhl Informatik 5(Informationssysteme) Prof. Dr. M. Jarke I5-KL-111010-38
  • 39. Game Theory A game is a tupleTeLLNet GALA G = N , ( Ai )i∈N , (ui )i∈N , where N is a nonempty, finite set of players Each player i∈N has 1. a set of actions (strategy space) Ai 2. payoff functions ui : A  → R 3. payoff matrix Player B chooses white Player B chooses black Player A chooses white 1,1 1,0Lehrstuhl Informatik 5 Player A chooses black 0,1 0,0(Informationssysteme) Prof. Dr. M. Jarke I5-KL-111010-39
  • 40. Network Formation Games  Social networks are formed by individual decisionsTeLLNet – Cost: write an e-mail GALA – Utility: cooperate with others  Social networks between pupils – Cost: make a joke – Utility: get appreciation from others  Lifelong learner networks – Cost: take a learning course – Utility: find learners with similar way of reasoningLehrstuhl Informatik 5(Informationssysteme) Prof. Dr. M. Jarke I5-KL-111010-40
  • 41. Network Formation  Set of agents N = {1..., n} which are actors of aTeLLNet network. i and j are typical members of a set GALA  A strategy of an agent i ∈ N is a vector ai = ( ai ,1 ,..., ai ,i −1 , ai ,i +1 , ai ,n ) where ai , j ∈ {0,1} for each j ∈ N {i} Actor i and j are connected if ai , j = 1Lehrstuhl Informatik 5(Informationssysteme) Prof. Dr. M. Jarke I5-KL-111010-41
  • 42. Nash Network : Win-Win Situation  Every agent changes its strategy until all agents are satisfiedTeLLNet with their strategies and will not benefit if they change GALA strategies (the network is stable)  Nash equilibrium  A network is a Nash network if each agent is in Nash equilibrium  Chosen strategies defeat others for the good of all players [Nash51, FuTi91]Lehrstuhl Informatik 5(Informationssysteme) Prof. Dr. M. Jarke I5-KL-111010-42
  • 43. Epistemic Frame for TeLLNet Identity • the way how members of a community see themselves in the communityTeLLNet •  institution role, country GALA Skills • tasks, community members perform •  languages, subjects, and tools from projects Knowledge • the understanding shared by members of a community •  languages, subjects Values • beliefs of members • experiences from projects (partners) EpistemologyLehrstuhl Informatik 5 • warrants that justify members’ actions as legitimate(Informationssysteme) Prof. Dr. M. Jarke •  quality labels, prizes, European quality labels I5-KL-111010-43
  • 44. Multi-Agent Simulation System  A multi-agent system is a collection of heterogeneousTeLLNet and diverse intelligent agents that interact with each GALA other and their environment [SiAi08] – Recommendations Yenta [Foner97] – looking for users with similar interests based on data from Web media – Market-binding mechanisms Looking for the best item (a reward agent, set of items and users agents) [WMJe05] – Team formationLehrstuhl Informatik 5 Forming teams for performing a task in dynamic(Informationssysteme) Prof. Dr. M. Jarke I5-KL-111010-44 environment [GaJa05]
  • 45. Multi-Agent Simulation Questions  Which kind of behavior can be expected under arbitrarilyTeLLNet given parameter combinations and initial conditions? GALA  Which kind of behavior will a given target system display in the future?  Which state will the target system reach in the future? [Troitzsch2000]Lehrstuhl Informatik 5(Informationssysteme) Prof. Dr. M. Jarke I5-KL-111010-45 2008 2009 2010
  • 46. Agent Based Simulation  Heterogeneous, autonomous and pro-active actors,TeLLNet such as human-centered systems GALA – Agents are capable to act without human intervention – Agents possess goal-directed behavior – Each agent has its own incentives and motives  Suited for modeling organizations: most work is based on cooperation and communication [Gazendam, 1993]Lehrstuhl Informatik 5(Informationssysteme) Prof. Dr. M. Jarke I5-KL-111010-46
  • 47. Inputs for simulation model  Agent =TeacherTeLLNet  Teacher properties: GALA – Languages – Subjects – Country – Institution role – Any Awards? (European Quality Label or Prize)  Project properties: – Languages – Tools – Subjects – Number of pupils in a project – Age of pupils in a project – Any Award? (Quality Label)Lehrstuhl Informatik 5(Informationssysteme) Prof. Dr. M. Jarke I5-KL-111010-47
  • 48. Network Formation Game Simulation  Payoff definition: payoff matrix is calculatedTeLLNet dynamically based on Epistemic Frame vector: GALA – teachers‘ subjects, subjects of projects (experiences) – teachers‘ languages, languages of projects (experiences) – tools used in projects (experiences) – countries past collaborators are coming from (beliefs) – ...  Strategy definition: homophily or contagiosity  Looking for a suitable network for a teacher and notLehrstuhl Informatik 5(Informationssysteme) Prof. Dr. M. Jarke for a suitable partner! I5-KL-111010-48
  • 49. Nash Equilibrium for Network Formation  Finding a Nash Equilibrium (NE) is NP-hard  Computer scientists deal with finding appropriateTeLLNet GALA techniques for calculating NE with a lot of agents  We are not interested in the best solution but in a better solutionLehrstuhl Informatik 5(Informationssysteme) Prof. Dr. M. Jarke I5-KL-111010-49
  • 50. Conclusions & Outlook  Network Science is an interdisciplinary approach between computerTeLLNet science and other disciplines GALA  Mediabase framework based on modeling & reflection support  Two case studies – Network Flow: Analysis and visualization of large digital libraries Identification of basic flow parameters – Network Formation: Analysis and visualization of large learner networks Performance Indicators and Visual Analytics  Application of tools on entrepreneurial problems: Causation and Effectuation (Excellence Project OBIP at RWTH Aachen University)  Researching Network Dynamics by Time Series Analysis and MultiLehrstuhl Informatik 5 Agent Simulation(Informationssysteme) Prof. Dr. M. Jarke I5-KL-111010-50