Network Flow and Network Formation: A Social Network Analysis Perspective

  • 1,719 views
Uploaded on

Ringvorlesung der Research School Business & Economics (RSBE) …

Ringvorlesung der Research School Business & Economics (RSBE)
University of Siegen , Germany
June 28, 2011

Ralf Klamma
RWTH Aachen University

More in: Technology , Education
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads

Views

Total Views
1,719
On Slideshare
0
From Embeds
0
Number of Embeds
1

Actions

Shares
Downloads
63
Comments
0
Likes
3

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. Informatik 5 (DBIS) RWTH Aachen UniversityTeLLNet GALA Network Flow and Network Formation: A Social Network Analysis Perspective Ralf Klamma RWTH Aachen University Ringvorlesung der Research School Business & Economics (RSBE) Siegen June 28, 2011Lehrstuhl Informatik 5(Informationssysteme) Prof. Dr. M. Jarke I5-KL-111010-1
  • 2. AgendaTeLLNet GALA Conclusions and Outlook Network Formation Network Science Network FlowLehrstuhl Informatik 5(Informationssysteme) Prof. Dr. M. Jarke I5-KL-111010-2
  • 3. RWTH Aachen University • 260 institutes in 9 faculties as Europe’s leading institutions for science and researchTeLLNet • Currently around 31,400 students are enrolled GALA in over 100 academic programs • Over 5,000 of them are international students hailing from 120 different countries • 1,250 spin-off businesses have created around 30,000 jobs in the greater Aachen region over the past 20 years. • IDEA League • Germany’s Excellence Initiative: 3 clusters of excellence, a graduate schoolLehrstuhl Informatik 5 and the institutional strategy “RWTH Aachen 2020: Meeting Global Challenges”(Informationssysteme) Prof. Dr. M. Jarke I5-KL-111010-3
  • 4. Community Information Systems Research GroupTeLLNet GALA Established at DBIS chair, RWTH Aachen UniversityLehrstuhl Informatik 5 3 Postdocs, 7 PhD students,(Informationssysteme) Prof. Dr. M. Jarke + paid student workers & thesis workers I5-KL-111010-4
  • 5. TeLLNet GALA Network ScienceLehrstuhl Informatik 5(Informationssysteme) Prof. Dr. M. Jarke I5-KL-111010-5
  • 6. Questions within Network Science  How well the position of a agent is to receive andTeLLNet disseminate information? GALA – experts (centrality measures) [Wasserman & Faust, 1997]  Are users communicate only within their groups or with some agents from the other groups as well? – innovation stars (boundary spanners, brokers, high betwenness centrality) [Burt, 2005]  Who and what effects a agent? – influence networks [Lewis, 2008]  What are groups/communities an agent belongs to?Lehrstuhl Informatik 5 – community mining [Clauset et al., 2004](Informationssysteme) Prof. Dr. M. Jarke I5-KL-111010-6
  • 7. Executive Board Networks: TheyRule.net  A prototype as of 2004TeLLNet  What is the connection between Motorola and Whirlpool? GALA  How does the academic institutes and the companiesLehrstuhl Informatik 5 network look like?(Informationssysteme) Prof. Dr. M. Jarke I5-KL-111010-7
  • 8. Who rule 3M, Motorola, AT&T, Coca- Cola, PepsiCo, and McDonald‘s?TeLLNet GALALehrstuhl Informatik 5(Informationssysteme) Prof. Dr. M. Jarke I5-KL-111010-8
  • 9. Spread of ContagionTeLLNet GALALehrstuhl Informatik 5(Informationssysteme) Prof. Dr. M. Jarke Source: orgnet.com I5-KL-111010-9
  • 10. Network Science Paradigms  Merge of analytic and engineering paradigms  In an analytic disciplineTeLLNet GALA Scientific – To find laws (computing paradigms) disciplines Commerce – To generate phenomena Communication – To explain observed phenomena serves a  In a engineering discipline purpose – To realize and implement Entertainment Politics the paradigms of Networks – To understand the cases when particular technologies should be usedLehrstuhl Informatik 5(Informationssysteme) – To store Network data efficiently (Mediabase) Prof. Dr. M. Jarke I5-KL-111010-10
  • 11. Web Science: The Long Tail & Fragments IN Continent Central Core OUT ContinentTeLLNet GALA Tunnels [Anderson, 2006] Tendrils Island [Barabasi, 2002]  The Web is a scale-free, fragmented network – The power law (Pareto-Distribution etc.)Lehrstuhl Informatik 5 – 95 % of users are located in the Long Tail (Communities)(Informationssysteme) Prof. Dr. M. Jarke – Trust and passion based cooperation I5-KL-111010-11
  • 12. Principle Analytic Approach  Interdisciplinary multidimensional model of networksTeLLNet – Social network analysis (SNA) is defining measures for social relations GALA – Actor network theory (ANT) is connecting human and media agents – i* framework is defining strategic goals and dependencies – Theory of media transcriptions is studying cross-media knowledge social software Media Networks network of artifacts Wiki, Blog, Podcast, IM, Chat, Microcontent, Blog entry, Message, Burst, Thread, Email, Newsgroup, Chat … Comment, Conversation, Feedback (Rating) i*-Dependencies (Structural, Cross-media) network of membersLehrstuhl Informatik 5 Members (Social Network Analysis: Centrality,(Informationssysteme) Prof. Dr. M. Jarke Efficiency) Communities of practice I5-KL-111010-12
  • 13. MediaBase  Collection of Social Software artifacts with parameterizedTeLLNet PERL scripts GALA – Mailing lists – Newsletter – Web sites – RSS Feeds – Blogs  Database support by IBM DB2, eXist, Oracle, ...  Web Interface based on Firefox Plugin, Plone/Zope, Widgets, ...  Strategies of visualization – Tree mapsLehrstuhl Informatik 5(Informationssysteme) Prof. Dr. M. Jarke – Cross-media graphs I5-KL-111010-13 Klamma et al.: Pattern-Based Cross Media Social Network Analysis for Technology Enhanced Learning in Europe, EC-TEL 2006
  • 14. TeLLNet GALA Network FlowLehrstuhl Informatik 5(Informationssysteme) Prof. Dr. M. Jarke I5-KL-111010-14
  • 15. Fundamentals: Definitions of Network A network Γ= (N, L) whereFundamentals of networks  TeLLNet N = {1, 2, ..., n} is a (finite) set of nodes (vertices), GALA L ⊆ N x N is a set of links (edges)  Assumed: – Unweighted – No multiple links => only one link exist between two given nodes => these two nodes are neighbors or adjacent – Directed or undirectedLehrstuhl Informatik 5(Informationssysteme) Prof. Dr. M. Jarke I5-KL-111010-15
  • 16. Definitions in a NetworkFundamentals of networks  Degree of a node: z = { j ∈ N : ij ∈ L} i TeLLNet GALA number of incoming and outgoing links  A path is a sequence of nodes v0, …, vn-1 with (vi, vi+1) ∈ L, for 0 ≤ i < n-1, A path is a set of connected links  Length of a path : number of links on a path  A path is a simple path, if all vertices on a path are pair wise different  A cycle is a path with v0 = vn-1 and length n ≥ 2  A subnetwork of a network Γ= (N, L) is a graph Γ’= (N’,Lehrstuhl Informatik 5(Informationssysteme) L’) with N’ ⊆ N und L’ ⊆ L Prof. Dr. M. Jarke I5-KL-111010-16
  • 17. Representation of NetworksFundamentals of networks  Adjacency matrix representation TeLLNet An n x n-dimensional matrix A, where GALA 1 if (i, j)∈ L aij = 0 otherwise  Neighborhood N ≡ { j ∈ N : (i , j ) ∈ L} i  Any network is the collection of neighborhoodsLehrstuhl Informatik 5(Informationssysteme) Γ= N { } i i∈Ν Prof. Dr. M. Jarke I5-KL-111010-17
  • 18. Boolean Adjacency Matrix ExampleFundamentals of networks  For Network Γ1, the adjacency matrix is as follows: TeLLNet true =1, if there exists a link between two nodes GALA false = 0, otherwise 0 1 2 Incoming degree 0 1 2 3 4 0 0 1 0 1 0 Outgoing degree 1 1 0 0 1 0 3 4 2 0 0 1 0 1 3 0 0 0 0 1Lehrstuhl Informatik 5(Informationssysteme) 4 0 0 1 0 0 Prof. Dr. M. Jarke I5-KL-111010-18
  • 19. Important Types of Degree DistributionFundamentals of networks  For any network Γ, its (kth-order) degree distribution p(·) specifies 1 for each k = 0, TeLLNet p(k ) = {i ∈ N : zi = k} GALA n 1, …, n-1Lehrstuhl Informatik 5(Informationssysteme) Prof. Dr. M. Jarke I5-KL-111010-19
  • 20. Network Characteristics: Geodesic DistancesFundamentals of networks  The average geodesic distance d(i, j) is defined as the TeLLNet minimum number of links that connect i and j GALA if no such path exists, d(i, j)=+∞  The distribution ϖ specifying the fraction ϖ (r) of nodes pairs at distance r {(i, j) ∈ N × N : d (i, j) = r} ϖ (r) = n(n − 1) where ∑r >0ϖ (r) = 1  The average network distance d = ∑ rϖ (r) 0< r <∞Lehrstuhl Informatik 5  The diameter of the network d = max{r : ϖ (r) > 0} ˆ(Informationssysteme) Prof. Dr. M. Jarke I5-KL-111010-20
  • 21. Network Characteristics: DensityTeLLNet GALALehrstuhl Informatik 5(Informationssysteme) Prof. Dr. M. Jarke I5-KL-111010-21
  • 22. Network Characteristics: Closeness & ClusteringFundamentals of networks  The total distance ∑ j∈N d (i, j) TeLLNet  The closeness is defined as: c(i) ≡ 1 GALA ∑ j∈N d (i, j )  For each node i having at least two neighbors: clustering { jk ∈ L : ij ∈ L ∧ ik ∈ L} C ≡ i zi ( zi − 1) 2  For each node j having less than two neighbors Cj =0 1 n i  Clustering index of the network Γ C = ∑C n i =1Lehrstuhl Informatik 5(Informationssysteme) Prof. Dr. M. Jarke I5-KL-111010-22
  • 23. Network Characteristics: Cohesiveness & BetweenessFundamentals of networks  Given a network Γ= (N, L), let M⊂N, for each node i ∈ M TeLLNet the fraction of its connections i {ij ∈ L : j ∈ M } GALA H (M ) = zi  The overall cohesiveness of the set M is defined as H ( M ) = min H i ( M ) i∈M  if the network Γ is connected the shortest-paths v(j, k) for each j, k and j≠k the betweenness of node i is i v ( j, k ) b ≡∑ iLehrstuhl Informatik 5(Informationssysteme) j ≠k v( j, k ) Prof. Dr. M. Jarke I5-KL-111010-23
  • 24. Shortest-path Betweenness: Example i v ( j, k ) Shortest-path betweenness b ≡∑Fundamentals of networks  i TeLLNet  Nodes A and B will have j ≠k v( j, k ) GALA high (shortest-path) betweenness in this configuration, while node C will not A measure of the extent to which an actor has control over information flowing between others In a network in which flow is entirely or at least mostly alongLehrstuhl Informatik 5 geodesic paths, the betweenness of a node measures how much flow will pass through that particular node(Informationssysteme) Prof. Dr. M. Jarke I5-KL-111010-24
  • 25. Flow BetweennessFundamentals of networks  Flow betweenness of a node i is defined as the amount of TeLLNet flow through node i when the maximum flow is transmitted GALA from s to t, averaged over all s and t: f st (i) i bmf ≡ ∑s,t∈N ,i ≠ s,i ≠t, f >0 st f st  While calculating flow betweenness, vertices A and B will get high scores while vertex C will notLehrstuhl Informatik 5(Informationssysteme) Prof. Dr. M. Jarke I5-KL-111010-25
  • 26. Case: AERCS - Recommendation of Venues for Young Computer Scientists  DBLP (http://www.informatik.uni- trier.de/~ley/db/)TeLLNet - 788,259 author’s names GALA - 1,226,412 publications - 3,490 venues (conferences, workshops, journals)  CiteSeerX (http://citeseerx.ist.psu.edu/) - 7,385,652 publications - 22,735,240 citations - Over 4 million author’s names  Combination - Canopy clustering [McCallum 2000] - Result: 864,097 matched pairs - On average: venues cite 2306 andLehrstuhl Informatik 5 are cited 2037 times(Informationssysteme) Prof. Dr. M. Jarke Pham, Klamma, Jarke: Development of Computer Science Disciplines – A Social Network I5-KL-111010-26 Analysis Approach, SNAM, 2011
  • 27. Properties of Collaboration and Citation Graphs of VenuesTeLLNet GALALehrstuhl Informatik 5(Informationssysteme) Prof. Dr. M. Jarke I5-KL-111010-27
  • 28. User-based CF: Author Clustering  Data: DBLPTeLLNet  Perform 2 test cases for the years of 2005 GALA and 2006 - Clustering of co-authorship networks - Prediction of the venue  Clustering algorithm - Density-based algorithm [Clauset 2004] - Obtained modularity: 0.829 and 0.82  Cluster size distribution follows Power lawLehrstuhl Informatik 5(Informationssysteme) Prof. Dr. M. Jarke I5-KL-111010-28
  • 29. User-based CF: Precision and Recall  Precisions for 1000 random chosenTeLLNet authors GALA  Precisions computed at 11 standard recall levels 0%, 10%,….,100%  Results - Clustering performs better - Not significant improved - Better efficiency  Further improvement - Different networks: citation - Overlapping clusteringLehrstuhl Informatik 5(Informationssysteme) Prof. Dr. M. Jarke I5-KL-111010-29
  • 30. Item-based CF: Venue Network Creation and Clustering  Knowledge network - Aggregate bibliography coupling counts at venue levelTeLLNet - Undirected graph G(V, E), where V: venues, E: edges weighted by cosine GALA similarity ∑k =1 Bi ,k B j ,k n Bi • B j Ci , j = = Bi × B j ∑k =1 , Bi2k ∑k =1 B 2, k n n 2 j 2 - Threshold: Ci , j >= 0.1 - Clustering: density-based algorithm [Neuman 2004, Clauset 2004] - Network visualization: force-directed paradigm [Fruchterman 1991]  Knowledge flow network - Aggregate bibliography coupling counts at venue level - Threshold: citation counts >= 50  Domains from Microsoft Academic SearchLehrstuhl Informatik 5(Informationssysteme) Prof. Dr. M. Jarke (http://academic.research.microsoft.com/) I5-KL-111010-30
  • 31. Knowledge Network: the VisualizationTeLLNet GALALehrstuhl Informatik 5(Informationssysteme) Prof. Dr. M. Jarke I5-KL-111010-31
  • 32. Interdisciplinary Venues: Top Betweenness CentralityTeLLNet GALALehrstuhl Informatik 5(Informationssysteme) Prof. Dr. M. Jarke I5-KL-111010-32
  • 33. High Prestige Series: Top PageRankTeLLNet GALALehrstuhl Informatik 5(Informationssysteme) Prof. Dr. M. Jarke I5-KL-111010-33
  • 34. TeLLNet GALA Network FormationLehrstuhl Informatik 5(Informationssysteme) Prof. Dr. M. Jarke I5-KL-111010-34
  • 35. Case: TeLLNet - SNA for European Teachers‘ Life Long Learning  How to manage and handle large scale data on social networks?TeLLNet  How to analyse social network data in GALA order to develop teachers’ competence, e.g. to facilitate a better project collaboration?  How to make the network visualization useful for teachers’ lifelong learning?Lehrstuhl Informatik 5(Informationssysteme) Prof. Dr. M. Jarke I5-KL-111010-35
  • 36. Analysis and Visualization of Lifelong Learner Data  Performance Data on Projects  Network Structures and PatternsTeLLNet GALALehrstuhl Informatik 5(Informationssysteme) Prof. Dr. M. Jarke I5-KL-111010-36
  • 37. Network Formation Strategies  Homophily – love of the same [LaMe54, MSK01]TeLLNet – similar socio-economical status GALA – thinking in a similar way  Contagion – being influenced by others  How to represent strategies for lifelong learner?Lehrstuhl Informatik 5(Informationssysteme) Prof. Dr. M. Jarke I5-KL-111010-37
  • 38. Game Theory Basics  Every situation as a game [Borel38, NeMo44]  A player – makes decisions in a gameTeLLNet GALA  Players choose best strategies based on payoff functions  Payoffs  motivations of players  A strategy defines a set of moves or actions a player will follow in a given game (mixed strategy, pure strategy)Lehrstuhl Informatik 5(Informationssysteme) Prof. Dr. M. Jarke I5-KL-111010-38
  • 39. Game Theory A game is a tupleTeLLNet GALA G = N , ( Ai )i∈N , (ui )i∈N , where N is a nonempty, finite set of players Each player i∈N has 1. a set of actions (strategy space) Ai 2. payoff functions ui : A  → R 3. payoff matrix Player B chooses white Player B chooses black Player A chooses white 1,1 1,0Lehrstuhl Informatik 5 Player A chooses black 0,1 0,0(Informationssysteme) Prof. Dr. M. Jarke I5-KL-111010-39
  • 40. Network Formation Games  Social networks are formed by individual decisionsTeLLNet – Cost: write an e-mail GALA – Utility: cooperate with others  Social networks between pupils – Cost: make a joke – Utility: get appreciation from others  Lifelong learner networks – Cost: take a learning course – Utility: find learners with similar way of reasoningLehrstuhl Informatik 5(Informationssysteme) Prof. Dr. M. Jarke I5-KL-111010-40
  • 41. Network Formation  Set of agents N = {1..., n} which are actors of aTeLLNet network. i and j are typical members of a set GALA  A strategy of an agent i ∈ N is a vector ai = ( ai ,1 ,..., ai ,i −1 , ai ,i +1 , ai ,n ) where ai , j ∈ {0,1} for each j ∈ N {i} Actor i and j are connected if ai , j = 1Lehrstuhl Informatik 5(Informationssysteme) Prof. Dr. M. Jarke I5-KL-111010-41
  • 42. Nash Network : Win-Win Situation  Every agent changes its strategy until all agents are satisfiedTeLLNet with their strategies and will not benefit if they change GALA strategies (the network is stable)  Nash equilibrium  A network is a Nash network if each agent is in Nash equilibrium  Chosen strategies defeat others for the good of all players [Nash51, FuTi91]Lehrstuhl Informatik 5(Informationssysteme) Prof. Dr. M. Jarke I5-KL-111010-42
  • 43. Epistemic Frame for TeLLNet Identity • the way how members of a community see themselves in the communityTeLLNet •  institution role, country GALA Skills • tasks, community members perform •  languages, subjects, and tools from projects Knowledge • the understanding shared by members of a community •  languages, subjects Values • beliefs of members • experiences from projects (partners) EpistemologyLehrstuhl Informatik 5 • warrants that justify members’ actions as legitimate(Informationssysteme) Prof. Dr. M. Jarke •  quality labels, prizes, European quality labels I5-KL-111010-43
  • 44. Multi-Agent Simulation System  A multi-agent system is a collection of heterogeneousTeLLNet and diverse intelligent agents that interact with each GALA other and their environment [SiAi08] – Recommendations Yenta [Foner97] – looking for users with similar interests based on data from Web media – Market-binding mechanisms Looking for the best item (a reward agent, set of items and users agents) [WMJe05] – Team formationLehrstuhl Informatik 5 Forming teams for performing a task in dynamic(Informationssysteme) Prof. Dr. M. Jarke I5-KL-111010-44 environment [GaJa05]
  • 45. Multi-Agent Simulation Questions  Which kind of behavior can be expected under arbitrarilyTeLLNet given parameter combinations and initial conditions? GALA  Which kind of behavior will a given target system display in the future?  Which state will the target system reach in the future? [Troitzsch2000]Lehrstuhl Informatik 5(Informationssysteme) Prof. Dr. M. Jarke I5-KL-111010-45 2008 2009 2010
  • 46. Agent Based Simulation  Heterogeneous, autonomous and pro-active actors,TeLLNet such as human-centered systems GALA – Agents are capable to act without human intervention – Agents possess goal-directed behavior – Each agent has its own incentives and motives  Suited for modeling organizations: most work is based on cooperation and communication [Gazendam, 1993]Lehrstuhl Informatik 5(Informationssysteme) Prof. Dr. M. Jarke I5-KL-111010-46
  • 47. Inputs for simulation model  Agent =TeacherTeLLNet  Teacher properties: GALA – Languages – Subjects – Country – Institution role – Any Awards? (European Quality Label or Prize)  Project properties: – Languages – Tools – Subjects – Number of pupils in a project – Age of pupils in a project – Any Award? (Quality Label)Lehrstuhl Informatik 5(Informationssysteme) Prof. Dr. M. Jarke I5-KL-111010-47
  • 48. Network Formation Game Simulation  Payoff definition: payoff matrix is calculatedTeLLNet dynamically based on Epistemic Frame vector: GALA – teachers‘ subjects, subjects of projects (experiences) – teachers‘ languages, languages of projects (experiences) – tools used in projects (experiences) – countries past collaborators are coming from (beliefs) – ...  Strategy definition: homophily or contagiosity  Looking for a suitable network for a teacher and notLehrstuhl Informatik 5(Informationssysteme) Prof. Dr. M. Jarke for a suitable partner! I5-KL-111010-48
  • 49. Nash Equilibrium for Network Formation  Finding a Nash Equilibrium (NE) is NP-hard  Computer scientists deal with finding appropriateTeLLNet GALA techniques for calculating NE with a lot of agents  We are not interested in the best solution but in a better solutionLehrstuhl Informatik 5(Informationssysteme) Prof. Dr. M. Jarke I5-KL-111010-49
  • 50. Conclusions & Outlook  Network Science is an interdisciplinary approach between computerTeLLNet science and other disciplines GALA  Mediabase framework based on modeling & reflection support  Two case studies – Network Flow: Analysis and visualization of large digital libraries Identification of basic flow parameters – Network Formation: Analysis and visualization of large learner networks Performance Indicators and Visual Analytics  Application of tools on entrepreneurial problems: Causation and Effectuation (Excellence Project OBIP at RWTH Aachen University)  Researching Network Dynamics by Time Series Analysis and MultiLehrstuhl Informatik 5 Agent Simulation(Informationssysteme) Prof. Dr. M. Jarke I5-KL-111010-50