Social Network Analysis for
Global Software Engineering
Exploring relationships from a fine-grained level
Sabrina Marczak
P...
N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy
Software Development
 2
N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy
Software Development
 3
N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy
Software Development
• Collaboration
• Coordination
• Communication
Goa...
N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy
Software Development
• Who talks with whom?
• Who receives help from wh...
N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy
Software Development
• Are the team members following the
organizationa...
N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy
Software Development
• How to answer to these questions?
Social Network...
N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy
Social Network Analysis
• It provides techniques to examine the
structu...
N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy
Learning Goals
• When we complete this tutorial, you will
be able to:
•...
N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy
Our Agenda
• [9:00-10:30] Introduction to Social
Network Analysis: Theo...
N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy
 Introduction to SNA
• Terminology
• Representation
• Measures
• Data c...
N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy
Terminology
Actor
Actor = Node =Vertice
12
N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy
Terminology
Tie
Tie = Link = Edge
13
N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy
Terminology
 14
Directional
tie
N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy
Terminology
Dyad
15
N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy
Terminology
Triad
16
N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy
Representation
• Sociogram
17
N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy
Representation
• Matrix representation of network data
Absent
Present
18
N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy
Representation
• Actors attributes
Role
 Country
 Work exp.
Andrew
 1
 ...
N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy
Representation
• Sociogram with actors attributes 
20
N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy
Representation
• Tie weight 
• Strength
• Frequency
• Etc...
21
N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy
Measures
• Overall network characterization
• Network size
• Network de...
N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy
Measures
• Network size: is the number of actors
in the social network
...
N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy
Measures
• Network density: is the proportion of
ties that exist in the...
N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy
Measures
• Ties statistics: it uses the actors
attributes to reveal ove...
N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy
Measures
• Information exchange
• Reachability
• Component
• Centrality...
N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy
Measures
•  Reachability: one actor is reachable by another actor if
ex...
N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy
Measures
•  Component: indicates whether a social network is
connected....
N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy
Measures
•  Centrality: it serves the purpose of indicating power
and i...
N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy
Measures
And now: which actor has more power
in each network?
Andrew Bo...
N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy
Measures
•  Centrality can be measured in different ways, each with
dif...
N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy
Measures
•  Degree centrality: indicates the number of ties of a a
cert...
N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy
Measures
•  Degree centrality: if we go back to the star network, one
c...
N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy
•  Closeness centrality: An actor is considered important if he is
rela...
N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy
•  Betweenness centrality: it measures how central a
person is in a net...
N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy
Combined interpretation
 36
Low Degree
 Low Closeness
 Low
Betweeness
H...
N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy
Measures
•  Brokerage: indicates when an actor, named broker, connects
...
N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy
Measures
•  Cutpoint: indicates a weak point in the network. If this
ac...
N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy
Measures
• Network structure
• Network centralization
• Core-periphery
...
N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy
Measures
•  Network centralization: quantifies the difference between th...
N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy
Measures
•  Core-periphery: indicates the extent to which the structure...
N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy
Measures
•  Ties reciprocity: when the relationship is considered direc...
N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy
Measures
•  Clique: consists of a subset of at least 3 actors in which
...
N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy
Measures
• Network structure and Evolution
• Triadic closure
• Clusteri...
N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy
• Network structure and Evolution

 What are the mechanisms by which no...
N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy
Measures
•  Triadic Closure: If two people in a social network have a
f...
N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy
•  Reasons for having triadic closure:
•  Opportunity: One reason why B...
N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy
•  Structural Holes:
•  Span asymmetric information: love triangle, als...
N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy
Data collection
• Manual
• Survey
• Work diary
• Observation
49
N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy
Data collection
 50
N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy
Data collection
• Automatic
• Mining software repositories
• E.g.: sour...
N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy
Coffee Break
We are back in
30 minutes
52
N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy
 Tools
• Gephi
• UCINet
• NetMiner
53
N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy
Gephi
https://gephi.org/
54
•  It is a tool for the interactive visuali...
N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy
Graph files: Defining the input file that describes the
network
- Definitio...
N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy
Opening a .gexf graph file
.gexf
When the file is opened, the report sums...
N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy
Opening a .gexf graph file
.gexf
Preliminary overview
of the network
N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy
Layout the graph
•  Force-based algorithm: linked nodes attract each-ot...
N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy
The goal
•  To obtain a meaningful representation of the network (i.e. ...
N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy
Layout the graph
•  Set the Repulsion strength (eg. to 10000) and Run
N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy
Node ranking (colors)
•  Choose a rank parameter (eg. Node degree) and ...
N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy
Edge ranking (colors)
You can do the same with edge weight
N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy
Other options
•  You can control the thickness of edges, the visbility ...
N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy
Metrics
•  Metrics are available in the right
section of the Gephi inte...
N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy
New node values
•  Metrics generates reports but also new information
a...
N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy
Node size
•  Let s express the node Betweeness Centrality using node
si...
N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy
You should now see a colored and sized graph
•  Color expresses Degree
...
N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy
Node labels
N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy
!#$%
'()*%*+
!%,()-$.-$+%
!/%!(0*$#%
1*2+-%..3%4*
5%6*#(+7
18(/)-%#9$%#...
N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy
!#$%
'()*%*+
!%,()-$.-$+%
!/%!(0*$#%
1*2+-%..3%4*
5%6*#(+7
18(/)-%#9$%#...
N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy
Opening a .gephi project
.gephi
N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy
UCINet
https://sites.google.com/site/ucinetsoftware/home
72
N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy
NetMiner
http://www.netminer.com
73
N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy
 Hand-on Exercises
• Time to practice and do it yourself!
74
N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy
Exercise 1
• Let s explore what the visualization of a
social network c...
N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy
Exercise 2
• Let s explore a measure [15 min]
•  Calculate the centrali...
N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy
Exercise 3
• Let s highlight the results [15 min]
•  Choose how to repr...
N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy
Exercise 4
• Exporting the results [5 min]
•  Save the results in PDF f...
N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy
 Final Remarks
• What one wants to learn from the social
networks
• Pla...
N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy
 Recommended reading
• Rob Cross and Andrew
Parker. The Hidden
Power of...
N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy
 Recommended reading
• John Scott. Social Network
Analysis: A Handbook....
N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy
 Recommended reading
•  Stanley Wasserman and
Katherine Faust. Social
N...
N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy
 Recommended reading
•  Kate Ehrlich and Klarissa Chang. Leveraging Exp...
N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy
 References
•  [Mitchell, 1969] J. Clyde Mitchell. Social Networks in U...
N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy
 References
•  [Tsai, 2002] Wenpin Tsai. Social Structure of Coopetitio...
N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy
 References
•  [Cain et al., 1996] Brendan Cain, James Coplien, and Nei...
N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy
 References
•  [Ehrlich et al., 2008] Kate Ehrlich, Mary Helander, Gius...
Thank you for you interest!
Questions? Comments? Suggestions? 
Sabrina Marczak
PUCRS, Porto Alegre, Brazil
sabrina.marczak...
Upcoming SlideShare
Loading in...5
×

ICGSE2013 Social Network Analysis for Global Software Engineering: Exploring relationships from a fine-grained view

282

Published on

This tutorial given at ICGSE '13, Bari, Italy, presents the basic concepts of social network analysis and discusses examples from global software engineering literature. It also includes a sample of how to do social network analysis in practice.

Published in: Technology, Business
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
282
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
0
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Transcript of "ICGSE2013 Social Network Analysis for Global Software Engineering: Exploring relationships from a fine-grained view"

  1. 1. Social Network Analysis for Global Software Engineering Exploring relationships from a fine-grained level Sabrina Marczak PUCRS, Porto Alegre, Brazil sabrina.marczak@pucrs.br ICGSE 2013 8th IEEE International Conference on Global Software Engineering Bari, Italy | August 26-29, 2013 www.icgse.org Nicole Novielli Uniba, Bari, Italy nicole.novielli@uniba.it
  2. 2. N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy Software Development 2
  3. 3. N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy Software Development 3
  4. 4. N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy Software Development • Collaboration • Coordination • Communication Goals Tasks Dependencies Deadlines 4
  5. 5. N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy Software Development • Who talks with whom? • Who receives help from whom? • Who is aware of whom? • Who are the experts? • Who are the most active contributors? 5
  6. 6. N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy Software Development • Are the team members following the organizational structure? • Are the team members coordinating with those their work is dependent on? • Are the next builds going to fail? 6
  7. 7. N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy Software Development • How to answer to these questions? Social Network Analysis 7
  8. 8. N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy Social Network Analysis • It provides techniques to examine the structure of social relationships in a group to uncover patterns of behavior and interaction among people [Mitchell, 1969] 8
  9. 9. N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy Learning Goals • When we complete this tutorial, you will be able to: •  Understand the main concepts related to SNA •  Better understand SNA research literature •  Have basic knowledge to identify which SNA data you need for your own research •  Run basic SNA measures and interpret the results 9
  10. 10. N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy Our Agenda • [9:00-10:30] Introduction to Social Network Analysis: Theory and Application from Research • [10:30-11:00] Coffee Break • [11:00-12:30] SNA in Practice: Tools and Hands-on Exercises 10
  11. 11. N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy Introduction to SNA • Terminology • Representation • Measures • Data collection 11
  12. 12. N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy Terminology Actor Actor = Node =Vertice 12
  13. 13. N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy Terminology Tie Tie = Link = Edge 13
  14. 14. N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy Terminology 14 Directional tie
  15. 15. N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy Terminology Dyad 15
  16. 16. N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy Terminology Triad 16
  17. 17. N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy Representation • Sociogram 17
  18. 18. N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy Representation • Matrix representation of network data Absent Present 18
  19. 19. N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy Representation • Actors attributes Role Country Work exp. Andrew 1 1 3 Bob 1 2 3 Charles 1 1 1 David 1 2 2 Emma 1 1 1 Fynn 2 1 3 Greg 2 1 1 Hannah 2 1 1 Iris 2 1 2 John 2 2 3 Kevin 2 2 2 Lucas 2 1 2 Role 1.Tester 2. Developer Country 1. Canada 2. Ireland Work experience 1. 1-6 months 2. 6-12 months 3.18+ months 19
  20. 20. N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy Representation • Sociogram with actors attributes 20
  21. 21. N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy Representation • Tie weight • Strength • Frequency • Etc... 21
  22. 22. N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy Measures • Overall network characterization • Network size • Network density • Ties statistics 22
  23. 23. N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy Measures • Network size: is the number of actors in the social network Size: 12 actors Size can be larger or smaller than the team size Herbsleb and Mockus (2003) found that distributed communication networks are significantly smaller than same-site networks 23
  24. 24. N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy Measures • Network density: is the proportion of ties that exist in the network out of the total possible ties. It can vary from 0 to 1. Possible ties: 12 (12-1) / 2 = 66 Hinds and McGrath (2006) found that geographic distribution is associated with less dense work ties and less dense information sharing, suggesting that social ties are not particularly important in distributed as compared with collocated teams as a means of coordinating work and improving performance Density: 20 / 66 = 0.30 24
  25. 25. N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy Measures • Ties statistics: it uses the actors attributes to reveal overall network characteristics E.g.: 5 testers and 7 developers By counting up the number of ties within and cross-sites, Herbsleb and Mockus (2003) found that there is much more frequent communication with local colleagues in a distributed project than with remote ones 25 Damian et al. (2007) found that notification of changes is the main reason for communication in requirements-centric social networks
  26. 26. N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy Measures • Information exchange • Reachability • Component • Centrality • Brokerage • Cutpoint 26
  27. 27. N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy Measures •  Reachability: one actor is reachable by another actor if exists any set of ties that connects both actors, regardless of how many others fall in between them [Wasserman and Faust, 1994]. All actors are reachable If some actors cannot reach others, there is a potential division in the network and thus information cannot reach everyone 27
  28. 28. N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy Measures •  Component: indicates whether a social network is connected. A network is connected if there is a path between every pair of actors, otherwise it is disconnected. The actors in a disconnected network may be partitioned in subsets called components [Wasserman and Faust, 1994]. One component Component test indicates whether there is a group of people connected to each other and disconnected from the remaining, while clique test indicates whether a subset of actors is completely connected 28
  29. 29. N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy Measures •  Centrality: it serves the purpose of indicating power and influence in a network Who has more power and is, as a consequence, more influential or more important in our example? Hard to say by a naked eye!
  30. 30. N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy Measures And now: which actor has more power in each network? Andrew Bob Charles David Emma Greg Fynn (a) Star (b) Circle Andrew Bob Charles David Emma Greg Fynn Andrew Bob Charles David Emma GregFynn (c) Line
  31. 31. N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy Measures •  Centrality can be measured in different ways, each with different implications •  Degree •  Closeness •  Betweeneess
  32. 32. N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy Measures •  Degree centrality: indicates the number of ties of a a certain actor. When the ties are directional, we have out- degree which are the ties from a certain actor to others and in-degree which are the ties from others to a certain actor [Freeman and colleagues, 1979]. Fynn is the member with the highest out- and in-degree Hossain et al. (2006) found that highly centralized members coordinate better than others Bird et al. (2006) found that degree centrality indicated that developers who actually committed changes played much more significant roles in the email community than non-developers 32
  33. 33. N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy Measures •  Degree centrality: if we go back to the star network, one can see that all lines of communication lead to Andrew at the center of the network What can we conclude from this example? Is Andrew the most important member of this network? Can we conclude this from his position in the network? What if Andrew is a janitor who has the keys to every office and no power whatsoever! 33 (a) Star Andrew Bob Charles David Emma Greg Fynn And what the CEO does not need a key to the office: others open the door for him!
  34. 34. N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy •  Closeness centrality: An actor is considered important if he is relatively close to all other actors. Closeness is based on the inverse of the distance of each actor to every other actor in the network. It measures how fast information propagates from a node to the others. Fynn is the member with the highest closeness centrality, followed by Andrew Hansen (2002) found that projects whose members have higher degree of closeness centrality (i.e., short path lengths) in their knowledge network were more likely to be completed more quickly than those whose members have lower degree of closeness centrality 34 Measures
  35. 35. N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy •  Betweenness centrality: it measures how central a person is in a network by examining the fraction of shortest paths between individual pairs of team members that pass through that person. Andrew and Fynn have the higher betweeness centrality A person who lies on the path of others can control the communication flow, and thus becomes an important and influential member of the network 35 Measures It points out those who act as communication bottlenecks
  36. 36. N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy Combined interpretation 36 Low Degree Low Closeness Low Betweeness High Degree Embededd in a portion of the network that is far from the rest of the network The node’s connections are redundant and communication bypasses the node itself High Closeness Key player tied to important or active nodes Probably multiple paths in the network, the node is near many people, but so are many others High Betweeness The node’s few ties are crucial for network flow (of information, exchanges, collaborations etc.) Very rare cell: the node monopolizes the ties from a small number of people to many others
  37. 37. N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy Measures •  Brokerage: indicates when an actor, named broker, connects two otherwise unconnected actors or subgroups. Brokerage occurs when, in a triad of actors A, B, and C, A has a tie to B, B has a tie to C, but A has no tie to C. A needs B to reach C, therefore B is a broker. The actors need to be partitioned into subgroups per attribute [Gould and Fernandez, 1989]. Fynn brokers information among his developer colleagues Hinds and McGrath (2006) found that brokers effectively disseminate information between distributed sites when maintaining direct relationships is not practical Ehrlich et al. (2008) found that brokers are usually the most knowledgeable members of a team regardless of geographical location 37 Marczak et al. (2008) confirm Ehrlich et al. (2008) findings in a study of multiple distributed teams of a large IT multinational
  38. 38. N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy Measures •  Cutpoint: indicates a weak point in the network. If this actor were removed along with his connections, the network would become divided into unconnected parts. A set of cutpoints is called a cutset. Andrew and Fynn are the cutset In communication networks a cutpoint indicates disruption of information flow 38
  39. 39. N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy Measures • Network structure • Network centralization • Core-periphery • Ties reciprocity • Clique 39
  40. 40. N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy Measures •  Network centralization: quantifies the difference between the number of ties for each node divided by the maximum possible sum of differences. A centralized network (index = 1) structure will have many of its ties dispersed around one or a few actors while a decentralized network structure (index = 0) is one in which there is little variation between the number of ties each actor possesses [Freeman, 1978]. Centralization index = 0.39 Tsai (2002) found that a formal hierarchical structure in the form of centralization has a significant negative effect on knowledge sharing among organizational units 40
  41. 41. N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy Measures •  Core-periphery: indicates the extent to which the structure of a network consists of two classes of actors: a cohesive subnetwork, the core, in which the actors are connected to each other in some maximal sense; and a class of actors that are more loosely connected to the cohesive subnetwork but lack any maximal cohesion with the core, the peripheral actors. A high core value (close to 1) indicates a strong core-periphery structure [Borgatti and Everett, 1999]. Core-periphery index = 0.47 Hinds and McGrath (2006) found that communication networks with a strong core- periphery structure leads to less coordination problems than loosely connected networks 41
  42. 42. N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy Measures •  Ties reciprocity: when the relationship is considered directional (e.g., friendship, trust), then the reciprocity index can be calculated using the dyad method, the ration of the number of pairs of actors with a reciprocated ties relative to the number of pairs with any tie between the actors; or the arc method, the ration of the number of ties that are involved in reciprocal relationships relative to the total number of actual ties [Hanneman and Riddle, 2005]. Dyad method index = 0.85 The higher the index of reciprocal ties the more stable or equal the network structure is [Rao and Bandyopadhyay, 1987]. A higher reciprocity index suggests a more horizontal structure while the opposite suggests a more hierarchical network [Hanneman and Riddle, 2005]. 42
  43. 43. N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy Measures •  Clique: consists of a subset of at least 3 actors in which every possible pair of actors is directly connected by a tie and there are no other actors that are also directly connected to all members of the clique [Wasserman and Faust, 1994]. - Andrew, Bob, Charles, and David - Andrew, David, and Emma - Fynn, Iris, John, and Kevin Cain et al. (1996) found 3 large cliques consisting of team members developing 3 major activities: architecture design, code development, and code review in the communication networks of development teams 43
  44. 44. N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy Measures • Network structure and Evolution • Triadic closure • Clustering coefficient • Structural holes 44
  45. 45. N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy • Network structure and Evolution What are the mechanisms by which nodes arrive and depart, and by which ties form and vanish? Measures
  46. 46. N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy Measures •  Triadic Closure: If two people in a social network have a friend in common, then there is an increased likelihood that they will become friends themselves at some point in the future [Skyrms, 2003] This pattern can be identified when one observes the network behavior for a long time window Two developers who do not know each other who seek information from another 3rd developer are likely to quickly help each other when put together to work when the 3rd developer is also allocated to the project [Easley and Kleinberg, 2010]
  47. 47. N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy •  Reasons for having triadic closure: •  Opportunity: One reason why B and C are more likely to become friends, when they have a common friend A, is simply based on the opportunity for B and C to meet •  Trust: the fact that each of B and C is friends with A (provided they are mutually aware of this) gives them a basis for trusting each other •  Incentive: if A is friends with B and C, then it becomes a source of latent stress in these relationships if B and C are not friends with each other 47 Measures [Easley and Kleinberg, 2010]
  48. 48. N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy •  Structural Holes: •  Span asymmetric information: love triangle, also known as ‘forbidden triad’, brokerage activity of entrepreneurs, bankers, brokers or real-estate agents •  Bridge entire communities: we will see the local bridges and their role in connecting graph components and spreading novelty Structural hole Advantageous position of B, based on his position in the network. People as B are network bridges Pawlowski and Robey (2004) examine knowledge brokering as an aspect of the work of information technology professionals. Measures [Tvesovat and Kouznetsov, 2011]
  49. 49. N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy Data collection • Manual • Survey • Work diary • Observation 49
  50. 50. N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy Data collection 50
  51. 51. N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy Data collection • Automatic • Mining software repositories • E.g.: source-code, bug trackers 51
  52. 52. N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy Coffee Break We are back in 30 minutes 52
  53. 53. N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy Tools • Gephi • UCINet • NetMiner 53
  54. 54. N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy Gephi https://gephi.org/ 54 •  It is a tool for the interactive visualization and exploration of networks and graphs
  55. 55. N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy Graph files: Defining the input file that describes the network - Definition of type and idtype - # of nodes and their IDs - Definition of ties (weight is optional) .gexf graph defaultedgetype=undirected idtype=string nodes count=77 node id=0.0 label=Myriel/ node id=1.0 label=Napoleon/ … edge id=235 source=72.0 target=27.0/ edge id=237 source=73.0 target=48.0 weight=2.0/ … /edges /graph /gexf Gephi
  56. 56. N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy Opening a .gexf graph file .gexf When the file is opened, the report sums up data found and issues: •  Number of nodes •  Number of edges •  Type of graph
  57. 57. N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy Opening a .gexf graph file .gexf Preliminary overview of the network
  58. 58. N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy Layout the graph •  Force-based algorithm: linked nodes attract each-other and vice-versa Select ‘Force Atlas’
  59. 59. N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy The goal •  To obtain a meaningful representation of the network (i.e. with respect to context, available information about nodes and ‘meaning’ and strength of connections, goals of our study, etc.) !#$% '()*%*+ !%,()-$.-$+% !/%!(0*$#% 1*2+-%..3%4* 5%6*#(+7 18(/)-%#9$%# 1#(:(--% 1*2+- ;7!(+ 4(6(##% (=%(+ !(#02%#$-% !/%3% ?.(6%(2 5%#:($. @8**/%. 4$.-*$%# A(/%2$ ,(98%:$% A(:*2#$-%3(8$( B%)8$+% A(+-$+% !/%@8%+(#7$%# @8%+(#7$%# 1*.%--% C(:%#- A(298%%:%+- ,(/(-(6*$. D%#)%-2%E$/)$9% E9(2FF($#% G*/(+H C270%18(/)/(-8$%2 ,#%:%- 18%+$7$%2 1*98%)($% D*+-/%#9 ,*2(-#2%% I)*+$+% J+K%/( G*/(+L !*-8%#?++*9%+- 5#$6$%# C*+7#%--% !/%,2#0*+ 5(:#*98% 5$%+*#/(+7 !(0+*+ !%5$%+*#/(+7 !/%D*+-/%#9 !%(26*$. 4-5$%+*#/(+7 !(#$2. ,(#*+%..@ !(6%2F I+=*#(. 1*/6%F%##% D#*2:($#% A%2$ 1*2#F%#(9 ,(8*#% ,*..2%- C* 5#(+-($#% !*-8%#D2-(#98 52%2%/%# ,(6%- 1(M2%.*2. !*+-)(#+(..% @*2..($+- 18$7H 18$7L ,#2=*+ !/%N298%*2) Visual representation of node degree, cliques, edge weight as a preliminary step to a deeper analysis using SNA metrics
  60. 60. N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy Layout the graph •  Set the Repulsion strength (eg. to 10000) and Run
  61. 61. N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy Node ranking (colors) •  Choose a rank parameter (eg. Node degree) and set the colors •  Nodes will be colored according to the color range between yellow (lowest degree =1) and dark orange (higher degree)
  62. 62. N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy Edge ranking (colors) You can do the same with edge weight
  63. 63. N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy Other options •  You can control the thickness of edges, the visbility and size of node labels •  And manage different dragging modes
  64. 64. N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy Metrics •  Metrics are available in the right section of the Gephi interface •  Eg. Click on Run here, to calculate the average path length of the network
  65. 65. N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy New node values •  Metrics generates reports but also new information available for each node. •  Thus, by launching the Average path length algorithm, we now have Betweeness Centrality, Closeness Centrality and Eccentricity for each node •  Let s try to rank again nodes according to Betweeness Centrality
  66. 66. N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy Node size •  Let s express the node Betweeness Centrality using node size •  Colors will remain the indicator of the node Degree Centrality
  67. 67. N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy You should now see a colored and sized graph •  Color expresses Degree •  Size expresses Betweeness
  68. 68. N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy Node labels
  69. 69. N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy !#$% '()*%*+ !%,()-$.-$+% !/%!(0*$#% 1*2+-%..3%4* 5%6*#(+7 18(/)-%#9$%# 1#(:(--% 1*2+- ;7!(+ 4(6(##% (=%(+ !(#02%#$-% !/%3% ?.(6%(2 5%#:($. @8**/%. 4$.-*$%# A(/%2$ ,(98%:$% A(:*2#$-%3(8$( B%)8$+% A(+-$+% !/%@8%+(#7$%# @8%+(#7$%# 1*.%--% C(:%#- A(298%%:%+- ,(/(-(6*$. D%#)%-2%E$/)$9% E9(2FF($#% G*/(+H C270%18(/)/(-8$%2 ,#%:%- 18%+$7$%2 1*98%)($% D*+-/%#9 ,*2(-#2%% I)*+$+% J+K%/( G*/(+L !*-8%#?++*9%+- 5#$6$%# C*+7#%--% !/%,2#0*+ 5(:#*98% 5$%+*#/(+7 !(0+*+ !%5$%+*#/(+7 !/%D*+-/%#9 !%(26*$. 4-5$%+*#/(+7 !(#$2. ,(#*+%..@ !(6%2F I+=*#(. 1*/6%F%##% D#*2:($#% A%2$ 1*2#F%#(9 ,(8*#% ,*..2%- C* 5#(+-($#% !*-8%#D2-(#98 52%2%/%# ,(6%- 1(M2%.*2. !*+-)(#+(..% @*2..($+- 18$7H 18$7L ,#2=*+ !/%N298%*2) Save and export .gexf .gephi Input .gephi .pdf Output export project project text
  70. 70. N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy !#$% '()*%*+ !%,()-$.-$+% !/%!(0*$#% 1*2+-%..3%4* 5%6*#(+7 18(/)-%#9$%# 1#(:(--% 1*2+- ;7!(+ 4(6(##% (=%(+ !(#02%#$-% !/%3% ?.(6%(2 5%#:($. @8**/%. 4$.-*$%# A(/%2$ ,(98%:$% A(:*2#$-%3(8$( B%)8$+% A(+-$+% !/%@8%+(#7$%# @8%+(#7$%# 1*.%--% C(:%#- A(298%%:%+- ,(/(-(6*$. D%#)%-2%E$/)$9% E9(2FF($#% G*/(+H C270%18(/)/(-8$%2 ,#%:%- 18%+$7$%2 1*98%)($% D*+-/%#9 ,*2(-#2%% I)*+$+% J+K%/( G*/(+L !*-8%#?++*9%+- 5#$6$%# C*+7#%--% !/%,2#0*+ 5(:#*98% 5$%+*#/(+7 !(0+*+ !%5$%+*#/(+7 !/%D*+-/%#9 !%(26*$. 4-5$%+*#/(+7 !(#$2. ,(#*+%..@ !(6%2F I+=*#(. 1*/6%F%##% D#*2:($#% A%2$ 1*2#F%#(9 ,(8*#% ,*..2%- C* 5#(+-($#% !*-8%#D2-(#98 52%2%/%# ,(6%- 1(M2%.*2. !*+-)(#+(..% @*2..($+- 18$7H 18$7L ,#2=*+ !/%N298%*2) Opening a .gexf file Creating a .gephi project Exporting in .pdf
  71. 71. N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy Opening a .gephi project .gephi
  72. 72. N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy UCINet https://sites.google.com/site/ucinetsoftware/home 72
  73. 73. N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy NetMiner http://www.netminer.com 73
  74. 74. N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy Hand-on Exercises • Time to practice and do it yourself! 74
  75. 75. N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy Exercise 1 • Let s explore what the visualization of a social network can offer us [20 min] •  Enter the dataset made available at the Gephi tool and brainstorm with others what insights you can have about the network from its visual representation •  Save the visualization in a separated file •  Share what you have learned with the participants 75
  76. 76. N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy Exercise 2 • Let s explore a measure [15 min] •  Calculate the centrality measure (degree, closeness, and betweenness) for the network loaded in the Exercise 1 •  Share what you have learned with the participants 76
  77. 77. N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy Exercise 3 • Let s highlight the results [15 min] •  Choose how to represent the node attributes, e.g. by coloring the nodes according to the attributes and/or by showing them role in the node labels •  Share what you have learned with the participants 77
  78. 78. N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy Exercise 4 • Exporting the results [5 min] •  Save the results in PDF format •  Save the results again, now as a Gephi project 78
  79. 79. N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy Final Remarks • What one wants to learn from the social networks • Plan ahead: Design to collect proper data • Use a tool to provide support to understanding the collected dataset • Contextual information is necessary for comprehension of what the tool points out 79
  80. 80. N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy Recommended reading • Rob Cross and Andrew Parker. The Hidden Power of Social Networks: Understanding How work Really Gets Done in Organizations. Harvard Business School Press, Boston, United States, June 2004. 80
  81. 81. N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy Recommended reading • John Scott. Social Network Analysis: A Handbook. Sage Publications, London, England, 2nd edition, March 2000. 81
  82. 82. N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy Recommended reading •  Stanley Wasserman and Katherine Faust. Social Network Analysis: Methods and Applications. Cambridge University Press, Cambridge, United Kingdom, 1994. 82
  83. 83. N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy Recommended reading •  Kate Ehrlich and Klarissa Chang. Leveraging Expertise in Global Software Teams: Going Outside Boundaries. In IEEE Proc. of the International Conference on Global Software Engineering, 149– 158, Florianópolis, Brazil, October 2006. •  Marcelo Cataldo, Patrick Wagstrom, James Herbsleb, and Kathleen Carley. Identification of Coordination Requirements: Implications for the Design of Collaboration and Awareness Tools. In ACM Proc. of the Conference on Computer Supported Cooperative Work, 353–362, Banff, Canada, November 2006. 83
  84. 84. N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy References •  [Mitchell, 1969] J. Clyde Mitchell. Social Networks in Urban Situations: Analyses of Personal Relationships in Central African Towns. Manchester University Press, Manchester, United Kingdom, November 1969. •  [Herbsleb and Mockus, 2003] James Herbsleb and Audris Mockus. An Empirical Study of Speed and Communication in Globally Distributed Software Development. IEEE Transactions on Software Engineering, 29(6): 481–494, June 2003. •  [Hinds and McGrath, 2006] Pamela Hinds and Cathleen McGrath. Structures that Work: Social Structure, Work Structure and Coordination Ease in Geographically Distributed Teams. In ACM Proc. of the Conference on Computer Supported Cooperative Work, 343–352, Banff, Canada, November 2006. •  [Damian et al., 2007] Daniela Damian, Sabrina Marczak, and Irwin Kwan. Collaboration Patterns and the Impact of Distance on Awareness in Requirements- Centred Social Networks. In IEEE Proc. of the Int l Requirements Engineering Conference, 59-68, New Delhi, India, October 2007. •  [Freeman, 1978] Linton Freeman. Centrality in Social Networks: Conceptual Clarification. Social Networks, 1(3): 215–239, 1978/1979. 84
  85. 85. N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy References •  [Tsai, 2002] Wenpin Tsai. Social Structure of Coopetition Within a Multiunit Organization: Coordination, Competition, and Intraorganizational Knowledge Sharing. Organization Science, 13(2):179–190, March 2002. •  [Borgetti and Everett, 1999] Stephen Borgatti and Martin Everett. Models of Core/Periphery Structures. Social Networks, 21(4): 375–395, October 1999. •  [Hanneman and Riddle, 2005] Robert Hanneman and Mark Riddle. Introduction to Social Network Methods. University of California, Riverside, United States, 2005. •  [Rao and Bandyopadhyay, 1987] Ramachandra Rao and Sura Bandyopadhyay. Measures of Reciprocity in a Social Network. Sankhya: The Indian Journal of Statistics, Series A, 49(2): 141–188, June 1987. •  [Wasserman and Faust, 1994] Stanley Wasserman and Katherine Faust. Social Network Analysis: Methods and Applications. Crambidge University Press, Crambidge, United Kingdom, 1994. 85
  86. 86. N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy References •  [Cain et al., 1996] Brendan Cain, James Coplien, and Neil Harrison. Social Patterns in Productive Software Development Organizations. Annals of Software Engineering, 2(1): 259–286, 1996. •  [Freeman et al., 1979] Linton Freeman, Douglas Roeder, and Robert Mulholland. Centrality in Social Networks: II. Experimental Results. Social Networks, 2(2):119–141, 1979/1980. •  [Hossain et al., 2006] Liaquat Hossain, Andre Wu, and Kennetg Chung. Actor Centrality Correlates to Project Based Coordination. In ACM Proc. of the Conference on Computer Supported Cooperative Work, 363–372, Banff, Canada, November 2006. •  [Bird et al., 2006] Christian Bird, Alex Gourley, Premkumar Devanbu, Michael Gertz, and Anand Swaminathan. Mining Email Social Networks. In ACM Proc. of the Int l Workshop on Mining Software Repositories, 37–143, Shanghai, China, May 2006. •  [Hansen, 2002] Morten Hansen. Knowledge Networks: Explaining Effective Knowledge Sharing in Multiunit Companies. Organization Science, 13(3):232–248, June 2002. 86
  87. 87. N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy References •  [Ehrlich et al., 2008] Kate Ehrlich, Mary Helander, Giuseppe Valetto, Stephen Davies, and Clay Williams. An Analysis of Congruence Gaps and Their Effect on Distributed Software Development. In Workshop on Socio-Technical Congruence, in conj. with the Int l Conf. on Software Eng., Leipzig, Germany, May 2008. ACM. •  [RE 08] Sabrina Marczak, Daniela Damian, Ulrike Stege, and Adrian Schroeter, Information Brokers in Requirements-Dependency Social Networks , In: IEEE Proc. International Requirements Engineering Conference, Barcelona, Spain, 53-62, September 2008. •  [Skyrms, 2003] Brian Skyrms. The Stag Hunt and Evolution of Social Structure. Cambridge University Press, 2003. •  [Easley and Kleinberg, 2010] David Easley and Jon Kleinberg. Networks, Crowds, and Markets: Reasoning About a Highly Connected World. Cambridge University Press New York, NY, 2010 •  [Tvesovat and Kouznetsov, 2011] Maksim Tvesovat and Alexander Kouznetsov. Social Network Analysis for Startups – Finding Connections on the Social Web. O’ Reilly, 2011. 87
  88. 88. Thank you for you interest! Questions? Comments? Suggestions? Sabrina Marczak PUCRS, Porto Alegre, Brazil sabrina.marczak@pucrs.br ICGSE 2013 8th IEEE International Conference on Global Software Engineering Bari, Italy | August 26-29, 2013 www.icgse.org Nicole Novielli Uniba, Bari, Italy nicole.novielli@uniba.it

×