Social Network Analysis for
Global Software Engineering
Exploring relationships from a fine-grained level
Sabrina Marczak
...
N. Novielli, S. Marczak | ICGSE 2013 | Bari,
Italy
Software Development 2
N. Novielli, S. Marczak | ICGSE 2013 | Bari,
Italy
Software Development 3
N. Novielli, S. Marczak | ICGSE 2013 | Bari,
Italy
Software Development
• Collaboration
• Coordination
• Communication
Goa...
N. Novielli, S. Marczak | ICGSE 2013 | Bari,
Italy
Software Development
• Who talks with whom?
• Who receives help from wh...
N. Novielli, S. Marczak | ICGSE 2013 | Bari,
Italy
Software Development
• Are the team members following the
organizationa...
N. Novielli, S. Marczak | ICGSE 2013 | Bari,
Italy
Software Development
• How to answer to these questions?Social Network ...
N. Novielli, S. Marczak | ICGSE 2013 | Bari,
Italy
Social Network Analysis
• It provides techniques to examine the
structu...
N. Novielli, S. Marczak | ICGSE 2013 | Bari,
Italy
Learning Goals
• When we complete this tutorial, you will
be able to:
•...
N. Novielli, S. Marczak | ICGSE 2013 | Bari,
Italy
Our Agenda
• [9:00-10:30] Introduction to Social
Network Analysis:Theor...
N. Novielli, S. Marczak | ICGSE 2013 | Bari,
Italy
> Introduction to SNA
• Terminology
• Representation
• Measures
• Data ...
N. Novielli, S. Marczak | ICGSE 2013 | Bari,
Italy
Terminology
Actor
Actor = Node = Vertice
12
N. Novielli, S. Marczak | ICGSE 2013 | Bari,
Italy
Terminology
Tie
Tie = Link = Edge
13
N. Novielli, S. Marczak | ICGSE 2013 | Bari,
Italy
Terminology 14
Directional
tie
N. Novielli, S. Marczak | ICGSE 2013 | Bari,
Italy
Terminology
Dyad
15
N. Novielli, S. Marczak | ICGSE 2013 | Bari,
Italy
Terminology
Triad
16
N. Novielli, S. Marczak | ICGSE 2013 | Bari,
Italy
Representation
• Sociogram
17
N. Novielli, S. Marczak | ICGSE 2013 | Bari,
Italy
Representation
• Matrix representation of network data
Absent
Present
18
N. Novielli, S. Marczak | ICGSE 2013 | Bari,
Italy
Representation
• Actors’ attributes
Role Country Work exp.
Andrew 1 1 3...
N. Novielli, S. Marczak | ICGSE 2013 | Bari,
Italy
Representation
• Sociogram with actors’ attributes
20
N. Novielli, S. Marczak | ICGSE 2013 | Bari,
Italy
Representation
• Tie weight
• Strength
• Frequency
• Etc...
21
N. Novielli, S. Marczak | ICGSE 2013 | Bari,
Italy
Measures
• Overall network characterization
• Network size
• Network de...
N. Novielli, S. Marczak | ICGSE 2013 | Bari,
Italy
Measures
• Network size: is the number of actors in
the social network
...
N. Novielli, S. Marczak | ICGSE 2013 | Bari,
Italy
Measures
• Network density: is the proportion of
ties that exist in the...
N. Novielli, S. Marczak | ICGSE 2013 | Bari,
Italy
Measures
• Ties statistics: it uses the actors’
attributes to reveal ov...
N. Novielli, S. Marczak | ICGSE 2013 | Bari,
Italy
Measures
• Information exchange
• Reachability
• Component
• Centrality...
N. Novielli, S. Marczak | ICGSE 2013 | Bari,
Italy
Measures
• Reachability: one actor is reachable by another actor if
exi...
N. Novielli, S. Marczak | ICGSE 2013 | Bari,
Italy
Measures
• Component: indicates whether a social network is
connected.A...
N. Novielli, S. Marczak | ICGSE 2013 | Bari,
Italy
Measures
• Centrality: it serves the purpose of indicating power and
in...
N. Novielli, S. Marczak | ICGSE 2013 | Bari,
Italy
Measures
And now: which actor has more power
in each network?
Andrew Bo...
N. Novielli, S. Marczak | ICGSE 2013 | Bari,
Italy
Measures
• Centrality can be measured in different ways, each with
diff...
N. Novielli, S. Marczak | ICGSE 2013 | Bari,
Italy
Measures
• Degree centrality: indicates the number of ties of a a certa...
N. Novielli, S. Marczak | ICGSE 2013 | Bari,
Italy
Measures
• Degree centrality: if we go back to the star network, one
ca...
N. Novielli, S. Marczak | ICGSE 2013 | Bari,
Italy
• Closeness centrality: An actor is considered important if he is
relat...
N. Novielli, S. Marczak | ICGSE 2013 | Bari,
Italy
• Betweenness centrality: it measures how central a person
is in a netw...
N. Novielli, S. Marczak | ICGSE 2013 | Bari,
Italy
Combined interpretation 36
Low Degree Low Closeness Low Betweeness
High...
N. Novielli, S. Marczak | ICGSE 2013 | Bari,
Italy
Measures
• Brokerage: indicates when an actor, named broker, connects t...
N. Novielli, S. Marczak | ICGSE 2013 | Bari,
Italy
Measures
• Cutpoint: indicates a weak point in the network. If this
act...
N. Novielli, S. Marczak | ICGSE 2013 | Bari,
Italy
Measures
• Network structure
• Network centralization
• Core-periphery
...
N. Novielli, S. Marczak | ICGSE 2013 | Bari,
Italy
Measures
• Network centralization: quantifies the difference between th...
N. Novielli, S. Marczak | ICGSE 2013 | Bari,
Italy
Measures
• Core-periphery: indicates the extent to which the structure ...
N. Novielli, S. Marczak | ICGSE 2013 | Bari,
Italy
Measures
• Ties reciprocity: when the relationship is considered direct...
N. Novielli, S. Marczak | ICGSE 2013 | Bari,
Italy
Measures
• Clique: consists of a subset of at least 3 actors in which
e...
N. Novielli, S. Marczak | ICGSE 2013 | Bari,
Italy
Measures
• Network structure and Evolution
• Triadic closure
• Clusteri...
N. Novielli, S. Marczak | ICGSE 2013 | Bari,
Italy
• Network structure and Evolution
“What are the mechanisms by which nod...
N. Novielli, S. Marczak | ICGSE 2013 | Bari,
Italy
Measures
• Triadic Closure: If two people in a social network have a
fr...
N. Novielli, S. Marczak | ICGSE 2013 | Bari,
Italy
• Reasons for having triadic closure:
• Opportunity: One reason why B a...
N. Novielli, S. Marczak | ICGSE 2013 | Bari,
Italy
• Structural Holes:
• Span asymmetric information: love triangle, also ...
N. Novielli, S. Marczak | ICGSE 2013 | Bari,
Italy
Data collection
• Manual
• Survey
• Work diary
• Observation
49
N. Novielli, S. Marczak | ICGSE 2013 | Bari,
Italy
Data collection 50
N. Novielli, S. Marczak | ICGSE 2013 | Bari,
Italy
Data collection
• Automatic
• Mining software repositories
• E.g.: sour...
N. Novielli, S. Marczak | ICGSE 2013 | Bari,
Italy
Coffee Break
We are back in
30 minutes
52
N. Novielli, S. Marczak | ICGSE 2013 | Bari,
Italy
> Tools
• Gephi
• UCINet
• NetMiner
53
N. Novielli, S. Marczak | ICGSE 2013 | Bari,
Italy
Gephi
https://gephi.org/
54
• It is a tool for the interactive visualiz...
N. Novielli, S. Marczak | ICGSE 2013 | Bari,
Italy
Graph files: Defining the input file that describes the
network
-Defini...
N. Novielli, S. Marczak | ICGSE 2013 | Bari,
Italy
Opening a .gexf graph file
.gexf
When the file is opened, the report su...
N. Novielli, S. Marczak | ICGSE 2013 | Bari,
Italy
Opening a .gexf graph file
.gexf
Preliminary overview
of the network
Pr...
N. Novielli, S. Marczak | ICGSE 2013 | Bari,
Italy
Layout the graph
• Force-based algorithm: linked nodes attract each-oth...
N. Novielli, S. Marczak | ICGSE 2013 | Bari,
Italy
The goal
• To obtain a meaningful representation of the network (i.e. w...
N. Novielli, S. Marczak | ICGSE 2013 | Bari,
Italy
Layout the graph
• Set the Repulsion strength (eg. to 10000) and Run
N. Novielli, S. Marczak | ICGSE 2013 | Bari,
Italy
Node ranking (colors)
• Choose a rank parameter (eg. Node degree) and s...
N. Novielli, S. Marczak | ICGSE 2013 | Bari,
Italy
Edge ranking (colors)
You can do the same with edge weight
N. Novielli, S. Marczak | ICGSE 2013 | Bari,
Italy
Other options
• You can control the thickness of edges, the visbility a...
N. Novielli, S. Marczak | ICGSE 2013 | Bari,
Italy
Metrics
• Metrics are available in the right
section of the Gephi inter...
N. Novielli, S. Marczak | ICGSE 2013 | Bari,
Italy
New node values
• Metrics generates reports but also new information
av...
N. Novielli, S. Marczak | ICGSE 2013 | Bari,
Italy
Node size
• Let’s express the node Betweeness Centrality using node
siz...
N. Novielli, S. Marczak | ICGSE 2013 | Bari,
Italy
You should now see a colored and sized graph
• Color expresses Degree
•...
N. Novielli, S. Marczak | ICGSE 2013 | Bari,
Italy
Node labels
N. Novielli, S. Marczak | ICGSE 2013 | Bari,
Italy
Save and export
.gexf
.gephi.gephi
Input
.gephi.gephi
.pdf.pdf
Output
e...
N. Novielli, S. Marczak | ICGSE 2013 | Bari,
Italy
Opening
a .gexf file
Creating a .gephi
project
Exporting in .pdf
N. Novielli, S. Marczak | ICGSE 2013 | Bari,
Italy
Opening a .gephi project
.gephi.gephi
N. Novielli, S. Marczak | ICGSE 2013 | Bari,
Italy
UCINet
https://sites.google.com/site/ucinetsoftware/home
72
N. Novielli, S. Marczak | ICGSE 2013 | Bari,
Italy
NetMiner
http://www.netminer.com
73
N. Novielli, S. Marczak | ICGSE 2013 | Bari,
Italy
> Hand-on Exercises
• Time to practice and do it yourself!
74
N. Novielli, S. Marczak | ICGSE 2013 | Bari,
Italy
Exercise 1
• Let’s explore what the visualization of a
social network c...
N. Novielli, S. Marczak | ICGSE 2013 | Bari,
Italy
Exercise 2
• Let’s explore a measure [15 min]
• Calculate the centralit...
N. Novielli, S. Marczak | ICGSE 2013 | Bari,
Italy
Exercise 3
• Let’s highlight the results [15 min]
• Choose how to repre...
N. Novielli, S. Marczak | ICGSE 2013 | Bari,
Italy
Exercise 4
• Exporting the results [5 min]
• Save the results in PDF fo...
N. Novielli, S. Marczak | ICGSE 2013 | Bari,
Italy
> Final Remarks
• What one wants to learn from the social
networks
• Pl...
N. Novielli, S. Marczak | ICGSE 2013 | Bari,
Italy
> Recommended reading
• Rob Cross and Andrew
Parker.The Hidden
Power of...
N. Novielli, S. Marczak | ICGSE 2013 | Bari,
Italy
> Recommended reading
• John Scott. Social Network
Analysis:A Handbook....
N. Novielli, S. Marczak | ICGSE 2013 | Bari,
Italy
> Recommended reading
• StanleyWasserman and
Katherine Faust. Social
Ne...
N. Novielli, S. Marczak | ICGSE 2013 | Bari,
Italy
> Recommended reading
• Kate Ehrlich and Klarissa Chang. Leveraging Exp...
N. Novielli, S. Marczak | ICGSE 2013 | Bari,
Italy
> References
• [Mitchell, 1969] J. Clyde Mitchell. Social Networks in U...
N. Novielli, S. Marczak | ICGSE 2013 | Bari,
Italy
> References
• [Tsai, 2002] Wenpin Tsai. Social Structure of ”Coopetiti...
N. Novielli, S. Marczak | ICGSE 2013 | Bari,
Italy
> References
• [Cain et al., 1996] Brendan Cain, James Coplien, and Nei...
N. Novielli, S. Marczak | ICGSE 2013 | Bari,
Italy
> References
• [Ehrlich et al., 2008] Kate Ehrlich, Mary Helander, Gius...
Thank you for you interest!
Questions? Comments? Suggestions?
Sabrina Marczak
PUCRS, Porto Alegre, Brazil
sabrina.marczak@...
Upcoming SlideShare
Loading in …5
×

Social Network Analysis for Global Software Engineering: Exploring relationships from a fine-grained level @ICGSE 2013

2,582 views

Published on

Published in: Technology
0 Comments
8 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
2,582
On SlideShare
0
From Embeds
0
Number of Embeds
255
Actions
Shares
0
Downloads
0
Comments
0
Likes
8
Embeds 0
No embeds

No notes for slide
  • Tvesovat
  • Tvesovat
  • Tvesovat
  • Cite tzvesovat on slideshare
  • tvesovat
  • • Advantageous position of Ego based on his position in the network • Person crossing a structural hole (i.e. Ego) is called a network bridge
  • Social Network Analysis for Global Software Engineering: Exploring relationships from a fine-grained level @ICGSE 2013

    1. 1. Social Network Analysis for Global Software Engineering Exploring relationships from a fine-grained level Sabrina Marczak PUCRS, Porto Alegre, Brazil sabrina.marczak@pucrs.br ICGSE 2013 8th IEEE International Conference on Global Software Engineering Bari, Italy | August 26-29, 2013 www.icgse.org Nicole Novielli Uniba, Bari, Italy nicole.novielli@uniba.it
    2. 2. N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy Software Development 2
    3. 3. N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy Software Development 3
    4. 4. N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy Software Development • Collaboration • Coordination • Communication Goals Tasks Dependencies Deadlines 4
    5. 5. N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy Software Development • Who talks with whom? • Who receives help from whom? • Who is aware of whom? • Who are the experts? • Who are the most active contributors? 5
    6. 6. N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy Software Development • Are the team members following the organizational structure? • Are the team members coordinating with those their work is dependent on? • Are the next builds going to fail? 6
    7. 7. N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy Software Development • How to answer to these questions?Social Network Analysis 7
    8. 8. N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy Social Network Analysis • It provides techniques to examine the structure of social relationships in a group to uncover patterns of behavior and interaction among people [Mitchell, 1969] 8
    9. 9. N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy Learning Goals • When we complete this tutorial, you will be able to: • Understand the main concepts related to SNA • Better understand SNA research literature • Have basic knowledge to identify which SNA data you need for your own research • Run basic SNA measures and interpret the results 9
    10. 10. N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy Our Agenda • [9:00-10:30] Introduction to Social Network Analysis:Theory and Application from Research • [10:30-11:00] Coffee Break • [11:00-12:30] SNA in Practice:Tools and Hands-on Exercises 10
    11. 11. N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy > Introduction to SNA • Terminology • Representation • Measures • Data collection 11
    12. 12. N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy Terminology Actor Actor = Node = Vertice 12
    13. 13. N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy Terminology Tie Tie = Link = Edge 13
    14. 14. N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy Terminology 14 Directional tie
    15. 15. N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy Terminology Dyad 15
    16. 16. N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy Terminology Triad 16
    17. 17. N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy Representation • Sociogram 17
    18. 18. N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy Representation • Matrix representation of network data Absent Present 18
    19. 19. N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy Representation • Actors’ attributes Role Country Work exp. Andrew 1 1 3 Bob 1 2 3 Charles 1 1 1 David 1 2 2 Emma 1 1 1 Fynn 2 1 3 Greg 2 1 1 Hannah 2 1 1 Iris 2 1 2 John 2 2 3 Kevin 2 2 2 Lucas 2 1 2 Role 1. Tester 2. Developer Country 1. Canada 2. Ireland Work experience 1. 1-6 months 2. 6-12 months 3.18+ months 19
    20. 20. N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy Representation • Sociogram with actors’ attributes 20
    21. 21. N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy Representation • Tie weight • Strength • Frequency • Etc... 21
    22. 22. N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy Measures • Overall network characterization • Network size • Network density • Ties statistics 22
    23. 23. N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy Measures • Network size: is the number of actors in the social network Size: 12 actors Size can be larger or smaller than the team size Herbsleb and Mockus (2003) found that distributed communication networks are significantly smaller than same-site networks 23
    24. 24. N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy Measures • Network density: is the proportion of ties that exist in the network out of the total possible ties. It can vary from 0 to 1. Possible ties: 12 (12-1) / 2 = 66 Hinds and McGrath (2006) found that geographic distribution is associated with less dense work ties and less dense information sharing, suggesting that social ties are not particularly important in distributed as compared with collocated teams as a means of coordinating work and improving performance Density: 20 / 66 = 0.30 24
    25. 25. N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy Measures • Ties statistics: it uses the actors’ attributes to reveal overall network characteristics E.g.: 5 testers and 7 developers By counting up the number of ties within and cross-sites, Herbsleb and Mockus (2003) found that there is much more frequent communication with local colleagues in a distributed project than with remote ones 25 Damian et al. (2007) found that notification of changes is the main reason for communication in requirements-centric social networks
    26. 26. N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy Measures • Information exchange • Reachability • Component • Centrality • Brokerage • Cutpoint 26
    27. 27. N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy Measures • Reachability: one actor is reachable by another actor if exists any set of ties that connects both actors, regardless of how many others fall in between them [Wasserman and Faust, 1994]. All actors are reachable If some actors cannot reach others, there is a potential division in the network and thus information cannot reach everyone 27
    28. 28. N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy Measures • Component: indicates whether a social network is connected.A network is connected if there is a path between every pair of actors, otherwise it is disconnected.The actors in a disconnected network may be partitioned in subsets called components [Wasserman and Faust, 1994]. One component Component test indicates whether there is a group of people connected to each other and disconnected from the remaining, while clique test indicates whether a subset of actors is completely connected 28
    29. 29. N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy Measures • Centrality: it serves the purpose of indicating power and influence in a network Who has more power and is, as a consequence, more influential or more important in our example? Hard to say by a naked eye!
    30. 30. N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy Measures And now: which actor has more power in each network? Andrew Bob Charles David Emma Greg Fynn (a) Star (b) Circle Andrew Bob Charles David Emma Greg Fynn Andrew Bob Charles David Emma GregFynn (c) Line
    31. 31. N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy Measures • Centrality can be measured in different ways, each with different implications • Degree • Closeness • Betweeneess
    32. 32. N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy Measures • Degree centrality: indicates the number of ties of a a certain actor.When the ties are directional, we have out-degree which are the ties from a certain actor to others and in-degree which are the ties from others to a certain actor [Freeman and colleagues, 1979]. Fynn is the member with the highest out- and in-degree Hossain et al. (2006) found that highly centralized members coordinate better than others Bird et al. (2006) found that degree centrality indicated that developers who actually committed changes played much more significant roles in the email community than non-developers 32
    33. 33. N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy Measures • Degree centrality: if we go back to the star network, one can see that all lines of communication lead to Andrew at the center of the network What can we conclude from this example? Is Andrew the most important member of this network? Can we conclude this from his position in the network? What if Andrew is a janitor who has the keys to every office and no power whatsoever! 33 (a) Star Andrew Bob Charles David Emma Greg Fynn And what the CEO does not need a key to the office: others open the door for him!
    34. 34. N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy • Closeness centrality: An actor is considered important if he is relatively close to all other actors. Closeness is based on the inverse of the distance of each actor to every other actor in the network. It measures how fast information propagates from a node to the others. Fynn is the member with the highest closeness centrality, followed by Andrew Hansen (2002) found that projects whose members have higher degree of closeness centrality (i.e., short path lengths) in their knowledge network were more likely to be completed more quickly than those whose members have lower degree of closeness centrality 34 Measures
    35. 35. N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy • Betweenness centrality: it measures how central a person is in a network by examining the fraction of shortest paths between individual pairs of team members that pass through that person. Andrew and Fynn have the higher betweeness centrality A person who lies on the path of others can control the communication flow, and thus becomes an important and influential member of the network 35 Measures It points out those who act as communication bottlenecks
    36. 36. N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy Combined interpretation 36 Low Degree Low Closeness Low Betweeness High Degree Embededd in a portion of the network that is far from the rest of the network The node’s connections are redundant and communication bypasses the node itself High Closeness Key player tied to important or active nodes Probably multiple paths in the network, the node is near many people, but so are many others High Betweeness The node’s few ties are crucial for network flow (of information, exchanges, collaborations etc.) Very rare cell: the node monopolizes the ties from a small number of people to many others
    37. 37. N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy Measures • Brokerage: indicates when an actor, named broker, connects two otherwise unconnected actors or subgroups. Brokerage occurs when, in a triad of actors A, B, and C, A has a tie to B, B has a tie to C, but A has no tie to C. A needs B to reach C, therefore B is a broker.The actors need to be partitioned into subgroups per attribute [Gould and Fernandez, 1989]. Fynn brokers information among his developer colleagues Hinds and McGrath (2006) found that brokers effectively disseminate information between distributed sites when maintaining direct relationships is not practical Ehrlich et al. (2008) found that brokers are usually the most knowledgeable members of a team regardless of geographical location 37 Marczak et al. (2008) confirm Ehrlich et al. (2008) findings in a study of multiple distributed teams of a large IT multinational
    38. 38. N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy Measures • Cutpoint: indicates a weak point in the network. If this actor were removed along with his connections, the network would become divided into unconnected parts. A set of cutpoints is called a cutset. Andrew and Fynn are the cutset In communication networks a cutpoint indicates disruption of information flow 38
    39. 39. N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy Measures • Network structure • Network centralization • Core-periphery • Ties reciprocity • Clique 39
    40. 40. N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy Measures • Network centralization: quantifies the difference between the number of ties for each node divided by the maximum possible sum of differences.A centralized network (index = 1) structure will have many of its ties dispersed around one or a few actors while a decentralized network structure (index = 0) is one in which there is little variation between the number of ties each actor possesses [Freeman, 1978]. Centralization index = 0.39 Tsai (2002) found that a formal hierarchical structure in the form of centralization has a significant negative effect on knowledge sharing among organizational units 40
    41. 41. N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy Measures • Core-periphery: indicates the extent to which the structure of a network consists of two classes of actors: a cohesive subnetwork, the core, in which the actors are connected to each other in some maximal sense; and a class of actors that are more loosely connected to the cohesive subnetwork but lack any maximal cohesion with the core, the peripheral actors. A high core value (close to 1) indicates a strong core-periphery structure [Borgatti and Everett, 1999]. Core-periphery index = 0.47 Hinds and McGrath (2006) found that communication networks with a strong core- periphery structure leads to less coordination problems than loosely connected networks 41
    42. 42. N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy Measures • Ties reciprocity: when the relationship is considered directional (e.g., friendship, trust), then the reciprocity index can be calculated using the dyad method, the ration of the number of pairs of actors with a reciprocated ties relative to the number of pairs with any tie between the actors; or the arc method, the ration of the number of ties that are involved in reciprocal relationships relative to the total number of actual ties [Hanneman and Riddle, 2005]. Dyad method index = 0.85 The higher the index of reciprocal ties the more stable or equal the network structure is [Rao and Bandyopadhyay, 1987]. A higher reciprocity index suggests a more horizontal structure while the opposite suggests a more hierarchical network [Hanneman and Riddle, 2005]. 42
    43. 43. N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy Measures • Clique: consists of a subset of at least 3 actors in which every possible pair of actors is directly connected by a tie and there are no other actors that are also directly connected to all members of the clique [Wasserman and Faust, 1994]. - Andrew, Bob, Charles, and David - Andrew, David, and Emma - Fynn, Iris, John, and Kevin Cain et al. (1996) found 3 large cliques consisting of team members developing 3 major activities: architecture design, code development, and code review in the communication networks of development teams 43
    44. 44. N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy Measures • Network structure and Evolution • Triadic closure • Clustering coefficient • Structural holes 44
    45. 45. N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy • Network structure and Evolution “What are the mechanisms by which nodes arrive and depart, and by which ties form and vanish?” Measures
    46. 46. N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy Measures • Triadic Closure: If two people in a social network have a friend in common, then there is an increased likelihood that they will become friends themselves at some point in the future [Skyrms, 2003] This pattern can be identified when one observes the network behavior for a long time window Two developers who do not know each other who seek information from another 3rd developer are likely to quickly help each other when put together to work when the 3rd developer is also allocated to the project [Easley and Kleinberg, 2010]
    47. 47. N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy • Reasons for having triadic closure: • Opportunity: One reason why B and C are more likely to become friends, when they have a common friend A, is simply based on the opportunity for B and C to meet • Trust: the fact that each of B and C is friends with A (provided they are mutually aware of this) gives them a basis for trusting each other • Incentive: if A is friends with B and C, then it becomes a source of latent stress in these relationships if B and C are not friends with each other 47 Measures [Easley and Kleinberg, 2010]
    48. 48. N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy • Structural Holes: • Span asymmetric information: love triangle, also known as ‘forbidden triad’, brokerage activity of entrepreneurs, bankers, brokers or real-estate agents • Bridge entire communities: we will see the local bridges and their role in connecting graph components and spreading novelty Structural hole Advantageous position of B, based on his position in the network. People as B are network bridges Pawlowski and Robey (2004) examine knowledge brokering as an aspect of the work of information technology professionals. Measures [Tvesovat and Kouznetsov, 2011]
    49. 49. N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy Data collection • Manual • Survey • Work diary • Observation 49
    50. 50. N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy Data collection 50
    51. 51. N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy Data collection • Automatic • Mining software repositories • E.g.: source-code, bug trackers 51
    52. 52. N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy Coffee Break We are back in 30 minutes 52
    53. 53. N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy > Tools • Gephi • UCINet • NetMiner 53
    54. 54. N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy Gephi https://gephi.org/ 54 • It is a tool for the interactive visualization and exploration of networks and graphs
    55. 55. N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy Graph files: Defining the input file that describes the network -Definition of type and idtype -# of nodes and their IDs -Definition of ties (weight is optional) .gexf <graph defaultedgetype="undirected” idtype="string”> <nodes count="77”> <node id="0.0" label="Myriel"/> <node id="1.0" label="Napoleon"/> … <edge id="235" source="72.0" target="27.0"/> <edge id="237" source="73.0" target="48.0" weight="2.0"/> … </edges> </graph> </gexf> Gephi
    56. 56. N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy Opening a .gexf graph file .gexf When the file is opened, the report sums up data found and issues: • Number of nodes • Number of edges • Type of graph
    57. 57. N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy Opening a .gexf graph file .gexf Preliminary overview of the network Preliminary overview of the network
    58. 58. N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy Layout the graph • Force-based algorithm: linked nodes attract each-other and vice-versa Select ‘Force Atlas’
    59. 59. N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy The goal • To obtain a meaningful representation of the network (i.e. with respect to context, available information about nodes and ‘meaning’ and strength of connections, goals of our study, etc.) Visual representation of node degree, cliques, edge weight as a preliminary step to a deeper analysis using SNA metrics
    60. 60. N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy Layout the graph • Set the Repulsion strength (eg. to 10000) and Run
    61. 61. N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy Node ranking (colors) • Choose a rank parameter (eg. Node degree) and set the colors • Nodes will be colored according to the color range between yellow (lowest degree =1) and dark orange (higher degree)
    62. 62. N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy Edge ranking (colors) You can do the same with edge weight
    63. 63. N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy Other options • You can control the thickness of edges, the visbility and size of node labels • And manage different dragging modes
    64. 64. N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy Metrics • Metrics are available in the right section of the Gephi interface • Eg. Click on ‘Run’ here, to calculate the average path length of the network
    65. 65. N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy New node values • Metrics generates reports but also new information available for each node. • Thus, by launching the ‘Average path length’ algorithm, we now have Betweeness Centrality, Closeness Centrality and Eccentricity for each node • Let’s try to rank again nodes according to Betweeness Centrality
    66. 66. N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy Node size • Let’s express the node Betweeness Centrality using node size • Colors will remain the indicator of the node Degree Centrality
    67. 67. N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy You should now see a colored and sized graph • Color expresses Degree • Size expresses Betweeness
    68. 68. N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy Node labels
    69. 69. N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy Save and export .gexf .gephi.gephi Input .gephi.gephi .pdf.pdf Output export project project text
    70. 70. N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy Opening a .gexf file Creating a .gephi project Exporting in .pdf
    71. 71. N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy Opening a .gephi project .gephi.gephi
    72. 72. N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy UCINet https://sites.google.com/site/ucinetsoftware/home 72
    73. 73. N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy NetMiner http://www.netminer.com 73
    74. 74. N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy > Hand-on Exercises • Time to practice and do it yourself! 74
    75. 75. N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy Exercise 1 • Let’s explore what the visualization of a social network can offer us [20 min] • Enter the dataset made available at the Gephi tool and brainstorm with others what insights you can have about the network from its visual representation • Save the visualization in a separated file • Share what you have learned with the participants 75
    76. 76. N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy Exercise 2 • Let’s explore a measure [15 min] • Calculate the centrality measure (degree, closeness, and betweenness) for the network loaded in the Exercise 1 • Share what you have learned with the participants 76
    77. 77. N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy Exercise 3 • Let’s highlight the results [15 min] • Choose how to represent the node attributes, e.g. by coloring the nodes according to the attributes and/or by showing them role in the node labels • Share what you have learned with the participants 77
    78. 78. N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy Exercise 4 • Exporting the results [5 min] • Save the results in PDF format • Save the results again, now as a Gephi project 78
    79. 79. N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy > Final Remarks • What one wants to learn from the social networks • Plan ahead: Design to collect proper data • Use a tool to provide support to understanding the collected dataset • Contextual information is necessary for comprehension of what the tool points out 79
    80. 80. N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy > Recommended reading • Rob Cross and Andrew Parker.The Hidden Power of Social Networks: Understanding How work Really Gets Done in Organizations. Harvard Business School Press, Boston, United States, June 2004. 80
    81. 81. N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy > Recommended reading • John Scott. Social Network Analysis:A Handbook. Sage Publications, London, England, 2nd edition, March 2000. 81
    82. 82. N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy > Recommended reading • StanleyWasserman and Katherine Faust. Social Network Analysis: Methods and Applications. Cambridge University Press, Cambridge, United Kingdom, 1994. 82
    83. 83. N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy > Recommended reading • Kate Ehrlich and Klarissa Chang. Leveraging Expertise in Global Software Teams: Going Outside Boundaries. In IEEE Proc. of the International Conference on Global Software Engineering, 149– 158, Florianópolis, Brazil, October 2006. • Marcelo Cataldo, Patrick Wagstrom, James Herbsleb, and Kathleen Carley. Identification of Coordination Requirements: Implications for the Design of Collaboration and Awareness Tools. In ACM Proc. of the Conference on Computer Supported Cooperative Work, 353–362, Ban , Canada, November 2006.ff 83
    84. 84. N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy > References • [Mitchell, 1969] J. Clyde Mitchell. Social Networks in Urban Situations:Analyses of Personal Relationships in Central African Towns. Manchester University Press, Manchester, United Kingdom, November 1969. • [Herbsleb and Mockus, 2003] James Herbsleb and Audris Mockus.An Empirical Study of Speed and Communication in Globally Distributed Software Development. IEEE Transactions on Software Engineering, 29(6): 481–494, June 2003. • [Hinds and McGrath, 2006] Pamela Hinds and Cathleen McGrath. Structures that Work: Social Structure,Work Structure and Coordination Ease in Geographically Distributed Teams. In ACM Proc. of the Conference on Computer Supported Cooperative Work, 343–352, Banff, Canada, November 2006. • [Damian et al., 2007] Daniela Damian, Sabrina Marczak, and Irwin Kwan. Collaboration Patterns and the Impact of Distance on Awareness in Requirements- Centred Social Networks. In IEEE Proc. of the Int’l Requirements Engineering Conference, 59-68, New Delhi, India, October 2007. • [Freeman, 1978] Linton Freeman. Centrality in Social Networks: Conceptual Clarification. Social Networks, 1(3): 215–239, 1978/1979. 84
    85. 85. N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy > References • [Tsai, 2002] Wenpin Tsai. Social Structure of ”Coopetition” Within a Multiunit Organization: Coordination, Competition, and Intraorganizational Knowledge Sharing. Organization Science, 13(2):179–190, March 2002. • [Borgetti and Everett, 1999] Stephen Borgatti and Martin Everett. Models of Core/Periphery Structures. Social Networks, 21(4): 375–395, October 1999. • [Hanneman and Riddle, 2005] Robert Hanneman and Mark Riddle. Introduction to Social Network Methods. University of California, Riverside, United States, 2005. • [Rao and Bandyopadhyay, 1987] Ramachandra Rao and Sura Bandyopadhyay. Measures of Reciprocity in a Social Network. Sankhya:The Indian Journal of Statistics, Series A, 49(2): 141–188, June 1987. • [Wasserman and Faust, 1994] Stanley Wasserman and Katherine Faust. Social Network Analysis: Methods and Applications. Crambidge University Press, Crambidge, United Kingdom, 1994. 85
    86. 86. N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy > References • [Cain et al., 1996] Brendan Cain, James Coplien, and Neil Harrison. Social Patterns in Productive Software Development Organizations.Annals of Software Engineering, 2(1): 259–286, 1996. • [Freeman et al., 1979] Linton Freeman, Douglas Roeder, and Robert Mulholland. Centrality in Social Networks: II. Experimental Results. Social Networks, 2(2):119– 141, 1979/1980. • [Hossain et al., 2006] Liaquat Hossain,Andre Wu, and Kennetg Chung.Actor Centrality Correlates to Project Based Coordination. In ACM Proc. of the Conference on Computer Supported Cooperative Work, 363–372, Ban , Canada,ff November 2006. • [Bird et al., 2006] Christian Bird,Alex Gourley, Premkumar Devanbu, Michael Gertz, and Anand Swaminathan. Mining Email Social Networks. In ACM Proc. of the Int’l Workshop on Mining Software Repositories, 37–143, Shanghai, China, May 2006. • [Hansen, 2002] Morten Hansen. Knowledge Networks: Explaining Effective Knowledge Sharing in Multiunit Companies. Organization Science, 13(3):232–248, June 2002. 86
    87. 87. N. Novielli, S. Marczak | ICGSE 2013 | Bari, Italy > References • [Ehrlich et al., 2008] Kate Ehrlich, Mary Helander, GiuseppeValetto, Stephen Davies, and Clay Williams.An Analysis of Congruence Gaps and Their Effect on Distributed Software Development. In Workshop on Socio-Technical Congruence, in conj. with the Int’l Conf. on Software Eng., Leipzig, Germany, May 2008.ACM. • [RE ‘08] Sabrina Marczak, Daniela Damian, Ulrike Stege, and Adrian Schroeter, “Information Brokers in Requirements-Dependency Social Networks”, In: IEEE Proc. International Requirements Engineering Conference, Barcelona, Spain, 53-62, September 2008. • [Skyrms, 2003] Brian Skyrms.The Stag Hunt and Evolution of Social Structure. Cambridge University Press, 2003. • [Easley and Kleinberg, 2010] David Easley and Jon Kleinberg. Networks, Crowds, and Markets: Reasoning About a Highly Connected World. Cambridge University Press New York, NY, 2010 • [Tvesovat and Kouznetsov, 2011] Maksim Tvesovat and Alexander Kouznetsov. Social Network Analysis for Startups – Finding Connections on the Social Web. O’ Reilly, 2011. 87
    88. 88. Thank you for you interest! Questions? Comments? Suggestions? Sabrina Marczak PUCRS, Porto Alegre, Brazil sabrina.marczak@pucrs.br ICGSE 2013 8th IEEE International Conference on Global Software Engineering Bari, Italy | August 26-29, 2013 www.icgse.org Nicole Novielli Uniba, Bari, Italy nicole.novielli@uniba.it

    ×