This is a short tutorial on social network analysis applied to software engineering for beginners. Main social network analysis are presented along with examples of their application from literature. Reading recommendation is provided. This material was presented at the Workshop on Agile Methods for Distributed Teams organized by Prof. Tayana Conte, UFAM, Manaus, Brazil, on late Nov 2012.
On the Understanding of Requirements-Driven Collaboration
An Introduction to Social Network Analysis and Its Application in Software Engineering
1. Workshop de Métodos Ágeis para Desenvolvimento Distribuído de Software
Manaus, Novembro 2012
An Introduction to
Social Network Analysis
and its application in Software Engineering
Sabrina Marczak
sabrina.marczak@pucrs.br
2. Software Development 2
Workshop de Métodos Ágeis para Desenvolvimento Distribuído de Software
Sabrina Marczak - Manaus, Novembro 2012
3. Software Development 3
Conception Planning
R. Analyst P. Manager
Design Development
Architect Developer
Testing Deployment
Tester Developer
Workshop de Métodos Ágeis para Desenvolvimento Distribuído de Software
Sabrina Marczak - Manaus, Novembro 2012
4. Software Development 3
Conception Planning
R. Analyst P. Manager
R. Analyst
Design Development Requirement
Architect Developer Tester Architect
Testing Deployment
Tester Developer
Developer P. Manager
Workshop de Métodos Ágeis para Desenvolvimento Distribuído de Software
Sabrina Marczak - Manaus, Novembro 2012
5. Software Development 4
Goals
Tasks
• Collaboration
Depen
• Coordination dencies
• Communication Deadlines
Workshop de Métodos Ágeis para Desenvolvimento Distribuído de Software
Sabrina Marczak - Manaus, Novembro 2012
6. Software Development 5
• Who talks with whom?
• Who receives help from whom?
• Who is aware of whom?
• Who are the experts?
• Who are the most active contributors?
Workshop de Métodos Ágeis para Desenvolvimento Distribuído de Software
Sabrina Marczak - Manaus, Novembro 2012
7. Software Development 6
• Are the team members following the
organizational structure?
• Are the team members coordinating with
those their work is dependent on?
• Are the next builds going to fail?
Workshop de Métodos Ágeis para Desenvolvimento Distribuído de Software
Sabrina Marczak - Manaus, Novembro 2012
8. Software Development 7
• How to answer to these questions?
Workshop de Métodos Ágeis para Desenvolvimento Distribuído de Software
Sabrina Marczak - Manaus, Novembro 2012
9. Software Development 7
• How to answer to these questions?
Social Network Analysis
Workshop de Métodos Ágeis para Desenvolvimento Distribuído de Software
Sabrina Marczak - Manaus, Novembro 2012
10. Social network analysis 8
• It provides techniques to examine the
structure of social relationships in a group
to uncover patterns of behavior and
interaction among people [Mitchell, 1969]
Workshop de Métodos Ágeis para Desenvolvimento Distribuído de Software
Sabrina Marczak - Manaus, Novembro 2012
11. Agenda 9
• Introduction to social network analysis
• My research on collaboration using SNA
• Recommended reading
Workshop de Métodos Ágeis para Desenvolvimento Distribuído de Software
Sabrina Marczak - Manaus, Novembro 2012
12. > Introduction to SNA 10
• Terminology
• Representation
• Measures
• Data collection
• Tools
Workshop de Métodos Ágeis para Desenvolvimento Distribuído de Software
Sabrina Marczak - Manaus, Novembro 2012
13. Terminology 11
Actor Charles
Bob
Greg Hannah
Fynn
David Andrew Iris
Emma
Kevin
John
Lucas
Actor = Node = Vertice
Workshop de Métodos Ágeis para Desenvolvimento Distribuído de Software
Sabrina Marczak - Manaus, Novembro 2012
14. Terminology 12
Bob
Charles
Greg Hannah
Tie
Fynn
David Andrew Iris
Emma
Kevin
John
Lucas
Tie = Link = Edge
Workshop de Métodos Ágeis para Desenvolvimento Distribuído de Software
Sabrina Marczak - Manaus, Novembro 2012
15. Terminology 13
Dyad Charles
Bob
Greg Hannah
Fynn
David Andrew Iris
Emma
Kevin
John
Lucas
Workshop de Métodos Ágeis para Desenvolvimento Distribuído de Software
Sabrina Marczak - Manaus, Novembro 2012
16. Terminology 14
Bob
Charles
Greg Hannah
Triad
Fynn
David Andrew Iris
Emma
Kevin
John
Lucas
Workshop de Métodos Ágeis para Desenvolvimento Distribuído de Software
Sabrina Marczak - Manaus, Novembro 2012
17. Representation 15
• Sociogram
Bob
Charles
Greg Hannah
Fynn
David Andrew Iris
Emma
Kevin
John
Lucas
Workshop de Métodos Ágeis para Desenvolvimento Distribuído de Software
Sabrina Marczak - Manaus, Novembro 2012
18. Representation 16
• Matrix representation of network data
Absent
Present
Workshop de Métodos Ágeis para Desenvolvimento Distribuído de Software
Sabrina Marczak - Manaus, Novembro 2012
19. Representation 17
• Actors’ attributes
Role Country Work exp.
Andrew 1 1 3 Role
Bob 1 2 3 1. Tester
2. Developer
Charles 1 1 1
David 1 2 2 Country
Emma 1 1 1 1. Canada
2. Ireland
Fynn 2 1 3
Greg 2 1 1 Work experience
1. 1-6 months
Hannah 2 1 1 2. 6-12 months
Iris 2 1 2 3.18+ months
John 2 2 3
Kevin 2 2 2
Lucas 2 1 2
Workshop de Métodos Ágeis para Desenvolvimento Distribuído de Software
Sabrina Marczak - Manaus, Novembro 2012
20. Representation 18
• Sociogram with actors’ attributes
Bob
Charles
Greg Hannah
Fynn
David Iris
Andrew
Emma
Kevin
John
Lucas
Workshop de Métodos Ágeis para Desenvolvimento Distribuído de Software
Sabrina Marczak - Manaus, Novembro 2012
21. Representation 18
• Sociogram with actors’ attributes
Bob
Charles
Greg Hannah
Legend
Developer
Tester
Canada
Fynn Ireland
1-6 months
6-12 months
David Iris
Andrew
18+ months
Emma
Kevin
John
Lucas
Workshop de Métodos Ágeis para Desenvolvimento Distribuído de Software
Sabrina Marczak - Manaus, Novembro 2012
22. Representation 19
• Tie weight
• Strength
• Frequency
• Etc...
Workshop de Métodos Ágeis para Desenvolvimento Distribuído de Software
Sabrina Marczak - Manaus, Novembro 2012
23. Measures 20
• Overall network characterization
• Network size
• Network density
• Ties statistics
Workshop de Métodos Ágeis para Desenvolvimento Distribuído de Software
Sabrina Marczak - Manaus, Novembro 2012
24. Measures 21
• Network size: is the number of actors
in the social network
Bob
Charles
Greg Hannah
Fynn
David Iris
Andrew
Emma
Kevin
John
Lucas
Workshop de Métodos Ágeis para Desenvolvimento Distribuído de Software
Sabrina Marczak - Manaus, Novembro 2012
25. Measures 21
• Network size: is the number of actors
in the social network
Bob
Charles
Greg Hannah
Fynn
David Iris
Andrew
Emma
Kevin
John
Lucas
Size: 12 actors
Workshop de Métodos Ágeis para Desenvolvimento Distribuído de Software
Sabrina Marczak - Manaus, Novembro 2012
26. Measures 21
• Network size: is the number of actors
in the social network
Bob Size can be larger or
smaller than the team size
Charles
Greg Hannah
Fynn
David Iris
Andrew
Emma
Kevin
John
Lucas
Size: 12 actors
Workshop de Métodos Ágeis para Desenvolvimento Distribuído de Software
Sabrina Marczak - Manaus, Novembro 2012
27. Measures 21
• Network size: is the number of actors
in the social network
Bob Size can be larger or
smaller than the team size
Charles
Greg Hannah
Fynn
David Iris
Andrew
Emma Herbsleb and Mockus (2003)
Kevin found that distributed
Lucas
John
communication networks
are significantly smaller than
same-site networks
Size: 12 actors
Workshop de Métodos Ágeis para Desenvolvimento Distribuído de Software
Sabrina Marczak - Manaus, Novembro 2012
28. Measures 22
• Network density: is the proportion of
ties that exist in the network out of the
total possible ties. It can vary from 0 to 1.
Bob
Charles
Greg Hannah
Fynn
David Iris
Andrew
Emma
Kevin
John
Lucas
Workshop de Métodos Ágeis para Desenvolvimento Distribuído de Software
Sabrina Marczak - Manaus, Novembro 2012
29. Measures 22
• Network density: is the proportion of
ties that exist in the network out of the
total possible ties. It can vary from 0 to 1.
Bob
Charles
Greg Hannah
Fynn
David Iris
Andrew
Emma
Kevin
John
Lucas
Possible ties: 12 (12-1) / 2 = 66
Workshop de Métodos Ágeis para Desenvolvimento Distribuído de Software
Sabrina Marczak - Manaus, Novembro 2012
30. Measures 22
• Network density: is the proportion of
ties that exist in the network out of the
total possible ties. It can vary from 0 to 1.
Bob
Charles
Greg Hannah
Fynn
David Iris
Andrew
Emma
Kevin
John
Lucas
Possible ties: 12 (12-1) / 2 = 66
Density: 20 / 66 = 0.30
Workshop de Métodos Ágeis para Desenvolvimento Distribuído de Software
Sabrina Marczak - Manaus, Novembro 2012
31. Measures 22
• Network density: is the proportion of
ties that exist in the network out of the
total possible ties. It can vary from 0 to 1.
Bob
Charles
Greg
Hinds and McGrath (2006) found
Hannah
that geographic distribution is
Fynn
associated with less dense work
David
Andrew
Iris ties and less dense information
Emma
sharing, suggesting that social ties
Kevin
are not particularly important in
John
distributed as compared with
Lucas
collocated teams as a means of
coordinating work and improving
Possible ties: 12 (12-1) / 2 = 66 performance
Density: 20 / 66 = 0.30
Workshop de Métodos Ágeis para Desenvolvimento Distribuído de Software
Sabrina Marczak - Manaus, Novembro 2012
32. Measures 23
• Ties statistics: it uses the actors’
attributes to reveal overall network
characteristics
Bob
Charles
Greg Hannah
Fynn
David Iris
Andrew
Emma
Kevin
John
Lucas
Workshop de Métodos Ágeis para Desenvolvimento Distribuído de Software
Sabrina Marczak - Manaus, Novembro 2012
33. Measures 23
• Ties statistics: it uses the actors’
attributes to reveal overall network
characteristics
Bob
Charles
Greg Hannah
Fynn
David Iris
Andrew
Emma
Kevin
John
Lucas
E.g.: 5 testers and 7 developers
Workshop de Métodos Ágeis para Desenvolvimento Distribuído de Software
Sabrina Marczak - Manaus, Novembro 2012
34. Measures 23
• Ties statistics: it uses the actors’
attributes to reveal overall network
characteristics
Bob By counting up the number
Charles
Greg Hannah
of ties within and cross-sites,
Herbesleb and Mockus
Fynn
(2003) found that there is
David
Andrew
Iris
much more frequent
Emma communication with local
Kevin colleagues in a distributed
John
project
Lucas
E.g.: 5 testers and 7 developers
Workshop de Métodos Ágeis para Desenvolvimento Distribuído de Software
Sabrina Marczak - Manaus, Novembro 2012
35. Measures 24
• Network structure
• Network centralization
• Core-periphery
• Ties reciprocity
• Clique
Workshop de Métodos Ágeis para Desenvolvimento Distribuído de Software
Sabrina Marczak - Manaus, Novembro 2012
36. Measures 25
• Network centralization: quantifies the difference between the
number of ties for each node divided by the maximum possible sum
of differences. A centralized network (index = 1) structure will have
many of its ties dispersed around one or a few actors while a
decentralized network structure (index = 0) is one in which there is
little variation between the number of ties each actor possesses
[Freeman, 1978].
Bob
Charles
Greg Hannah
Fynn
David Iris
Andrew
Emma
Kevin
John
Lucas
Workshop de Métodos Ágeis para Desenvolvimento Distribuído de Software
Sabrina Marczak - Manaus, Novembro 2012
37. Measures 25
• Network centralization: quantifies the difference between the
number of ties for each node divided by the maximum possible sum
of differences. A centralized network (index = 1) structure will have
many of its ties dispersed around one or a few actors while a
decentralized network structure (index = 0) is one in which there is
little variation between the number of ties each actor possesses
[Freeman, 1978].
Bob
Charles
Greg Hannah
Fynn
David Iris
Andrew
Emma
Kevin
John
Lucas
Centralization index = 0.39
Workshop de Métodos Ágeis para Desenvolvimento Distribuído de Software
Sabrina Marczak - Manaus, Novembro 2012
38. Measures 25
• Network centralization: quantifies the difference between the
number of ties for each node divided by the maximum possible sum
of differences. A centralized network (index = 1) structure will have
many of its ties dispersed around one or a few actors while a
decentralized network structure (index = 0) is one in which there is
little variation between the number of ties each actor possesses
[Freeman, 1978].
Bob
Charles
Greg Hannah Tsai (2002) found that a
formal hierarchical
Fynn
structure in the form of
David
Andrew
Iris
centralization has a
Emma
significant negative effect
Kevin
on knowledge sharing
among organizational units
John
Lucas
Centralization index = 0.39
Workshop de Métodos Ágeis para Desenvolvimento Distribuído de Software
Sabrina Marczak - Manaus, Novembro 2012
39. Measures 26
• Core-periphery: indicates the extent to which the structure of a
network consists of two classes of actors: a cohesive subnetwork, the
core, in which the actors are connected to each other in some
maximal sense; and a class of actors that are more loosely connected
to the cohesive subnetwork but lack any maximal cohesion with the
core, the peripheral actors. A high core value (close to 1) indicates a
strong core-periphery structure [Borgatti and Everett, 1999].
Bob
Charles
Greg Hannah
Fynn
David Iris
Andrew
Emma
Kevin
John
Lucas
Workshop de Métodos Ágeis para Desenvolvimento Distribuído de Software
Sabrina Marczak - Manaus, Novembro 2012
40. Measures 26
• Core-periphery: indicates the extent to which the structure of a
network consists of two classes of actors: a cohesive subnetwork, the
core, in which the actors are connected to each other in some
maximal sense; and a class of actors that are more loosely connected
to the cohesive subnetwork but lack any maximal cohesion with the
core, the peripheral actors. A high core value (close to 1) indicates a
strong core-periphery structure [Borgatti and Everett, 1999].
Bob
Charles
Greg Hannah
Fynn
David Iris
Andrew
Emma
Kevin
John
Lucas
Core-periphery index = 0.47
Workshop de Métodos Ágeis para Desenvolvimento Distribuído de Software
Sabrina Marczak - Manaus, Novembro 2012
41. Measures 26
• Core-periphery: indicates the extent to which the structure of a
network consists of two classes of actors: a cohesive subnetwork, the
core, in which the actors are connected to each other in some
maximal sense; and a class of actors that are more loosely connected
to the cohesive subnetwork but lack any maximal cohesion with the
core, the peripheral actors. A high core value (close to 1) indicates a
strong core-periphery structure [Borgatti and Everett, 1999].
Bob
Charles
Greg Hannah
Hinds and McGrath (2006)
found that communication
Fynn
networks with a strong core-
David
Andrew
Iris
periphery structure leads to
Emma less coordination problems
Kevin than loosely connected
John
networks
Lucas
Core-periphery index = 0.47
Workshop de Métodos Ágeis para Desenvolvimento Distribuído de Software
Sabrina Marczak - Manaus, Novembro 2012
42. Measures 27
• Ties reciprocity: when the relationship is considered directional (e.g.,
friendship, trust), then the reciprocity index can be calculated using the dyad
method, the ration of the number of pairs of actors with a reciprocated ties
relative to the number of pairs with any tie between the actors; or the arc
method, the ration of the number of ties that are involved in reciprocal
relationships relative to the total number of actual ties [Hanneman and Riddle, 2005].
Bob
Charles
Greg Hannah
Fynn
David Iris
Andrew
Emma
Kevin
John
Lucas
Workshop de Métodos Ágeis para Desenvolvimento Distribuído de Software
Sabrina Marczak - Manaus, Novembro 2012
43. Measures 27
• Ties reciprocity: when the relationship is considered directional (e.g.,
friendship, trust), then the reciprocity index can be calculated using the dyad
method, the ration of the number of pairs of actors with a reciprocated ties
relative to the number of pairs with any tie between the actors; or the arc
method, the ration of the number of ties that are involved in reciprocal
relationships relative to the total number of actual ties [Hanneman and Riddle, 2005].
Bob
Charles
Greg Hannah
Fynn
David Iris
Andrew
Emma
Kevin
John
Lucas
Dyad method index = 0.85
Workshop de Métodos Ágeis para Desenvolvimento Distribuído de Software
Sabrina Marczak - Manaus, Novembro 2012
44. Measures 27
• Ties reciprocity: when the relationship is considered directional (e.g.,
friendship, trust), then the reciprocity index can be calculated using the dyad
method, the ration of the number of pairs of actors with a reciprocated ties
relative to the number of pairs with any tie between the actors; or the arc
method, the ration of the number of ties that are involved in reciprocal
relationships relative to the total number of actual ties [Hanneman and Riddle, 2005].
Bob
Charles
Greg Hannah The higher the index of
reciprocal ties the more stable
Fynn
or equal the network structure
David
is [Rao and Bandyopadhyay, 1987].
Iris
Andrew
Emma
Kevin
John
Lucas
Dyad method index = 0.85
Workshop de Métodos Ágeis para Desenvolvimento Distribuído de Software
Sabrina Marczak - Manaus, Novembro 2012
45. Measures 27
• Ties reciprocity: when the relationship is considered directional (e.g.,
friendship, trust), then the reciprocity index can be calculated using the dyad
method, the ration of the number of pairs of actors with a reciprocated ties
relative to the number of pairs with any tie between the actors; or the arc
method, the ration of the number of ties that are involved in reciprocal
relationships relative to the total number of actual ties [Hanneman and Riddle, 2005].
Bob
Charles
Greg Hannah The higher the index of
reciprocal ties the more stable
Fynn
or equal the network structure
David
is [Rao and Bandyopadhyay, 1987].
Iris
Andrew
Emma
A higher reciprocity index
Kevin
suggests a more horizontal
John
structure while the opposite
Lucas
suggests a more hierarchical
Dyad method index = 0.85 network [Hanneman and Riddle, 2005].
Workshop de Métodos Ágeis para Desenvolvimento Distribuído de Software
Sabrina Marczak - Manaus, Novembro 2012
46. Measures 28
• Clique: consists of a subset of at least 3 actors in which
every possible pair of actors is directly connected by a tie
and there are no other actors that are also directly
connected to all members of the clique [Wasserman and Faust, 1994].
Bob
Charles
Greg Hannah
Fynn
David Iris
Andrew
Emma
Kevin
John
Lucas
Workshop de Métodos Ágeis para Desenvolvimento Distribuído de Software
Sabrina Marczak - Manaus, Novembro 2012
47. Measures 28
• Clique: consists of a subset of at least 3 actors in which
every possible pair of actors is directly connected by a tie
and there are no other actors that are also directly
connected to all members of the clique [Wasserman and Faust, 1994].
Bob
Charles
Greg Hannah
Fynn
David Iris
Andrew
Emma
Kevin
John
Lucas
- Andrew, Bob, Charles, and David
- Andrew, David, and Emma
- Fynn, Iris, John, and Kevin
- Fynn, John, Kevin, and Lucas Workshop de Métodos Ágeis para Desenvolvimento Distribuído de Software
Sabrina Marczak - Manaus, Novembro 2012
48. Measures 28
• Clique: consists of a subset of at least 3 actors in which
every possible pair of actors is directly connected by a tie
and there are no other actors that are also directly
connected to all members of the clique [Wasserman and Faust, 1994].
Bob
Charles
Greg Hannah
Cain and colleagues (1996)
Fynn found 3 large cliques
David
Andrew
Iris
consisting of team members
Emma
developing 3 major activities:
Kevin
architecture design, code
John
development, and code
Lucas review in the communication
networks of a certain
- Andrew, Bob, Charles, and David
- Andrew, David, and Emma development team
- Fynn, Iris, John, and Kevin
- Fynn, John, Kevin, and Lucas Workshop de Métodos Ágeis para Desenvolvimento Distribuído de Software
Sabrina Marczak - Manaus, Novembro 2012
49. Measures 29
• Information exchange
• Reachability
• Component
• Degree centrality
Workshop de Métodos Ágeis para Desenvolvimento Distribuído de Software
Sabrina Marczak - Manaus, Novembro 2012
50. Measures 30
• Reachability: one actor is reachable by another actor if
exists any set of ties that connects both actors, regardless
of how many others fall in between them [Wasserman and Faust, 1994].
Bob
Charles
Greg Hannah
Fynn
David Iris
Andrew
Emma
Kevin
John
Lucas
Workshop de Métodos Ágeis para Desenvolvimento Distribuído de Software
Sabrina Marczak - Manaus, Novembro 2012
51. Measures 30
• Reachability: one actor is reachable by another actor if
exists any set of ties that connects both actors, regardless
of how many others fall in between them [Wasserman and Faust, 1994].
Bob
Charles
Greg Hannah
Fynn
David Iris
Andrew
Emma
Kevin
John
Lucas
All actors are reachable
Workshop de Métodos Ágeis para Desenvolvimento Distribuído de Software
Sabrina Marczak - Manaus, Novembro 2012
52. Measures 30
• Reachability: one actor is reachable by another actor if
exists any set of ties that connects both actors, regardless
of how many others fall in between them [Wasserman and Faust, 1994].
Bob
Charles
Greg Hannah If some actors cannot
reach others, there is a
Fynn
potential division in the
network and thus
David Iris
Andrew
Emma
information cannot reach
everyone
Kevin
John
Lucas
All actors are reachable
Workshop de Métodos Ágeis para Desenvolvimento Distribuído de Software
Sabrina Marczak - Manaus, Novembro 2012
53. Measures 31
• Component: indicates whether a social network is
connected. A network is connected if there is a path
between every pair of actors, otherwise it is disconnected.
The actors in a disconnected network may be partitioned in
subsets called components [Wasserman and Faust, 1994].
Bob
Charles
Greg Hannah
Fynn
David Iris
Andrew
Emma
Kevin
John
Lucas
Workshop de Métodos Ágeis para Desenvolvimento Distribuído de Software
Sabrina Marczak - Manaus, Novembro 2012
54. Measures 31
• Component: indicates whether a social network is
connected. A network is connected if there is a path
between every pair of actors, otherwise it is disconnected.
The actors in a disconnected network may be partitioned in
subsets called components [Wasserman and Faust, 1994].
Bob
Charles
Greg Hannah
Fynn
David Iris
Andrew
Emma
Kevin
John
Lucas
One component
Workshop de Métodos Ágeis para Desenvolvimento Distribuído de Software
Sabrina Marczak - Manaus, Novembro 2012
55. Measures 31
• Component: indicates whether a social network is
connected. A network is connected if there is a path
between every pair of actors, otherwise it is disconnected.
The actors in a disconnected network may be partitioned in
subsets called components [Wasserman and Faust, 1994].
Bob
Charles
Greg Hannah
Component test indicates
whether there is a group of
Fynn
people connected to each
David
Andrew
Iris other and disconnected from
Emma the remaining, while clique
Kevin test indicates whether a
John subset of actors is
Lucas
completely connected
One component
Workshop de Métodos Ágeis para Desenvolvimento Distribuído de Software
Sabrina Marczak - Manaus, Novembro 2012
56. Measures 32
• Degree centrality: indicates the number of ties of an actor
and is indicative of activity. When the ties are directional, we
have out-degree which are the ties from a certain actor to
others and in-degree which are the ties from others to a
certain actor [Freeman and colleagues, 1979].
Bob
Charles
Greg Hannah
Fynn
David Iris
Andrew
Emma
Kevin
John
Lucas
Workshop de Métodos Ágeis para Desenvolvimento Distribuído de Software
Sabrina Marczak - Manaus, Novembro 2012
57. Measures 32
• Degree centrality: indicates the number of ties of an actor
and is indicative of activity. When the ties are directional, we
have out-degree which are the ties from a certain actor to
others and in-degree which are the ties from others to a
certain actor [Freeman and colleagues, 1979].
Bob
Charles
Greg Hannah
Fynn
David Iris
Andrew
Emma
Kevin
John
Lucas
Fynn is the member with the
highest out- and in-degree
Workshop de Métodos Ágeis para Desenvolvimento Distribuído de Software
Sabrina Marczak - Manaus, Novembro 2012
58. Measures 32
• Degree centrality: indicates the number of ties of an actor
and is indicative of activity. When the ties are directional, we
have out-degree which are the ties from a certain actor to
others and in-degree which are the ties from others to a
certain actor [Freeman and colleagues, 1979].
Charles
Bob
Hossain and colleagues (2006) found
Greg Hannah
that highly centralized members
coordinate better than others
Fynn
David Iris
Andrew
Emma
Kevin
John
Lucas
Fynn is the member with the
highest out- and in-degree
Workshop de Métodos Ágeis para Desenvolvimento Distribuído de Software
Sabrina Marczak - Manaus, Novembro 2012
59. Measures 32
• Degree centrality: indicates the number of ties of an actor
and is indicative of activity. When the ties are directional, we
have out-degree which are the ties from a certain actor to
others and in-degree which are the ties from others to a
certain actor [Freeman and colleagues, 1979].
Charles
Bob
Hossain and colleagues (2006) found
Greg Hannah
that highly centralized members
coordinate better than others
Fynn
David Iris
Bird and colleagues (2006) found
Andrew
that degree centrality indicated
Emma
that developers who actually
Kevin committed changes played much
John more significant roles in the email
Lucas community than non-developers
Fynn is the member with the
highest out- and in-degree
Workshop de Métodos Ágeis para Desenvolvimento Distribuído de Software
Sabrina Marczak - Manaus, Novembro 2012
60. Measures 33
• Brokerage: indicates when an actor, named broker, connects
two otherwise unconnected actors or subgroups. Brokerage
occurs when, in a triad of actors A, B, and C, A has a tie to B, B has
a tie to C, but A has no tie to C. A needs B to reach C, therefore B
is a broker. The actors need to be partitioned into subgroups per
attribute [Gould and Fernandez, 1989].
Bob
Charles
Greg Hannah
Fynn
David Iris
Andrew
Emma
Kevin
John
Lucas
Workshop de Métodos Ágeis para Desenvolvimento Distribuído de Software
Sabrina Marczak - Manaus, Novembro 2012
61. Measures 33
• Brokerage: indicates when an actor, named broker, connects
two otherwise unconnected actors or subgroups. Brokerage
occurs when, in a triad of actors A, B, and C, A has a tie to B, B has
a tie to C, but A has no tie to C. A needs B to reach C, therefore B
is a broker. The actors need to be partitioned into subgroups per
attribute [Gould and Fernandez, 1989].
Bob
Charles
Greg Hannah
Fynn
David Iris
Andrew
Emma
Kevin
John
Lucas
Fynn brokers information among
his developer colleagues
Workshop de Métodos Ágeis para Desenvolvimento Distribuído de Software
Sabrina Marczak - Manaus, Novembro 2012
62. Measures 33
• Brokerage: indicates when an actor, named broker, connects
two otherwise unconnected actors or subgroups. Brokerage
occurs when, in a triad of actors A, B, and C, A has a tie to B, B has
a tie to C, but A has no tie to C. A needs B to reach C, therefore B
is a broker. The actors need to be partitioned into subgroups per
attribute [Gould and Fernandez, 1989].
Bob Hinds and McGrath (2006) found
Charles
Greg Hannah that brokers effectively disseminate
information between distributed
Fynn sites when maintaining direct
David
relationships is not practical
Iris
Andrew
Emma
Kevin
John
Lucas
Fynn brokers information among
his developer colleagues
Workshop de Métodos Ágeis para Desenvolvimento Distribuído de Software
Sabrina Marczak - Manaus, Novembro 2012
63. Measures 33
• Brokerage: indicates when an actor, named broker, connects
two otherwise unconnected actors or subgroups. Brokerage
occurs when, in a triad of actors A, B, and C, A has a tie to B, B has
a tie to C, but A has no tie to C. A needs B to reach C, therefore B
is a broker. The actors need to be partitioned into subgroups per
attribute [Gould and Fernandez, 1989].
Bob Hinds and McGrath (2006) found
Charles
Greg Hannah that brokers effectively disseminate
information between distributed
Fynn sites when maintaining direct
David
relationships is not practical
Iris
Andrew
Emma Ehrlich and colleagues (2008)
Kevin found that brokers are usually the
John
most knowledgeable members of
Lucas
a team regardless of geographical
Fynn brokers information among location
his developer colleagues
Workshop de Métodos Ágeis para Desenvolvimento Distribuído de Software
Sabrina Marczak - Manaus, Novembro 2012
64. Measures 34
• Cutpoint: indicates a weak point in the network. If this
actor were removed along with his connections, the network
would become divided into unconnected parts. A set of
cutpoints is called a cutset.
Bob
Charles
Greg Hannah
Fynn
David Iris
Andrew
Emma
Kevin
John
Lucas
Workshop de Métodos Ágeis para Desenvolvimento Distribuído de Software
Sabrina Marczak - Manaus, Novembro 2012
65. Measures 34
• Cutpoint: indicates a weak point in the network. If this
actor were removed along with his connections, the network
would become divided into unconnected parts. A set of
cutpoints is called a cutset.
Bob
Charles
Greg Hannah
Fynn
David Iris
Andrew
Emma
Kevin
John
Lucas
Andrew and Fynn are the cutset
Workshop de Métodos Ágeis para Desenvolvimento Distribuído de Software
Sabrina Marczak - Manaus, Novembro 2012
66. Measures 34
• Cutpoint: indicates a weak point in the network. If this
actor were removed along with his connections, the network
would become divided into unconnected parts. A set of
cutpoints is called a cutset.
Bob
Charles
Greg Hannah
In communication
networks a cutpoint
Fynn
indicates disruption of
David Iris
Andrew
Emma
information flow
Kevin
John
Lucas
Andrew and Fynn are the cutset
Workshop de Métodos Ágeis para Desenvolvimento Distribuído de Software
Sabrina Marczak - Manaus, Novembro 2012
67. Data collection 35
• Manual
• Survey
• Work diary
• Observation
Workshop de Métodos Ágeis para Desenvolvimento Distribuído de Software
Sabrina Marczak - Manaus, Novembro 2012
68. Data collection 36
Workshop de Métodos Ágeis para Desenvolvimento Distribuído de Software
Sabrina Marczak - Manaus, Novembro 2012
69. Data collection 37
• Automatic
• Mining software repositories
• E.g.: source-code, bug trackers
Workshop de Métodos Ágeis para Desenvolvimento Distribuído de Software
Sabrina Marczak - Manaus, Novembro 2012
70. Tools 38
• UCINet
https://sites.google.com/site/ucinetsoftware/home
Workshop de Métodos Ágeis para Desenvolvimento Distribuído de Software
Sabrina Marczak - Manaus, Novembro 2012
71. Tools 39
• NetMiner
http://www.netminer.com
Workshop de Métodos Ágeis para Desenvolvimento Distribuído de Software
Sabrina Marczak - Manaus, Novembro 2012
72. Tools 40
• Gephi
https://gephi.org/
Workshop de Métodos Ágeis para Desenvolvimento Distribuído de Software
Sabrina Marczak - Manaus, Novembro 2012
73. > My research 41
• RE ’07: Patterns
• RE ’08: Brokerage
• Book Ch. ’10: RDC framework
• RE ’11: Roles and communication
• ICSE ’12: Domain knowledge
Workshop de Métodos Ágeis para Desenvolvimento Distribuído de Software
Sabrina Marczak - Manaus, Novembro 2012
74. > My research 42
• RE ’07: Collaboration patterns and impact of
distance on awareness
Workshop de Métodos Ágeis para Desenvolvimento Distribuído de Software
Sabrina Marczak - Manaus, Novembro 2012
75. > My research 43
• RE ’08: Brokerage
Brokerage predominant in certain
types of communication
Distance didn’t matter
Knowledge and experience as
determinants for brokerage
Workshop de Métodos Ágeis para Desenvolvimento Distribuído de Software
Sabrina Marczak - Manaus, Novembro 2012
76. > My research 44
• Book ch. ’10: RDC framework
Workshop de Métodos Ágeis para Desenvolvimento Distribuído de Software
Sabrina Marczak - Manaus, Novembro 2012
77. > My research 45
• RE ’11: Roles and communication structures
Workshop de Métodos Ágeis para Desenvolvimento Distribuído de Software
Sabrina Marczak - Manaus, Novembro 2012
78. > My research 46
• ICSE ’13: Domain knowledge and hierarchical
control structures in coordination
Communication ties that do not follow task assignments
but are according to hierarchical structure
Workshop de Métodos Ágeis para Desenvolvimento Distribuído de Software
Sabrina Marczak - Manaus, Novembro 2012
79. > Recommended reading 47
Rob Cross and Andrew Parker.
The Hidden Power of Social
Networks: Understanding How
work Really Gets Done in
Organizations. Harvard Business
School Press, Boston, United
States, June 2004.
Workshop de Métodos Ágeis para Desenvolvimento Distribuído de Software
Sabrina Marczak - Manaus, Novembro 2012
80. > Recommended reading 48
John Scott. Social Network
Analysis: A Handbook. Sage
Publications, London,
England, 2nd edition, March
2000.
Workshop de Métodos Ágeis para Desenvolvimento Distribuído de Software
Sabrina Marczak - Manaus, Novembro 2012
81. > Recommended reading 49
Stanley Wasserman and
Katherine Faust. Social
Network Analysis: Methods and
Applications. Crambidge
University Press, Crambidge,
United Kingdom, 1994.
Workshop de Métodos Ágeis para Desenvolvimento Distribuído de Software
Sabrina Marczak - Manaus, Novembro 2012
82. > Recommended reading 50
• Kate Ehrlich and Klarissa Chang. Leveraging Expertise in Global
Software Teams: Going Outside Boundaries. In IEEE Proc. of the
International Conference on Global Software Engineering, 149–
158, Florianópolis, Brazil, October 2006.
• Marcelo Cataldo, Patrick Wagstrom, James Herbsleb, and
Kathleen Carley. Identification of Coordination Requirements:
Implications for the Design of Collaboration and Awareness
Tools. In ACM Proc. of the Conference on Computer Supported
Cooperative Work, 353–362, Banff, Canada, November 2006.
Workshop de Métodos Ágeis para Desenvolvimento Distribuído de Software
Sabrina Marczak - Manaus, Novembro 2012
83. > References 51
[Mitchell, 1969] J. Clyde Mitchell. Social Networks in Urban Situations: Analyses of
Personal Relationships in Central African Towns. Manchester University Press, Manchester,
United Kingdom, November 1969.
[Herbsleb and Mockus, 2003] James Herbsleb and Audris Mockus. An Empirical
Study of Speed and Communication in Globally Distributed Software Development. IEEE
Transactions on Software Engineering, 29(6): 481–494, June 2003.
[Hinds and McGrath, 2006] Pamela Hinds and Cathleen McGrath. Structures that
Work: Social Structure, Work Structure and Coordination Ease in Geographically
Distributed Teams. In ACM Proc. of the Conference on Computer Supported Cooperative
Work, 343–352, Banff, Canada, November 2006.
[Freeman, 1978] Linton Freeman. Centrality in Social Networks: Conceptual
Clarification. Social Networks, 1(3): 215–239, 1978/1979.
[Tsai, 2002] Wenpin Tsai. Social Structure of ”Coopetition” Within a Multiunit
Organization: Coordination, Competition, and Intraorganizational Knowledge Sharing.
Organization Science, 13(2):179–190, March 2002.
Workshop de Métodos Ágeis para Desenvolvimento Distribuído de Software
Sabrina Marczak - Manaus, Novembro 2012
84. > References 52
[Borgetti and Everett, 1999] Stephen Borgatti and Martin Everett. Models of Core/
Periphery Structures. Social Networks, 21(4): 375–395, October 1999.
[Hanneman and Riddle, 2005] Robert Hanneman and Mark Riddle. Introduction to
Social Network Methods. University of California, Riverside, United States, 2005.
[Rao and Bandyopadhyay, 1987] Ramachandra Rao and Sura Bandyopadhyay.
Measures of Reciprocity in a Social Network. Sankhya: The Indian Journal of Statistics,
Series A, 49(2): 141–188, June 1987.
[Wasserman and Faust, 1994] Stanley Wasserman and Katherine Faust. Social
Network Analysis: Methods and Applications. Crambidge University Press, Crambidge,
United Kingdom, 1994.
[Cain and colleagues, 1996] Brendan Cain, James Coplien, and Neil Harrison. Social
Patterns in Productive Software Development Organizations. Annals of Software
Engineering, 2(1): 259–286, 1996.
Workshop de Métodos Ágeis para Desenvolvimento Distribuído de Software
Sabrina Marczak - Manaus, Novembro 2012
85. > References 53
[Freeman and colleagues, 1979] Linton Freeman, Douglas Roeder, and Robert
Mulholland. Centrality in Social Networks: II. Experimental Results. Social Networks, 2(2):
119–141, 1979/1980.
[Hossain and colleagues, 2006] Liaquat Hossain, Andre Wu, and Kennetg Chung.
Actor Centrality Correlates to Project Based Coordination. In ACM Proc. of the
Conference on Computer Supported Cooperative Work, 363–372, Banff, Canada,
November 2006.
[Bird and colleagues, 2006] Christian Bird, Alex Gourley, Premkumar Devanbu,
Michael Gertz, and Anand Swaminathan. Mining Email Social Networks. In ACM Proc. of
the Int’l Workshop on Mining Software Repositories, 37–143, Shanghai, China, May 2006.
[Ehrlich and colleagues, 2008] Kate Ehrlich, Mary Helander, Giuseppe Valetto,
Stephen Davies, and Clay Williams. An Analysis of Congruence Gaps and Their Effct on
Distributed Software Development. In Workshop on Socio-Technical Congruence, in conj.
with the Int’l Conference on Software Engineering, Leipzig, Germany, May 2008. ACM.
Workshop de Métodos Ágeis para Desenvolvimento Distribuído de Software
Sabrina Marczak - Manaus, Novembro 2012
86. > References 54
[RE ‘07] Daniela Damian, Sabrina Marczak, and Irwin Kwan, “Collaboration Patterns and the
Impact of Distance on Awareness in Requirements-Centred Social Networks”, In: IEEE Proc.
International Requirements Engineering Conference, New Delhi, India, 59-68, 2007.
[RE ‘08] Sabrina Marczak, Daniela Damian, Ulrike Stege, and Adrian Schroeter, “Information
Brokers in Requirements-Dependency Social Networks”, In: IEEE Proc. International
Requirements Engineering Conference, Barcelona, Spain, 53-62, September 2008.
[Book ch. ‘10] Daniela Damian, Irwin Kwan, and Sabrina Marczak, Requirements-Driven
Collaboration: Leveraging the Invisible Relationships between Requirements and People,
Collaborative Software Engineering, Mistrik, I., Grundy, J., van der Hoek, A, Whitehead, J. (Eds.),
Chapter 3, pages 57-76, Springer-Verlag, London, England, March 2010.
[RE ‘11] Sabrina Marczak and Daniela Damian, “How Interaction Between Roles Shapes the
Communication Structure in Requirements-Driven Collaboration”, In: IEEE Proc. International
Requirements Engineering Conference, Trento, Italy, 47-56, 2011.
[ICSE ’13] Daniela Damian, Remko Helms, Irwin Kwan, Sabrina Marczak, and Benjamin
Koelewijn, “The Role of Domain Knowledge and Hierarchical Control Structures in Socio-
Technical Coordination”, In: IEEE International Conference on Software Engineering, San
Francisco, USA, May 2013 (To appear).
Workshop de Métodos Ágeis para Desenvolvimento Distribuído de Software
Sabrina Marczak - Manaus, Novembro 2012
87. Workshop de Métodos Ágeis para Desenvolvimento Distribuído de Software
Manaus, Novembro 2012
Thank you for your attention!
Questions?
Sabrina Marczak
sabrina.marczak@pucrs.br
http://www.inf.pucrs.br/sabrina.marczak/