The Anatomy of Developer Social Networks

The Anatomy of Developer Social
Networks
Qiaona HONG
Supervisor: Prof. Shing-Chi Cheung

1

Social Network
• Study the Topological Structure of Social
Network
– Y. Y. Ahn @WWW '07; A. Mislove@IMC '07
• Study the Community Structure of Social
Network
– V. D. Blondel@ Journal of Statistical Mechanics:
Theory and Experiment; Y. R. Lin@WI '07
• Techniques to visualize the social network
– Jeffrey Heer@InfoVis '05
• Influential People & Information Diffusion
General Social Network – Kimura, M.@InfoVis '07
(GSN) • Friend Recommendation
– Nitai B. Silva@WCCI‘10
2

Research Questions
• Q1: What are the similarities and differences
between DSNs and GSNs?

3

Research Questions
• Q2: How do DSNs evolve over time?
• Q3: How do communities evolve in DSNs?
between DSNs extracted using different social
linkage indicators?

4

Research Questions
• Q2: How do DSNs evolve over time?
• Q3: How do communities evolve in DSNs?
•Qiaona HONG, the similarities and differences
Q4: What are Sunghun Kim, S.C. Cheung and
Christian Bird, “Understanding a different social
between DSNs extracted using Developer Social
Network indicators?
linkage and its Evolution”, in Proceedings of the
27th IEEE International Conference on Software
Maintenance, 2011.
5

Subjects
• Mozilla Bug Report: 2000-2009
– 496,692 bug reports
– 3,893,025 comments
• Mozilla CVS Log: 2000-2009
– 44394 revisions
• Eclipse Bug Report: 2002-2009
– 294,938 bug reports
– 1,618,667 comments
• Eclipse CVS Log: 2002-2009
– 22493 revisions

6

DSN Extraction Approach
Bug Report 1 Bug Report 2 Bug Report 3 Bug Report 4
David Comment 1
David Comment 1 Bob Comment 1 Bob Comment 2
Bob Comment 2
Bob Comment 2 Jack Comment 2 Jack Comment 3 Jack Comment 3
Jack Comment 3 Bill Comment 3 Bill Comment 3

David Bill

Bob Jack 7

David Comment 1
Bob Comment 2

1
David Bill
2 2
2 2

4
Bob Jack 8

David Comment 1
Bob Comment 2

David Bill

4
Bob Jack 9

David Comment 1
Bob Comment 2

Bob Jack

10

Metrics
• Degree Distribution
– The number of edges connected to a node
• Degree of Separation
– The shortest path between two nodes
• Modularity
– To measure the quality of division of nodes
• Community Size
– The number of nodes within a community

11

Modularity

A 0.51 B 0.176
• According to A. Clauset’s work, modularity of 0.3 is
a good indicator of significant community structure
in a network
• When the modularity is 0, the community structure
is no stronger than that of a randomly generated
network 12

Communities in DSN
• Identified Communities in DSN
– Louvain Algorithm (by optimizing modularity)
– 50 different input ordering of nodes

13

?
Q1: What are the similarities
and differences between
DSNs and GSNs

Degree of Distribution Degree of Separation

Modularity Community Size

14

Q1: What are the similarities and differences between DSNs and GSNs

Degree Distribution

(1) MozillaDSN-BR (2) MozillaDSN-CL

(3) EclipseDSN-BR (4) EclipseDSN-CL
15


Degree Distribution


16


Degree Distribution

• Quantitative power law fit test
– An approach of analyzing power law distributed
data introduced by A. Clauset et al.
• P-value : The likelihood that(2) MozillaDSN-CL
(1) MozillaDSN-BR degree
distribution does actually follow a power-law
– If p-value is less than 0.1, the power law is
rejected.

17


P-value<0.1
Degree some<0.1,other>0.1
Distribution


Different from GSNs, DSNs do not(4) EclipseDSN-CL
(3) EclipseDSN-BR follow power-law
18

Degree of Separation
Degree ofof Separation
Degree Separation
1-month DSN
1-month DSN 1-year DSN
Degree ofDSN
1-year DSN tw itter(8000 sample)
Separation twtw itter(8000 sample)
tw itter(8000 sample)

1-month DSN
1-month DSN
1-month DSN
3-month DSN
3-month DSN
1-month DSN
3-month DSN
6-month DSN
3-month
3-month DSN
6-month DSN
3-month DSN
6-month DSN
6-month DSN
6-month DSN
1-year DSN
1-year DSN
1-year
2-year DSN
2-year DSN
1-year DSN
2-year DSN
4-year DSN
2-year
2-year DSN
4-year DSN
2-year DSN
4-year DSN
4-year DSN
4-year DSN
itter(8000 sample)
tw itter(8000 sample) cyw orld(3000 sample)
cyw orld(3000 sample)
tw cyw orld(3000 sample)
itter(8000 sample)

1.0

1.0
0.0 0.00.2 0.20.4 0.40.6 0.60.8 0.81.0 1.0
1.0
0.8 1.0
1-month DSN
0.6
0.6
0.0 0.0 0.2 0.2 0.4 0.4 0.6 0.6
0.6
0.6

0.8
3-month DSN
Degree of Separa

0.8
0.6 0.8
6-month DSN

0.6
0.4

MozillaDSN-CL
0.4

MozillaDSN-BR MozillaDSN-CL

0.6
0.4 0.6
MozillaDSN-BR
0.4
0.4

MozillaDSN-BR
MozillaDSN-BR
MozillaDSN-BR 1-month DSN
MozillaDSN-CL
MozillaDSN-CL
MozillaDSN-CL 1-year DSN
MozillaDSN-CL

0.4
Degree of Separation 2-year DSN
3-month DSN
MozillaDSN-BR

0.4
0.2 0.4
0.2
0.2
Probability

Probability
0.2
0.2
Probability
Probability

0.6
0.2
0.2
0.0 0.2
1-month DSN 1-year DSN tw itter(8000 sample)
Probability

1.0
3-month DSN 2-year DSN cyw orld(3000 sample)
0.0
0.0

0.0
0.0

0.0
0.0

0.0
0 0 2 2 4 4 6 6 8 8 10 12 14 16 18
10 12 14 16 18 0 0 2 2 4 4 6 6 8 10 12 14 16 18
8 10 12 14
16 18

0.6
0 0 2 2 4 4 6 6 8 8 1010 1212 1414 1616 1818
0 2 4 6 8 10 12 14 16 18 0 0 2 2 4 4 6 6 8 8 1010 1212 1414 1616 1818
0 2 4 6 8 10 12 14 16 18

0.4

0.8
Mozilla

1.0
0 2 4 6 8 10 12 14 16 18 0 2 4 6 8 10 12 14 16 18
0.6

0.6
0.6

0.6
0.0 0.0 0.2 0.2 0.4 0.4 0.6 0.6

0.6
0.6

0.6
0.6

Probability0.2 0.4 0.4 0.6 0.6
0.6

0.6
0.8
0.4
MozillaDSN-BR

0.2
0.4

0.4
0.4

0.4

Probability
EclipseDSN-BR
EclipseDSN-BR EclipseDSN-CL
EclipseDSN-CL
0.4
0.4

0.4
0.4

EclipseDSN-BR
EclipseDSN-BR
EclipseDSN-CL
EclipseDSN-CL

0.4
0.6
0.4

0.2
0.2

0.2

0.2
0.0 0.0 0.2 0.2
0.2

0.2
0.2

0.2

0.0

0.2
0.4
0.0

0.0
0.0

0.0
0.2
0.0

0.0
0.0

0.0
obability

0 2 4 6 8 10 1
0 0 2 2 4 4 6 6 8 8 10 12 14 16 18
10 12 14 16 18 0 0 2 2 4 4 6 6 8 8 10 12 14 16 18
10 12 14 16 18

0.2
0.0

0.0
0 0 2 2 4 4 6 6 8 8 1010 1212 1414 1616 1818
0 2 4 6 8 10 12 14 16 18 0 0 2 2 4 4 6 6 8 8 1010 1212 1414 1616 1818
0 2 4 6 8 10 12 14 16 18
Distance between two developers
0 2 4 6 8 Distance between two 2 developers
12 14 16 18 0 developers
Distance between two 4 6 8
10 Distance between two developers 10 12 14 16 18
0 2 4 6
Distance between two developers 8 10 12 14 16 18 19 0 2
0.6
0.0

0.0

Degree ofof SeparationDegree of Separation
Degree Separation
Degree ofDSN tw itter(8000 sample)
1-month DSN
1-month DSN
1-month DSN
3-month DSN
3-month DSN
DSN
6-month DSN
3-month
3-month DSN
1-month
6-month DSN
6-month DSN
6-month DSN
3-month DSN
1-year DSN
2-year DSN
1-year
1-year DSN
1-month DSN
2-year DSN
DSN
4-year DSN
2-year
2-year DSN
1-year
4-year DSN
4-year DSN
4-year DSN
3-month DSN
2-year DSN
itter(8000 sample)
cyw 1-year sample)
cyworld(3000 DSN
orld(3000 sample)
2-year DSN
tw itter(8000 s
cyw orld(3000

1.0

0.0 0.00.2 0.20.4 0.40.6 0.60.8 0.81.0 1.0
1.0
0.8 1.0
1-month DSN
4.12
0.6
0.0 0.0 0.2 0.2 0.4 0.4 0.6 0.6
0.6
0.6

3-month DSN
Degree of Separa

1.0
0.8
0.6 0.8
6-month DSN
0.4


0.6
0.4 0.6
0.6
0.4

90% (6)
0.4

MozillaDSN-BR
MozillaDSN-BR
MozillaDSN-CL
MozillaDSN-CL

0.8
MozillaDSN-CL
3-month DSN
MozillaDSN-BR

0.4
0.2 0.4
0.2
Probability

0.2
0.2
Probability
Probability

0.6
0.6
0.2
0.0 0.2
0.4
Probability

MozillaDSN-BR 2-year DSN Mozill

1.0
3-month DSN cyw orld(3000 sample)
0.0
0.0

0.0
0.0

0.0
6-month DSN 2 4 6 8 10 DSN14
4-year 12

0.4
0 2 4 6 8 10 12 14 16 18 0 16 18

0.6
0 0 2 2 4 4 6 6 8 8 1010 1212 1414 1616 1818
0 2 4 6 8 10 12 14 16 18 0 0 2 2 4 4 6 6 8 8 1010 1212 1414 1616 1818
0 2 4 6 8 10 12 14 16 18

0.4

0.8
Mozilla
0.2

1.0
0 2 4 6 8 10 12 14 16 18 0 2 4 6 8 10 12 14 16 18
Probability
0.6

0.6

0.2
0.0 0.0 0.2 0.2 0.4 0.4 0.6 0.6

0.6
0.6

0.6
0.6

Probability0.2 0.4 0.4 0.6 0.6
0.6

0.6
0.8
0.4
MozillaDSN-BR

0.2
0.4

0.4

Probability
0.4
0.4

0.4
0.0

0.0
0.4

EclipseDSN-BR
EclipseDSN-BR
EclipseDSN-CL
EclipseDSN-CL

0.4
0.6
0.4

0 2 4 6 8 10 MozillaDSN-BR 16
12 14 18 0 2 4 6 MozillaDSN-CL
8 10
0.2

0.2
0.0 0.0 0.2 0.2
0.2

0.2
0.2

0.2

0.0

0.2
0.4
0.0

0.0
0.6

0.6
0.2
0.0

0.0
0.0

0.0
obability

0 2 4 6 8 10 12 14 16 18 0 2 4 6 8 10 12 14 16 18
0 2 4 6 8 10 1
0.2
0.0

0.0
0 0 2 2 4 4 6 6 8 8 1010 1212 1414 1616 1818
0 2 4 6 8 10 12 14 16 18 0 0 2 2 4 4 6 6 8 8 1010 1212 1414 1616 1818
0 2 4 6 8 10 12 14 16 18
0 2 4 6 8 Distance between two 2 developers
12 14 16 18 0 developers
Distance between two 4 6 8
10 Distance between two developers 10 12 14 16 18
0 2 4
Distance between two developers 6 8 10 12 14 16 18 20 0 2
0.6
0.0

0.0
.4

.4
EclipseDSN-BR Eclips

Degree ofof Separation
Degree Separation
1-month DSN
Degree ofDSN
1-year DSN tw itter(8000 sample)

1-month DSN
1-month DSN
1-month DSN
3-month DSN
3-month DSN
1-month DSN
3-month DSN
6-month DSN
3-month
3-month DSN
6-month DSN
3-month DSN
6-month DSN
6-month DSN
6-month DSN
1-year DSN
1-year DSN
1-year
2-year DSN
2-year DSN
1-year DSN
2-year DSN
4-year DSN
2-year
2-year DSN
4-year DSN
2-year DSN
4-year DSN
4-year DSN
4-year DSN
itter(8000 sample)
tw itter(8000 sample) cyw orld(3000 sample)
tw cyw orld(3000 sample)
itter(8000 sample)

1.0

1.0
0.0 0.00.2 0.20.4 0.40.6 0.60.8 0.81.0 1.0
1.0
0.8 1.0
1-month DSN
0.6
0.6

3.0 2.1
0.0 0.0 0.2 0.2 0.4 0.4 0.6 0.6
0.6
0.6

0.8
3-month DSN
Degree of Separa

0.8
0.6 0.8
6-month DSN

0.6
0.4

MozillaDSN-CL
0.4


0.6
0.4 0.6
MozillaDSN-BR
0.4
0.4

MozillaDSN-BR
MozillaDSN-BR
MozillaDSN-CL
MozillaDSN-CL
MozillaDSN-CL

0.4
3-month DSN
MozillaDSN-BR

0.4
0.2 0.4
0.2
0.2
Probability

Probability
0.2
0.2
Probability
Probability

0.6
0.2
0.2
0.0 0.2
Probability

1.0
3-month DSN 2-year DSN cyw orld(3000 sample)
0.0
0.0

0.0
0.0

0.0
0.0

0.0
0 0 2 2 4 4 6 6 8 8 10 12 14 16 18
10 12 14 16 18 0 0 2 2 4 4 6 6 8 10 12 14 16 18
8 10 12 14
16 18

0.6
0 0 2 2 4 4 6 6 8 8 1010 1212 1414 1616 1818
0 2 4 6 8 10 12 14 16 18 0 0 2 2 4 4 6 6 8 8 1010 1212 1414 1616 1818
0 2 4 6 8 10 12 14 16 18

0.4

0.8
Mozilla

1.0
0 2 4 6 8 10 12 14 16 18 0 2 4 6 8 10 12 14 16 18
0.6

0.6
0.6

0.6
0.0 0.0 0.2 0.2 0.4 0.4 0.6 0.6

0.6
0.6

0.6
0.6

Probability0.2 0.4 0.4 0.6 0.6
4.0 2.5
0.6

0.6
0.8
0.4
MozillaDSN-BR

0.2
0.4

0.4
0.4

0.4

Probability
EclipseDSN-BR
EclipseDSN-CL
0.4
0.4

0.4
0.4

EclipseDSN-BR
EclipseDSN-BR
EclipseDSN-CL
EclipseDSN-CL

0.4
0.6
0.4

0.2
0.2

0.2

0.2
0.0 0.0 0.2 0.2
0.2

0.2
0.2

0.2

0.0

0.2
0.4
0.0

0.0
0.0

0.0
0.2
0.0

0.0
0.0

0.0
obability

0 0 2 2 4 4 6 6 8 8 10 12 14 16 18 to 12 14 16 1818 2 4 6
0 0 2 2 4 4 6 6 8 8 10 12each 18
10 12 14 16 18 10 12 14 16
0
Developers in DSN are much 44closer1010 1212 1414 1616other than18
8 10 1
0.2
0.0

0.0
0 0 2 2 4 4 6 6 8 8 1010 1212 1414 1616 1818
0 2 4 6 8 10 12 14 16 1800 22 4
0 2 66 88
6 8 10 14 16 18
Distance participants in GSN.
Distance 16 18
between two developers
0 2 4 6 Distance between two 2 developers
12 14 0 developers
between two 4 6 8
8 10 Distance between two developers 10 12 14 16 18
0 2 4
Distance between two developers 6 8 10 12 14 16 18 21 0 2
0.6
0.0

0.0


Modularity Modularity

MozillaDSN-CL
0.7
0.6
0.5
0.4
0.3
MozillaDSN-BR
0.7
0.6
0.5
0.4
Modularity

0.3
EclipseDSN-CL
0.7
0.6
0.5
0.4
0.3
EclipseDSN-BR
0.7
0.6
0.5
0.4
0.3

ok
SN

SN

SN

rld
N

N

N
DS

DS

DS

bo

wo
D

D

D

ce
th

th

th

ar

ar

ar

Cy
on

on

on

Fa
ye

ye

ye
m

m

m

1-

2-

4-
1-

3-

6-

Network

Similar to GSNs, all DSNs have significant community structure
22


Community Size


23


Community Size

28%


24


Community Size

21%-36% 23%-43%


15%-30% 23%-33%

25

?
Q4:What are the similarities and
differences between DSNs extracted
using different social linkage indicators
Q2: How do DSNs evolve over time?

Degree of Distribution Degree of Separation

Modularity Community Size

26


Change of Developer Size

DSNs-BR always have more developers than DSNs-CL

27


Change of Percentage of New Comers

DSNs-BR always have higher percentage of new
comers than DSNs-CL
28

The Anatomy of Developer Social Networks

The Anatomy of Developer Social Networks

Recommended

Recommended

More Related Content

Viewers also liked

Viewers also liked (11)

More from Sung Kim

More from Sung Kim (13)

Recently uploaded

Recently uploaded (20)

The Anatomy of Developer Social Networks

Editor's Notes