Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

The Anatomy of Developer Social Networks

678 views

Published on

Qiaona hong MPhil defense slides

Published in: Technology
  • Be the first to comment

The Anatomy of Developer Social Networks

  1. 1. The Anatomy of Developer Social Networks Qiaona HONG Supervisor: Prof. Shing-Chi Cheung 1
  2. 2. Social Network • Study the Topological Structure of Social Network – Y. Y. Ahn @WWW 07; A. Mislove@IMC 07 • Study the Community Structure of Social Network – V. D. Blondel@ Journal of Statistical Mechanics: Theory and Experiment; Y. R. Lin@WI 07 • Techniques to visualize the social network – Jeffrey Heer@InfoVis 05 • Influential People & Information DiffusionGeneral Social Network – Kimura, M.@InfoVis 07 (GSN) • Friend Recommendation – Nitai B. Silva@WCCI‘10 2
  3. 3. Research Questions• Q1: What are the similarities and differences between DSNs and GSNs? 3
  4. 4. Research Questions• Q1: What are the similarities and differences between DSNs and GSNs?• Q2: How do DSNs evolve over time?• Q3: How do communities evolve in DSNs?• Q4: What are the similarities and differences between DSNs extracted using different social linkage indicators? 4
  5. 5. Research Questions• Q1: What are the similarities and differences between DSNs and GSNs?• Q2: How do DSNs evolve over time?• Q3: How do communities evolve in DSNs?•Qiaona HONG, the similarities and differences Q4: What are Sunghun Kim, S.C. Cheung and Christian Bird, “Understanding a different social between DSNs extracted using Developer Social Network indicators? linkage and its Evolution”, in Proceedings of the27th IEEE International Conference on SoftwareMaintenance, 2011. 5
  6. 6. Subjects• Mozilla Bug Report: 2000-2009 – 496,692 bug reports – 3,893,025 comments• Mozilla CVS Log: 2000-2009 – 44394 revisions• Eclipse Bug Report: 2002-2009 – 294,938 bug reports – 1,618,667 comments• Eclipse CVS Log: 2002-2009 – 22493 revisions 6
  7. 7. DSN Extraction ApproachBug Report 1 Bug Report 2 Bug Report 3 Bug Report 4 David Comment 1David Comment 1 Bob Comment 1 Bob Comment 2 Bob Comment 2Bob Comment 2 Jack Comment 2 Jack Comment 3 Jack Comment 3Jack Comment 3 Bill Comment 3 Bill Comment 3 David Bill Bob Jack 7
  8. 8. DSN Extraction ApproachBug Report 1 Bug Report 2 Bug Report 3 Bug Report 4 David Comment 1David Comment 1 Bob Comment 1 Bob Comment 2 Bob Comment 2Bob Comment 2 Jack Comment 2 Jack Comment 3 Jack Comment 3Jack Comment 3 Bill Comment 3 Bill Comment 3 1 David Bill 2 2 2 2 4 Bob Jack 8
  9. 9. DSN Extraction ApproachBug Report 1 Bug Report 2 Bug Report 3 Bug Report 4 David Comment 1David Comment 1 Bob Comment 1 Bob Comment 2 Bob Comment 2Bob Comment 2 Jack Comment 2 Jack Comment 3 Jack Comment 3Jack Comment 3 Bill Comment 3 Bill Comment 3 David Bill 4 Bob Jack 9
  10. 10. DSN Extraction ApproachBug Report 1 Bug Report 2 Bug Report 3 Bug Report 4 David Comment 1David Comment 1 Bob Comment 1 Bob Comment 2 Bob Comment 2Bob Comment 2 Jack Comment 2 Jack Comment 3 Jack Comment 3Jack Comment 3 Bill Comment 3 Bill Comment 3 Bob Jack 10
  11. 11. Metrics• Degree Distribution – The number of edges connected to a node• Degree of Separation – The shortest path between two nodes• Modularity – To measure the quality of division of nodes• Community Size – The number of nodes within a community 11
  12. 12. Modularity A 0.51 B 0.176• According to A. Clauset’s work, modularity of 0.3 is a good indicator of significant community structure in a network• When the modularity is 0, the community structure is no stronger than that of a randomly generated network 12
  13. 13. Communities in DSN• Identified Communities in DSN – Louvain Algorithm (by optimizing modularity) – 50 different input ordering of nodes 13
  14. 14. ? Q1: What are the similarities and differences between DSNs and GSNsDegree of Distribution Degree of SeparationModularity Community Size 14
  15. 15. Q1: What are the similarities and differences between DSNs and GSNs Degree Distribution(1) MozillaDSN-BR (2) MozillaDSN-CL(3) EclipseDSN-BR (4) EclipseDSN-CL 15
  16. 16. Q1: What are the similarities and differences between DSNs and GSNs Degree Distribution(1) MozillaDSN-BR (2) MozillaDSN-CL(3) EclipseDSN-BR (4) EclipseDSN-CL 16
  17. 17. Q1: What are the similarities and differences between DSNs and GSNs Degree Distribution• Quantitative power law fit test – An approach of analyzing power law distributed data introduced by A. Clauset et al.• P-value : The likelihood that(2) MozillaDSN-CL (1) MozillaDSN-BR degree distribution does actually follow a power-law – If p-value is less than 0.1, the power law is rejected. (3) EclipseDSN-BR (4) EclipseDSN-CL 17
  18. 18. Q1: What are the similarities and differences between DSNs and GSNs P-value<0.1 Degree some<0.1,other>0.1 Distribution (1) MozillaDSN-BR (2) MozillaDSN-CLDifferent from GSNs, DSNs do not(4) EclipseDSN-CL (3) EclipseDSN-BR follow power-law 18
  19. 19. Q1: What are the similarities and differences between DSNs and GSNs Degree of Separation Degree of Separation Degree ofof Separation Degree Separation Degree of Separation 1-month DSN 1-month DSN 1-year DSN Degree ofDSN 1-year DSN tw itter(8000 sample) Separation twtw itter(8000 sample) tw itter(8000 sample) Degree of Separation 1-month DSN 1-month DSN 1-month DSN 3-month DSN 3-month DSN 1-month DSN 3-month DSN 6-month DSN 3-month 3-month DSN 6-month DSN 3-month DSN 6-month DSN 6-month DSN 6-month DSN 1-year DSN 1-year DSN 1-year 2-year DSN 2-year DSN 1-year DSN 2-year DSN 4-year DSN 2-year 2-year DSN 4-year DSN 2-year DSN 4-year DSN 4-year DSN 4-year DSN itter(8000 sample) tw itter(8000 sample) cyw orld(3000 sample) cyw orld(3000 sample) cyw orld(3000 sample) tw cyw orld(3000 sample) cyw orld(3000 sample) itter(8000 sample) cyw orld(3000 sample) 1.0 6-month DSN 4-year DSN 1.0 0.0 0.00.2 0.20.4 0.40.6 0.60.8 0.81.0 1.0 1.0 0.8 1.0 1-month DSN 0.6 0.6 0.0 0.0 0.2 0.2 0.4 0.4 0.6 0.6 0.6 0.6 0.8 3-month DSN Degree of Separa 0.8 0.6 0.8 6-month DSN 0.6 0.4 MozillaDSN-CL 0.4 MozillaDSN-BR MozillaDSN-CL 0.6 0.4 0.6 MozillaDSN-BR 0.4 0.4 MozillaDSN-BR MozillaDSN-BR MozillaDSN-BR 1-month DSN MozillaDSN-CL MozillaDSN-CL MozillaDSN-CL 1-year DSN MozillaDSN-CL 0.4 Degree of Separation 2-year DSN 3-month DSN MozillaDSN-BR 0.4 0.2 0.4 0.2 0.2 Probability 6-month DSN 4-year DSN Probability 0.2 0.2 Probability Probability 0.6 0.2 0.2 0.0 0.2 1-month DSN 1-year DSN tw itter(8000 sample)Probability 1.0 3-month DSN 2-year DSN cyw orld(3000 sample) 0.0 0.0 0.0 0.0 0.0 0.0 0.0 6-month DSN 4-year DSN 0 0 2 2 4 4 6 6 8 8 10 12 14 16 18 10 12 14 16 18 0 0 2 2 4 4 6 6 8 10 12 14 16 18 8 10 12 14 16 18 0.6 0 0 2 2 4 4 6 6 8 8 1010 1212 1414 1616 1818 0 2 4 6 8 10 12 14 16 18 0 0 2 2 4 4 6 6 8 8 1010 1212 1414 1616 1818 0 2 4 6 8 10 12 14 16 18 0.4 0.8 Mozilla 1.0 0 2 4 6 8 10 12 14 16 18 0 2 4 6 8 10 12 14 16 18 0.6 0.6 0.6 0.6 0.0 0.0 0.2 0.2 0.4 0.4 0.6 0.6 0.6 0.6 0.6 0.6 Probability0.2 0.4 0.4 0.6 0.6 0.6 0.6 0.8 0.4 MozillaDSN-BR 0.2 0.4 0.4 0.4 0.4 Probability EclipseDSN-BR EclipseDSN-BR EclipseDSN-CL EclipseDSN-CL 0.4 0.4 0.4 0.4 EclipseDSN-BR EclipseDSN-BR EclipseDSN-BR EclipseDSN-CL EclipseDSN-CL EclipseDSN-CL 0.4 0.6 EclipseDSN-BR EclipseDSN-CL 0.4 MozillaDSN-BR MozillaDSN-CL 0.2 0.2 0.2 0.2 0.0 0.0 0.2 0.2 0.2 0.2 0.2 0.2 0.0 0.2 0.4 0.0 0.0 0.0 0.0 0.2 0.0 0.0 0.0 0.0 obability 0 2 4 6 8 10 1 0 0 2 2 4 4 6 6 8 8 10 12 14 16 18 10 12 14 16 18 0 0 2 2 4 4 6 6 8 8 10 12 14 16 18 10 12 14 16 18 0.2 0.0 0.0 0 0 2 2 4 4 6 6 8 8 1010 1212 1414 1616 1818 0 2 4 6 8 10 12 14 16 18 0 0 2 2 4 4 6 6 8 8 1010 1212 1414 1616 1818 0 2 4 6 8 10 12 14 16 18 Distance between two developers Distance between two developers 0 2 4 6 8 Distance between two 2 developers 12 14 16 18 0 developers Distance between two 4 6 8 10 Distance between two developers 10 12 14 16 18 0 2 4 6 Distance between two developers 8 10 12 14 16 18 19 0 2 0.6 0.0 0.0
  20. 20. Q1: What are the similarities and differences between DSNs and GSNs Degree of Separation Degree ofof SeparationDegree of Separation Degree Separation Degree of Separation 1-month DSN 1-year DSN Degree ofDSN tw itter(8000 sample) Separation twtw itter(8000 sample) 1-month DSN 1-month DSN 1-month DSN 3-month DSN 3-month DSN DSN 6-month DSN 3-month 3-month DSN 1-month 6-month DSN 6-month DSN 6-month DSN 3-month DSN Degree of Separation 1-year DSN 2-year DSN 1-year 1-year DSN 1-month DSN 2-year DSN DSN 4-year DSN 2-year 2-year DSN 1-year 4-year DSN 4-year DSN 4-year DSN 3-month DSN 2-year DSN itter(8000 sample) tw itter(8000 sample) cyw orld(3000 sample) cyw 1-year sample) cyworld(3000 DSN cyw orld(3000 sample) orld(3000 sample) tw itter(8000 sample) 2-year DSN cyw orld(3000 sample) tw itter(8000 s cyw orld(3000 1.0 6-month DSN 4-year DSN 6-month DSN 4-year DSN 0.0 0.00.2 0.20.4 0.40.6 0.60.8 0.81.0 1.0 1.0 0.8 1.0 1-month DSN 4.12 0.6 0.0 0.0 0.2 0.2 0.4 0.4 0.6 0.6 0.6 0.6 3-month DSN Degree of Separa 1.0 0.8 0.6 0.8 6-month DSN 0.4 MozillaDSN-BR MozillaDSN-CL 0.6 0.4 0.6 0.6 0.4 90% (6) 0.4 MozillaDSN-BR MozillaDSN-BR MozillaDSN-BR 1-month DSN MozillaDSN-CL MozillaDSN-CL MozillaDSN-CL 1-year DSN 0.8 MozillaDSN-CL Degree of Separation 2-year DSN 3-month DSN MozillaDSN-BR 0.4 0.2 0.4 0.2 Probability 6-month DSN 4-year DSN 0.2 0.2 Probability Probability 0.6 0.6 0.2 0.0 0.2 1-month DSN 1-year DSN tw itter(8000 sample) 0.4Probability MozillaDSN-BR 2-year DSN Mozill 1.0 3-month DSN cyw orld(3000 sample) 0.0 0.0 0.0 0.0 0.0 6-month DSN 2 4 6 8 10 DSN14 4-year 12 0.4 0 2 4 6 8 10 12 14 16 18 0 16 18 0.6 0 0 2 2 4 4 6 6 8 8 1010 1212 1414 1616 1818 0 2 4 6 8 10 12 14 16 18 0 0 2 2 4 4 6 6 8 8 1010 1212 1414 1616 1818 0 2 4 6 8 10 12 14 16 18 0.4 0.8 Mozilla 0.2 1.0 0 2 4 6 8 10 12 14 16 18 0 2 4 6 8 10 12 14 16 18 Probability 0.6 0.6 0.2 0.0 0.0 0.2 0.2 0.4 0.4 0.6 0.6 0.6 0.6 0.6 0.6 Probability0.2 0.4 0.4 0.6 0.6 0.6 0.6 0.8 0.4 MozillaDSN-BR 0.2 0.4 0.4 Probability EclipseDSN-BR EclipseDSN-CL 0.4 0.4 0.4 0.0 0.0 0.4 EclipseDSN-BR EclipseDSN-BR EclipseDSN-BR EclipseDSN-CL EclipseDSN-CL EclipseDSN-CL 0.4 0.6 EclipseDSN-BR EclipseDSN-CL 0.4 0 2 4 6 8 10 MozillaDSN-BR 16 12 14 18 0 2 4 6 MozillaDSN-CL 8 10 0.2 0.2 0.0 0.0 0.2 0.2 0.2 0.2 0.2 0.2 0.0 0.2 0.4 0.0 0.0 0.6 0.6 0.2 0.0 0.0 0.0 0.0 obability 0 2 4 6 8 10 12 14 16 18 0 2 4 6 8 10 12 14 16 18 0 2 4 6 8 10 1 0.2 0.0 0.0 0 0 2 2 4 4 6 6 8 8 1010 1212 1414 1616 1818 0 2 4 6 8 10 12 14 16 18 0 0 2 2 4 4 6 6 8 8 1010 1212 1414 1616 1818 0 2 4 6 8 10 12 14 16 18 Distance between two developers 0 2 4 6 8 Distance between two 2 developers 12 14 16 18 0 developers Distance between two 4 6 8 10 Distance between two developers 10 12 14 16 18 0 2 4 Distance between two developers 6 8 10 12 14 16 18 20 0 2 0.6 0.0 0.0 .4 .4 EclipseDSN-BR Eclips
  21. 21. Q1: What are the similarities and differences between DSNs and GSNs Degree of Separation Degree of Separation Degree ofof Separation Degree Separation Degree of Separation 1-month DSN 1-month DSN 1-year DSN Degree ofDSN 1-year DSN tw itter(8000 sample) Separation twtw itter(8000 sample) tw itter(8000 sample) Degree of Separation 1-month DSN 1-month DSN 1-month DSN 3-month DSN 3-month DSN 1-month DSN 3-month DSN 6-month DSN 3-month 3-month DSN 6-month DSN 3-month DSN 6-month DSN 6-month DSN 6-month DSN 1-year DSN 1-year DSN 1-year 2-year DSN 2-year DSN 1-year DSN 2-year DSN 4-year DSN 2-year 2-year DSN 4-year DSN 2-year DSN 4-year DSN 4-year DSN 4-year DSN itter(8000 sample) tw itter(8000 sample) cyw orld(3000 sample) cyw orld(3000 sample) cyw orld(3000 sample) tw cyw orld(3000 sample) cyw orld(3000 sample) itter(8000 sample) cyw orld(3000 sample) 1.0 6-month DSN 4-year DSN 1.0 0.0 0.00.2 0.20.4 0.40.6 0.60.8 0.81.0 1.0 1.0 0.8 1.0 1-month DSN 0.6 0.6 3.0 2.1 0.0 0.0 0.2 0.2 0.4 0.4 0.6 0.6 0.6 0.6 0.8 3-month DSN Degree of Separa 0.8 0.6 0.8 6-month DSN 0.6 0.4 MozillaDSN-CL 0.4 MozillaDSN-BR MozillaDSN-CL 0.6 0.4 0.6 MozillaDSN-BR 0.4 0.4 MozillaDSN-BR MozillaDSN-BR MozillaDSN-BR 1-month DSN MozillaDSN-CL MozillaDSN-CL MozillaDSN-CL 1-year DSN MozillaDSN-CL 0.4 Degree of Separation 2-year DSN 3-month DSN MozillaDSN-BR 0.4 0.2 0.4 0.2 0.2 Probability 6-month DSN 4-year DSN Probability 0.2 0.2 Probability Probability 0.6 0.2 0.2 0.0 0.2 1-month DSN 1-year DSN tw itter(8000 sample)Probability 1.0 3-month DSN 2-year DSN cyw orld(3000 sample) 0.0 0.0 0.0 0.0 0.0 0.0 0.0 6-month DSN 4-year DSN 0 0 2 2 4 4 6 6 8 8 10 12 14 16 18 10 12 14 16 18 0 0 2 2 4 4 6 6 8 10 12 14 16 18 8 10 12 14 16 18 0.6 0 0 2 2 4 4 6 6 8 8 1010 1212 1414 1616 1818 0 2 4 6 8 10 12 14 16 18 0 0 2 2 4 4 6 6 8 8 1010 1212 1414 1616 1818 0 2 4 6 8 10 12 14 16 18 0.4 0.8 Mozilla 1.0 0 2 4 6 8 10 12 14 16 18 0 2 4 6 8 10 12 14 16 18 0.6 0.6 0.6 0.6 0.0 0.0 0.2 0.2 0.4 0.4 0.6 0.6 0.6 0.6 0.6 0.6 Probability0.2 0.4 0.4 0.6 0.6 4.0 2.5 0.6 0.6 0.8 0.4 MozillaDSN-BR 0.2 0.4 0.4 0.4 0.4 Probability EclipseDSN-BR EclipseDSN-BR EclipseDSN-CL EclipseDSN-CL 0.4 0.4 0.4 0.4 EclipseDSN-BR EclipseDSN-BR EclipseDSN-BR EclipseDSN-CL EclipseDSN-CL EclipseDSN-CL 0.4 0.6 EclipseDSN-BR EclipseDSN-CL 0.4 MozillaDSN-BR MozillaDSN-CL 0.2 0.2 0.2 0.2 0.0 0.0 0.2 0.2 0.2 0.2 0.2 0.2 0.0 0.2 0.4 0.0 0.0 0.0 0.0 0.2 0.0 0.0 0.0 0.0 obability 0 0 2 2 4 4 6 6 8 8 10 12 14 16 18 to 12 14 16 1818 2 4 6 0 0 2 2 4 4 6 6 8 8 10 12each 18 10 12 14 16 18 10 12 14 16 0 Developers in DSN are much 44closer1010 1212 1414 1616other than18 8 10 1 0.2 0.0 0.0 0 0 2 2 4 4 6 6 8 8 1010 1212 1414 1616 1818 0 2 4 6 8 10 12 14 16 1800 22 4 0 2 66 88 6 8 10 14 16 18 Distance between two developers Distance participants in GSN. Distance 16 18 between two developers 0 2 4 6 Distance between two 2 developers 12 14 0 developers between two 4 6 8 8 10 Distance between two developers 10 12 14 16 18 0 2 4 Distance between two developers 6 8 10 12 14 16 18 21 0 2 0.6 0.0 0.0
  22. 22. Q1: What are the similarities and differences between DSNs and GSNs Modularity Modularity MozillaDSN-CL 0.7 0.6 0.5 0.4 0.3 MozillaDSN-BR 0.7 0.6 0.5 0.4 Modularity 0.3 EclipseDSN-CL 0.7 0.6 0.5 0.4 0.3 EclipseDSN-BR 0.7 0.6 0.5 0.4 0.3 ok SN SN SN rld N N N DS DS DS bo wo D D D ce th th th ar ar ar Cy on on on Fa ye ye ye m m m 1- 2- 4- 1- 3- 6- NetworkSimilar to GSNs, all DSNs have significant community structure 22
  23. 23. Q1: What are the similarities and differences between DSNs and GSNs Community Size(1) MozillaDSN-BR (2) MozillaDSN-CL(3) EclipseDSN-BR (4) EclipseDSN-CL 23
  24. 24. Q1: What are the similarities and differences between DSNs and GSNs Community Size 28%(1) MozillaDSN-BR (2) MozillaDSN-CL(3) EclipseDSN-BR (4) EclipseDSN-CL 24
  25. 25. Q1: What are the similarities and differences between DSNs and GSNs Community Size 21%-36% 23%-43%(1) MozillaDSN-BR (2) MozillaDSN-CL 15%-30% 23%-33%(3) EclipseDSN-BR (4) EclipseDSN-CL 25
  26. 26. ? Q4:What are the similarities and differences between DSNs extracted using different social linkage indicators Q2: How do DSNs evolve over time?Degree of Distribution Degree of SeparationModularity Community Size 26
  27. 27. Q2: How do DSNs evolve over time? Change of Developer SizeDSNs-BR always have more developers than DSNs-CL 27
  28. 28. Q2: How do DSNs evolve over time?Change of Percentage of New Comers DSNs-BR always have higher percentage of new comers than DSNs-CL 28

×