Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

On the variation and specialisation of workload : The gnome case

299 views

Published on

Published in: Technology
  • Be the first to comment

On the variation and specialisation of workload : The gnome case

  1. 1. On the variation and specialisation of workload The Gnome case B.Vasilescu, A. Serebrenik, M. Goeminne, T. Mensmardi 4 décembre 2012
  2. 2. Gnome as an ecosystem • Ecosystem: set of interconnected projects • ~ 1400 projects • ~ 3000 contributors • 15 years of activity Variation and specialisation of workload Benevol 2012mardi 4 décembre 2012
  3. 3. How does workload vary across contributors? • Who are they? • What do they do? • How do they do it? A partial answer by analysing the git repositories. Variation and specialisation of workload Benevol 2012mardi 4 décembre 2012
  4. 4. Who are the contributors? Variation and specialisation of workload Benevol 2012mardi 4 décembre 2012
  5. 5. Identity matching • Contributors have an account per project repository… • … and sometimes more than one. • No explicit links between the accounts, need to guess them. • Based on names and e-mails found in the git repositories. Variation and specialisation of workload Benevol 2012mardi 4 décembre 2012
  6. 6. Identity matching (cont.) • (semi) automatic classification techniques. • Must take into account variations, abbreviations, permutations, misspelling, nicknames, etc. • No perfect process: even a manualy post-checked result can contain false positives and false negatives. • Since Gnome has no strict identification regulation on the whole, some matches are not detectable without an extra context information. Fictitious example: • Robbie Williams <robbiew@gnome.org> • Euphegenia Doubtfire <euphegenia@gmail.com> Variation and specialisation of workload Benevol 2012mardi 4 décembre 2012
  7. 7. What do the contributors do? Variation and specialisation of workload Benevol 2012mardi 4 décembre 2012
  8. 8. 13 activity types • Identified by the path, name and extension of the touched files. • Coding : *.c, *.java, etc. • Translation : *.po, etc. • Testing : */test/*, etc. • ... Variation and specialisation of workload Benevol 2012mardi 4 décembre 2012
  9. 9. How do the contributors contribute? Variation and specialisation of workload Benevol 2012mardi 4 décembre 2012
  10. 10. Metrics • APTW(p,c,t) : Number of files touched by the contributor c performing an activity of type t in a project p. • Derived metrics, by aggregation: max, sum, etc. Variation and specialisation of workload Benevol 2012mardi 4 décembre 2012
  11. 11. Workload 600 500 • 50% contributorsNumber of authors 400 made < 14 changes. 300 • 1 contributor made 200 185,874 changes. 100 0 0 2 4 6 8 10 12 log(AW) Université de Mons Rapport de formation doctorale 2011 Mathieu Goeminne mardi 4 décembre 2012
  12. 12. The more things you do, the more things you can! • Correlations • Between the number of activity types and the workload. • Between the number of projects and the workload. Variation and specialisation of workload Benevol 2012mardi 4 décembre 2012
  13. 13. Favorite activities of contributors having ≥ 14 changes • Most frequent contributors specialise in coding and development documentation. • The other activities are not subject to specialisation. Variation and specialisation of workload Benevol 2012mardi 4 décembre 2012
  14. 14. Favorite activities of contributors having < 14 changes • Most occasional contributors specialise in translation and coding. • The other activities are not subject to specialisation. Variation and specialisation of workload Benevol 2012mardi 4 décembre 2012
  15. 15. How strongly do the contributor’s focus? • Basic measure : RATW(c,t) • % of the total workload of c dedicated to t. • Use of Gini as inequality index: • Value in [0, 1[ • 0 if the workload is equally distributed. • Close to 1 if the workload is concentrated in few activity types. Variation and specialisation of workload Benevol 2012mardi 4 décembre 2012
  16. 16. Contributor’s focus (cont.) • Occasional contributors typically participate in a single activity type. • Frequent contributors typically participate in few activity types. Variation and specialisation of workload Benevol 2012mardi 4 décembre 2012
  17. 17. To summarise Variation and specialisation of workload Benevol 2012mardi 4 décembre 2012
  18. 18. What did we learn? • Most contributors are occasional and are involved in only one activity type; few are very active; frequent contributors are involved in few activity types. • The more things you do, the more things you can. • Occasional contributors are translators, involved in many projects. Frequent contributors are coders and are involved in few projects. • And more again in our paper. Variation and specialisation of workload Benevol 2012mardi 4 décembre 2012
  19. 19. How did we do it? • Contributor matching: semi-automatic and automatic methods. • Activity identification based on file path/name/extension rules. • Advanced statistical analysis (among others for the partial ordering of activity types). • Specialisation: aggregation with inequality indices. Variation and specialisation of workload Benevol 2012mardi 4 décembre 2012
  20. 20. In the future • Add a temporal aspect: How does the contributors’ behaviour change over time? • Consider subsets of Gnome: subecosystems composed by projects sharing stronger properties than all projects on average: archived, by theme, etc. • Combine both by studying migration trends. •… Variation and specialisation of workload Benevol 2012mardi 4 décembre 2012
  21. 21. Thank you On the variation and specialisation of workload – A case study of the Gnome ecosystem community B. Vasilescu, A. Serebrenik, M. Goeminne, T. Mens Empirical Software Engineering Waiting for being accepted Variation and specialisation of workload Benevol 2012mardi 4 décembre 2012

×