Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Implications Of Dual Participation Of Floss Developer


Published on

Are FLOSS Developers Committing to CVS/SVN as much as they are Talking in Mailing Lists?
Challenges for Integrating Data from Multiple Repositories

Published in: Technology, Business
  • Be the first to comment

  • Be the first to like this

Implications Of Dual Participation Of Floss Developer

  1. 1. Are FLOSS Developers Committing to CVS/SVN as much as they are Talking in Mailing Lists? Challenges for Integrating Data from Multiple Repositories Sulayman K. Sowe, I. Samoladas, I. Stamelos, L. Angelis Dept. of Informatics, Aristotle University, Greece. 3rd International Workshop on Public Data about Software Development (WoPDaSD) 10th September 2008, Milan, Italy. This research is partially sponsored by the FLOSSMetrics Project (Ref. No. FP6-IST5-033547), and SQO-OSS project (Ref. No. FP6-IST-5-033331), WoPDaSD ~.1
  2. 2. In this presentation... ➲ Nomadic life of FLOSS developers  Motivation for this research:  Research hypothesis ➲ Methodology in brief  Data & Source  Identification of developers from SVN & Lists ➲ Results & Discussion ➲ Summary & conclusion  Ongoing research WoPDaSD ~.2
  3. 3. Nomadic life of FLOSS developers ➲ Like the Fulani nomads of the West African planes FLOSS developers are not bound to a single territory and are free to:  participate in other projects or communities,  use and reuse software/bits of code from other projects,  suggest, argue for or against requirements, specs., etc. in projects where they have least commits rights,  use different identities (usernames, email), etc. WoPDaSD ~.3
  4. 4. Motivation for this research ➲ Why research FLOSS developers or nomads?  Understand the collaborative nature of developing FLOSS in terms developer participation (code commits and email postings) in multiple repositories - SVN and Mailing Lists. ➲ Research Hypothesis: IF Mailing lists are the main communication veins in most projects, then CVS/SVN is a collection of arteries. Thus,  FLOSS developers code and participate in lists discussions: H0: ”FLOSS developers contribute equally to code repository and mailing lists”, alternative H1: “FLOSS developers contribute more to code repository than mailing lists”. WoPDaSD ~.4
  5. 5. Methodology…Data & Source ➲ Retrieve data from 14 projects from the Flossmetric retrieval system  Mailing lists data dumps (.sql file format)  SVN data dumps (.sql file format) WoPDaSD ~.5
  6. 6. Initial (Raw) Data ➲ How many SVN commiters and Mailing Lists posters in each project? SVN Commits ML Posts WoPDaSD ~.6
  7. 7. Methodology…Identification of developers ➲ The main problem in studying developers activities in multiple repositories is identification: ➲ Is committer A in SVN of project X the same person (Poster A) in mailing lists of project X? WoPDaSD ~.7
  8. 8. Results & Discussion…1 ➲ The query result for each project gave us developers co-occurrence in both SVN and mailing list ➲ N=486 for all 14 projects.  Percentage of developer in both repositories  In 8 projects = 57.14%  In 4 projects = 90.11%  In 2 projects = 80.21% ➲ What is going on in ibatis and turbine? WoPDaSD ~.8
  9. 9. Results & Discussion...2 ➲ Distribution of Commits & Posts  Domination of commits over posts  Mean commit per developer > Mean post per developer  Developers are committing more to SVN than they are posting to mailing lists, EXCEPT in ibatis and turbine. WoPDaSD ~.9
  10. 10. Results & Discussion...3 ➲ Relationship between Commits and Posts ➲ Overall correlation between commits and posts shows statistical significance (with * and for p < 0.05). WoPDaSD ~.10
  11. 11. Results & Discussion...4 ➲ Developers contribution in terms of commits and posts  Wilcoxon signed rank test applied on mean values shows almost 50-50 split between projects where commits = posts (green) and commits > posts (yellow). With only the turbine project showing otherwise. WoPDaSD ~.11
  12. 12. Summary & conclusion ➲ FLOSS developers are coding as much as they are talking. They contribute equally to cod repositories and mailing lists, H0 supported. ➲ However, in almost all the projects, developers made more commits than posts, H1 supported. ➲ Why turbine and ibatis are outliers?  Maybe the high prolific developer is making more posts than commits; in a ratio 4:1.  Something peculiar about the composition of Apache related projects ➲ Ongoing aspects of this research  Automate data collection and identification process  Analyze a total of 60 or more projects from the FM retrieval system.  Add a quality dimension to committers variable:  Categorize commits: modifications, deletions, additions, code related, documentation (reports, readme, etc)  Time scale/Sliding frames: the evolution of commits and posts over a given period. WoPDaSD ~.12
  13. 13. Thank you for your attention Questions ? Comments Suggestion for improvements WoPDaSD ~.13