Your SlideShare is downloading. ×
Floss Metrics 2009
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Floss Metrics 2009

648

Published on

FLOSSMETRICS: The main objective of FLOSSMETRICS is to construct, publish and analyse a large scale database with information and metrics about libre software development coming from several thousands …

FLOSSMETRICS: The main objective of FLOSSMETRICS is to construct, publish and analyse a large scale database with information and metrics about libre software development coming from several thousands of software projects, using existing methodologies, and tools already developed. The project will also provide a public platform for validation and industrial exploitation of results.

Published in: Technology, Business
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
648
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
6
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Quantitative data about libre software development: the FLOSSMetrics project Jesus M. Gonzalez-Barahona Teo Romera Otero (GSyC/LibreSoft, URJC) jgb@libresoft.es teo@libresoft.es FOSSa, Grenoble, November 17th 2009
  • 2. 1 c 2006-2009 GSyC/LibreSoft Some rights reserved. This document is distributed under the Creative Commons Attribution-ShareAlike 3.0 licence, available in http://creativecommons.org/licenses/by-sa/3.0/ c GSyC/LibreSoft
  • 3. FLOSSMetrics: base ideas 2 FLOSSMetrics: base ideas Libre software development: Lots of opinions, few known facts Researcher-friendly: public data, reproducibility, validation of results, large samples Interest by volunteers and companies Main questions: Can libre software development be improved? Can software engineering learn from libre software? Can projects better understand their processes and products? http://flossmetrics.org c GSyC/LibreSoft Quantitative data about libre software development: the FLOSSMetrics project
  • 4. FLOSSMETRICS goals 3 FLOSSMETRICS goals Retrieval of data from (thousands of) libre software projects Analysis about actors, artefacts and processes involved in de- velopment Higher level studies: software evolution, human resources, ef- fort estimation, productivity, quality, etc. Database available to other researchers, developers Providing tools for development follow-up Involvement with the libre software community http://flossmetrics.org c GSyC/LibreSoft Quantitative data about libre software development: the FLOSSMetrics project
  • 5. Main results 4 Main results Huge database with factual details about libre software de- velopment (accessible to everyone) Higher level analysis and studies Sustainable platform for benchmarking and analysis Targeted reports (SMEs, industry, etc.) Focus on providing data and information that others can use for research, evaluation, follow-up http://flossmetrics.org c GSyC/LibreSoft Quantitative data about libre software development: the FLOSSMetrics project
  • 6. Partners 5 Partners Universidad Rey Juan Carlos (ES) University of Maastricht (NL) Wirtsshaftuniversitaet Wien (AT) Aristotle Univeristy of Thessaloniki (GR) Conecta s.r.l (IT) ZEA Partners (BE) Philips Medical Systems (NL) Project funded by the European Commission (FP6-IST programme) c GSyC/LibreSoft Quantitative data about libre software development: the FLOSSMetrics project
  • 7. Current status and work in progress 6 Current status and work in progress Data (of various kinds) for about 2,300 projects Full MySQL dumps for • CVS and Subversion commit records • Metrics (size, complexity) for source code • Mailing lists main headers • Issue tracking system (bug reports, etc.) Focused report on SMEs (third release) Working on focused studies Web-based data repository: Melquiades Direct access to database to some researchers http://melquiades.flossmetrics.org c GSyC/LibreSoft Quantitative data about libre software development: the FLOSSMetrics project
  • 8. Current status and work in progress (cont.) 7 c GSyC/LibreSoft Quantitative data about libre software development: the FLOSSMetrics project
  • 9. Current status and work in progress (cont.) 8 c GSyC/LibreSoft Quantitative data about libre software development: the FLOSSMetrics project
  • 10. Current status and work in progress (cont.) 9 c GSyC/LibreSoft Quantitative data about libre software development: the FLOSSMetrics project
  • 11. Tools and current status 10 Tools and current status LibreSoft Tools suite • CVSAnalY already work with CVS, SVN, git (Bazaar com- ing soon) • CVSAnalY produces complexity metrics counts for each release of each file (C, C++, Java, Python, more to come) • Bicho: bug reports from SourceForge (Bugzilla coming soon) • MLStat: mailing lists, hidding real email addresses About 2,300 projects and counting All of this integrated in Melquiades http://melquiades.flossmetrics.org http://forge.morfeo-project.org/projects/libresoft-tools/ c GSyC/LibreSoft Quantitative data about libre software development: the FLOSSMetrics project
  • 12. Retrieving information: general problems 11 Retrieving information: general problems Diversity: Kinds of forges: difficult to automate Kinds of projects: not all projects in SF are relevant Sources for same project: forge(s), distributions... Missing information: Hidden information (eg: mail headers) Lost information (eg: transition from CVS to SVN) Bugs and errors (eg: old locks in SCM) Stress to projects infrastructure!! c GSyC/LibreSoft Quantitative data about libre software development: the FLOSSMetrics project
  • 13. Retrieving information: SCM problems 12 Retrieving information: SCM problems Different systems (CVS, Subversion, git, Bazaar, Mercurial, etc.) Different models (file-based, commit-based, distributed) Bots performing commits Large transitions don’t preserve information Performance issues (systems poorly designed for massive re- trieval) But at least we have facilities for incremental retrieval c GSyC/LibreSoft Quantitative data about libre software development: the FLOSSMetrics project
  • 14. Retrieving information: BTS problems 13 Retrieving information: BTS problems Different systems (Bugzilla, SourceForge, GForge, trac, Launch- pad, etc.) Different models (bug cycle, bug report parameters, etc) Different uses (issue tracker, only bugs, scheduler, etc.) Bots acting on bug reports Lack of facilities for incremental retrieval Performance issues (systems not really designed for massive retrieval) c GSyC/LibreSoft Quantitative data about libre software development: the FLOSSMetrics project
  • 15. Retrieving information: Mailing lists problems 14 Retrieving information: Mailing lists problems Different systems (usually accessible only through HTML) Partial information (missing headers) Bots sending email (eg: commit messages) Spam (mixed with real messages) But email messages are pretty uniform in format c GSyC/LibreSoft Quantitative data about libre software development: the FLOSSMetrics project
  • 16. Retrieving information: All together 15 Retrieving information: All together How to track actors and products: • Different repositories of the same project • Different projects SourceForge helps a bit! Massive information (when dealing with 1,000s projects) Exchange formats (for third parties and reproduction) Tracking information (where did this commit record came from?): • Repositories change • Retrieval tools change • Errors do occur c GSyC/LibreSoft Quantitative data about libre software development: the FLOSSMetrics project
  • 17. Interested? 16 Interested? Detailed description of work available from the website All the software used is libre software Keep an eye on the website Tell us about your pet project, we can analyze it Interested in knowing how this is useful for you: provide feed- back about your interests, needs Willing to collaborate with projects! http://flossmetrics.org http://forge.morfeo-project.org/projects/libresoft-tools/ c GSyC/LibreSoft Quantitative data about libre software development: the FLOSSMetrics project

×