Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Quantitative data about libre software
development: the FLOSSMetrics project
       Jesus M. Gonzalez-Barahona
           ...
1




                                       c 2006-2009 GSyC/LibreSoft
        Some rights reserved. This document is dis...
FLOSSMetrics: base ideas                                     2



                     FLOSSMetrics: base ideas

      Lib...
FLOSSMETRICS goals                                         3




                       FLOSSMETRICS goals

           Ret...
Main results                                         4




                                Main results

           Huge d...
Partners                                           5




                                    Partners

           Universi...
Current status and work in progress                               6



               Current status and work in progress
...
Current status and work in progress (cont.)                           7




c GSyC/LibreSoft   Quantitative data about lib...
Current status and work in progress (cont.)                           8




c GSyC/LibreSoft   Quantitative data about lib...
Current status and work in progress (cont.)                           9




c GSyC/LibreSoft   Quantitative data about lib...
Tools and current status                                   10



                      Tools and current status

         ...
Retrieving information: general problems                            11




          Retrieving information: general probl...
Retrieving information: SCM problems                               12




            Retrieving information: SCM problems...
Retrieving information: BTS problems                               13




             Retrieving information: BTS problem...
Retrieving information: Mailing lists problems                         14




               Retrieving information: Maili...
Retrieving information: All together                             15



               Retrieving information: All together...
Interested?                                         16




                                  Interested?

           Detai...
Upcoming SlideShare
Loading in …5
×

Floss Metrics 2009

FLOSSMETRICS: The main objective of FLOSSMETRICS is to construct, publish and analyse a large scale database with information and metrics about libre software development coming from several thousands of software projects, using existing methodologies, and tools already developed. The project will also provide a public platform for validation and industrial exploitation of results.

  • Login to see the comments

  • Be the first to like this

Floss Metrics 2009

  1. 1. Quantitative data about libre software development: the FLOSSMetrics project Jesus M. Gonzalez-Barahona Teo Romera Otero (GSyC/LibreSoft, URJC) jgb@libresoft.es teo@libresoft.es FOSSa, Grenoble, November 17th 2009
  2. 2. 1 c 2006-2009 GSyC/LibreSoft Some rights reserved. This document is distributed under the Creative Commons Attribution-ShareAlike 3.0 licence, available in http://creativecommons.org/licenses/by-sa/3.0/ c GSyC/LibreSoft
  3. 3. FLOSSMetrics: base ideas 2 FLOSSMetrics: base ideas Libre software development: Lots of opinions, few known facts Researcher-friendly: public data, reproducibility, validation of results, large samples Interest by volunteers and companies Main questions: Can libre software development be improved? Can software engineering learn from libre software? Can projects better understand their processes and products? http://flossmetrics.org c GSyC/LibreSoft Quantitative data about libre software development: the FLOSSMetrics project
  4. 4. FLOSSMETRICS goals 3 FLOSSMETRICS goals Retrieval of data from (thousands of) libre software projects Analysis about actors, artefacts and processes involved in de- velopment Higher level studies: software evolution, human resources, ef- fort estimation, productivity, quality, etc. Database available to other researchers, developers Providing tools for development follow-up Involvement with the libre software community http://flossmetrics.org c GSyC/LibreSoft Quantitative data about libre software development: the FLOSSMetrics project
  5. 5. Main results 4 Main results Huge database with factual details about libre software de- velopment (accessible to everyone) Higher level analysis and studies Sustainable platform for benchmarking and analysis Targeted reports (SMEs, industry, etc.) Focus on providing data and information that others can use for research, evaluation, follow-up http://flossmetrics.org c GSyC/LibreSoft Quantitative data about libre software development: the FLOSSMetrics project
  6. 6. Partners 5 Partners Universidad Rey Juan Carlos (ES) University of Maastricht (NL) Wirtsshaftuniversitaet Wien (AT) Aristotle Univeristy of Thessaloniki (GR) Conecta s.r.l (IT) ZEA Partners (BE) Philips Medical Systems (NL) Project funded by the European Commission (FP6-IST programme) c GSyC/LibreSoft Quantitative data about libre software development: the FLOSSMetrics project
  7. 7. Current status and work in progress 6 Current status and work in progress Data (of various kinds) for about 2,300 projects Full MySQL dumps for • CVS and Subversion commit records • Metrics (size, complexity) for source code • Mailing lists main headers • Issue tracking system (bug reports, etc.) Focused report on SMEs (third release) Working on focused studies Web-based data repository: Melquiades Direct access to database to some researchers http://melquiades.flossmetrics.org c GSyC/LibreSoft Quantitative data about libre software development: the FLOSSMetrics project
  8. 8. Current status and work in progress (cont.) 7 c GSyC/LibreSoft Quantitative data about libre software development: the FLOSSMetrics project
  9. 9. Current status and work in progress (cont.) 8 c GSyC/LibreSoft Quantitative data about libre software development: the FLOSSMetrics project
  10. 10. Current status and work in progress (cont.) 9 c GSyC/LibreSoft Quantitative data about libre software development: the FLOSSMetrics project
  11. 11. Tools and current status 10 Tools and current status LibreSoft Tools suite • CVSAnalY already work with CVS, SVN, git (Bazaar com- ing soon) • CVSAnalY produces complexity metrics counts for each release of each file (C, C++, Java, Python, more to come) • Bicho: bug reports from SourceForge (Bugzilla coming soon) • MLStat: mailing lists, hidding real email addresses About 2,300 projects and counting All of this integrated in Melquiades http://melquiades.flossmetrics.org http://forge.morfeo-project.org/projects/libresoft-tools/ c GSyC/LibreSoft Quantitative data about libre software development: the FLOSSMetrics project
  12. 12. Retrieving information: general problems 11 Retrieving information: general problems Diversity: Kinds of forges: difficult to automate Kinds of projects: not all projects in SF are relevant Sources for same project: forge(s), distributions... Missing information: Hidden information (eg: mail headers) Lost information (eg: transition from CVS to SVN) Bugs and errors (eg: old locks in SCM) Stress to projects infrastructure!! c GSyC/LibreSoft Quantitative data about libre software development: the FLOSSMetrics project
  13. 13. Retrieving information: SCM problems 12 Retrieving information: SCM problems Different systems (CVS, Subversion, git, Bazaar, Mercurial, etc.) Different models (file-based, commit-based, distributed) Bots performing commits Large transitions don’t preserve information Performance issues (systems poorly designed for massive re- trieval) But at least we have facilities for incremental retrieval c GSyC/LibreSoft Quantitative data about libre software development: the FLOSSMetrics project
  14. 14. Retrieving information: BTS problems 13 Retrieving information: BTS problems Different systems (Bugzilla, SourceForge, GForge, trac, Launch- pad, etc.) Different models (bug cycle, bug report parameters, etc) Different uses (issue tracker, only bugs, scheduler, etc.) Bots acting on bug reports Lack of facilities for incremental retrieval Performance issues (systems not really designed for massive retrieval) c GSyC/LibreSoft Quantitative data about libre software development: the FLOSSMetrics project
  15. 15. Retrieving information: Mailing lists problems 14 Retrieving information: Mailing lists problems Different systems (usually accessible only through HTML) Partial information (missing headers) Bots sending email (eg: commit messages) Spam (mixed with real messages) But email messages are pretty uniform in format c GSyC/LibreSoft Quantitative data about libre software development: the FLOSSMetrics project
  16. 16. Retrieving information: All together 15 Retrieving information: All together How to track actors and products: • Different repositories of the same project • Different projects SourceForge helps a bit! Massive information (when dealing with 1,000s projects) Exchange formats (for third parties and reproduction) Tracking information (where did this commit record came from?): • Repositories change • Retrieval tools change • Errors do occur c GSyC/LibreSoft Quantitative data about libre software development: the FLOSSMetrics project
  17. 17. Interested? 16 Interested? Detailed description of work available from the website All the software used is libre software Keep an eye on the website Tell us about your pet project, we can analyze it Interested in knowing how this is useful for you: provide feed- back about your interests, needs Willing to collaborate with projects! http://flossmetrics.org http://forge.morfeo-project.org/projects/libresoft-tools/ c GSyC/LibreSoft Quantitative data about libre software development: the FLOSSMetrics project

×