• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
Floss Metrics 2009
 

Floss Metrics 2009

on

  • 855 views

FLOSSMETRICS: The main objective of FLOSSMETRICS is to construct, publish and analyse a large scale database with information and metrics about libre software development coming from several thousands ...

FLOSSMETRICS: The main objective of FLOSSMETRICS is to construct, publish and analyse a large scale database with information and metrics about libre software development coming from several thousands of software projects, using existing methodologies, and tools already developed. The project will also provide a public platform for validation and industrial exploitation of results.

Statistics

Views

Total Views
855
Views on SlideShare
855
Embed Views
0

Actions

Likes
0
Downloads
5
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    Floss Metrics 2009 Floss Metrics 2009 Presentation Transcript

    • Quantitative data about libre software development: the FLOSSMetrics project Jesus M. Gonzalez-Barahona Teo Romera Otero (GSyC/LibreSoft, URJC) jgb@libresoft.es teo@libresoft.es FOSSa, Grenoble, November 17th 2009
    • 1 c 2006-2009 GSyC/LibreSoft Some rights reserved. This document is distributed under the Creative Commons Attribution-ShareAlike 3.0 licence, available in http://creativecommons.org/licenses/by-sa/3.0/ c GSyC/LibreSoft
    • FLOSSMetrics: base ideas 2 FLOSSMetrics: base ideas Libre software development: Lots of opinions, few known facts Researcher-friendly: public data, reproducibility, validation of results, large samples Interest by volunteers and companies Main questions: Can libre software development be improved? Can software engineering learn from libre software? Can projects better understand their processes and products? http://flossmetrics.org c GSyC/LibreSoft Quantitative data about libre software development: the FLOSSMetrics project
    • FLOSSMETRICS goals 3 FLOSSMETRICS goals Retrieval of data from (thousands of) libre software projects Analysis about actors, artefacts and processes involved in de- velopment Higher level studies: software evolution, human resources, ef- fort estimation, productivity, quality, etc. Database available to other researchers, developers Providing tools for development follow-up Involvement with the libre software community http://flossmetrics.org c GSyC/LibreSoft Quantitative data about libre software development: the FLOSSMetrics project
    • Main results 4 Main results Huge database with factual details about libre software de- velopment (accessible to everyone) Higher level analysis and studies Sustainable platform for benchmarking and analysis Targeted reports (SMEs, industry, etc.) Focus on providing data and information that others can use for research, evaluation, follow-up http://flossmetrics.org c GSyC/LibreSoft Quantitative data about libre software development: the FLOSSMetrics project
    • Partners 5 Partners Universidad Rey Juan Carlos (ES) University of Maastricht (NL) Wirtsshaftuniversitaet Wien (AT) Aristotle Univeristy of Thessaloniki (GR) Conecta s.r.l (IT) ZEA Partners (BE) Philips Medical Systems (NL) Project funded by the European Commission (FP6-IST programme) c GSyC/LibreSoft Quantitative data about libre software development: the FLOSSMetrics project
    • Current status and work in progress 6 Current status and work in progress Data (of various kinds) for about 2,300 projects Full MySQL dumps for • CVS and Subversion commit records • Metrics (size, complexity) for source code • Mailing lists main headers • Issue tracking system (bug reports, etc.) Focused report on SMEs (third release) Working on focused studies Web-based data repository: Melquiades Direct access to database to some researchers http://melquiades.flossmetrics.org c GSyC/LibreSoft Quantitative data about libre software development: the FLOSSMetrics project
    • Current status and work in progress (cont.) 7 c GSyC/LibreSoft Quantitative data about libre software development: the FLOSSMetrics project
    • Current status and work in progress (cont.) 8 c GSyC/LibreSoft Quantitative data about libre software development: the FLOSSMetrics project
    • Current status and work in progress (cont.) 9 c GSyC/LibreSoft Quantitative data about libre software development: the FLOSSMetrics project
    • Tools and current status 10 Tools and current status LibreSoft Tools suite • CVSAnalY already work with CVS, SVN, git (Bazaar com- ing soon) • CVSAnalY produces complexity metrics counts for each release of each file (C, C++, Java, Python, more to come) • Bicho: bug reports from SourceForge (Bugzilla coming soon) • MLStat: mailing lists, hidding real email addresses About 2,300 projects and counting All of this integrated in Melquiades http://melquiades.flossmetrics.org http://forge.morfeo-project.org/projects/libresoft-tools/ c GSyC/LibreSoft Quantitative data about libre software development: the FLOSSMetrics project
    • Retrieving information: general problems 11 Retrieving information: general problems Diversity: Kinds of forges: difficult to automate Kinds of projects: not all projects in SF are relevant Sources for same project: forge(s), distributions... Missing information: Hidden information (eg: mail headers) Lost information (eg: transition from CVS to SVN) Bugs and errors (eg: old locks in SCM) Stress to projects infrastructure!! c GSyC/LibreSoft Quantitative data about libre software development: the FLOSSMetrics project
    • Retrieving information: SCM problems 12 Retrieving information: SCM problems Different systems (CVS, Subversion, git, Bazaar, Mercurial, etc.) Different models (file-based, commit-based, distributed) Bots performing commits Large transitions don’t preserve information Performance issues (systems poorly designed for massive re- trieval) But at least we have facilities for incremental retrieval c GSyC/LibreSoft Quantitative data about libre software development: the FLOSSMetrics project
    • Retrieving information: BTS problems 13 Retrieving information: BTS problems Different systems (Bugzilla, SourceForge, GForge, trac, Launch- pad, etc.) Different models (bug cycle, bug report parameters, etc) Different uses (issue tracker, only bugs, scheduler, etc.) Bots acting on bug reports Lack of facilities for incremental retrieval Performance issues (systems not really designed for massive retrieval) c GSyC/LibreSoft Quantitative data about libre software development: the FLOSSMetrics project
    • Retrieving information: Mailing lists problems 14 Retrieving information: Mailing lists problems Different systems (usually accessible only through HTML) Partial information (missing headers) Bots sending email (eg: commit messages) Spam (mixed with real messages) But email messages are pretty uniform in format c GSyC/LibreSoft Quantitative data about libre software development: the FLOSSMetrics project
    • Retrieving information: All together 15 Retrieving information: All together How to track actors and products: • Different repositories of the same project • Different projects SourceForge helps a bit! Massive information (when dealing with 1,000s projects) Exchange formats (for third parties and reproduction) Tracking information (where did this commit record came from?): • Repositories change • Retrieval tools change • Errors do occur c GSyC/LibreSoft Quantitative data about libre software development: the FLOSSMetrics project
    • Interested? 16 Interested? Detailed description of work available from the website All the software used is libre software Keep an eye on the website Tell us about your pet project, we can analyze it Interested in knowing how this is useful for you: provide feed- back about your interests, needs Willing to collaborate with projects! http://flossmetrics.org http://forge.morfeo-project.org/projects/libresoft-tools/ c GSyC/LibreSoft Quantitative data about libre software development: the FLOSSMetrics project