SlideShare a Scribd company logo
Intensive Metrics for the Study of the Evolution
of Open Source Projects: Case Studies from the
ASF
Santiago Gala-Pérez (ASF), Gregorio Robles (URJC),
Jesús M. González-Barahona (URJC), Israel Herraiz (UPM)

10th Working Conference on Mining Software Repositories
SF, California, May 18th, 2013
Preprint available at http://oa.upm.es/14698/
Slides at http://slideshare.net/herraiz/intensive-metrics-software-evolution

,

Intensive metrics for open source evolution – http://oa.upm.es/14698/

1/13
Metrics for Software Evolution

Common metrics are extensive
Difficult to compare projects of different size
Successful projects undergo large size changes over their lifetime
Intensive metrics in natural sciences
Metrics not depending on the size of system
Scale invariant

,

Intensive metrics for open source evolution – http://oa.upm.es/14698/

2/13
Metrics for Software Evolution

Common metrics are extensive
Difficult to compare projects of different size
Successful projects undergo large size changes over their lifetime
Intensive metrics in natural sciences
Metrics not depending on the size of system
Scale invariant
Are there any intensive metric for software?
Can we find intensive metrics to study software evolution?

,

Intensive metrics for open source evolution – http://oa.upm.es/14698/

2/13
The case of the Apache Software Foundation
ASF members mailing list, November 29 2008
Joe Schaeffer says
something IMO interesting about the ASF: the fact that the number of
commits and the number of mailing list posts have grown in linear
relationship [...] over the years.

,

Intensive metrics for open source evolution – http://oa.upm.es/14698/

3/13
Goal of the paper
Ratio Communication flow / development activity
Hypothesis: the ratio is an intensive metric for software evolution
It varies with
Maturity, technology, community composition

But not with project source code size

,

Intensive metrics for open source evolution – http://oa.upm.es/14698/

4/13
Goal of the paper
Ratio Communication flow / development activity
Hypothesis: the ratio is an intensive metric for software evolution
It varies with
Maturity, technology, community composition

But not with project source code size
Case study: the ASF
Broad and diverse range of projects
Size, scope, technology, maturity

If it didn’t happen on-list, it didn’t happen
Communications between developers (decisions)
Issue trackers
Code review tools, automated builds, wiki page edits
Commits
,

Intensive metrics for open source evolution – http://oa.upm.es/14698/

4/13
ASF projects under study
Project
HTTPD
APR
Lucene
Turbine
Tomcat
Jackrabbit
Hadoop
Geronimo
SpamAssassin
Portals
Beehive

,

kSLOC
156
66
414
41
213
344
1270
370
54
202
88

Technology
Web server
Library
Index & search
Java web fwork
Servlet API
JSR-170 ref. impl.
Big Data
JavaEE app. srv.
Spam filter
Web fwork
J2EE Struts

Maturity
Active, long-lived
Active, long-lived
Active, long-lived
Stagnated
Active, long-lived
Active
Very active
Active, long-lived
Mature
Nearly dead
Attic

Intensive metrics for open source evolution – http://oa.upm.es/14698/

Scope
Users
Devs
Users
Devs
Devs
Devs
Devs
Devs
End users
Devs
Devs

5/13
ASF projects under study
Project
HTTPD
APR
Lucene
Turbine
Tomcat
Jackrabbit
Hadoop
Geronimo
SpamAssassin
Portals
Beehive

kSLOC
156
66
414
41
213
344
1270
370
54
202
88

Technology
Web server
Library
Index & search
Java web fwork
Servlet API
JSR-170 ref. impl.
Big Data
JavaEE app. srv.
Spam filter
Web fwork
J2EE Struts

Maturity
Active, long-lived
Active, long-lived
Active, long-lived
Stagnated
Active, long-lived
Active
Very active
Active, long-lived
Mature
Nearly dead
Attic

Scope
Users
Devs
Users
Devs
Devs
Devs
Devs
Devs
End users
Devs
Devs

Ratio
What’s the ratio evolution for these projects?

,

Intensive metrics for open source evolution – http://oa.upm.es/14698/

5/13
Apache httpd
156 kSLOC, active and long lived web server

,

Intensive metrics for open source evolution – http://oa.upm.es/14698/

6/13
Apache Portable Runtime (APR)
66 kSLOC, active and long lived library used by httpd and Subversion

,

Intensive metrics for open source evolution – http://oa.upm.es/14698/

7/13
Apache Hadoop
1270 kSLOC, very active development and community, higher presence of
non-human emails

,

Intensive metrics for open source evolution – http://oa.upm.es/14698/

8/13
Apache SpamAssassin
54 kSLOC, spam filter, intended for end users, maturing project

,

Intensive metrics for open source evolution – http://oa.upm.es/14698/

9/13
Apache Beehive
88 kSLOC, project in the Attic (no longer under development)

,

Intensive metrics for open source evolution – http://oa.upm.es/14698/

10/13
Overall comparison
Allows for comparison of projects with large differences in size, scope,
technology, maturity

,

Intensive metrics for open source evolution – http://oa.upm.es/14698/

11/13
Overall comparison
Lessons learned
Healthy Apache projects have smooth ratios
Projects with little activity, or small core group, are noisier
Peaks to infinity are evidence of stagnation

,

Intensive metrics for open source evolution – http://oa.upm.es/14698/

12/13
Overall comparison
Lessons learned
Healthy Apache projects have smooth ratios
Projects with little activity, or small core group, are noisier
Peaks to infinity are evidence of stagnation
User-oriented projects
Evolution:
Starts with high values
Stabilize and matures with 3 <ratio< 8
Developer-oriented projects
Evolution:
Smaller community, no peaks
Always within 3 <ratio< 8
,

Intensive metrics for open source evolution – http://oa.upm.es/14698/

12/13
Conclusions and further work
Metric
Intensive and expressive metric.
Not depending on size, maturity,
scope or technology.
End-users
More suitable for users-oriented
projects. Ratio works better with
large and active communities.

,

Stagnation

Other ratios, other cases
Devel-only messages, issues,
commits complexity.
Study beyond the ASF.

Can identify stagnated projects.
Can signal potential stagnation
threats.

Intensive metrics for open source evolution – http://oa.upm.es/14698/

13/13
Conclusions and further work
Metric
Intensive and expressive metric.
Not depending on size, maturity,
scope or technology.

Stagnation

End-users
More suitable for users-oriented
projects. Ratio works better with
large and active communities.

Other ratios, other cases
Devel-only messages, issues,
commits complexity.
Study beyond the ASF.

Can identify stagnated projects.
Can signal potential stagnation
threats.

Get a preprint of the paper at http://oa.upm.es/14698
Replication package
http://gsyc.es/∼grex/repro/2013-apache-intensive/
,

Intensive metrics for open source evolution – http://oa.upm.es/14698/

13/13

More Related Content

Similar to Intensive metrics software evolution

GoOpen 2010: Sandro D'Elia
GoOpen 2010: Sandro D'EliaGoOpen 2010: Sandro D'Elia
GoOpen 2010: Sandro D'Elia
Friprogsenteret
 
Software Analytics: Data Analytics for Software Engineering
Software Analytics: Data Analytics for Software EngineeringSoftware Analytics: Data Analytics for Software Engineering
Software Analytics: Data Analytics for Software Engineering
Tao Xie
 
Software Security Assurance for Devops
Software Security Assurance for DevopsSoftware Security Assurance for Devops
Software Security Assurance for Devops
Jerika Phelps
 
Software Security Assurance for DevOps
Software Security Assurance for DevOpsSoftware Security Assurance for DevOps
Software Security Assurance for DevOps
Black Duck by Synopsys
 
2016 nov-ieee-sdn-wiki
2016 nov-ieee-sdn-wiki2016 nov-ieee-sdn-wiki
2016 nov-ieee-sdn-wiki
Christian Esteve Rothenberg
 
Implementing policy @ WSSSPE
Implementing policy @ WSSSPEImplementing policy @ WSSSPE
Implementing policy @ WSSSPE
Daisie Huang
 
DaveParizekResumeJune2015
DaveParizekResumeJune2015DaveParizekResumeJune2015
DaveParizekResumeJune2015
Dave Parizek
 
Better Software, Better Research
Better Software, Better ResearchBetter Software, Better Research
Better Software, Better Research
Carole Goble
 
Frankfurt Big Data Lab & Refugee Projeect
Frankfurt Big Data Lab & Refugee ProjeectFrankfurt Big Data Lab & Refugee Projeect
Frankfurt Big Data Lab & Refugee Projeect
Goethe Univeristy
 
Open Source
Open SourceOpen Source
Open Source
Kartik Subbarao
 
Using Open Source Software For Public Health Kass-Hout Di Tada
Using Open Source Software For Public Health Kass-Hout Di TadaUsing Open Source Software For Public Health Kass-Hout Di Tada
Using Open Source Software For Public Health Kass-Hout Di Tada
Taha Kass-Hout, MD, MS
 
Using Opensource Software For Public Health
Using Opensource Software For Public HealthUsing Opensource Software For Public Health
Using Opensource Software For Public Health
InSTEDD
 
Collaborative Data Analysis with Taverna Workflows
Collaborative Data Analysis with Taverna WorkflowsCollaborative Data Analysis with Taverna Workflows
Collaborative Data Analysis with Taverna Workflows
Andrea Wiggins
 
Tds — big science dec 2021
Tds — big science dec 2021Tds — big science dec 2021
Tds — big science dec 2021
Gérard Dupont
 
Analysis-of-Major-Trends-in-big-data-analytics-slim-baltagi-hadoop-summit
Analysis-of-Major-Trends-in-big-data-analytics-slim-baltagi-hadoop-summitAnalysis-of-Major-Trends-in-big-data-analytics-slim-baltagi-hadoop-summit
Analysis-of-Major-Trends-in-big-data-analytics-slim-baltagi-hadoop-summit
Slim Baltagi
 
LaranEvansResume
LaranEvansResumeLaranEvansResume
LaranEvansResume
butest
 
Analysis of Major Trends in Big Data Analytics
Analysis of Major Trends in Big Data AnalyticsAnalysis of Major Trends in Big Data Analytics
Analysis of Major Trends in Big Data Analytics
DataWorks Summit/Hadoop Summit
 
Analysis of Major Trends in Big Data Analytics
Analysis of Major Trends in Big Data AnalyticsAnalysis of Major Trends in Big Data Analytics
Analysis of Major Trends in Big Data Analytics
DataWorks Summit/Hadoop Summit
 
Six Easy Pieces of Quantitatively Analyzing Open Source
Six Easy Pieces of Quantitatively Analyzing Open SourceSix Easy Pieces of Quantitatively Analyzing Open Source
Six Easy Pieces of Quantitatively Analyzing Open Source
Dirk Riehle
 
Project On-Science
Project On-ScienceProject On-Science
Project On-Science
Amrit Ravi
 

Similar to Intensive metrics software evolution (20)

GoOpen 2010: Sandro D'Elia
GoOpen 2010: Sandro D'EliaGoOpen 2010: Sandro D'Elia
GoOpen 2010: Sandro D'Elia
 
Software Analytics: Data Analytics for Software Engineering
Software Analytics: Data Analytics for Software EngineeringSoftware Analytics: Data Analytics for Software Engineering
Software Analytics: Data Analytics for Software Engineering
 
Software Security Assurance for Devops
Software Security Assurance for DevopsSoftware Security Assurance for Devops
Software Security Assurance for Devops
 
Software Security Assurance for DevOps
Software Security Assurance for DevOpsSoftware Security Assurance for DevOps
Software Security Assurance for DevOps
 
2016 nov-ieee-sdn-wiki
2016 nov-ieee-sdn-wiki2016 nov-ieee-sdn-wiki
2016 nov-ieee-sdn-wiki
 
Implementing policy @ WSSSPE
Implementing policy @ WSSSPEImplementing policy @ WSSSPE
Implementing policy @ WSSSPE
 
DaveParizekResumeJune2015
DaveParizekResumeJune2015DaveParizekResumeJune2015
DaveParizekResumeJune2015
 
Better Software, Better Research
Better Software, Better ResearchBetter Software, Better Research
Better Software, Better Research
 
Frankfurt Big Data Lab & Refugee Projeect
Frankfurt Big Data Lab & Refugee ProjeectFrankfurt Big Data Lab & Refugee Projeect
Frankfurt Big Data Lab & Refugee Projeect
 
Open Source
Open SourceOpen Source
Open Source
 
Using Open Source Software For Public Health Kass-Hout Di Tada
Using Open Source Software For Public Health Kass-Hout Di TadaUsing Open Source Software For Public Health Kass-Hout Di Tada
Using Open Source Software For Public Health Kass-Hout Di Tada
 
Using Opensource Software For Public Health
Using Opensource Software For Public HealthUsing Opensource Software For Public Health
Using Opensource Software For Public Health
 
Collaborative Data Analysis with Taverna Workflows
Collaborative Data Analysis with Taverna WorkflowsCollaborative Data Analysis with Taverna Workflows
Collaborative Data Analysis with Taverna Workflows
 
Tds — big science dec 2021
Tds — big science dec 2021Tds — big science dec 2021
Tds — big science dec 2021
 
Analysis-of-Major-Trends-in-big-data-analytics-slim-baltagi-hadoop-summit
Analysis-of-Major-Trends-in-big-data-analytics-slim-baltagi-hadoop-summitAnalysis-of-Major-Trends-in-big-data-analytics-slim-baltagi-hadoop-summit
Analysis-of-Major-Trends-in-big-data-analytics-slim-baltagi-hadoop-summit
 
LaranEvansResume
LaranEvansResumeLaranEvansResume
LaranEvansResume
 
Analysis of Major Trends in Big Data Analytics
Analysis of Major Trends in Big Data AnalyticsAnalysis of Major Trends in Big Data Analytics
Analysis of Major Trends in Big Data Analytics
 
Analysis of Major Trends in Big Data Analytics
Analysis of Major Trends in Big Data AnalyticsAnalysis of Major Trends in Big Data Analytics
Analysis of Major Trends in Big Data Analytics
 
Six Easy Pieces of Quantitatively Analyzing Open Source
Six Easy Pieces of Quantitatively Analyzing Open SourceSix Easy Pieces of Quantitatively Analyzing Open Source
Six Easy Pieces of Quantitatively Analyzing Open Source
 
Project On-Science
Project On-ScienceProject On-Science
Project On-Science
 

Recently uploaded

From Natural Language to Structured Solr Queries using LLMs
From Natural Language to Structured Solr Queries using LLMsFrom Natural Language to Structured Solr Queries using LLMs
From Natural Language to Structured Solr Queries using LLMs
Sease
 
Essentials of Automations: Exploring Attributes & Automation Parameters
Essentials of Automations: Exploring Attributes & Automation ParametersEssentials of Automations: Exploring Attributes & Automation Parameters
Essentials of Automations: Exploring Attributes & Automation Parameters
Safe Software
 
"Scaling RAG Applications to serve millions of users", Kevin Goedecke
"Scaling RAG Applications to serve millions of users",  Kevin Goedecke"Scaling RAG Applications to serve millions of users",  Kevin Goedecke
"Scaling RAG Applications to serve millions of users", Kevin Goedecke
Fwdays
 
PRODUCT LISTING OPTIMIZATION PRESENTATION.pptx
PRODUCT LISTING OPTIMIZATION PRESENTATION.pptxPRODUCT LISTING OPTIMIZATION PRESENTATION.pptx
PRODUCT LISTING OPTIMIZATION PRESENTATION.pptx
christinelarrosa
 
A Deep Dive into ScyllaDB's Architecture
A Deep Dive into ScyllaDB's ArchitectureA Deep Dive into ScyllaDB's Architecture
A Deep Dive into ScyllaDB's Architecture
ScyllaDB
 
MySQL InnoDB Storage Engine: Deep Dive - Mydbops
MySQL InnoDB Storage Engine: Deep Dive - MydbopsMySQL InnoDB Storage Engine: Deep Dive - Mydbops
MySQL InnoDB Storage Engine: Deep Dive - Mydbops
Mydbops
 
The Microsoft 365 Migration Tutorial For Beginner.pptx
The Microsoft 365 Migration Tutorial For Beginner.pptxThe Microsoft 365 Migration Tutorial For Beginner.pptx
The Microsoft 365 Migration Tutorial For Beginner.pptx
operationspcvita
 
Astute Business Solutions | Oracle Cloud Partner |
Astute Business Solutions | Oracle Cloud Partner |Astute Business Solutions | Oracle Cloud Partner |
Astute Business Solutions | Oracle Cloud Partner |
AstuteBusiness
 
Apps Break Data
Apps Break DataApps Break Data
Apps Break Data
Ivo Velitchkov
 
"NATO Hackathon Winner: AI-Powered Drug Search", Taras Kloba
"NATO Hackathon Winner: AI-Powered Drug Search",  Taras Kloba"NATO Hackathon Winner: AI-Powered Drug Search",  Taras Kloba
"NATO Hackathon Winner: AI-Powered Drug Search", Taras Kloba
Fwdays
 
"What does it really mean for your system to be available, or how to define w...
"What does it really mean for your system to be available, or how to define w..."What does it really mean for your system to be available, or how to define w...
"What does it really mean for your system to be available, or how to define w...
Fwdays
 
Poznań ACE event - 19.06.2024 Team 24 Wrapup slidedeck
Poznań ACE event - 19.06.2024 Team 24 Wrapup slidedeckPoznań ACE event - 19.06.2024 Team 24 Wrapup slidedeck
Poznań ACE event - 19.06.2024 Team 24 Wrapup slidedeck
FilipTomaszewski5
 
GraphRAG for LifeSciences Hands-On with the Clinical Knowledge Graph
GraphRAG for LifeSciences Hands-On with the Clinical Knowledge GraphGraphRAG for LifeSciences Hands-On with the Clinical Knowledge Graph
GraphRAG for LifeSciences Hands-On with the Clinical Knowledge Graph
Neo4j
 
Christine's Product Research Presentation.pptx
Christine's Product Research Presentation.pptxChristine's Product Research Presentation.pptx
Christine's Product Research Presentation.pptx
christinelarrosa
 
"$10 thousand per minute of downtime: architecture, queues, streaming and fin...
"$10 thousand per minute of downtime: architecture, queues, streaming and fin..."$10 thousand per minute of downtime: architecture, queues, streaming and fin...
"$10 thousand per minute of downtime: architecture, queues, streaming and fin...
Fwdays
 
Christine's Supplier Sourcing Presentaion.pptx
Christine's Supplier Sourcing Presentaion.pptxChristine's Supplier Sourcing Presentaion.pptx
Christine's Supplier Sourcing Presentaion.pptx
christinelarrosa
 
GlobalLogic Java Community Webinar #18 “How to Improve Web Application Perfor...
GlobalLogic Java Community Webinar #18 “How to Improve Web Application Perfor...GlobalLogic Java Community Webinar #18 “How to Improve Web Application Perfor...
GlobalLogic Java Community Webinar #18 “How to Improve Web Application Perfor...
GlobalLogic Ukraine
 
Y-Combinator seed pitch deck template PP
Y-Combinator seed pitch deck template PPY-Combinator seed pitch deck template PP
Y-Combinator seed pitch deck template PP
c5vrf27qcz
 
QR Secure: A Hybrid Approach Using Machine Learning and Security Validation F...
QR Secure: A Hybrid Approach Using Machine Learning and Security Validation F...QR Secure: A Hybrid Approach Using Machine Learning and Security Validation F...
QR Secure: A Hybrid Approach Using Machine Learning and Security Validation F...
AlexanderRichford
 
Northern Engraving | Modern Metal Trim, Nameplates and Appliance Panels
Northern Engraving | Modern Metal Trim, Nameplates and Appliance PanelsNorthern Engraving | Modern Metal Trim, Nameplates and Appliance Panels
Northern Engraving | Modern Metal Trim, Nameplates and Appliance Panels
Northern Engraving
 

Recently uploaded (20)

From Natural Language to Structured Solr Queries using LLMs
From Natural Language to Structured Solr Queries using LLMsFrom Natural Language to Structured Solr Queries using LLMs
From Natural Language to Structured Solr Queries using LLMs
 
Essentials of Automations: Exploring Attributes & Automation Parameters
Essentials of Automations: Exploring Attributes & Automation ParametersEssentials of Automations: Exploring Attributes & Automation Parameters
Essentials of Automations: Exploring Attributes & Automation Parameters
 
"Scaling RAG Applications to serve millions of users", Kevin Goedecke
"Scaling RAG Applications to serve millions of users",  Kevin Goedecke"Scaling RAG Applications to serve millions of users",  Kevin Goedecke
"Scaling RAG Applications to serve millions of users", Kevin Goedecke
 
PRODUCT LISTING OPTIMIZATION PRESENTATION.pptx
PRODUCT LISTING OPTIMIZATION PRESENTATION.pptxPRODUCT LISTING OPTIMIZATION PRESENTATION.pptx
PRODUCT LISTING OPTIMIZATION PRESENTATION.pptx
 
A Deep Dive into ScyllaDB's Architecture
A Deep Dive into ScyllaDB's ArchitectureA Deep Dive into ScyllaDB's Architecture
A Deep Dive into ScyllaDB's Architecture
 
MySQL InnoDB Storage Engine: Deep Dive - Mydbops
MySQL InnoDB Storage Engine: Deep Dive - MydbopsMySQL InnoDB Storage Engine: Deep Dive - Mydbops
MySQL InnoDB Storage Engine: Deep Dive - Mydbops
 
The Microsoft 365 Migration Tutorial For Beginner.pptx
The Microsoft 365 Migration Tutorial For Beginner.pptxThe Microsoft 365 Migration Tutorial For Beginner.pptx
The Microsoft 365 Migration Tutorial For Beginner.pptx
 
Astute Business Solutions | Oracle Cloud Partner |
Astute Business Solutions | Oracle Cloud Partner |Astute Business Solutions | Oracle Cloud Partner |
Astute Business Solutions | Oracle Cloud Partner |
 
Apps Break Data
Apps Break DataApps Break Data
Apps Break Data
 
"NATO Hackathon Winner: AI-Powered Drug Search", Taras Kloba
"NATO Hackathon Winner: AI-Powered Drug Search",  Taras Kloba"NATO Hackathon Winner: AI-Powered Drug Search",  Taras Kloba
"NATO Hackathon Winner: AI-Powered Drug Search", Taras Kloba
 
"What does it really mean for your system to be available, or how to define w...
"What does it really mean for your system to be available, or how to define w..."What does it really mean for your system to be available, or how to define w...
"What does it really mean for your system to be available, or how to define w...
 
Poznań ACE event - 19.06.2024 Team 24 Wrapup slidedeck
Poznań ACE event - 19.06.2024 Team 24 Wrapup slidedeckPoznań ACE event - 19.06.2024 Team 24 Wrapup slidedeck
Poznań ACE event - 19.06.2024 Team 24 Wrapup slidedeck
 
GraphRAG for LifeSciences Hands-On with the Clinical Knowledge Graph
GraphRAG for LifeSciences Hands-On with the Clinical Knowledge GraphGraphRAG for LifeSciences Hands-On with the Clinical Knowledge Graph
GraphRAG for LifeSciences Hands-On with the Clinical Knowledge Graph
 
Christine's Product Research Presentation.pptx
Christine's Product Research Presentation.pptxChristine's Product Research Presentation.pptx
Christine's Product Research Presentation.pptx
 
"$10 thousand per minute of downtime: architecture, queues, streaming and fin...
"$10 thousand per minute of downtime: architecture, queues, streaming and fin..."$10 thousand per minute of downtime: architecture, queues, streaming and fin...
"$10 thousand per minute of downtime: architecture, queues, streaming and fin...
 
Christine's Supplier Sourcing Presentaion.pptx
Christine's Supplier Sourcing Presentaion.pptxChristine's Supplier Sourcing Presentaion.pptx
Christine's Supplier Sourcing Presentaion.pptx
 
GlobalLogic Java Community Webinar #18 “How to Improve Web Application Perfor...
GlobalLogic Java Community Webinar #18 “How to Improve Web Application Perfor...GlobalLogic Java Community Webinar #18 “How to Improve Web Application Perfor...
GlobalLogic Java Community Webinar #18 “How to Improve Web Application Perfor...
 
Y-Combinator seed pitch deck template PP
Y-Combinator seed pitch deck template PPY-Combinator seed pitch deck template PP
Y-Combinator seed pitch deck template PP
 
QR Secure: A Hybrid Approach Using Machine Learning and Security Validation F...
QR Secure: A Hybrid Approach Using Machine Learning and Security Validation F...QR Secure: A Hybrid Approach Using Machine Learning and Security Validation F...
QR Secure: A Hybrid Approach Using Machine Learning and Security Validation F...
 
Northern Engraving | Modern Metal Trim, Nameplates and Appliance Panels
Northern Engraving | Modern Metal Trim, Nameplates and Appliance PanelsNorthern Engraving | Modern Metal Trim, Nameplates and Appliance Panels
Northern Engraving | Modern Metal Trim, Nameplates and Appliance Panels
 

Intensive metrics software evolution

  • 1. Intensive Metrics for the Study of the Evolution of Open Source Projects: Case Studies from the ASF Santiago Gala-Pérez (ASF), Gregorio Robles (URJC), Jesús M. González-Barahona (URJC), Israel Herraiz (UPM) 10th Working Conference on Mining Software Repositories SF, California, May 18th, 2013 Preprint available at http://oa.upm.es/14698/ Slides at http://slideshare.net/herraiz/intensive-metrics-software-evolution , Intensive metrics for open source evolution – http://oa.upm.es/14698/ 1/13
  • 2. Metrics for Software Evolution Common metrics are extensive Difficult to compare projects of different size Successful projects undergo large size changes over their lifetime Intensive metrics in natural sciences Metrics not depending on the size of system Scale invariant , Intensive metrics for open source evolution – http://oa.upm.es/14698/ 2/13
  • 3. Metrics for Software Evolution Common metrics are extensive Difficult to compare projects of different size Successful projects undergo large size changes over their lifetime Intensive metrics in natural sciences Metrics not depending on the size of system Scale invariant Are there any intensive metric for software? Can we find intensive metrics to study software evolution? , Intensive metrics for open source evolution – http://oa.upm.es/14698/ 2/13
  • 4. The case of the Apache Software Foundation ASF members mailing list, November 29 2008 Joe Schaeffer says something IMO interesting about the ASF: the fact that the number of commits and the number of mailing list posts have grown in linear relationship [...] over the years. , Intensive metrics for open source evolution – http://oa.upm.es/14698/ 3/13
  • 5. Goal of the paper Ratio Communication flow / development activity Hypothesis: the ratio is an intensive metric for software evolution It varies with Maturity, technology, community composition But not with project source code size , Intensive metrics for open source evolution – http://oa.upm.es/14698/ 4/13
  • 6. Goal of the paper Ratio Communication flow / development activity Hypothesis: the ratio is an intensive metric for software evolution It varies with Maturity, technology, community composition But not with project source code size Case study: the ASF Broad and diverse range of projects Size, scope, technology, maturity If it didn’t happen on-list, it didn’t happen Communications between developers (decisions) Issue trackers Code review tools, automated builds, wiki page edits Commits , Intensive metrics for open source evolution – http://oa.upm.es/14698/ 4/13
  • 7. ASF projects under study Project HTTPD APR Lucene Turbine Tomcat Jackrabbit Hadoop Geronimo SpamAssassin Portals Beehive , kSLOC 156 66 414 41 213 344 1270 370 54 202 88 Technology Web server Library Index & search Java web fwork Servlet API JSR-170 ref. impl. Big Data JavaEE app. srv. Spam filter Web fwork J2EE Struts Maturity Active, long-lived Active, long-lived Active, long-lived Stagnated Active, long-lived Active Very active Active, long-lived Mature Nearly dead Attic Intensive metrics for open source evolution – http://oa.upm.es/14698/ Scope Users Devs Users Devs Devs Devs Devs Devs End users Devs Devs 5/13
  • 8. ASF projects under study Project HTTPD APR Lucene Turbine Tomcat Jackrabbit Hadoop Geronimo SpamAssassin Portals Beehive kSLOC 156 66 414 41 213 344 1270 370 54 202 88 Technology Web server Library Index & search Java web fwork Servlet API JSR-170 ref. impl. Big Data JavaEE app. srv. Spam filter Web fwork J2EE Struts Maturity Active, long-lived Active, long-lived Active, long-lived Stagnated Active, long-lived Active Very active Active, long-lived Mature Nearly dead Attic Scope Users Devs Users Devs Devs Devs Devs Devs End users Devs Devs Ratio What’s the ratio evolution for these projects? , Intensive metrics for open source evolution – http://oa.upm.es/14698/ 5/13
  • 9. Apache httpd 156 kSLOC, active and long lived web server , Intensive metrics for open source evolution – http://oa.upm.es/14698/ 6/13
  • 10. Apache Portable Runtime (APR) 66 kSLOC, active and long lived library used by httpd and Subversion , Intensive metrics for open source evolution – http://oa.upm.es/14698/ 7/13
  • 11. Apache Hadoop 1270 kSLOC, very active development and community, higher presence of non-human emails , Intensive metrics for open source evolution – http://oa.upm.es/14698/ 8/13
  • 12. Apache SpamAssassin 54 kSLOC, spam filter, intended for end users, maturing project , Intensive metrics for open source evolution – http://oa.upm.es/14698/ 9/13
  • 13. Apache Beehive 88 kSLOC, project in the Attic (no longer under development) , Intensive metrics for open source evolution – http://oa.upm.es/14698/ 10/13
  • 14. Overall comparison Allows for comparison of projects with large differences in size, scope, technology, maturity , Intensive metrics for open source evolution – http://oa.upm.es/14698/ 11/13
  • 15. Overall comparison Lessons learned Healthy Apache projects have smooth ratios Projects with little activity, or small core group, are noisier Peaks to infinity are evidence of stagnation , Intensive metrics for open source evolution – http://oa.upm.es/14698/ 12/13
  • 16. Overall comparison Lessons learned Healthy Apache projects have smooth ratios Projects with little activity, or small core group, are noisier Peaks to infinity are evidence of stagnation User-oriented projects Evolution: Starts with high values Stabilize and matures with 3 <ratio< 8 Developer-oriented projects Evolution: Smaller community, no peaks Always within 3 <ratio< 8 , Intensive metrics for open source evolution – http://oa.upm.es/14698/ 12/13
  • 17. Conclusions and further work Metric Intensive and expressive metric. Not depending on size, maturity, scope or technology. End-users More suitable for users-oriented projects. Ratio works better with large and active communities. , Stagnation Other ratios, other cases Devel-only messages, issues, commits complexity. Study beyond the ASF. Can identify stagnated projects. Can signal potential stagnation threats. Intensive metrics for open source evolution – http://oa.upm.es/14698/ 13/13
  • 18. Conclusions and further work Metric Intensive and expressive metric. Not depending on size, maturity, scope or technology. Stagnation End-users More suitable for users-oriented projects. Ratio works better with large and active communities. Other ratios, other cases Devel-only messages, issues, commits complexity. Study beyond the ASF. Can identify stagnated projects. Can signal potential stagnation threats. Get a preprint of the paper at http://oa.upm.es/14698 Replication package http://gsyc.es/∼grex/repro/2013-apache-intensive/ , Intensive metrics for open source evolution – http://oa.upm.es/14698/ 13/13