This is my first in a series of 4 lectures on the topic of Evolving Software Ecosystems, presented during the NATO Marktoberdorf 2014 Summer School on Dependable Software System Engineering in Germany, August 2014.
4. July?August(2014(—(NATO(Marktoberdorf(Summer(School(—(Dependable(So#ware(Systems(Engineering
Ongoing(Research(Projects
ARC Project « Ecological Studies of Open Source Software
Ecoysystems »
- 2012-2017
- Interdisciplinary research
- Use ideas from biological ecology to understand and improve evolution/
maintenance of software ecosystems
!
FRFC Project « Data-Intensive Software System Evolution »
- 2013-2017, in collaboration with A. Cleve (University of Namur)
- Empirical study of evolution of data-intensive software systems
-Co-evolution between database and code
!
FRIA PhD Scholarship « Executable Modeling of Gestural Interaction
Applications »
- 2011-2015
- Using domain-specific modeling languages
- Based on high-level Petri nets and graph transformation
4
10. Long?term(goals
• Determine the main factors that drive
the success or failure of OSS projects
within their ecosystem
!
• Investigate new techniques and
mechanisms to predict and improve
quality and survival of OSS projects
– Inspired by research in biological ecology
10
informa7que.umons.ac.be/genlog/projects/ecos
12. Long?term(goals(
Research(Ques7ons
• Specific questions for Eclipse
• Software development environment
• Plugin framework architecture
• Which plugins pose less problems or survive
longer?
• e.g. based on their API usage
• stable and supported APIs versus unstable,
discouraged and unsupported APIs
12
Businge et al. “Survival of Eclipse third-party plug-ins”; “Co-evolution
of the Eclipse SDK Framework and Its Third-Party Plug-Ins”; “Analyzing
the Eclipse API Usage: Putting the Developer in the Loop”
Mens et al. “The Evolution of Eclipse”, ICSM 2008
13. Long?term(goals(
Research(Ques7ons
• Specific questions for CRAN
• R package archive network
• Which packages are more likely to cause, upon
update, problems in dependent packages?
13
Claes et al. “On the Maintainability of CRAN Packages”,
CSMR-WCRE 2014
German, Adams et al. “The Evolution of the R Software Ecosystem”,
CSMR 2013
16. Long?term(goals(
Research(Ques7ons
• CRAN (R package archive)
• R package description file format:
16
Package: pkgname!
Version: 0.5-1!
Date: 2004-01-01!
Title: My First Collection of Functions!
Authors@R: c(person("Joe", "Developer", role = c("aut", "cre"),!
! ! email = "Joe.Developer@some.domain.net"),!
! person("Pat", "Developer", role = "aut"),!
! person("A.", "User", role = "ctb",!
! ! email = "A.User@whereever.net"))!
Author: Joe Developer and Pat Developer, with contributions from A. User!
Maintainer: Joe Developer <Joe.Developer@some.domain.net>!
Depends: R (>= 1.8.0), nlme!
Suggests: MASS!
Description: A short (one paragraph) description of what!
the package does and why it may be useful.!
License: GPL (>= 2)!
URL: http://www.r-project.org, http://www.another.url!
BugReports: http://pkgname.bugtracker.url
17. Long?term(goals(
Research(Ques7ons
• CRAN (R package archive)
• Types of R package dependencies:
• Strong dependencies (required to install and load a
package)
• Depends: all packages the package directly depends
upon
• Imports: all packages whose namespaces are
imported from
• Weak dependencies
• Suggests: packages that are not necessarily needed.
E.g. they may be required for running some tests or
examples, but not for installing/loading
• Enhances: packages that enhance other packages,
e.g., by providing methods for classes from these
packages
17
19. Long?term(goals(
Research(Ques7ons
• Specific questions for Debian
• Open source Linux distribution
• Which packages are more likely to cause future
co-installation conflicts with other packages?
• Can I upgrade a set of installed Debian packages
without “breaking” my installation?
• Based on a formalisation and SAT solving
19
Vouillon and Di Cosmo, “Broken Sets in
Sotware Repository Evolution”, ICSE 2013
21. Long?term(goals(
Research(Ques7ons
• Specific questions for GNOME
• Linux desktop environment
• Which projects have a higher chance of
survival?
• How is workload distributed over different
projects/contributors?
• What is the “bus factor” risk? Who are the top
contributors (for a specific activity type)?
21
“Vasilescu et al. “On the variation and specialisation
of workload: A case study of the GNOME ecosystem
community”, Emp. Softw. Eng. journal 2014.
24. a
b
c
d
e
Birth First Release Maintenance Major Release
Long?term(goals(
Tooling
• Develop automated tools
– E.g. Complicity (Neu et al., University of Lugano)
24
1
2
3
Fig. 1. Main View of Complicity visualizing the GNOME projects at ecosystem level, and the details of the Gimp project
25. Long?term(goals(
Tooling
• Develop automated tools
– E.g. SECONDA for Gnome
25
VI. WORK IN PROGRESS
The current version of the tool is under active de-
velopment and changing every week. Currently, we are
working on the following issues that will be integrated
in future releases of SECONDA: integrate the identity
matching algorithm (already implemented) as a postpro-
cessing phase of the data extraction module; implement
an incremental version of the data extraction and metrics
computation to accommodate new projects’ revisions
without needing to recompute everything everytime,
thereby saving bandwith, time and memory; integrate the
community and developer analysis; integrating the sta-
tistical analysis module; implement a reporting module;
analyse other ecosystems than GNOME; support other
types of version repositories, and integrate information
from bug trackers, mailing lists and development fora.
IV. DATA EXTRACTION AND METRICS COMPUTATION
SECONDA provides two types of manipulation of the
data extracted from GIT repositories: global analysis,
which clones the GIT repository and performs a global
coarse-grained analysis; and local analysis, which anal-
yses the software projects in the local repository clone
at a fine-grained level. Although both analyses can be
used independently, it is advisable to first carry out the
global analysis, and then request a local analysis for
those projects that deserve more attention.
A. Global analysis
The data extraction module downloads and maintains
a local cached copy of the GNOME repository on which
the metrics module runs a coarse-grained analysis, using
SLOCCOUNT for obtaining the projects’ size metrics,
for the latest revision of each project, and with GIT for