SlideShare a Scribd company logo
1 of 71
Download to read offline
Evolving6So9ware6Ecosystems6
Marktoberdorf6Summer6School62014

Lecture64
Tom6Mens6
So#ware(Engineering(Lab(
University(of(Mons
informa7que.umons.ac.be/genlog
Ecosystem(Measures
Ecosystem(Measures
• The(characteris7cs(of(a(so#ware(ecosystem(can(
be(measured(in(different(ways(
– Using(tradi7onal(so#ware(quality(metrics(
– Using(ecological(diversity(metrics(
– Using(econometrics
96
Ecosystem(Measures(
So#ware(Quality(Metrics
97
July?August(2014(—(NATO(Marktoberdorf(Summer(School(—(Dependable(So#ware(Systems(Engineering
Ecosystem(Measures(
So#ware(Quality(Metrics
• So#ware(product((code)(metrics(
– size(metrics(
– e.g.(LOC,(NOM(
– complexity(metrics(
– e.g.(cycloma7c(complexity(
– coupling(and(cohesion(metrics(
– e.g.(LCOM,(CBO(
– dependency(metrics(
– e.g.(fan?in,(fan?out
98
July?August(2014(—(NATO(Marktoberdorf(Summer(School(—(Dependable(So#ware(Systems(Engineering
Ecosystem(Measures(
So#ware(Quality(Metrics
• Side(remark(
• Distribu7on(of(most(of(these(metrics(is(highly(
skewed(
• Tradi7onal(aggrega7on(measures((mean,(median)(
are(only(reliable(for(centralised(distribu7ons(
• We(need(other(aggrega7on(measures(for(skewed(
distribu7ons
99
Mordal(et(al.(“So#ware(quality(metrics(aggrega7on(in(
industry”,(J./SoCware:/Evolu$on/and/Process/(2012)
Ecosystem(Measures(
Measuring(Diversity
Many(different(diversity(metrics:(
• species(richness$
• the(number(of(different(species(represented(in(an(ecological(
community(
• species(evenness$(entropy)$
• the(rela7ve(abundance(of(the(popula7on(of(each(species(in(the(
ecosystem(
• Shannon$diversity$index$(rela7ve(entropy)$
• how(specialised(is(a(given(species(in(rela7on(to(the(species(in(the(
other(level(
• Simpson$index$
• the(degree(of(concentra7on(when(individuals(are(classified(into(
species
100
Measuring(Diversity(
Evenness
• Quan7fies(the(rela7ve(abundance(of(the(popula7on(
of(each(species(in(the(ecosystem(
• Maximum(evenness(if(all(species(are(equally(abundant((i.e.,(
have(same(number(of(individuals)(
• Low(evenness(if(some(species(dominate(the(others(
!
• Can(be(measured(using(Shannon’s(no7on(of(informa$on/
entropy
101
July?August(2014(—(NATO(Marktoberdorf(Summer(School(—(Dependable(So#ware(Systems(Engineering
Based on Shannon’s notion of information entropy

and 2nd law of thermodynamics!
!
!
!
where X = set of n distinct species xi!
p(xi) = proportion of all individuals that belong to species xi!
!
Quantifies the uncertainty in predicting the species identity of an
individual that is taken at random from the dataset.!
102
Measuring(Diversity(
Evenness
€
H(X) = − p(xi)ln p(xi)
i=1
n
∑
Claude6Shannon6
1916I2001
July?August(2014(—(NATO(Marktoberdorf(Summer(School(—(Dependable(So#ware(Systems(Engineering 103
Measuring(Diversity(
Evenness
Dual(views(in(a(so#ware(ecosystem(
!
• Based(on(species(analogy(
✦ Contributors(are(species(that(thrive((
in(their(environment(of(projects(
✦ Projects(are(species(that(thrive(in(
their(environment(of(contributors(
(human(resources)
Bipar7te(
contributor?project(
graph
project(1
project(2
project(3
July?August(2014(—(NATO(Marktoberdorf(Summer(School(—(Dependable(So#ware(Systems(Engineering
Two(dual(measures(of(entropy(
• Based(on(bipar7te(author((contributor)(?(
module((project)(graph(
M(=(set(of(n(dis7nct(modules(mi(
A(=(set(of(k(dis7nct(authors(aj(
Mi(=(#(commits(to(module(mi(
Aj(=(#(commits(by(author(aj(
aij(=((#(commits(to(module(mi(by(author(aj)(/(Aj(
mij(=((#commits(to(module(mi(by(author(aj)(/(Mi(
• Author(diversity(
!
!
• Module(diversity
104
Measuring(Diversity(
Evenness
€
Ha j
= − aij lnaij
i=1
n
∑
Hmi
= − mij lnmij
j =1
k
∑
Posnet(et(al.(Dual/ecological/
measures/of/focus/in/soCware/
development.(ICSE(2013
module(1
module(2
module(3
July?August(2014(—(NATO(Marktoberdorf(Summer(School(—(Dependable(So#ware(Systems(Engineering
Two(dual(measures(of(entropy(
• Based(on(bipar7te(author((contributor)(?(
module((project)(graph(
M(=(set(of(n(dis7nct(modules(mi(
A(=(set(of(k(dis7nct(authors(aj(
Mi(=(#(commits(to(module(mi(
Aj(=(#(commits(by(author(aj(
aij(=((#(commits(to(module(mi(by(author(aj)(/(Aj(
mij(=((#commits(to(module(mi(by(author(aj)(/(Mi(
• Author(diversity(
!
!
• Module(diversity
105
Measuring(Diversity(
Evenness
€
Ha j
= − aij lnaij
i=1
n
∑
Hmi
= − mij lnmij
j =1
k
∑
Low(diversity(if(author(dominates(
most(commit(ac7vity
Low(diversity(if(module(
dominates(most(commit(ac7vity
Posnet(et(al.(Dual/ecological/
measures/of/focus/in/soCware/
development.(ICSE(2013
July?August(2014(—(NATO(Marktoberdorf(Summer(School(—(Dependable(So#ware(Systems(Engineering
Measuring(Diversity(
Shannon’s(diversity(index
Expresses(how(specialised(a(given(species(is(in(rela7on(to(the(
species(in(the(other(level(
Using(a(no7on(of(rela/ve$entropy(
Taking(into(account(the(contributor?project(duality
106
Projet 1
Projet 2
Projet 3
Thiruvalluvan Douglas Phillip
avro.genavro
avro.io.parsing
avro.io avro.generic avro.reflect
avro.specific
avro avro.file avro.tool
avro.util
avro.mapred.tether
avro.mapred
default
avro.idl
avro.ipc
avro.ipc.trace
avro.ipc.stats
July?August(2014(—(NATO(Marktoberdorf(Summer(School(—(Dependable(So#ware(Systems(Engineering
Shannon’s(diversity(index(
Rela7ve(Entropy
Specialisa&on(of(a(species(rela7ve(to(the(species(in(the(other(level(
Takes(into(account(the(interac7on(between(authors(and(modules(as(
well(as(the(overall(amount(of(ac7vity(per(author(or(module.(
– Mi(and(Aj(defined(as(before(
– mij(and(aij(defined(as(before(
– C(=(total(#commits((
!
• Author((contributor)(specialisa7on(
!
!
• Module((project)(specialisa7on
107
Fa j
= − aij ln
aij
M'ii=1
n
∑
Fmi
= − mij ln
mij
A'jj=1
k
∑
July?August(2014(—(NATO(Marktoberdorf(Summer(School(—(Dependable(So#ware(Systems(Engineering
• ANen&on6focus6=(normalisa7on(of(specialisa$on(by(the(
theore7cal(maximum(and(minimum(possible(values(((
!
!
!
• Findings(by(Posnet(et/al.(
– Project(leaders(and(top(contributors(tend(to(exhibit(lower(
aten7on(focus(than(others.(
– Narrowly(focused(developers(introduce(fewer(defects.(
– Increased(module(ac7vity(focus(results(in(a(greater(number(
of(defects.
Shannon’s(diversity(index(
Rela7ve(Entropy
108
Can(be(computed(with(R(package(‘bipartite’
Measuring(Diversity(
Simpson(index
» Measures(the(degree(of(concentra7on(when(individuals(are(
classified(into(species(
• I.e.,(the(probability(that(two(individuals(taken(at(random

from(the(dataset(belong(to(the(same(species(
• Is(minimal(when(all(species(are(equally(abundant(
• For(small(datasets:(
!
!
!
!
!
• For(large(datasets:
109
• R(=(number(of(species(types(
• N(=(number(of(en77es(in(the(dataset(
• ni(=(number(of(en77es(belonging(to(the(ith(species(type(
Ecosystem(Measures(
Econometrics
• Econometrics(are(measures(used(in(economy(
• Well?known(examples(
• Pareto(principle(
• Inequality(indices
110
July?August(2014(—(NATO(Marktoberdorf(Summer(School(—(Dependable(So#ware(Systems(Engineering 111
Econometrics(
Pareto(Principle
Pareto(Principle
• A.k.a. 80–20 rule!
• Roughly 80% of the effects come from
20% of the causes.!
• Often coincides with power law distribution!
Examples!
• 80% of land owned by 20% of the population!
• 80% of sales come from 20% of clients!
• 80% of crashes come from 20% most
reported bugs
Pareto
July?August(2014(—(NATO(Marktoberdorf(Summer(School(—(Dependable(So#ware(Systems(Engineering
Example in the GNOME
ecosystem!
• 20% of all contributors
account for about 80% of
the total workload in
GNOME code repository
112
0.0 0.2 0.4 0.6 0.8 1.0
0.00.20.40.60.81.0
Cumulative percentage of contributors
Cumulativepercentageofworkload
Econometrics(
Pareto(Principle
July?August(2014(—(NATO(Marktoberdorf(Summer(School(—(Dependable(So#ware(Systems(Engineering
• Example for individual GNOME
projects!
• Brasero!
• Evince!
• Analysing different data sources!
• Commits in a version
control repository!
• Mails in a mailing lists!
• Issue reports in a bug
tracker!
• Pareto principle is confirmed in
all case
113
Econometrics(
Pareto(Principle
Evidence for the Pareto principle
in Open Source Software Activity
Mathieu Goeminne and Tom Mens
Institut d’Informatique, Facult´e des Sciences
Universit´e de Mons – UMONS
Mons, Belgium
{ mathieu.goeminne | tom.mens }@umons.ac.be
Abstract—Numerous empirical studies analyse evolving open
source software (OSS) projects, and try to estimate the activity
and effort in these projects. Most of these studies, however, only
focus on a limited set of artefacts, being source code and defect
data. In our research, we extend the analysis by also taking into
account mailing list information. The main goal of this article
is to find evidence for the Pareto principle in this context, by
studying how the activity of developers and users involved in
OSS projects is distributed: it appears that most of the activity
is carried out by a small group of people. Following the GQM
paradigm, we provide evidence for this principle. We selected
a range of metrics used in economy to measure inequality in
distribution of wealth, and adapted these metrics to assess how
OSS project activity is distributed. Regardless of whether we
analyse version repositories, bug trackers, or mailing lists, and
for all three projects we studied, it turns out that the distribution
of activity is highly imbalanced.
Index Terms—software evolution, activity, software project,
data mining, empirical study, open source software, GQM, Pareto
I. INTRODUCTION
Numerous empirical studies aim to understand and model
how open source software (OSS) evolves over time [1]. In
order to gain a deeper understanding of this evolution, it
is essential to study not only the software artefacts that
evolve (e.g. source code, bug reports, and so on), but also
their interplay with the different project members (mainly
developers and users) that communicate (e.g., via mailing lists)
and collaborate in order to construct and evolve the software.
In this article, we wish to understand how activity is spread
over the different members of an OSS project, and how this
activity distribution evolves over time. Our hypothesis is that
the distribution of activity follows the Pareto principle, in the
sense that there is a small group of key persons that carry
out most of the activity, regardless of the type of considered
activity. To verify this hypothesis, we carry out an empirical
study based on the GQM paradigm [2]. We rely on concepts
borrowed from econometrics (the use of measurement in
economy), and apply them to the field of OSS evolution.
In particular, we apply indices that have been introduced
for measuring distribution (and inequality) of wealth, and
use them to measure the distribution of activity in software
development.
The remainder of this paper is structured as follows. Sec-
tion II explains the methodology we followed and defines
the metrics that we rely upon. Section III presents the ex-
perimental setup of our empirical study that we have carried
out. Section IV presents the results of our analysis of activity
distribution in three OSS projects. Section V discusses the
evidence we found for the Pareto principle. Section VI presents
related work, and Section VII concludes.
II. METHODOLOGY
A. GQM paradigm
To gain a deeper understanding of how OSS projects evolve,
we follow the well-known Goal-Question-Metric (GQM)
paradigm. Our main research Goal is to understand how ac-
tivity is distributed over the different stakeholders (developers
and users) involved in OSS projects. Once we have gained
deeper insight in this issue, we will be able to exploit it to
provide dedicated tool support to the OSS community, e.g.,
by helping newcomers to understand how the community is
structured, by improving the way in which the community
members communicate and collaborate, by trying to reduce
the potential risk of the so-called bus factor1
, and so on.
To reach the aforementioned research goal, we raise the
following research Questions:
1) Is there a core group of OSS project members (develop-
ers and/or users) that are significantly more active than
the other members?
2) How does the distribution of activity within an OSS
community evolve over time?
3) Is there an overlap between the different types of activity
(e.g., committing, mailing, submitting and changing bug
reports) the community members contribute to?
4) How does the distribution of activity vary across differ-
ent OSS projects?
As a third step, we need to select appropriate Metrics that
will enable us to provide a satisfactory answer to each of the
above research questions. For our empirical study, we will
make use of basic metrics to compute the activity of OSS
project members, and aggregate metrics that allow us to com-
pare these basic metric values across members (to understand
how activity is distributed), over time (to understand how they
1The bus factor refers to the total number of key persons (involved in the
project) that would, if they were to be hit by a bus, lead the project into
serious problems
SQM(2011
July?August(2014(—(NATO(Marktoberdorf(Summer(School(—(Dependable(So#ware(Systems(Engineering
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Econometrics(
Pareto(Principle
114
Brasero
Evince
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
commits
mails
bug report changes
July?August(2014(—(NATO(Marktoberdorf(Summer(School(—(Dependable(So#ware(Systems(Engineering
Econometrics(
Lorenz(curve
•A(graphical(representa7on(for(a(
cumula7ve(distribu7on(of(values(
• Example(for(income/wealth(distribu7on(
• A(point((x,y)(on(the(graph(indicates(that(the(
poorest(x%(of(persons(have(a(total(cumula7ve(
income(of(y%.(
• (Example(for(ecology/biodiversity(
• cumula7ve(propor7on(of(species(is(ploted(
against(cumula7ve(propor7on(of(individuals.(
!
• Can(be(used(to(check(Pareto(principle
115
July?August(2014(—(NATO(Marktoberdorf(Summer(School(—(Dependable(So#ware(Systems(Engineering
Econometrics(
Inequality(Indices
•(are(used(to(measure(the(amount(of(inequality(in(
a(sta7s7cal(distribu7on(
– Examples:(Gini,(Theil,(Hoover,(Kolm,(Atkinson,(…(
•Values(typically(range(between(0(and(1(
•0(=(perfect(equality(
•1(=(maximal(inequality(
!
•Are(useful(for(skewed(distribu7ons,(where(use(of(
mean(and(median(as(aggrega7on(measure(is(not(
very(meaningful(
•Are(all(correlated,(in(prac7ce(…
116
July?August(2014(—(NATO(Marktoberdorf(Summer(School(—(Dependable(So#ware(Systems(Engineering
Econometrics(
Inequality(Indices
•(Examples((and(defini7ons)(
!
!
!
!
!
•Inequality(indices(have(been(used(in(
empirical(so#ware(engineering(to(study(
the(evolu7on(of(so#ware(metrics
117
Gini/
Theil
Atkinson/
Hoover/
Kolm
July?August(2014(—(NATO(Marktoberdorf(Summer(School(—(Dependable(So#ware(Systems(Engineering 118
• Gini(coefficient(measures(the(
inequality(among(values(of(a(
frequency(distribu7on(
• 0(=(perfect(equality(
• 1?1/n(=(maximal(inequality(
• Is(computed(based(on(the(
areas(above(and(below(the(
Lorenz(curve:(
Gini(=(A(/((A+B)
Inequality(Indices(
Gini(coefficient
July?August(2014(—(NATO(Marktoberdorf(Summer(School(—(Dependable(So#ware(Systems(Engineering
Inequality(Indices(
Gini(coefficient
119
ution profiles similar to the ones we observed
fortunately, the number of freely-available,
ystems developed in C# framework that met
criteria is rather limited. So, we began our
tems that were originally written in Java and
ed to the .NET platform in order to take ad-
the knowledge gained in the analysis of their
a counterparts.
ET metrics extraction, we used CLI [18], an
der library that provides access to both the
byte code. We added a small wrapper for the
f the Gini coefficients and stored the resulting
file for further processing with JSeat.
ed metrics data from four .NET systems:
NHibernate, SharpDevelop, and NAnt. The
ur 10 measures produced Gini coefficients
he ones determined for Java systems. How-
re also exceptions. We observed a shift ex-
i.e., individual Gini coefficients doubled in
most all measures in NAnt version 0.8.3-rc1.
fficients stayed high until version 0.84-rc1,
sumed “normal” values again. An inspection
per logs provided an explanation: in version
NAntContrib project was integrated into the
tion. This project defines a number of utili-
trics exhibit very uneven distribution profiles
changes do happen and may result in significant fluctua-
tions in Gini coefficients that warrant a deeper analysis (see
Figure 4 showing selected Gini profiles for 51 consecutive
releases of the Spring framework). But why do we see such
a remarkable stability of Gini coefficients?
Figure 4. Selected Gini profiles in Spring.
Developers accumulate system competence over time.
Proven techniques to solve a given problem prevail, where
untested or weak practices have little chance of survival.
If a team has historically built software in a certain way,
then it will continue to prefer a certain approach over oth-
ers. Moreover, we can expect that most problems in a given
domain are similar, hence the means taken to tackle them
would be similar, too. Tversky and Kahneman coined the
Vasa(et(al.(Compara$ve/analysis/of/
evolving/soCware/systems/using/
the/Gini/coefficient.(ICSM(2009
July?August(2014(—(NATO(Marktoberdorf(Summer(School(—(Dependable(So#ware(Systems(Engineering
Inequality(Indices(
Gini(coefficient
120
ution profiles similar to the ones we observed
fortunately, the number of freely-available,
ystems developed in C# framework that met
criteria is rather limited. So, we began our
tems that were originally written in Java and
ed to the .NET platform in order to take ad-
the knowledge gained in the analysis of their
a counterparts.
ET metrics extraction, we used CLI [18], an
der library that provides access to both the
byte code. We added a small wrapper for the
f the Gini coefficients and stored the resulting
file for further processing with JSeat.
ed metrics data from four .NET systems:
NHibernate, SharpDevelop, and NAnt. The
ur 10 measures produced Gini coefficients
he ones determined for Java systems. How-
re also exceptions. We observed a shift ex-
i.e., individual Gini coefficients doubled in
most all measures in NAnt version 0.8.3-rc1.
fficients stayed high until version 0.84-rc1,
sumed “normal” values again. An inspection
per logs provided an explanation: in version
NAntContrib project was integrated into the
tion. This project defines a number of utili-
trics exhibit very uneven distribution profiles
changes do happen and may result in significant fluctua-
tions in Gini coefficients that warrant a deeper analysis (see
Figure 4 showing selected Gini profiles for 51 consecutive
releases of the Spring framework). But why do we see such
a remarkable stability of Gini coefficients?
Figure 4. Selected Gini profiles in Spring.
Developers accumulate system competence over time.
Proven techniques to solve a given problem prevail, where
untested or weak practices have little chance of survival.
If a team has historically built software in a certain way,
then it will continue to prefer a certain approach over oth-
ers. Moreover, we can expect that most problems in a given
domain are similar, hence the means taken to tackle them
would be similar, too. Tversky and Kahneman coined the
Vasa(et(al.(Compara$ve/analysis/of/
evolving/soCware/systems/using/
the/Gini/coefficient.(ICSM(2009
!"
!#$"
!#%"
!#&"
!#'"
!#("
!#)"
!#*"
!#+"
!#,"
$"
-./0!)" 1230!*" -./0!*" 1230!+" -./0!+" 1230!," -./0!," 1230$!" -./0$!"
.4556/7"
58697"
:;"0".<8=>?7"
!"
!#$"
!#%"
!#&"
!#'"
!#("
!#)"
!#*"
!#+"
!#,"
$"
-./0,," -./0!!" -./0!$" -./0!%" -./0!&" -./0!'" -./0!(" -./0!)" -./0!*" -./0!+" -./0!," -./0$!"
1233456"
37486"
9:;"/<.2/5"1=7>;<6"
Gnome/Brasero Gnome/Evince
Goeminne et al. Evidence for the Pareto principle in open source software activity. SQM 2011
July?August(2014(—(NATO(Marktoberdorf(Summer(School(—(Dependable(So#ware(Systems(Engineering
Inequality(Indices(
Theil(index
•Is(defined(as(
!
!
and(gives(a(value(between(0(and(ln/N/
•Corresponds(to(the(no7on(of(redundancy(in(informa7on(
theory(
!
•Normalised6Theil6index6is(obtained(by(dividing(by(ln/N

and(gives(values(between(0(and(1(
•0(=(equal(distribu7on(
•1(=(unequal(distribu7on(
!
121
Inequality(Indices(
Theil(index
Commits(sent(
E?mails(sent(
Bug(reports(modified
Evince
122
Evidence for the Pareto principle
in Open Source Software Activity
Mathieu Goeminne and Tom Mens
Institut d’Informatique, Facult´e des Sciences
Universit´e de Mons – UMONS
Mons, Belgium
{ mathieu.goeminne | tom.mens }@umons.ac.be
Abstract—Numerous empirical studies analyse evolving open
source software (OSS) projects, and try to estimate the activity
and effort in these projects. Most of these studies, however, only
focus on a limited set of artefacts, being source code and defect
data. In our research, we extend the analysis by also taking into
account mailing list information. The main goal of this article
is to find evidence for the Pareto principle in this context, by
studying how the activity of developers and users involved in
OSS projects is distributed: it appears that most of the activity
is carried out by a small group of people. Following the GQM
paradigm, we provide evidence for this principle. We selected
a range of metrics used in economy to measure inequality in
distribution of wealth, and adapted these metrics to assess how
OSS project activity is distributed. Regardless of whether we
analyse version repositories, bug trackers, or mailing lists, and
for all three projects we studied, it turns out that the distribution
of activity is highly imbalanced.
Index Terms—software evolution, activity, software project,
data mining, empirical study, open source software, GQM, Pareto
I. INTRODUCTION
Numerous empirical studies aim to understand and model
how open source software (OSS) evolves over time [1]. In
order to gain a deeper understanding of this evolution, it
is essential to study not only the software artefacts that
evolve (e.g. source code, bug reports, and so on), but also
their interplay with the different project members (mainly
developers and users) that communicate (e.g., via mailing lists)
and collaborate in order to construct and evolve the software.
In this article, we wish to understand how activity is spread
over the different members of an OSS project, and how this
activity distribution evolves over time. Our hypothesis is that
the distribution of activity follows the Pareto principle, in the
sense that there is a small group of key persons that carry
out most of the activity, regardless of the type of considered
activity. To verify this hypothesis, we carry out an empirical
study based on the GQM paradigm [2]. We rely on concepts
borrowed from econometrics (the use of measurement in
economy), and apply them to the field of OSS evolution.
In particular, we apply indices that have been introduced
for measuring distribution (and inequality) of wealth, and
use them to measure the distribution of activity in software
development.
The remainder of this paper is structured as follows. Sec-
tion II explains the methodology we followed and defines
the metrics that we rely upon. Section III presents the ex-
perimental setup of our empirical study that we have carried
out. Section IV presents the results of our analysis of activity
distribution in three OSS projects. Section V discusses the
evidence we found for the Pareto principle. Section VI presents
related work, and Section VII concludes.
II. METHODOLOGY
A. GQM paradigm
To gain a deeper understanding of how OSS projects evolve,
we follow the well-known Goal-Question-Metric (GQM)
paradigm. Our main research Goal is to understand how ac-
tivity is distributed over the different stakeholders (developers
and users) involved in OSS projects. Once we have gained
deeper insight in this issue, we will be able to exploit it to
provide dedicated tool support to the OSS community, e.g.,
by helping newcomers to understand how the community is
structured, by improving the way in which the community
members communicate and collaborate, by trying to reduce
the potential risk of the so-called bus factor1
, and so on.
To reach the aforementioned research goal, we raise the
following research Questions:
1) Is there a core group of OSS project members (develop-
ers and/or users) that are significantly more active than
the other members?
2) How does the distribution of activity within an OSS
community evolve over time?
3) Is there an overlap between the different types of activity
(e.g., committing, mailing, submitting and changing bug
reports) the community members contribute to?
4) How does the distribution of activity vary across differ-
ent OSS projects?
As a third step, we need to select appropriate Metrics that
will enable us to provide a satisfactory answer to each of the
above research questions. For our empirical study, we will
make use of basic metrics to compute the activity of OSS
project members, and aggregate metrics that allow us to com-
pare these basic metric values across members (to understand
how activity is distributed), over time (to understand how they
1The bus factor refers to the total number of key persons (involved in the
project) that would, if they were to be hit by a bus, lead the project into
serious problems
Brasero
Evolu7on(of(Theil(index(for(2(GNOME(projects
SQM2011
July?August(2014(—(NATO(Marktoberdorf(Summer(School(—(Dependable(So#ware(Systems(Engineering
Econometrics(
Inequality(Indices
Example: Comparison of (evolution of) inequality indices
for Evince
123
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Apr-99 Dec-99 Aug-00 Apr-01 Dec-01 Aug-02 Apr-03 Dec-03 Aug-04 Apr-05 Dec-05 Aug-06 Apr-07 Dec-07 Aug-08
Gini
Hoover
Theil (normalised)
So#ware(Ecosystems
Case(Study:(GNOME
Vasilescu(et(al.(On/the/varia$on/and/
specialisa$on/of/workload:/A/case/study/
of/the/GNOME/ecosystem/community.(
Emp.(So#w.(Eng.(2014
July?August(2014(—(NATO(Marktoberdorf(Summer(School(—(Dependable(So#ware(Systems(Engineering
Overall(goal(revisited
Improve(support((tools/guidelines/models/…)(for(
dealing(with(changes(in(open(source(soCware(
ecosystems/
–Improve(chance(of(survival(of(a(project(within(its(
ecosystem(
–Improve(resilience(of(an(ecosystem(as(a(whole(
–Allow(to(make(changes(more(effec7vely(
e.g.(higher(produc7vity,(faster(reac7on(to/
implementa7on(of(change/bug(requests)(
–Increase((accuracy(of(effort/cost(es7ma7on(
models,(defect(predic7on(models(and(so(on
125
July?August(2014(—(NATO(Marktoberdorf(Summer(School(—(Dependable(So#ware(Systems(Engineering
Case(Study:(GNOME
Observa$on:(exis7ng(generic(support(does(not(take(the(
specifici7es(of(the(ecosystem(into(account,(making(the(
support(subop7mal.(
!
Assump$on:(specialised(ecosystem?specific(change(
support(will(be(more(effec7ve(
!
Consequence:(We(need(to(understand(the(socio?technical(
specifici7es(of(the(ecosystem(under(study((in(order(to(
provide(more(effec7ve(change(support.(
!
This(is(what(we(will(do(for(the(GNOME(ecosystem.
126
July?August(2014(—(NATO(Marktoberdorf(Summer(School(—(Dependable(So#ware(Systems(Engineering
Case(Study:(GNOME

Some(references
127
To appear in 2013 in Springer’s Empirical Software Engineering journal – manuscri
(will be inserted by the editor)
On the variation and specialisation of workload – A
case study of the Gnome ecosystem community
Bogdan Vasilescu · Alexander Serebrenik ·
Mathieu Goeminne · Tom Mens
DOI: 10.1007/s10664-013-9244-1
Abstract Most empirical studies of open source software repositories focus on the
analysis of isolated projects, or restrict themselves to the study of the relation-
ships between technical artifacts. In contrast, we have carried out a case study that
focuses on the actual contributors to software ecosystems, being collections of soft-
ware projects that are maintained by the same community. To this aim, we defined
a new series of workload and involvement metrics, as well as a novel approach—
eT-graphs—for reporting the results of comparing multiple distributions. We used
these techniques to statistically study how workload and involvement of ecosys-
tem contributors varies across projects and across activity types, and we explored
to which extent projects and contributors specialise in particular activity types.
Using Gnome as a case study we observed that, next to coding, the activities of lo-
calization, development documentation and building are prevalent throughout the
ecosystem. We also observed notable di↵erences between frequent and occasional
contributors in terms of the activity types they are involved in and the number
of projects they contribute to. Occasional contributors and contributors that are
involved in many di↵erent projects tend to be more involved in the localization ac-
tivity, while frequent contributors tend to be more involved in the coding activity
in a limited number of projects.
Keywords open source · software ecosystem · metrics · developer community ·
case study
B. Vasilescu and A. Serebrenik
MDSE, Eindhoven University of Technology, PO Box 513, 5600 MB Eindhoven, The Nether-
UMONS
Faculté des Sciences
Département d’Informatique
Understanding the Evolution of
Socio-technical Aspects in Open Source
Ecosystems: An Empirical Analysis of
GNOME
Mathieu Goeminne
A dissertation submitted in fulfillment of the requirements of
the degree of Docteur en Sciences
Advisor Jury
Dr. TOM MENS Dr. XAVIER BLANC
Université de Mons, Belgium Université de Bordeaux 1, France
Dr. VÉRONIQUE BRUYÈRE
Université de Mons, Belgium
Dr. JESUS M. GONZALEZ-BARAHONA
Universidad Rey Juan Carlos, Spain
Dr. TOM MENS
Université de Mons, Belgium
Dr. ALEXANDER SEREBRENIK
Technische Universiteit Eindhoven, The Netherlands
Dr. JEF WIJSEN
Université de Mons, Belgium
June 2013
A historical dataset for GNOME contributors
Mathieu Goeminne, Ma¨elick Claes and Tom Mens
Software Engineering Lab, COMPLEXYS research institute, UMONS, Belgium
Abstract—We present a dataset of the open source
software ecosystem GNOME from a social point of view.
We have collected historical data about the contributors
to all GNOME projects stored on git.gnome.org, taking
into account the problem of identity matching, and as-
sociating different activity types to the contributors. This
type of information is very useful to complement the
traditional, source-code related information one can ob-
tain by mining and analyzing the actual source code.
The dataset can be obtained at https://bitbucket.org/
mgoeminne/sgl-flossmetric-dbmerge.
I. INTRODUCTION
In this paper, we present the process we have used
to create a dataset containing the historical information
related to contributors to the GNOME ecosystem. Our
database and the tools and scripts used to created it can
be found on a dedicated Bitbucket repository2
.
In contrast to many other datasets, we do not focus on
source code, since a significant amount of files commit-
ted to GNOME’s project repositories do not even contain
code (e.g., image files, web pages, documentation, lo-
calization and many more). Such type of information is
often ignored in MSR research while it is very relevant
to understand which types of activities contributors are
@(MSR(2013
July?August(2014(—(NATO(Marktoberdorf(Summer(School(—(Dependable(So#ware(Systems(Engineering
Case(Study:(GNOME(
Characteris7cs
Open(source(desktop(environment(for(Linux(
• >(16(years(of(ac7vity((1997(—>(…()(
• Projects((Git(repositories(stored(at(htp://git.gnome.org)(
( >(1400(projects(
!
• Contributors(
( >(11000(contributor(accounts(
( a#er(iden7ty(merging,(>(5800(contributors(
( a#er(filtering(code(ac7vity,(>(4300(coders(
!
• Commits(and(file(touches(
( >(1.3M(commits((of(which(>(0.6M(code(commits)(
( >(12M(file(touches((of(which(>(6M(of(codefile(touches)
128
July?August(2014(—(NATO(Marktoberdorf(Summer(School(—(Dependable(So#ware(Systems(Engineering
Case(Study:(GNOME(
Characteris7cs
129
Gnome
Use case
July?August(2014(—(NATO(Marktoberdorf(Summer(School(—(Dependable(So#ware(Systems(Engineering
C
Java
Objective C
Python
Lisp
JS
ASP.Net
C/C++ Header
C++
Perl
yacc
C#
IDL
Haskell
Objective C++
lexAssembly
Visual Basic
PHP
Ruby
Tcl/Tk
1e+05
1e+07
100 1000 10000
Files
LOC
Case(Study:(GNOME(
Programming(language
130
Rela7on(between(programming(language(used(and(code(size
Mainly6C/C++

and6Python
July?August(2014(—(NATO(Marktoberdorf(Summer(School(—(Dependable(So#ware(Systems(Engineering
Case(Study:(GNOME(
Characteris7cs
131
Dataset(shared(on

htps://bitbucket.org/mgoeminne/sgl?flossmetric?dbmerge/downloads(
FLOSSMetrics(compliant(
MySQL(database
Goeminne(et/al./“A(historical(dataset(
for(GNOME(contributors”,(MSR(2013
Case/Study:/GNOME/
Characteris$cs
132
Bipar7te(contributor?project(graph
project(1
project(2
project(3
!
>(5800(contributors(
(>(4300(coders)(
>(1400(projects
July?August(2014(—(NATO(Marktoberdorf(Summer(School(—(Dependable(So#ware(Systems(Engineering
Case(Study:(GNOME(
Workload(Distribu7on
How(is(workload(distributed(over(different

authors(and(projects?
133
July?August(2014(—(NATO(Marktoberdorf(Summer(School(—(Dependable(So#ware(Systems(Engineering
Case(Study:(GNOME(
Workload(Distribu7on
How(is(workload(distributed(over(different(
authors(and(projects(per6ac&vity6type?
134
Image
Code
Documentation
Traduction
July?August(2014(—(NATO(Marktoberdorf(Summer(School(—(Dependable(So#ware(Systems(Engineering
Case(Study:(GNOME(
Workload(Distribu7on
Two(dual(views((cf.(bipar7te(contributor?project(graph)(
?(Distribu7on(of(workload(over

(((different(projects(per(ac7vity(type(
?(Distribu7on(of(workload(over

(((different(authors(per(ac7vity(type?(
135
How(is(workload(distributed(over(different(
authors(and(projects(per6ac&vity6type?
July?August(2014(—(NATO(Marktoberdorf(Summer(School(—(Dependable(So#ware(Systems(Engineering
Case(Study:(GNOME(
Workload(Distribu7on
?(Extract(file(informa7on(for(each(commit(in(the(git(
repository(of(each(GNOME(project(
?(Associate(a(unique(ac7vity(type(t(to(each(file(
?(Count(the(number(of(file(touches
136
Based on [Robles2006]
/foo/bar.c
Fichiers Règles
...
...
.*.c -> CODE
CODE
Activité
July?August(2014(—(NATO(Marktoberdorf(Summer(School(—(Dependable(So#ware(Systems(Engineering
Case(Study:(GNOME(
Workload(Distribu7on
?(Extract(file(informa7on(for(each(commit(in(the(git(
repository(of(each(GNOME(project(
?(Associate(a(unique(ac7vity(type(t(to(each(file(
?(Count(the(number(of(file(touches(
!
Basic(workload(metric:(
APTW(a,p,t)(=(number(of(file(touches(of(an(author(a(
for(a(given(project(p(and(ac7vity(type(t/
!
Derived(metrics:(sum(and(Gini(coefficient
137
July?August(2014(—(NATO(Marktoberdorf(Summer(School(—(Dependable(So#ware(Systems(Engineering
Case(Study:(GNOME(
Workload(Metrics
Comment(le(travail(varie?t?il(d’un(projet(de(
GNOME(à(un(autre?(
Comment(le(travail(varie?t?il(d’un(
contributeur(de(GNOME(à(un(autre?(
Mesure(de(l’ac7vité(u7lisée:(le(nombre(de(
modifica7ons(effectuées(sur(les(fichiers.
138
July?August(2014(—(NATO(Marktoberdorf(Summer(School(—(Dependable(So#ware(Systems(Engineering
Case(Study:(GNOME(
Workload(Metrics
Main(findings(
!
Workload(is(log?
normally(
distributed(over(
GNOME(projects
139
July?August(2014(—(NATO(Marktoberdorf(Summer(School(—(Dependable(So#ware(Systems(Engineering
Case(Study:(GNOME(
Workload(Metrics
Main(findings(
!
The(majority(of(
GNOME(authors(
are(involved(in(a(
very(low(number(
of(file(touches.
140
28
log(AW)
Numberofauthors
0 2 4 6 8 10 12
0100200300400500600
50%
< 14
changes
185,874
changes
frequent6
authors
occasional6
authors
July?August(2014(—(NATO(Marktoberdorf(Summer(School(—(Dependable(So#ware(Systems(Engineering
Case(Study:(GNOME(
Workload(Metrics
Main(findings
141
Highest workload is
represented by coding activity,
followed by activities of
development documentation,
translation/internationalisation,
and build file creation.
TW(t)
July?August(2014(—(NATO(Marktoberdorf(Summer(School(—(Dependable(So#ware(Systems(Engineering
Case(Study:(GNOME(
Rela7ve(importance(of(ac7vity(types
What(are(the(favourite(ac7vity(types(for(GNOME?(
!
Two(dual(views(
?(Rela7ve(importance(of

(((each(ac7vity(type(per/author(
?(Rela7ve(importance(of(
(((each(ac7vity(type(per/project(
142
July?August(2014(—(NATO(Marktoberdorf(Summer(School(—(Dependable(So#ware(Systems(Engineering
What(are(the(favourite(ac7vity(types(for(GNOME?(
!
Approach(
•Use(sta7s7cal(tests(to

compare(distribu7ons(
•Verify(if(a(data(set(corresponding

to(an(ac7vity(type(tends(to(have

higher(values(than(a(data(set

corresponding(to

another(ac7vity(type
Case(Study:(GNOME(
Rela7ve(importance(of(ac7vity(types
143
July?August(2014(—(NATO(Marktoberdorf(Summer(School(—(Dependable(So#ware(Systems(Engineering
Case(Study:(GNOME(
Rela7ve(importance(of(ac7vity(types
Examples of statistical comparison tests!
•(Wilcoxon?)Mann–Whitney(U(test(
•Kruskal?Wallis(test((
!
Problems(with(tradi7onal(sta7s7cal(tests:(
• Not robust to populations of unequal sizes!
• Different tests can be inconsistent with each other!
• Pairwise comparison of all activity types requires 78
different combinations (12 * 13 / 2)!
•Traditional tests are not transitive
144
July?August(2014(—(NATO(Marktoberdorf(Summer(School(—(Dependable(So#ware(Systems(Engineering
Case(Study:(GNOME(
Rela7ve(importance(of(ac7vity(types
Solu7on:((
•Use(a(single(test(that(respects(transi7vity(
•T(procedure([Konietschke(et(al(2012]
145
~
July?August(2014(—(NATO(Marktoberdorf(Summer(School(—(Dependable(So#ware(Systems(Engineering
Case(Study:(GNOME(
Rela7ve(importance(of(ac7vity(types
T(procedure
146
6-3-2013
Pair Low High
B-A -0.56 -0.44
C-A -0.50 -0.31
D-A -0.32 -0.03
C-B -0.01 0.24
D-B 0.24 0.47
D-C 0.09 0.40
A→B
A→C
A→D
D→B
D→C
~
July?August(2014(—(NATO(Marktoberdorf(Summer(School(—(Dependable(So#ware(Systems(Engineering
Case(Study:(GNOME(
Rela7ve(importance(of(ac7vity(types
147
by author
July?August(2014(—(NATO(Marktoberdorf(Summer(School(—(Dependable(So#ware(Systems(Engineering
Case(Study:(GNOME(
Rela7ve(importance(of(ac7vity(types
148
by author by project
July?August(2014(—(NATO(Marktoberdorf(Summer(School(—(Dependable(So#ware(Systems(Engineering
Case(Study:(GNOME(
Rela7ve(importance(of(ac7vity(types
149
GNOME projects
and authors are
code-centric
by author by project
July?August(2014(—(NATO(Marktoberdorf(Summer(School(—(Dependable(So#ware(Systems(Engineering
Case(Study:(GNOME(
Rela7ve(importance(of(ac7vity(types
150
!
!
!
GNOME projects
and authors are
mainly involved in
4 activity types
!
!
!
!
by author by project
July?August(2014(—(NATO(Marktoberdorf(Summer(School(—(Dependable(So#ware(Systems(Engineering
Case(Study:(GNOME(
Heterogeneous(communi7es
Does(the(rela7ve(importance(of(ac7vity(types(differ(
between(frequent/and/occasional/authors?(
!
Idea(
Equally(split(the(authors(in(two(bins(of

more(or(less(equal(size,(based(on

the(author(workload:(
• about(50%(of(all(authors(were

involved(in(<14(file(touches
151
28
log(AW)
Numberofauthors
0 2 4 6 8 10 12
0100200300400500600
50%
< 14
changes
July?August(2014(—(NATO(Marktoberdorf(Summer(School(—(Dependable(So#ware(Systems(Engineering
Case(Study:(GNOME(
Heterogeneous(communi7es
152
Occasional authors
July?August(2014(—(NATO(Marktoberdorf(Summer(School(—(Dependable(So#ware(Systems(Engineering
Case(Study:(GNOME(
Heterogeneous(communi7es
153
Occasional authors Frequent authors
July?August(2014(—(NATO(Marktoberdorf(Summer(School(—(Dependable(So#ware(Systems(Engineering
Case(Study:(GNOME(
Heterogeneous(communi7es
154
Occasional authors Frequent authors
Frequent authors
are mostly coders,
occasional authors
are mostly
translators.
July?August(2014(—(NATO(Marktoberdorf(Summer(School(—(Dependable(So#ware(Systems(Engineering
Case(Study:(GNOME(
Heterogeneous(communi7es
Observa7ons(
• Coders have a higher workload and
are involved in less projects!
• Translators are less active but are
involved in more projects
155
Can(be(explained(in(part(by(the(use(of(Damned/Lies,(a(Web(applica7on(used(to(manage(the(
localisa7on((l10n)(ac7vi7es(of(the(GNOME(project
July?August(2014(—(NATO(Marktoberdorf(Summer(School(—(Dependable(So#ware(Systems(Engineering
Case(Study:(GNOME(
Heterogeneous(communi7es
156
6-3-2013
Sylvia Neu et al. “Telling stories
about GNOME with Complicity”,
VISSOFT 2011
Affec7onal(bond(view:(
- size(of(rectangle(=(author’s(life7me(in(days(
- color(=(number(of(projects(
Complicity(is(a(web?based(applica7on(
suppor7ng(so#ware(ecosystem(analysis(by(
means(of(interac7ve(visualiza7ons.
July?August(2014(—(NATO(Marktoberdorf(Summer(School(—(Dependable(So#ware(Systems(Engineering
Case(Study:(GNOME(
Heterogeneous(communi7es
157
6-3-2013
Unverified(assump7ons:(
!
1.(Authors(contribu7ng(a(lot(to(few(projects(
are(likely(to(be(developers((D)(
2.(Authors(contribu7ng(less(o#en(to(more(
projects(are(likely(to(be(translators((T)(
3.(Authors(tend(to(have(an(affec7onal(bond(
to(either(development(or(transla7on(work
July?August(2014(—(NATO(Marktoberdorf(Summer(School(—(Dependable(So#ware(Systems(Engineering
Case(Study:(GNOME(
Heterogeneous(communi7es
158
6-3-2013
Our work confirms
these assumptions
Potential
misclassifications
in Neu et al.
July?August(2014(—(NATO(Marktoberdorf(Summer(School(—(Dependable(So#ware(Systems(Engineering
How strongly do authors focus

on specific activities?
Case(Study:(GNOME(
Rela7ve(Workload
159
Basic measures:
• RATW(a,t)

= % of the total workload of author a

dedicated to activity type t
!
• RAWS(a) = author specialisation

= Gini index of of inequality of RATW(a,t)

aggregated over all activity types
July?August(2014(—(NATO(Marktoberdorf(Summer(School(—(Dependable(So#ware(Systems(Engineering
How strongly do authors focus?
Case(Study:(GNOME(
Rela7ve(Workload
160
1606-3-2013
max Gini for
n = 14: 0.9285
July?August(2014(—(NATO(Marktoberdorf(Summer(School(—(Dependable(So#ware(Systems(Engineering
How strongly do authors focus?
Case(Study:(GNOME(
Rela7ve(Workload
161
1616-3-2013
Occasional authors tend to focus
on a single activity type.
July?August(2014(—(NATO(Marktoberdorf(Summer(School(—(Dependable(So#ware(Systems(Engineering
How strongly do authors focus?
Case(Study:(GNOME(
Rela7ve(Workload
162
1626-3-2013 1626-3-2013
Frequent authors tend to focus
on few activity types.
July?August(2014(—(NATO(Marktoberdorf(Summer(School(—(Dependable(So#ware(Systems(Engineering
Case(Study:(GNOME(
Workload(Distribu7on
Main observations for GNOME ecosystem:
• Workload is unevenly distributed over projects and
authors
• Clear distinction between frequent and occasional
authors
• Authors form heterogeneous subcommunities (coding
versus translation)
• GNOME is code-centric, i.e., most of the workload is in
code-related activities (coding, build files, development
documentation)
163
July?August(2014(—(NATO(Marktoberdorf(Summer(School(—(Dependable(So#ware(Systems(Engineering
Case(Study:(GNOME(
Next(steps
Observa$on:(exis7ng(generic(support(does(not(take(the(
specifici7es(of(the(ecosystem(into(account,(making(the(support(
subop7mal.(
!
Having(gained(beter(understanding(of(the(GNOME(ecosystem(
specifici7es,(we(hope(to(come(up(with(beter(change(support(
mechanisms(
!
Dedicated(to(specific(sub(communi7es(
e.g.(Damned(Lies(applica7on(for(transla7on(community(
Es7ma7on((of(cost(or(effort)(and(predic7on(models((e.g.(of(
defects)(could(be(improved(
Tools(should(be(able(to(focus(on(those(ac7vi7es/projects(a(
contributor(is(interested(in((based(on(his(historic(ac7vity(profile)
164

More Related Content

Similar to MOD2014-Mens-Lecture4

A Systematic Mapping Study on Analysis of Code Repositories.pdf
A Systematic Mapping Study on Analysis of Code Repositories.pdfA Systematic Mapping Study on Analysis of Code Repositories.pdf
A Systematic Mapping Study on Analysis of Code Repositories.pdfAlicia Edwards
 
Replication and Benchmarking in Software Analytics
Replication and Benchmarking in Software AnalyticsReplication and Benchmarking in Software Analytics
Replication and Benchmarking in Software AnalyticsUniversity of Zurich
 
Syst biol 2012-burguiere-sysbio sys069
Syst biol 2012-burguiere-sysbio sys069Syst biol 2012-burguiere-sysbio sys069
Syst biol 2012-burguiere-sysbio sys069Thomas Burguiere
 
Control source code quality using the SonarQube platform
Control source code quality using the SonarQube platformControl source code quality using the SonarQube platform
Control source code quality using the SonarQube platformPVS-Studio
 
Measuring Technical Lag in Software Deployments (CHAOSScon 2020)
Measuring Technical Lag in Software Deployments (CHAOSScon 2020)Measuring Technical Lag in Software Deployments (CHAOSScon 2020)
Measuring Technical Lag in Software Deployments (CHAOSScon 2020)Tom Mens
 
1) How will managers of a monopolistically competitive firm de.docx
1) How will managers of a monopolistically competitive firm de.docx1) How will managers of a monopolistically competitive firm de.docx
1) How will managers of a monopolistically competitive firm de.docxLynellBull52
 
Agile maintenance
Agile maintenanceAgile maintenance
Agile maintenancearalikatte
 
Empirical research results for the evolution of a data-intensive software sys...
Empirical research results for the evolution of a data-intensive software sys...Empirical research results for the evolution of a data-intensive software sys...
Empirical research results for the evolution of a data-intensive software sys...Tom Mens
 
Towards effective bug triage with software data reduction techniques
Towards effective bug triage with software data reduction techniquesTowards effective bug triage with software data reduction techniques
Towards effective bug triage with software data reduction techniquesredpel dot com
 
Mining Software Repositories
Mining Software RepositoriesMining Software Repositories
Mining Software RepositoriesIsrael Herraiz
 
A web-based plateform for cleaner production and industrial symbiosis
A web-based plateform for cleaner production and industrial symbiosisA web-based plateform for cleaner production and industrial symbiosis
A web-based plateform for cleaner production and industrial symbiosisGuillaume Massard
 
Six Easy Pieces of Quantitatively Analyzing Open Source
Six Easy Pieces of Quantitatively Analyzing Open SourceSix Easy Pieces of Quantitatively Analyzing Open Source
Six Easy Pieces of Quantitatively Analyzing Open SourceDirk Riehle
 
AgriFood Data, Models, Standards, Tools, Use Cases
AgriFood Data, Models, Standards, Tools, Use CasesAgriFood Data, Models, Standards, Tools, Use Cases
AgriFood Data, Models, Standards, Tools, Use CasesRothamsted Research, UK
 
2014-10-10-SBC361-Reproducible research
2014-10-10-SBC361-Reproducible research2014-10-10-SBC361-Reproducible research
2014-10-10-SBC361-Reproducible researchYannick Wurm
 
Towards Effective Bug Triage with Software Data Reduction Techniques
Towards Effective Bug Triage with Software Data Reduction TechniquesTowards Effective Bug Triage with Software Data Reduction Techniques
Towards Effective Bug Triage with Software Data Reduction Techniques1crore projects
 
Software Analytics: Data Analytics for Software Engineering
Software Analytics: Data Analytics for Software EngineeringSoftware Analytics: Data Analytics for Software Engineering
Software Analytics: Data Analytics for Software EngineeringTao Xie
 
Runtime Behavior of JavaScript Programs
Runtime Behavior of JavaScript ProgramsRuntime Behavior of JavaScript Programs
Runtime Behavior of JavaScript ProgramsIRJET Journal
 

Similar to MOD2014-Mens-Lecture4 (20)

A Systematic Mapping Study on Analysis of Code Repositories.pdf
A Systematic Mapping Study on Analysis of Code Repositories.pdfA Systematic Mapping Study on Analysis of Code Repositories.pdf
A Systematic Mapping Study on Analysis of Code Repositories.pdf
 
Msr2021 tutorial-di penta
Msr2021 tutorial-di pentaMsr2021 tutorial-di penta
Msr2021 tutorial-di penta
 
Replication and Benchmarking in Software Analytics
Replication and Benchmarking in Software AnalyticsReplication and Benchmarking in Software Analytics
Replication and Benchmarking in Software Analytics
 
Syst biol 2012-burguiere-sysbio sys069
Syst biol 2012-burguiere-sysbio sys069Syst biol 2012-burguiere-sysbio sys069
Syst biol 2012-burguiere-sysbio sys069
 
Control source code quality using the SonarQube platform
Control source code quality using the SonarQube platformControl source code quality using the SonarQube platform
Control source code quality using the SonarQube platform
 
Measuring Technical Lag in Software Deployments (CHAOSScon 2020)
Measuring Technical Lag in Software Deployments (CHAOSScon 2020)Measuring Technical Lag in Software Deployments (CHAOSScon 2020)
Measuring Technical Lag in Software Deployments (CHAOSScon 2020)
 
1) How will managers of a monopolistically competitive firm de.docx
1) How will managers of a monopolistically competitive firm de.docx1) How will managers of a monopolistically competitive firm de.docx
1) How will managers of a monopolistically competitive firm de.docx
 
Agile maintenance
Agile maintenanceAgile maintenance
Agile maintenance
 
Empirical research results for the evolution of a data-intensive software sys...
Empirical research results for the evolution of a data-intensive software sys...Empirical research results for the evolution of a data-intensive software sys...
Empirical research results for the evolution of a data-intensive software sys...
 
Climbing the tree of unreachable fruits, reusing processes
Climbing the tree of unreachable fruits, reusing processesClimbing the tree of unreachable fruits, reusing processes
Climbing the tree of unreachable fruits, reusing processes
 
Towards effective bug triage with software data reduction techniques
Towards effective bug triage with software data reduction techniquesTowards effective bug triage with software data reduction techniques
Towards effective bug triage with software data reduction techniques
 
Mining Software Repositories
Mining Software RepositoriesMining Software Repositories
Mining Software Repositories
 
A web-based plateform for cleaner production and industrial symbiosis
A web-based plateform for cleaner production and industrial symbiosisA web-based plateform for cleaner production and industrial symbiosis
A web-based plateform for cleaner production and industrial symbiosis
 
Six Easy Pieces of Quantitatively Analyzing Open Source
Six Easy Pieces of Quantitatively Analyzing Open SourceSix Easy Pieces of Quantitatively Analyzing Open Source
Six Easy Pieces of Quantitatively Analyzing Open Source
 
AgriFood Data, Models, Standards, Tools, Use Cases
AgriFood Data, Models, Standards, Tools, Use CasesAgriFood Data, Models, Standards, Tools, Use Cases
AgriFood Data, Models, Standards, Tools, Use Cases
 
2014-10-10-SBC361-Reproducible research
2014-10-10-SBC361-Reproducible research2014-10-10-SBC361-Reproducible research
2014-10-10-SBC361-Reproducible research
 
Practices and Tools for Better Software Testing
Practices and Tools for  Better Software TestingPractices and Tools for  Better Software Testing
Practices and Tools for Better Software Testing
 
Towards Effective Bug Triage with Software Data Reduction Techniques
Towards Effective Bug Triage with Software Data Reduction TechniquesTowards Effective Bug Triage with Software Data Reduction Techniques
Towards Effective Bug Triage with Software Data Reduction Techniques
 
Software Analytics: Data Analytics for Software Engineering
Software Analytics: Data Analytics for Software EngineeringSoftware Analytics: Data Analytics for Software Engineering
Software Analytics: Data Analytics for Software Engineering
 
Runtime Behavior of JavaScript Programs
Runtime Behavior of JavaScript ProgramsRuntime Behavior of JavaScript Programs
Runtime Behavior of JavaScript Programs
 

More from Tom Mens

How to be(come) a successful PhD student
How to be(come) a successful PhD studentHow to be(come) a successful PhD student
How to be(come) a successful PhD studentTom Mens
 
Recognising bot activity in collaborative software development
Recognising bot activity in collaborative software developmentRecognising bot activity in collaborative software development
Recognising bot activity in collaborative software developmentTom Mens
 
A Dataset of Bot and Human Activities in GitHub
A Dataset of Bot and Human Activities in GitHubA Dataset of Bot and Human Activities in GitHub
A Dataset of Bot and Human Activities in GitHubTom Mens
 
The (r)evolution of CI/CD on GitHub
 The (r)evolution of CI/CD on GitHub The (r)evolution of CI/CD on GitHub
The (r)evolution of CI/CD on GitHubTom Mens
 
Nurturing the Software Ecosystems of the Future
Nurturing the Software Ecosystems of the FutureNurturing the Software Ecosystems of the Future
Nurturing the Software Ecosystems of the FutureTom Mens
 
Comment programmer un robot en 30 minutes?
Comment programmer un robot en 30 minutes?Comment programmer un robot en 30 minutes?
Comment programmer un robot en 30 minutes?Tom Mens
 
On the rise and fall of CI services in GitHub
On the rise and fall of CI services in GitHubOn the rise and fall of CI services in GitHub
On the rise and fall of CI services in GitHubTom Mens
 
On backporting practices in package dependency networks
On backporting practices in package dependency networksOn backporting practices in package dependency networks
On backporting practices in package dependency networksTom Mens
 
Comparing semantic versioning practices in Cargo, npm, Packagist and Rubygems
Comparing semantic versioning practices in Cargo, npm, Packagist and RubygemsComparing semantic versioning practices in Cargo, npm, Packagist and Rubygems
Comparing semantic versioning practices in Cargo, npm, Packagist and RubygemsTom Mens
 
Lost in Zero Space
Lost in Zero SpaceLost in Zero Space
Lost in Zero SpaceTom Mens
 
Evaluating a bot detection model on git commit messages
Evaluating a bot detection model on git commit messagesEvaluating a bot detection model on git commit messages
Evaluating a bot detection model on git commit messagesTom Mens
 
Is my software ecosystem healthy? It depends!
Is my software ecosystem healthy? It depends!Is my software ecosystem healthy? It depends!
Is my software ecosystem healthy? It depends!Tom Mens
 
Bot or not? Detecting bots in GitHub pull request activity based on comment s...
Bot or not? Detecting bots in GitHub pull request activity based on comment s...Bot or not? Detecting bots in GitHub pull request activity based on comment s...
Bot or not? Detecting bots in GitHub pull request activity based on comment s...Tom Mens
 
On the fragility of open source software packaging ecosystems
On the fragility of open source software packaging ecosystemsOn the fragility of open source software packaging ecosystems
On the fragility of open source software packaging ecosystemsTom Mens
 
How magic is zero? An Empirical Analysis of Initial Development Releases in S...
How magic is zero? An Empirical Analysis of Initial Development Releases in S...How magic is zero? An Empirical Analysis of Initial Development Releases in S...
How magic is zero? An Empirical Analysis of Initial Development Releases in S...Tom Mens
 
Comparing dependency issues across software package distributions (FOSDEM 2020)
Comparing dependency issues across software package distributions (FOSDEM 2020)Comparing dependency issues across software package distributions (FOSDEM 2020)
Comparing dependency issues across software package distributions (FOSDEM 2020)Tom Mens
 
SecoHealth 2019 Research Achievements
SecoHealth 2019 Research AchievementsSecoHealth 2019 Research Achievements
SecoHealth 2019 Research AchievementsTom Mens
 
SECO-Assist 2019 research seminar
SECO-Assist 2019 research seminarSECO-Assist 2019 research seminar
SECO-Assist 2019 research seminarTom Mens
 
Empirically Analysing the Socio-Technical Health of Software Package Managers
Empirically Analysing the Socio-Technical Health of Software Package ManagersEmpirically Analysing the Socio-Technical Health of Software Package Managers
Empirically Analysing the Socio-Technical Health of Software Package ManagersTom Mens
 
ConPan: Analysing Packages Installed in Docker Containers
ConPan: Analysing Packages Installed in Docker ContainersConPan: Analysing Packages Installed in Docker Containers
ConPan: Analysing Packages Installed in Docker ContainersTom Mens
 

More from Tom Mens (20)

How to be(come) a successful PhD student
How to be(come) a successful PhD studentHow to be(come) a successful PhD student
How to be(come) a successful PhD student
 
Recognising bot activity in collaborative software development
Recognising bot activity in collaborative software developmentRecognising bot activity in collaborative software development
Recognising bot activity in collaborative software development
 
A Dataset of Bot and Human Activities in GitHub
A Dataset of Bot and Human Activities in GitHubA Dataset of Bot and Human Activities in GitHub
A Dataset of Bot and Human Activities in GitHub
 
The (r)evolution of CI/CD on GitHub
 The (r)evolution of CI/CD on GitHub The (r)evolution of CI/CD on GitHub
The (r)evolution of CI/CD on GitHub
 
Nurturing the Software Ecosystems of the Future
Nurturing the Software Ecosystems of the FutureNurturing the Software Ecosystems of the Future
Nurturing the Software Ecosystems of the Future
 
Comment programmer un robot en 30 minutes?
Comment programmer un robot en 30 minutes?Comment programmer un robot en 30 minutes?
Comment programmer un robot en 30 minutes?
 
On the rise and fall of CI services in GitHub
On the rise and fall of CI services in GitHubOn the rise and fall of CI services in GitHub
On the rise and fall of CI services in GitHub
 
On backporting practices in package dependency networks
On backporting practices in package dependency networksOn backporting practices in package dependency networks
On backporting practices in package dependency networks
 
Comparing semantic versioning practices in Cargo, npm, Packagist and Rubygems
Comparing semantic versioning practices in Cargo, npm, Packagist and RubygemsComparing semantic versioning practices in Cargo, npm, Packagist and Rubygems
Comparing semantic versioning practices in Cargo, npm, Packagist and Rubygems
 
Lost in Zero Space
Lost in Zero SpaceLost in Zero Space
Lost in Zero Space
 
Evaluating a bot detection model on git commit messages
Evaluating a bot detection model on git commit messagesEvaluating a bot detection model on git commit messages
Evaluating a bot detection model on git commit messages
 
Is my software ecosystem healthy? It depends!
Is my software ecosystem healthy? It depends!Is my software ecosystem healthy? It depends!
Is my software ecosystem healthy? It depends!
 
Bot or not? Detecting bots in GitHub pull request activity based on comment s...
Bot or not? Detecting bots in GitHub pull request activity based on comment s...Bot or not? Detecting bots in GitHub pull request activity based on comment s...
Bot or not? Detecting bots in GitHub pull request activity based on comment s...
 
On the fragility of open source software packaging ecosystems
On the fragility of open source software packaging ecosystemsOn the fragility of open source software packaging ecosystems
On the fragility of open source software packaging ecosystems
 
How magic is zero? An Empirical Analysis of Initial Development Releases in S...
How magic is zero? An Empirical Analysis of Initial Development Releases in S...How magic is zero? An Empirical Analysis of Initial Development Releases in S...
How magic is zero? An Empirical Analysis of Initial Development Releases in S...
 
Comparing dependency issues across software package distributions (FOSDEM 2020)
Comparing dependency issues across software package distributions (FOSDEM 2020)Comparing dependency issues across software package distributions (FOSDEM 2020)
Comparing dependency issues across software package distributions (FOSDEM 2020)
 
SecoHealth 2019 Research Achievements
SecoHealth 2019 Research AchievementsSecoHealth 2019 Research Achievements
SecoHealth 2019 Research Achievements
 
SECO-Assist 2019 research seminar
SECO-Assist 2019 research seminarSECO-Assist 2019 research seminar
SECO-Assist 2019 research seminar
 
Empirically Analysing the Socio-Technical Health of Software Package Managers
Empirically Analysing the Socio-Technical Health of Software Package ManagersEmpirically Analysing the Socio-Technical Health of Software Package Managers
Empirically Analysing the Socio-Technical Health of Software Package Managers
 
ConPan: Analysing Packages Installed in Docker Containers
ConPan: Analysing Packages Installed in Docker ContainersConPan: Analysing Packages Installed in Docker Containers
ConPan: Analysing Packages Installed in Docker Containers
 

Recently uploaded

social pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajansocial pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajanpragatimahajan3
 
Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Disha Kariya
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingTechSoup
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introductionMaksud Ahmed
 
Russian Call Girls in Andheri Airport Mumbai WhatsApp 9167673311 💞 Full Nigh...
Russian Call Girls in Andheri Airport Mumbai WhatsApp  9167673311 💞 Full Nigh...Russian Call Girls in Andheri Airport Mumbai WhatsApp  9167673311 💞 Full Nigh...
Russian Call Girls in Andheri Airport Mumbai WhatsApp 9167673311 💞 Full Nigh...Pooja Nehwal
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104misteraugie
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...EduSkills OECD
 
Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationnomboosow
 
Mastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionMastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionSafetyChain Software
 
Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Celine George
 
Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphThiyagu K
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdfSoniaTolstoy
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactdawncurless
 
9548086042 for call girls in Indira Nagar with room service
9548086042  for call girls in Indira Nagar  with room service9548086042  for call girls in Indira Nagar  with room service
9548086042 for call girls in Indira Nagar with room servicediscovermytutordmt
 
Separation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesSeparation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesFatimaKhan178732
 
Disha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdfDisha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdfchloefrazer622
 
Student login on Anyboli platform.helpin
Student login on Anyboli platform.helpinStudent login on Anyboli platform.helpin
Student login on Anyboli platform.helpinRaunakKeshri1
 
Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3JemimahLaneBuaron
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityGeoBlogs
 

Recently uploaded (20)

social pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajansocial pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajan
 
Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy Consulting
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introduction
 
Russian Call Girls in Andheri Airport Mumbai WhatsApp 9167673311 💞 Full Nigh...
Russian Call Girls in Andheri Airport Mumbai WhatsApp  9167673311 💞 Full Nigh...Russian Call Girls in Andheri Airport Mumbai WhatsApp  9167673311 💞 Full Nigh...
Russian Call Girls in Andheri Airport Mumbai WhatsApp 9167673311 💞 Full Nigh...
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
 
Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communication
 
Mastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionMastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory Inspection
 
Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17
 
Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot Graph
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
 
Advance Mobile Application Development class 07
Advance Mobile Application Development class 07Advance Mobile Application Development class 07
Advance Mobile Application Development class 07
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impact
 
9548086042 for call girls in Indira Nagar with room service
9548086042  for call girls in Indira Nagar  with room service9548086042  for call girls in Indira Nagar  with room service
9548086042 for call girls in Indira Nagar with room service
 
Separation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesSeparation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and Actinides
 
Disha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdfDisha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdf
 
Student login on Anyboli platform.helpin
Student login on Anyboli platform.helpinStudent login on Anyboli platform.helpin
Student login on Anyboli platform.helpin
 
Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activity
 

MOD2014-Mens-Lecture4