Social and Technical Evolution
of Software Ecosystems
A Case Study of Rails
Eleni Constantinou, Tom Mens
4th International Workshop on Software
Ecosystem Architectures (WEA 2016)
1
Research
Team
Introduction
Software ecosystem
• Collection of software projects that are developed and evolve together
in the same environment [1]
Ecosystem environment
• Development team ⇒ Social aspect
• Source code artefacts ⇒ Technical aspect
Modifications
• Social: Contributors joining/leaving
• Technical: New/obsolete source code files
[1] M. Lungu. Towards reverse engineering software ecosystems. Int'l Conf. Software Maintenance, pages 428-431, 2008. 2
Introduction
Evolution
• Longevity
• Growth
Ecosystem sustainability
Negative impact of major social changes
A sustainable software ecosystem can
increase or maintain its user/developer
community over longer periods of time
and can survive inherent changes
such as new technologies or new
products (e.g. from competitors) that can
change the population (the community
of users, developers etc) [2]
[2] D. Dhungana, I. Groher, E. Schludermann, S. Biffl. Software ecosystems vs. natural ecosystems: learning from the ingenious mind of nature. Eur.
Conf. on Software Architecture: Companion Volume, pages 96-102, 2010. 3
Background
4
Time
Unit 1
Time
Unit 2
Time
Unit 3
…
Time
Unit N-2
Time
Unit N-1
Time
Unit N
S
T
A
R
T
E
N
D
Software Ecosystem Evolution
Technical
Artefacts
Technical
Artefacts
Definitions
5
Social Metrics
Leavers(t)
Joiners(t)
Stayers(t)
TeamTurnover(t)
TeamAbandonment(t)
Technical Metrics
Obsolete(t)
New(t)
Maintained(t)
FileTurnover(t)
FileAbandonment(t)
Dataset
• Ruby on Rails
• Largest/most popular Ruby project
• GHTorrent dataset [2] (2016-09-05 dump)
• Timespan: April 2008 – September 2016
• Time unit: year quarters
• Commit activity
• Base project/Forks/Ecosystem
[2] G. Gousios. The GHTorrent dataset and tool suite. Working Conf. Mining Software Repositories, pages 233-236, 2013. 6
Dataset Problems - Noise
• Forks can be simple copies of the base project
• Non source code files or irrelevant files can be committed (e.g.,
temporary files)
• One-time and occasional contributors
7
Dataset Filters
1. Forks
Filter: Merged back to the base
2. Files
Filter: Source code files
3. Contributors
Filter: Contributors whose AVG activity
is equal/greater than 2 quarters
Base Forks Ecosystem
Count 1 1,896 1,897
Contributors 1,827 2,154 3,121
Commits 43,195 25,938 69,133
Base Forks Ecosystem
Count 1 692 693
Contributors 430 681 765
Commits 40,660 22,923 63,583
8
Research Questions
RQ1 How does the commit activity of the ecosystem
(in base and forks) evolve over time?
RQ2 How does the development population and file activity
change over time?
RQ3 How do changes in the development team affect the file
activity of the ecosystem?
9
RQ1 How does the commit activity of the ecosystem
(in base and forks) evolve over time?
• Forks
• >= quarter 13 (July 2011)
• Increasing commit activity
• Development effort heavily
depends on forks after
quarter 18 (October 2012)
10
RQ2 How does the development population and file
activity change over time?
Base Project Forks Ecosystem
Core contributors: Small number of people join/leave the
ecosystem
11
RQ2 How does the development population and file
activity change over time?
Base Project Forks Ecosystem
Forks: Increasing trend
Low number of obsolete files 12
RQ2 How does the development population and
file activity change over time?
Percentage %
TeamTurnover 25 ± 12
TeamAbandonment 14 ± 10
FileTurnover 15 ± 11
FileAbandonment 10 ± 7
Moderate social and technical modifications
Ecosystem growth
13
RQ3 How do changes in the development team affect the
file activity of the ecosystem?
25% of obsolete files were
maintained by Leavers
14
Findings
• Intensive use of the fork and push mechanisms of GitHub from
quarter 13 (July 2011)
• Both the development team and files showed a roughly linearly
increasing trend
• Moderate impact of Leavers on the technical part of the
ecosystem
15
Findings
Do Leavers engage in other ecosystems?
Ecosystem Active in Ruby
JavaScript 18,038
Python 10,211
Java 7,363
16
Ecosystem Abandoned Ruby Percentage
JavaScript 13,814 77%
Python 8,131 79%
Java 5,132 70%
Threats to validity
• Multiple user accounts
• Less common within the same GitHub
repository
• Identity merging [3]
• Rails project
• Large/significant Ruby project
• Entire Ruby ecosystem
• Effort measurement
• Commit squashing
• LOC
17
[3] M. Goeminne and T. Mens, “A comparison of identity merge algorithms for software repositories,” Science of Computer Programming, vol. 78, no. 8,
pages 971–986, 2013
Conclusion
• Case study of the Rails evolution in GitHub
• Magnitude and effect of socio-technical changes
• Moderate impact of modifications on the ecosystem
• Sustainable ecosystem
• Socio-technical growth
• Longevity
18
Ongoing/Future Work
• Ruby ecosystem in GitHub (>60K projects)
• Leavers knowledge and specialization (relative entropy)
• Ecosystem migration (Ruby -> JavaScript)
• Practices eliminating the effect of occasional contributors
19
Thank you!
20

Social and Technical Evolution of the Ruby on Rails Software Ecosystem

  • 1.
    Social and TechnicalEvolution of Software Ecosystems A Case Study of Rails Eleni Constantinou, Tom Mens 4th International Workshop on Software Ecosystem Architectures (WEA 2016)
  • 2.
  • 3.
    Introduction Software ecosystem • Collectionof software projects that are developed and evolve together in the same environment [1] Ecosystem environment • Development team ⇒ Social aspect • Source code artefacts ⇒ Technical aspect Modifications • Social: Contributors joining/leaving • Technical: New/obsolete source code files [1] M. Lungu. Towards reverse engineering software ecosystems. Int'l Conf. Software Maintenance, pages 428-431, 2008. 2
  • 4.
    Introduction Evolution • Longevity • Growth Ecosystemsustainability Negative impact of major social changes A sustainable software ecosystem can increase or maintain its user/developer community over longer periods of time and can survive inherent changes such as new technologies or new products (e.g. from competitors) that can change the population (the community of users, developers etc) [2] [2] D. Dhungana, I. Groher, E. Schludermann, S. Biffl. Software ecosystems vs. natural ecosystems: learning from the ingenious mind of nature. Eur. Conf. on Software Architecture: Companion Volume, pages 96-102, 2010. 3
  • 5.
    Background 4 Time Unit 1 Time Unit 2 Time Unit3 … Time Unit N-2 Time Unit N-1 Time Unit N S T A R T E N D Software Ecosystem Evolution Technical Artefacts Technical Artefacts
  • 6.
  • 7.
    Dataset • Ruby onRails • Largest/most popular Ruby project • GHTorrent dataset [2] (2016-09-05 dump) • Timespan: April 2008 – September 2016 • Time unit: year quarters • Commit activity • Base project/Forks/Ecosystem [2] G. Gousios. The GHTorrent dataset and tool suite. Working Conf. Mining Software Repositories, pages 233-236, 2013. 6
  • 8.
    Dataset Problems -Noise • Forks can be simple copies of the base project • Non source code files or irrelevant files can be committed (e.g., temporary files) • One-time and occasional contributors 7
  • 9.
    Dataset Filters 1. Forks Filter:Merged back to the base 2. Files Filter: Source code files 3. Contributors Filter: Contributors whose AVG activity is equal/greater than 2 quarters Base Forks Ecosystem Count 1 1,896 1,897 Contributors 1,827 2,154 3,121 Commits 43,195 25,938 69,133 Base Forks Ecosystem Count 1 692 693 Contributors 430 681 765 Commits 40,660 22,923 63,583 8
  • 10.
    Research Questions RQ1 Howdoes the commit activity of the ecosystem (in base and forks) evolve over time? RQ2 How does the development population and file activity change over time? RQ3 How do changes in the development team affect the file activity of the ecosystem? 9
  • 11.
    RQ1 How doesthe commit activity of the ecosystem (in base and forks) evolve over time? • Forks • >= quarter 13 (July 2011) • Increasing commit activity • Development effort heavily depends on forks after quarter 18 (October 2012) 10
  • 12.
    RQ2 How doesthe development population and file activity change over time? Base Project Forks Ecosystem Core contributors: Small number of people join/leave the ecosystem 11
  • 13.
    RQ2 How doesthe development population and file activity change over time? Base Project Forks Ecosystem Forks: Increasing trend Low number of obsolete files 12
  • 14.
    RQ2 How doesthe development population and file activity change over time? Percentage % TeamTurnover 25 ± 12 TeamAbandonment 14 ± 10 FileTurnover 15 ± 11 FileAbandonment 10 ± 7 Moderate social and technical modifications Ecosystem growth 13
  • 15.
    RQ3 How dochanges in the development team affect the file activity of the ecosystem? 25% of obsolete files were maintained by Leavers 14
  • 16.
    Findings • Intensive useof the fork and push mechanisms of GitHub from quarter 13 (July 2011) • Both the development team and files showed a roughly linearly increasing trend • Moderate impact of Leavers on the technical part of the ecosystem 15
  • 17.
    Findings Do Leavers engagein other ecosystems? Ecosystem Active in Ruby JavaScript 18,038 Python 10,211 Java 7,363 16 Ecosystem Abandoned Ruby Percentage JavaScript 13,814 77% Python 8,131 79% Java 5,132 70%
  • 18.
    Threats to validity •Multiple user accounts • Less common within the same GitHub repository • Identity merging [3] • Rails project • Large/significant Ruby project • Entire Ruby ecosystem • Effort measurement • Commit squashing • LOC 17 [3] M. Goeminne and T. Mens, “A comparison of identity merge algorithms for software repositories,” Science of Computer Programming, vol. 78, no. 8, pages 971–986, 2013
  • 19.
    Conclusion • Case studyof the Rails evolution in GitHub • Magnitude and effect of socio-technical changes • Moderate impact of modifications on the ecosystem • Sustainable ecosystem • Socio-technical growth • Longevity 18
  • 20.
    Ongoing/Future Work • Rubyecosystem in GitHub (>60K projects) • Leavers knowledge and specialization (relative entropy) • Ecosystem migration (Ruby -> JavaScript) • Practices eliminating the effect of occasional contributors 19
  • 21.