Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Evolving Software Ecosystems: Health and beyond

78 views

Published on

Software evolves over time and several challenges arise concerning both the technical artefacts produced during development, as well as the developer community that maintains these artefacts. Evolution challenges become even more prominent in software ecosystems (SECOs), which are large collections of interdependent software packages/projects that share a common technological platform and that are maintained by large online communities of contributors. SECOs are subject to changes at an ever-increasing pace, thus facing health and longevity issues. In this talk, we will present our current research on SECO evolution and health, for both the technical and social aspects of SECOs. On the one hand, we will present our work on package dependency issues in SECOs throughout their evolution. On the other hand, we will present a socio-technical analysis of SECOs, studying aspects such as contributor abandonment. We will conclude our talk by presenting our future research agenda.

Published in: Technology
  • Be the first to comment

Evolving Software Ecosystems: Health and beyond

  1. 1. Evolving software ecosystems: Health and beyond Eleni Constantinou Tom Mens University of Mons Belgium
  2. 2. SOCIO- TECHNICAL A Software Ecosystem is X
  3. 3. Software ecosystem research 2012-2017 2017-2019 2018-2021
  4. 4. SECOHealth Inter-disciplinary inter-university research project Towards an interdisciplinary, socio-technical methodology and analysis of the health of software ecosystems www.secohealth.org
  5. 5. SECO-Assist Automated Assistance for Developing Software in Ecosystems of the Future secoassist.github.io Inter-university research project Tom Mens University of Mons Anthony Cleve Université de Namur Coen De Roover Vrije Universiteit Brussel Serge Demeyer University of Antwerp
  6. 6. SECO health Sustainability Longevity Growth Success Resilience Survival Diversity Popularity
  7. 7. SECO health Sustainability Longevity Growth Success Resilience Survival Diversity Popularity
  8. 8. Technical Health Problems • Outdated dependencies • Security vulnerabilities • Bugs • Duplicated code • Incompatible licenses • … • Abandonment of contributors • Lack of communication / interaction • Social conflicts • Insufficient diversity • …
  9. 9. Technical SECO evolution
  10. 10. Evolution of package dependency networks A Decan, T Mens (2018) An Empirical Comparison of Dependency Network Evolution in Seven Software Packaging Ecosystems. Empirical Software Engineering Seven package dependency networks extracted using open source discovery service http://libraries.io (CC BY-SA 4.0) 830K packages – 5.8M package versions – 20.5M dependencies
  11. 11. Package changes are frequent Findings • #package updates grows over time • >50% of package releases are updated within 2 months. • Required and young packages are updated more frequently. Changeability index: Maximal value n such that there exist n packages having been updated at least n times during the last month. CRAN differs due to rolling release policy: “Submitting updates should be done responsibly and with respect for the volunteers’ time. Once a package is established, ‘no more than every 1–2 months’ seems appropriate.”
  12. 12. Package changes are frequent Package updates may cause many maintainability issues or even failures in dependent packages. "Especially with respect to package dependencies, the risk of things breaking at some point due to the fact that a version of a dependency has changed without you knowing about it is immense. That actually cost us weeks and months in a couple of professional projects I was part of."
  13. 13. Most packages depend on other packages Findings • 60% to 80% of all packages are connected. • A stable minority (20%) of required packages collect over 80% of all reverse dependencies. • # npm dependencies grows much faster. Reusability index: Maximal value n such that there exist n required packages having at least n dependent packages.
  14. 14. Package changes may have important impact March 2016 Unexpected removal of left-pad Caused > 2% of all packages to break (> 5,400 packages) November 2010 Release 0.5.0 of i18n broke dependent package ActiveRecord Transitively required by >5% of all packages
  15. 15. Example: leftpad
  16. 16. Most of the complexity is deeply hidden … … in the transitive dependencies Proportion of top-level packages by depth of dependency tree Over 50% of top-level packages have deep dependency tree. Ecosystem complexity
  17. 17. Package changes may have important impact Evolution of 5-Impact Index Findings • Dependent packages have few direct but many transitive dependencies. • Ratio of indirect over direct dependencies increases over time. P-Impact Index : Number of packages that are transitively required by at least P% of all packages.
  18. 18. Socio-technical SECO evolution
  19. 19. SECO evolution Empirical investigation of software ecosystems • Social changes • Technical impact of social changes
  20. 20. SECO impact
  21. 21. SECO health
  22. 22. SECO repositories
  23. 23. SECO repositories
  24. 24. SECO repositories
  25. 25. SECO repositories
  26. 26. SECO repositories
  27. 27. Evolution of package dependency networks E Constantinou, T Mens (2017) Socio-Technical Evolution of the Ruby Ecosystem in GitHub. SANER 2017 26K packages/projects, 69K forks 76K contributors 5M commits
  28. 28. SECO health – Social Growth
  29. 29. SECO health – Technical Growth Technical growth 2008 2009 2010 2011 2012 2013 2014 2015 2016 2000 4000 6000 8000 10000 Projects Obsolete Projects New Projects Active Projects 2008 2009 2010 2011 2012 2013 2014 2015 0 1 2 3 4 Specialization
  30. 30. SECO health Major social changes can highly impact the ecosystem evolution Monitoring these changes can help in identifying such issues early
  31. 31. SECO health – Survival
  32. 32. Evolution of package dependency networks E Constantinou, T Mens (2017) An Empirical Comparison of Developer Retention in the RubyGems and npm Software Ecosystems. Innovations in Systems and Software Engineering 70K packages/projects 32K contributors 3M commits 1.5M messages 179K packages/projects 64K contributors 8M commits 4M messages
  33. 33. SECO health – Survival Socio-technical activity • Intensity • Frequency • Inactivity length Survival analysis
  34. 34. SECO health – Developer survival
  35. 35. SECO health – Developer survival Population: all developers in an ecosystem Event: abandonment of a developer Developers tend to abandon the ecosystem sooner if they: do not communicate communicate less intensively communicate less frequently do not communicate for a longer period 0 50 100 150 200 0.00.20.40.60.81.0 npm Duration of commit activity (months) Survivalprobability Social inactivity Social activity Social abandoner 0 50 100 150 0.00.20.40.60.81.0 RubyGems Duration of commit activity (months) Survivalprobability Social inactivity Social activity Social abandoner 0 50 100 150 200 0.00.20.40.60.81.0 npm Duration of commit activity (months) Survivalprobability Very Short Short Long Very Long 0 50 100 150 0.00.20.40.60.81.0 RubyGems Duration of commit activity (months) Survivalprobability Very Short Short Long Very Long
  36. 36. SECO health – Developer survival Developers tend to abandon the ecosystem sooner if they: commit less intensively commit less frequently do not commit for longer periods 0 50 100 150 200 0.00.20.40.60.81.0 npm Duration of commit activity (months) Survivalprobability Very Weak Weak Strong Very Strong 0 50 100 150 0.00.20.40.60.81.0 RubyGems Duration of commit activity (months) Survivalprobability Very Weak Weak Strong Very Strong 0 50 100 150 200 0.00.20.40.60.81.0 npm Duration of commit activity (months) Survivalprobability Very Short Short Long Very Long 0 50 100 150 0.00.20.40.60.81.0 RubyGems Duration of commit activity (months) Survivalprobability Very Short Short Long Very Long
  37. 37. SECO health – Package survival 37
  38. 38. SECO health – Package survival Population: all packages in an ecosystem Event: commit inactivity of a package Packages tend to become inactive sooner if the developers contributing to these packages: do not communicate communicate less intensively communicate less frequently do not communicate for a longer period
  39. 39. SECO health – Package survival Packages tend to become inactive sooner if the developers contributing to these packages: commit less intensively commit less frequently do not commit for longer periods
  40. 40. SECO health – Survival Intense and frequent commit activity is not enough … Intense and frequent messaging activity is also necessary
  41. 41. Current work – Identity merging
  42. 42. Current work – Identity matching GitHub git Mailing list Gerrit BugZilla IRC
  43. 43. Current work – Forecasting inactivity
  44. 44. What next? Technical • Outdated dependencies • Security vulnerabilities • Bugs • Duplicated code • Incompatible licenses • … • Abandonment of contributors • Lack of communication / interaction • Social conflicts • Insufficient diversity • …
  45. 45. @eleni_const @tom_mens

×