Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Towards Laws of Software Ecosystem Evolution: An Empirical Comparison of Seven Software Packaging Ecosystems

348 views

Published on

Presentation by Tom Mens of joint work with Alexandre Decan (University of Mons) at the SATTOSE 2017 research seminar in Madrid (7 June 2017).
Abstract: We carry out a quantitative empirical comparison of the macro-level evolution of software packaging ecosystems for a multitude of different programming languages. We report on the most important observed differences and commonalities in the evolution of their package dependency networks. We hypothesise that the observed commonalities emerge due to the ecosystem scale and complexity. Inspired by Lehman’s laws of software evolution, we seek evidence for a series of empirically observable “laws of software ecosystem evolution”.

Published in: Science
  • Be the first to comment

  • Be the first to like this

Towards Laws of Software Ecosystem Evolution: An Empirical Comparison of Seven Software Packaging Ecosystems

  1. 1. An Empirical Comparison of Seven Package Dependency Networks NugetnpmCargo CRAN CPAN Packagist RubyGems Towards Laws of Software Ecosystem Evolution
  2. 2. An Empirical Comparison of Seven Package Dependency Networks Towards Laws of Software Ecosystem Evolution Tom Mens and Alexandre Decan COMPLEXYS Research Institute University of Mons, Belgium
  3. 3. Software Ecosystems Large and coherent collections of software components that are maintained by large and geographically distributed online communities.
  4. 4. Software Packaging Ecosystems Collections of software packages distributed by package managers
  5. 5. Software Packaging Ecosystems Collections of software packages distributed by package managers
  6. 6. Package Dependency Networks Extracted using open source discovery service http://libraries.io (CC BY-SA 4.0) Name Age Language Packages Dependencies Cargo 2014 Rust 9k 150k CPAN 1995 Perl 34k 1,078k CRAN 1997 R 12k 164k npm 2010 JavaScript 462k 1,369k NuGet 2010 .NET 84k 1,665k Packagist 2012 PHP 97k 1,863k RubyGems 2004 Ruby 132k 1,894k
  7. 7. Laws of Software Evolution Empirically observed by M. Lehman for large proprietary software systems Continuing Growth Continuing Change Increasing Complexity [ … ] Do they also hold for software ecosystems? Lehman M.M. and Belady L.A., 1985. Software Evolution – Processes of Software Change. Free download from http://informatique.umons.ac.be/genlog/BeladyLehman1985-ProgramEvolution.pdf
  8. 8. Evolution of number of packages Continuing Growth
  9. 9. Evolution of number of dependencies Continuing Growth
  10. 10. Evolution of number of package updates per month Continuing Change Fastest growth for npm, NuGet, Packagist
  11. 11. Package releases get updated often Survival probability of a package release Continuing Change Probability > 50% for a package release to be updated within 2 months. For CRAN : within 6 months.
  12. 12. Younger packages get updated more often … Continuing Change Over 50% of updates are for packages ... up to 6 months old up to 6 months old up to 3 months old Over 2 years oldOver 2 years old … except for older ecosystems
  13. 13. Complexity caused by – high proportion of dependent packages Ecosystem Complexity I had one case where my package heavily depended on another package and after a while that package was removed from CRAN and stopped being maintained. So I had to remove one of the main features of my package. Now I try to minimize dependencies on packages that are not maintained by ‘established’ maintainers or by me.
  14. 14. Complexity caused by – high proportion of dependent packages Ecosystem Complexity
  15. 15. Most of the complexity is hidden … Ecosystem Complexity
  16. 16. Most of the complexity is hidden … … in the transitive dependencies Ecosystem Complexity
  17. 17. Complexity increases over time for some ecosystems (npm, nuget, cargo) Evolution of ratio between number of transitive and number of direct dependencies Increasing Complexity
  18. 18. Most of the complexity is deeply hidden … … in the transitive dependencies Proportion of top-level packages by depth of dependency tree Over 50% of top-level packages have deep dependency tree. Ecosystem Complexity
  19. 19. Impact of transitive dependencies March 2016 Unexpected removal of left-pad caused > 2% of all packages to break (> 5,400 packages) Ecosystem Complexity This impacted many thousands of projects. [...] We began observing hundreds of failures per minute, as dependent projects – and their dependents, and their dependents... – all failed when requesting the now-unpublished package.”
  20. 20. Impact of transitive dependencies March 2016 Unexpected removal of left-pad caused > 2% of all packages to break (> 5,400 packages) RubyGems, November 2010 Release 0.5.0 of i18n broke dependent package ActiveRecord, transitively required by >5% of all packages (930) Ecosystem Complexity
  21. 21. Impact of transitive dependencies • P-Impact Index = number of packages that are transitively required by at least P% of all packages. Evolution of 5-Impact Index Increasing Complexity
  22. 22. Summary Observed evidence of evolution “laws” of software (packaging) ecosystems Increasing growth Continuing change Increasing complexity (How) could we find evidence for other laws?
  23. 23. Complex Networks Emergent properties have been observed in complex networks – Small-world phenomenon – Power-law behaviour (unequal, skewed, distributions) – … Do they also hold for package dependency networks?
  24. 24. Low proportion of required packages Unequally Distributed Connectivity
  25. 25. • Low proportion of required packages concentrates high proportion of reverse deps – From 6% to 17% of required packages concentrate over 80% of all reverse dependencies. • High proportion of package updates is concentrated in a minority of packages. Power Law Behaviour Skewed Distributions Emergent property of complex networks?
  26. 26. Skewed distributions of in- and out-degree in package dependency graph • Few packages with many dependents (resp. dependencies) • Many packages with very few dependencies (resp. dependents) Power Law Behaviour Skewed Distributions
  27. 27. Summary Observed evidence of complex network behavior (power laws) Unequal distribution of package dependencies Unequal distribution of package updates Other emerging properties from complex networks?
  28. 28. Open Questions Many observed similarities across ecosystems … … but also some differences To which extent does the ecosystem policy influence its evolution? Many tools help in supporting package maintainers • DependencyCI, Gemnasium, … • How should they be improved? – E.g. to deal with transitive deps, co-installability issues, …
  29. 29. References • A Decan, T Mens, P Grosjean. An empirical comparison of package dependency networks in seven software ecosystems. SUBMITTED • E Constantinou, T Mens. Socio-technical evolution of the Ruby ecosystem in GitHub. SANER 2017 • A Decan, T Mens, M Claes. An empirical comparison of dependency issues in OSS packaging ecosystems. SANER 2017 • E Constantinou, T Mens. Social and technical evolution of software ecosystems: A case study of Rails. WEA 2016 • A Decan, T Mens, M Claes. On the topology of package dependency networks: A comparison of programming language ecosystems. WEA 2016

×