Prevalence and Evolution of License Violations in npm and RubyGems Dependency Networks

Ahmed Zerouali
Ahmed ZeroualiPostdoc Researcher
Prevalence and Evolution of License Violations
in npm and RubyGems Dependency Networks
Ilyas Saïd Makari Ilyas.Said.Makari@vub.be
Ahmed Zerouali Ahmed.Zerouali@vub.be
Coen De Roover Coen.De.Roover@vub.be
The International Conference on Software and Systems Reuse (ICSR)
Virtual (originally Montpellier, France) - June 15-17, 2022
Background
Open source software can be distributed with varying degrees of freedom
● 1. Public domain
○ All rights are granted with no conditions whatsoever
○ For example: “The Unlicense”
● 2. Permissive licenses
○ Little restrictions imposed
○ Must include copyright notice from original author
○ MIT, Apache, BSD, etc
● 3. Restrictive licenses (copyleft)
Background
● Strong copyleft licenses
○ All derivatives of the original work should be released under the same license
○ For example: GNU General Public License (GPL)
● Weak copyleft licenses
○ Exception when work is used as independent building block
○ For example: GNU Lesser General Public License (LGPL)
Background
Not all licenses may be legally combined in one software package
- For example: MIT (permissive) is compatible with GPL (restrictive), but not vice
versa.
We call a license A “one-way compatible” with license B, if software that
contains packages from both licenses may be legally licensed under
license B.
Background
https:/
/exploring-data.com/vis/npm-packages-dependencies/
~2M packages
~86M releases
~250M dependencies
Background
direct dependency
(level 1)
indirect dependency
(level 4)
Generated with https://npm.anvaka.com/
Background
Problem
Q: How frequent are license violations in dependency
networks of open source package repositories?
studied package
dependency
Case studies
License Violations in npm and RubyGems Dependency Networks
Method
A new license Compatibility Matrix, based on:
1. Kapitsaki et al. [*]
[*] Georgia M Kapitsaki, Frederik Kramer, and Nikolaos D Tselikas. Automating the license compatibility process in open source
software with spdx. Journal of Systems and Software, 131:386–401, 2017.
Compatibility graph from Kapitsaki et al. [*]
Then, we manually included information from:
1. Free Software Foundation
2. The European Commission
=> answer for 1,681 pairs of licenses, from which 205 (12.2%) are labeled as “Unknown”
Research questions
RQ1: What are the most prevalent licenses in package repositories?
○ What are the climates of each ecosystem?
○ Permissive or restrictive climate?
○ Does it influence the number of incompatibilities?
RQ2: To which extent do packages rely on direct dependencies with incompatible
licenses?
○ How prevalent are license violations on the first dependency tree level?
RQ3: How does license incompatibility spread across package dependency
networks?
○ How prevalent are license violations on each dependency tree level?
Case studies
~750k packages (latest release)
~3.5M direct runtime
dependencies
~95k packages (latest release)
~211k direct runtime
dependencies
Open Data:
- Libraries.io gathers data from 32 package managers and 3 source code
repositories.
- They monitor over 5.4M unique open source packages, and more than 500M
interdependencies between them.
Dataset
Case studies
package.json
Dependency resolution: January 12th, 2020
Case studies
~750k packages
~3.5M direct dependencies
On January 12th, 2020
~ 66.4 M (all) runtime dependencies
~7.3% of the packages have
dependencies with incompatible
licenses,
~95k packages
~211k direct dependencies
On January 12th, 2020:
~ 1.2M (all) runtime dependencies
~13.9% of the packages have
dependencies with incompatible
licenses,
Research questions
RQ1: What are the most prevalent licenses in package repositories?
- MIT is the most popular license in npm and RubyGems.
Research questions
RQ1: What are the most prevalent licenses in package repositories?
- MIT has been popular in npm since its beginning.
- ISC becoming the new default license for npm packages increased its
popularity.
Research questions
RQ1: What are the most prevalent licenses in package repositories?
- MIT gradually evolved into the most popular license.
- Over the last few years, Apache has become the second most popular
license choice within the RubyGems ecosystem.
Research questions
RQ2: To which extent do packages rely on direct dependencies with incompatible
licenses?
- Only 0.9% and 4.3% of npm and RubyGems dependencies have licenses
that are incompatible with those of their dependents, respectively.
- The most common pair of incompatible licenses is MIT with GPL.
Research questions
RQ3: How does license incompatibility spread across package dependency networks?
- npm packages have more indirect dependencies with incompatible licenses than
RubyGems (due to the high number of dependencies that npm packages include.)
- However, RubyGems has proportionally more incompatible indirect dependencies
than npm.
Research questions
RQ3: How does license incompatibility spread across package dependency networks?
- The number of dependencies without a license decreases from one level to the
next as we go deeper in both package repositories.
Tooling
Screenshot of the license compatibility checking tool.
https://doi.org/10.5281/zenodo.5913761
Conclusion
● Deeper-level dependencies cause fewer incompatibilities than those at the shallow levels.
● GPL dependencies are the major cause for incompatibilities, and they are more present in the first
level of dependency trees.
● We found that a set of packages created by a single organization can influence an ecosystem when it
consistently releases useful packages under a particular license.
● Our results help in understanding the state of license incompatibilities in software package
ecosystems.
Threats to validity
● Libraries.io dataset.
● Various sources of information to construct our license compatibility matrix.
● We only considered the license of the latest release of each package.
● Many packages do not have any license.
1 of 24

Recommended

EOLE / OWF 12 - Viral licences – myth or reality - patrice-emmanuel schmitz (... by
EOLE / OWF 12 - Viral licences – myth or reality - patrice-emmanuel schmitz (...EOLE / OWF 12 - Viral licences – myth or reality - patrice-emmanuel schmitz (...
EOLE / OWF 12 - Viral licences – myth or reality - patrice-emmanuel schmitz (...Paris Open Source Summit
1.2K views15 slides
WP_Open-Source_Best_pratice_web by
WP_Open-Source_Best_pratice_webWP_Open-Source_Best_pratice_web
WP_Open-Source_Best_pratice_webPaul Plaquette
104 views6 slides
A Method to Detect License Inconsistencies for Large-Scale Open Source Projects by
A Method to Detect License Inconsistencies for Large-Scale Open Source ProjectsA Method to Detect License Inconsistencies for Large-Scale Open Source Projects
A Method to Detect License Inconsistencies for Large-Scale Open Source ProjectsYuhao Wu
436 views12 slides
An Open Source Workshop by
An Open Source WorkshopAn Open Source Workshop
An Open Source Workshophalehmahbod
1.7K views35 slides
Do software developers understand open source licenses? by
Do software developers understand open source licenses?Do software developers understand open source licenses?
Do software developers understand open source licenses?Daniel Almeida
328 views23 slides
Compatibilidad de licencias de software libre by
Compatibilidad de licencias de software libreCompatibilidad de licencias de software libre
Compatibilidad de licencias de software libreEOI Escuela de Organización Industrial
703 views19 slides

More Related Content

Similar to Prevalence and Evolution of License Violations in npm and RubyGems Dependency Networks

On the Impact of Security Vulnerabilities in the npm and RubyGems Dependency ... by
On the Impact of Security Vulnerabilities in the npm and RubyGems Dependency ...On the Impact of Security Vulnerabilities in the npm and RubyGems Dependency ...
On the Impact of Security Vulnerabilities in the npm and RubyGems Dependency ...Ahmed Zerouali
10 views24 slides
Drupal Dev Days Vienna 2023 - What is the secure software supply chain and th... by
Drupal Dev Days Vienna 2023 - What is the secure software supply chain and th...Drupal Dev Days Vienna 2023 - What is the secure software supply chain and th...
Drupal Dev Days Vienna 2023 - What is the secure software supply chain and th...sparkfabrik
15 views42 slides
Open Source: A New Software Paradigm by
Open Source: A New Software ParadigmOpen Source: A New Software Paradigm
Open Source: A New Software ParadigmYe Joo Park
546 views30 slides
Open Source vs Proprietary by
Open Source vs ProprietaryOpen Source vs Proprietary
Open Source vs ProprietaryM. Antoinette Jerom
3.9K views29 slides
An Open Source Case Study by
An Open Source Case StudyAn Open Source Case Study
An Open Source Case Studywebhostingguy
1.9K views36 slides
Software Licensing.pptx by
Software Licensing.pptxSoftware Licensing.pptx
Software Licensing.pptxAaliyanShaikh
5 views21 slides

Similar to Prevalence and Evolution of License Violations in npm and RubyGems Dependency Networks(20)

On the Impact of Security Vulnerabilities in the npm and RubyGems Dependency ... by Ahmed Zerouali
On the Impact of Security Vulnerabilities in the npm and RubyGems Dependency ...On the Impact of Security Vulnerabilities in the npm and RubyGems Dependency ...
On the Impact of Security Vulnerabilities in the npm and RubyGems Dependency ...
Ahmed Zerouali10 views
Drupal Dev Days Vienna 2023 - What is the secure software supply chain and th... by sparkfabrik
Drupal Dev Days Vienna 2023 - What is the secure software supply chain and th...Drupal Dev Days Vienna 2023 - What is the secure software supply chain and th...
Drupal Dev Days Vienna 2023 - What is the secure software supply chain and th...
sparkfabrik15 views
Open Source: A New Software Paradigm by Ye Joo Park
Open Source: A New Software ParadigmOpen Source: A New Software Paradigm
Open Source: A New Software Paradigm
Ye Joo Park546 views
An Open Source Case Study by webhostingguy
An Open Source Case StudyAn Open Source Case Study
An Open Source Case Study
webhostingguy1.9K views
GDSC - Software Licensing.pdf by AaliyanShaikh
GDSC - Software Licensing.pdfGDSC - Software Licensing.pdf
GDSC - Software Licensing.pdf
AaliyanShaikh15 views
Business models of open hardware by Robert Viseur
Business models of open hardwareBusiness models of open hardware
Business models of open hardware
Robert Viseur1.4K views
Legal analysis of source code by Robert Viseur
Legal analysis of source codeLegal analysis of source code
Legal analysis of source code
Robert Viseur1.1K views
L'open hardware dans l'électronique (et au delà...) by Robert Viseur
L'open hardware dans l'électronique (et au delà...)L'open hardware dans l'électronique (et au delà...)
L'open hardware dans l'électronique (et au delà...)
Robert Viseur1.2K views
Intro to FOSS by mgamal87
Intro to FOSSIntro to FOSS
Intro to FOSS
mgamal871.6K views
Introduction to FOSS by mgamal87
Introduction to FOSSIntroduction to FOSS
Introduction to FOSS
mgamal873.7K views
Introduction To Open Source Licensing by Mark Radcliffe
Introduction To Open Source LicensingIntroduction To Open Source Licensing
Introduction To Open Source Licensing
Mark Radcliffe3.2K views
IPO Presentation 2012 by theosss
IPO Presentation 2012IPO Presentation 2012
IPO Presentation 2012
theosss550 views
Introduction To Open Source Licenses by Harley Pascua
Introduction To Open Source LicensesIntroduction To Open Source Licenses
Introduction To Open Source Licenses
Harley Pascua7.2K views
ePractice workshop on Open Source Software, 7 April 2011 - Philippe Laurent by ePractice.eu
ePractice workshop on Open Source Software, 7 April 2011 - Philippe LaurentePractice workshop on Open Source Software, 7 April 2011 - Philippe Laurent
ePractice workshop on Open Source Software, 7 April 2011 - Philippe Laurent
ePractice.eu777 views

More from Ahmed Zerouali

Analysis And Observations Of The Evolution Of Testing Library Usage by
Analysis And Observations Of The Evolution Of Testing Library UsageAnalysis And Observations Of The Evolution Of Testing Library Usage
Analysis And Observations Of The Evolution Of Testing Library UsageAhmed Zerouali
4 views19 slides
On Popularity and Quality Metrics of npm Packages by
On Popularity and Quality Metrics of npm PackagesOn Popularity and Quality Metrics of npm Packages
On Popularity and Quality Metrics of npm PackagesAhmed Zerouali
18 views30 slides
A multi-dimensional analysis of technical lag in Debian-based Docker images by
A multi-dimensional analysis of technical lag in Debian-based Docker imagesA multi-dimensional analysis of technical lag in Debian-based Docker images
A multi-dimensional analysis of technical lag in Debian-based Docker imagesAhmed Zerouali
88 views30 slides
Evolution of Technical Lag in DockerHub images - Benevol20 by
Evolution of Technical Lag in DockerHub images - Benevol20Evolution of Technical Lag in DockerHub images - Benevol20
Evolution of Technical Lag in DockerHub images - Benevol20Ahmed Zerouali
121 views23 slides
PhD public defense: A Measurement Framework for Analyzing Technical Lag in ... by
PhD public defense: A Measurement Framework for  Analyzing Technical Lag in  ...PhD public defense: A Measurement Framework for  Analyzing Technical Lag in  ...
PhD public defense: A Measurement Framework for Analyzing Technical Lag in ...Ahmed Zerouali
97 views67 slides
Technical Lag in Software Ecosystems by
Technical Lag in Software EcosystemsTechnical Lag in Software Ecosystems
Technical Lag in Software EcosystemsAhmed Zerouali
167 views18 slides

More from Ahmed Zerouali(15)

Analysis And Observations Of The Evolution Of Testing Library Usage by Ahmed Zerouali
Analysis And Observations Of The Evolution Of Testing Library UsageAnalysis And Observations Of The Evolution Of Testing Library Usage
Analysis And Observations Of The Evolution Of Testing Library Usage
Ahmed Zerouali4 views
On Popularity and Quality Metrics of npm Packages by Ahmed Zerouali
On Popularity and Quality Metrics of npm PackagesOn Popularity and Quality Metrics of npm Packages
On Popularity and Quality Metrics of npm Packages
Ahmed Zerouali18 views
A multi-dimensional analysis of technical lag in Debian-based Docker images by Ahmed Zerouali
A multi-dimensional analysis of technical lag in Debian-based Docker imagesA multi-dimensional analysis of technical lag in Debian-based Docker images
A multi-dimensional analysis of technical lag in Debian-based Docker images
Ahmed Zerouali88 views
Evolution of Technical Lag in DockerHub images - Benevol20 by Ahmed Zerouali
Evolution of Technical Lag in DockerHub images - Benevol20Evolution of Technical Lag in DockerHub images - Benevol20
Evolution of Technical Lag in DockerHub images - Benevol20
Ahmed Zerouali121 views
PhD public defense: A Measurement Framework for Analyzing Technical Lag in ... by Ahmed Zerouali
PhD public defense: A Measurement Framework for  Analyzing Technical Lag in  ...PhD public defense: A Measurement Framework for  Analyzing Technical Lag in  ...
PhD public defense: A Measurement Framework for Analyzing Technical Lag in ...
Ahmed Zerouali97 views
Technical Lag in Software Ecosystems by Ahmed Zerouali
Technical Lag in Software EcosystemsTechnical Lag in Software Ecosystems
Technical Lag in Software Ecosystems
Ahmed Zerouali167 views
Technical lag in npm and docker ecosystems by Ahmed Zerouali
Technical lag in npm and docker ecosystemsTechnical lag in npm and docker ecosystems
Technical lag in npm and docker ecosystems
Ahmed Zerouali30 views
Analyzing Packages in Docker images hosted On DockerHub by Ahmed Zerouali
Analyzing Packages in Docker images hosted On DockerHubAnalyzing Packages in Docker images hosted On DockerHub
Analyzing Packages in Docker images hosted On DockerHub
Ahmed Zerouali31 views
On the Diversity of Software Package Popularity Metrics: An Empirical Study o... by Ahmed Zerouali
On the Diversity of Software Package Popularity Metrics: An Empirical Study o...On the Diversity of Software Package Popularity Metrics: An Empirical Study o...
On the Diversity of Software Package Popularity Metrics: An Empirical Study o...
Ahmed Zerouali40 views
ConPan: A Tool to Analyze Packages in Software Containers by Ahmed Zerouali
ConPan: A Tool to Analyze Packages in Software ContainersConPan: A Tool to Analyze Packages in Software Containers
ConPan: A Tool to Analyze Packages in Software Containers
Ahmed Zerouali29 views
Technical Lag in Docker Containers by Ahmed Zerouali
Technical Lag in Docker ContainersTechnical Lag in Docker Containers
Technical Lag in Docker Containers
Ahmed Zerouali47 views
Analyzing the Evolution of Testing Library Usage in Open Source Java Projects by Ahmed Zerouali
Analyzing the Evolution of Testing Library Usage in Open Source Java ProjectsAnalyzing the Evolution of Testing Library Usage in Open Source Java Projects
Analyzing the Evolution of Testing Library Usage in Open Source Java Projects
Ahmed Zerouali27 views
An Empirical Comparison of the Development History of CloudStack and Eucalyptus by Ahmed Zerouali
An Empirical Comparison of the Development History of CloudStack and EucalyptusAn Empirical Comparison of the Development History of CloudStack and Eucalyptus
An Empirical Comparison of the Development History of CloudStack and Eucalyptus
Ahmed Zerouali37 views
Analyzing the Evolution of Testing Library Usage in Open Source Java Projects by Ahmed Zerouali
Analyzing the Evolution of Testing Library Usage in Open Source Java ProjectsAnalyzing the Evolution of Testing Library Usage in Open Source Java Projects
Analyzing the Evolution of Testing Library Usage in Open Source Java Projects
Ahmed Zerouali26 views
An Empirical Analysis of Technical Lag in npm Package Dependencies by Ahmed Zerouali
An Empirical Analysis of Technical Lag in npm Package DependenciesAn Empirical Analysis of Technical Lag in npm Package Dependencies
An Empirical Analysis of Technical Lag in npm Package Dependencies
Ahmed Zerouali338 views

Recently uploaded

Gen Apps on Google Cloud PaLM2 and Codey APIs in Action by
Gen Apps on Google Cloud PaLM2 and Codey APIs in ActionGen Apps on Google Cloud PaLM2 and Codey APIs in Action
Gen Apps on Google Cloud PaLM2 and Codey APIs in ActionMárton Kodok
11 views55 slides
AI and Ml presentation .pptx by
AI and Ml presentation .pptxAI and Ml presentation .pptx
AI and Ml presentation .pptxFayazAli87
12 views15 slides
tecnologia18.docx by
tecnologia18.docxtecnologia18.docx
tecnologia18.docxnosi6702
5 views5 slides
Myths and Facts About Hospice Care: Busting Common Misconceptions by
Myths and Facts About Hospice Care: Busting Common MisconceptionsMyths and Facts About Hospice Care: Busting Common Misconceptions
Myths and Facts About Hospice Care: Busting Common MisconceptionsCare Coordinations
6 views1 slide
SAP FOR TYRE INDUSTRY.pdf by
SAP FOR TYRE INDUSTRY.pdfSAP FOR TYRE INDUSTRY.pdf
SAP FOR TYRE INDUSTRY.pdfVirendra Rai, PMP
27 views3 slides
Programming Field by
Programming FieldProgramming Field
Programming Fieldthehardtechnology
5 views9 slides

Recently uploaded(20)

Gen Apps on Google Cloud PaLM2 and Codey APIs in Action by Márton Kodok
Gen Apps on Google Cloud PaLM2 and Codey APIs in ActionGen Apps on Google Cloud PaLM2 and Codey APIs in Action
Gen Apps on Google Cloud PaLM2 and Codey APIs in Action
Márton Kodok11 views
AI and Ml presentation .pptx by FayazAli87
AI and Ml presentation .pptxAI and Ml presentation .pptx
AI and Ml presentation .pptx
FayazAli8712 views
tecnologia18.docx by nosi6702
tecnologia18.docxtecnologia18.docx
tecnologia18.docx
nosi67025 views
Myths and Facts About Hospice Care: Busting Common Misconceptions by Care Coordinations
Myths and Facts About Hospice Care: Busting Common MisconceptionsMyths and Facts About Hospice Care: Busting Common Misconceptions
Myths and Facts About Hospice Care: Busting Common Misconceptions
predicting-m3-devopsconMunich-2023.pptx by Tier1 app
predicting-m3-devopsconMunich-2023.pptxpredicting-m3-devopsconMunich-2023.pptx
predicting-m3-devopsconMunich-2023.pptx
Tier1 app7 views
Generic or specific? Making sensible software design decisions by Bert Jan Schrijver
Generic or specific? Making sensible software design decisionsGeneric or specific? Making sensible software design decisions
Generic or specific? Making sensible software design decisions
Introduction to Git Source Control by John Valentino
Introduction to Git Source ControlIntroduction to Git Source Control
Introduction to Git Source Control
John Valentino5 views
Ports-and-Adapters Architecture for Embedded HMI by Burkhard Stubert
Ports-and-Adapters Architecture for Embedded HMIPorts-and-Adapters Architecture for Embedded HMI
Ports-and-Adapters Architecture for Embedded HMI
Burkhard Stubert21 views
Sprint 226 by ManageIQ
Sprint 226Sprint 226
Sprint 226
ManageIQ8 views
Top-5-production-devconMunich-2023.pptx by Tier1 app
Top-5-production-devconMunich-2023.pptxTop-5-production-devconMunich-2023.pptx
Top-5-production-devconMunich-2023.pptx
Tier1 app8 views
Advanced API Mocking Techniques by Dimpy Adhikary
Advanced API Mocking TechniquesAdvanced API Mocking Techniques
Advanced API Mocking Techniques
Dimpy Adhikary23 views
FIMA 2023 Neo4j & FS - Entity Resolution.pptx by Neo4j
FIMA 2023 Neo4j & FS - Entity Resolution.pptxFIMA 2023 Neo4j & FS - Entity Resolution.pptx
FIMA 2023 Neo4j & FS - Entity Resolution.pptx
Neo4j12 views

Prevalence and Evolution of License Violations in npm and RubyGems Dependency Networks

  • 1. Prevalence and Evolution of License Violations in npm and RubyGems Dependency Networks Ilyas Saïd Makari Ilyas.Said.Makari@vub.be Ahmed Zerouali Ahmed.Zerouali@vub.be Coen De Roover Coen.De.Roover@vub.be The International Conference on Software and Systems Reuse (ICSR) Virtual (originally Montpellier, France) - June 15-17, 2022
  • 3. Open source software can be distributed with varying degrees of freedom ● 1. Public domain ○ All rights are granted with no conditions whatsoever ○ For example: “The Unlicense” ● 2. Permissive licenses ○ Little restrictions imposed ○ Must include copyright notice from original author ○ MIT, Apache, BSD, etc ● 3. Restrictive licenses (copyleft) Background
  • 4. ● Strong copyleft licenses ○ All derivatives of the original work should be released under the same license ○ For example: GNU General Public License (GPL) ● Weak copyleft licenses ○ Exception when work is used as independent building block ○ For example: GNU Lesser General Public License (LGPL) Background
  • 5. Not all licenses may be legally combined in one software package - For example: MIT (permissive) is compatible with GPL (restrictive), but not vice versa. We call a license A “one-way compatible” with license B, if software that contains packages from both licenses may be legally licensed under license B. Background
  • 7. direct dependency (level 1) indirect dependency (level 4) Generated with https://npm.anvaka.com/ Background
  • 8. Problem Q: How frequent are license violations in dependency networks of open source package repositories? studied package dependency
  • 9. Case studies License Violations in npm and RubyGems Dependency Networks
  • 10. Method A new license Compatibility Matrix, based on: 1. Kapitsaki et al. [*] [*] Georgia M Kapitsaki, Frederik Kramer, and Nikolaos D Tselikas. Automating the license compatibility process in open source software with spdx. Journal of Systems and Software, 131:386–401, 2017. Compatibility graph from Kapitsaki et al. [*] Then, we manually included information from: 1. Free Software Foundation 2. The European Commission => answer for 1,681 pairs of licenses, from which 205 (12.2%) are labeled as “Unknown”
  • 11. Research questions RQ1: What are the most prevalent licenses in package repositories? ○ What are the climates of each ecosystem? ○ Permissive or restrictive climate? ○ Does it influence the number of incompatibilities? RQ2: To which extent do packages rely on direct dependencies with incompatible licenses? ○ How prevalent are license violations on the first dependency tree level? RQ3: How does license incompatibility spread across package dependency networks? ○ How prevalent are license violations on each dependency tree level?
  • 12. Case studies ~750k packages (latest release) ~3.5M direct runtime dependencies ~95k packages (latest release) ~211k direct runtime dependencies
  • 13. Open Data: - Libraries.io gathers data from 32 package managers and 3 source code repositories. - They monitor over 5.4M unique open source packages, and more than 500M interdependencies between them. Dataset
  • 15. Case studies ~750k packages ~3.5M direct dependencies On January 12th, 2020 ~ 66.4 M (all) runtime dependencies ~7.3% of the packages have dependencies with incompatible licenses, ~95k packages ~211k direct dependencies On January 12th, 2020: ~ 1.2M (all) runtime dependencies ~13.9% of the packages have dependencies with incompatible licenses,
  • 16. Research questions RQ1: What are the most prevalent licenses in package repositories? - MIT is the most popular license in npm and RubyGems.
  • 17. Research questions RQ1: What are the most prevalent licenses in package repositories? - MIT has been popular in npm since its beginning. - ISC becoming the new default license for npm packages increased its popularity.
  • 18. Research questions RQ1: What are the most prevalent licenses in package repositories? - MIT gradually evolved into the most popular license. - Over the last few years, Apache has become the second most popular license choice within the RubyGems ecosystem.
  • 19. Research questions RQ2: To which extent do packages rely on direct dependencies with incompatible licenses? - Only 0.9% and 4.3% of npm and RubyGems dependencies have licenses that are incompatible with those of their dependents, respectively. - The most common pair of incompatible licenses is MIT with GPL.
  • 20. Research questions RQ3: How does license incompatibility spread across package dependency networks? - npm packages have more indirect dependencies with incompatible licenses than RubyGems (due to the high number of dependencies that npm packages include.) - However, RubyGems has proportionally more incompatible indirect dependencies than npm.
  • 21. Research questions RQ3: How does license incompatibility spread across package dependency networks? - The number of dependencies without a license decreases from one level to the next as we go deeper in both package repositories.
  • 22. Tooling Screenshot of the license compatibility checking tool. https://doi.org/10.5281/zenodo.5913761
  • 23. Conclusion ● Deeper-level dependencies cause fewer incompatibilities than those at the shallow levels. ● GPL dependencies are the major cause for incompatibilities, and they are more present in the first level of dependency trees. ● We found that a set of packages created by a single organization can influence an ecosystem when it consistently releases useful packages under a particular license. ● Our results help in understanding the state of license incompatibilities in software package ecosystems.
  • 24. Threats to validity ● Libraries.io dataset. ● Various sources of information to construct our license compatibility matrix. ● We only considered the license of the latest release of each package. ● Many packages do not have any license.