SlideShare a Scribd company logo
1 of 23
Download to read offline
Managing Software
Dependencies and the Supply
Chain
Wrangling Software Engineering Projects
MIT EM.S20
Andrew Lamb
April 6, 2022
Goal
Give both a commercial and an open-source perspective on the benefits, costs,
and risks of taking on dependencies.
About me
MIT Course VI-2 ‘02, MEng ‘03
17 years professional development 🤔
15 commercial enterprise software (startups at various stages)
● Oracle, DataPower/IBM, Vertica/HP, Nutonian, DataRobot
Last 2 years in open source commercial software development
● InfluxData, contributor to influxdb_iox
● Maintainer of arrow-rs, arrow-datafusion, and sqlparser-rs projects
● PMC member of Apache Arrow
Software “Supply Chain” ?
Code
Contributors
Project
Management
(e.g PRs)
User (😊)
AWS
Marketplace
Apple Pay
CI / CD
system
Software
Distribution
E.g.
Dockerhub,
App Store
Software Supply Chain Complexity
2005: Andrew’s First Startup (DataPower)
● C/C++, < 5 dependences (OpenSSL)
● Single binary, distributed to customers, on CD or via FTP
2022: Andrew’s Current Startup (InfluxDB)
● IOx has …. 606 dependencies
(rust alone)
Distributed as a
docker image on
GCR
Dependencies?
● Software Engineering 101 (6.001 / 6.037)
● “Don’t Reinvent the Wheel”: Use a pre-existing library of code
● The number and quality of pre-existing libraries grown massively
● Example:
○ 2004: DataPower had a custom written HTTP/S implementation, url parser,
and more!
○ 2022: Most languages have a library to do it (requests for python, node,
reqwest in Rust, etc)
(Dramatically) Lowers Cost of Building Software
● Low Barrier to Entry: Someone else designed the API, implemented
and (hopefully) tested it
○ E.g. can get a cross platform, secure webserver up and running almost instantly,
● Maintenance: You benefit from bugs fixed by others
● Debuggability: Source code is available, you can often even step
through it
Managing Dependencies: Licensing
● Software Patent licensing is still a (huge) thing
○ IBM makes $1Bn a year on software licensing
● You need to ensure you have the legal right to use the software.
● Good news: Most organizations have figured out licensing, have
known good “approved” set of licenses.
○ As long as you stick to known good ones
● Example “Auto Approve” (permissive): MIT, BSD, Apache 2
● Example “Special Dispensation”: MongoDB server side license
● Example “Do not use”: GPL / LGPL
Managing Dependencies: Quality
Quality of many Open Source dependencies is outstanding
● Crowdsourcing means more investment into bug reporting and fixing
● In theory you can look at the code to assess the quality
● You have many options to choose from
Managing Dependencies: Quality
● Amount of time spent on reviewing / assessing open source is minimal (both
commercially and in open source) – think reviewing 606 packages
● No one to cry to: Maintainers have
limited time to respond to your issue
● Open source maintainers typically
stretched (very) thin
● Parable: “broke my old version, sorry”:
dtolnay/quote/#204
Managing Dependencies: Security
● Somewhat terrifying to read “Backstabber's toolkit” paper
● Open source maintainers do not have loads of time
○ Open source is fundamentally based on trust but verify (in the maintainers + community)
○ Possible to abuse that trust and insert malicious code
● Surface Area: dependencies of dependencies
Managing Dependencies: Build times / package bloat
● Dependencies add build time to compiled languages (C/C++, Rust)
● Add significant bloat to binary / distribution size (MBs!)
○ Parable: Dependency (python) stack in one startup was > 1.5GB package.
● “DLL Hell”: Version matching dependencies (of dependencies)
Managing Dependencies: Keeping up to date
● Dependencies get upgraded with unpredictable regularity
● Things like security fixes you want/need, also features you probably don’t
Challenges
● Open source projects invest relatively less time on maintaining past releases.
○ p.s. Microsoft Windows: programs written 20+ years ago still run fine
● ⇒ bump dependencies a lot (daily)
● “Semantic versioning” - helps auto update dependencies 🤗
○ Sometimes do release incompatibilities and break builds 😖
○ Can get different binaries depending on *when* you run your build 😱
○ “Backstabbers Toolkit” 😓
Managing Dependencies: Packaging
Packaging: Gathering your code and dependencies into an executable “package”
that user can run on their system
As number dependencies grow, so does challenges in packaging / DLL Hell
● Language Runtime
● Your direct dependencies (e.g. http library)
● Indirect dependencies (e.g url parser)
● System dependencies (libssl, libqt, etc)
How to Manage
Think Twice about Adding New Dependencies
“A little copying is better than a little
dependency.”
- Rob Pike via https://go-proverbs.github.io/
E.g. One data structure from a library of data structures
Anti-example: http clients / crypto library
Best Practice: CI/CD (test, test, and test some more)
CI: Run
Tests
on change
branch
Build
“Artifacts”
CD: release
/ deploy
Source
Code
(in git)
CI: Run Tests
(on main
branch)
Propose
change via Pull
Request
approve +
merge to
main
branch
CI == Continuous Integration
CD == Continuous Deployment
Likely more
tests here
Likely more
tests here
Best Practice: Package Manager
❏ Use package manager built into your ecosystem:
❏ Java; maven
❏ Python: Pip
❏ Nodejs: NPM
❏ Ruby: Ruby Gems
❏ Rust: cargo
❏ …
❏ C/C++ CMake (not quite a package manager, but closer than Makefiles)
❏ Use “freeze” “shrinkwrap” or “version lock” feature to control updates
❏ Ensure you use widely used packages (wisdom of crowds)
Managing Dependencies: Best Practices
❏ Invest heavily in automated testing
❏ Especially end to end tests, and key features that rely on behavior of dependencies
❏ Invest in keeping dependencies up to date
❏ Update direct dependencies (tools like Dependabot can help)
❏ Help debug and fix your dependent libraries
❏ Submit patches back upstream
❏ May need to fork / apply a fix while you wait for maintainer to release new version
Managing Dependencies: Packaging
Technology to the rescue (enabler)
● Static Linking
● yum + .rpm ; apt + .deb
● FX; Electron (for Java; nodejs / desktop apps)
● Containerization (docker, et al)
● VMs (“Virtual Appliances”)
Thank you
Questions?
Readings (tentative):
https://ieeexplore-ieee-org.libproxy.mit.edu/stamp/stamp.jsp?tp=&arnumber=242525 – software maturity
https://www.oreilly.com/library/view/understanding-open-source/0596005814/ch06.html – reasonably thorough overview of software licensing
https://arxiv.org/pdf/2005.09535.pdf – supply-chain attacks
https://blog.npmjs.org/post/141577284765/kik-left-pad-and-npm.html – specific example of how easy/common broad supply-chain breaks are today
[optional] https://blogs.sap.com/2020/06/26/attacks-on-open-source-supply-chains-how-hackers-poison-the-well/
[optional] https://www.gnu.org/licenses/license-compatibility.en.html
[optional] https://www.tandfonline.com/doi/pdf/10.1080/14783360500235819?needAccess=true – software maturity

More Related Content

What's hot

Domain Modeling in a Functional World
Domain Modeling in a Functional WorldDomain Modeling in a Functional World
Domain Modeling in a Functional WorldDebasish Ghosh
 
Lecture: Word Sense Disambiguation
Lecture: Word Sense DisambiguationLecture: Word Sense Disambiguation
Lecture: Word Sense DisambiguationMarina Santini
 
Java 8 presentation
Java 8 presentationJava 8 presentation
Java 8 presentationVan Huong
 
Laziness, trampolines, monoids and other functional amenities: this is not yo...
Laziness, trampolines, monoids and other functional amenities: this is not yo...Laziness, trampolines, monoids and other functional amenities: this is not yo...
Laziness, trampolines, monoids and other functional amenities: this is not yo...Mario Fusco
 
Lecture 4: Transformers (Full Stack Deep Learning - Spring 2021)
Lecture 4: Transformers (Full Stack Deep Learning - Spring 2021)Lecture 4: Transformers (Full Stack Deep Learning - Spring 2021)
Lecture 4: Transformers (Full Stack Deep Learning - Spring 2021)Sergey Karayev
 
Kotlin InDepth Tutorial for beginners 2022
Kotlin InDepth Tutorial for beginners 2022Kotlin InDepth Tutorial for beginners 2022
Kotlin InDepth Tutorial for beginners 2022Simplilearn
 
PYTHON-Chapter 3-Classes and Object-oriented Programming: MAULIK BORSANIYA
PYTHON-Chapter 3-Classes and Object-oriented Programming: MAULIK BORSANIYAPYTHON-Chapter 3-Classes and Object-oriented Programming: MAULIK BORSANIYA
PYTHON-Chapter 3-Classes and Object-oriented Programming: MAULIK BORSANIYAMaulik Borsaniya
 
Running Spring Boot Applications as GraalVM Native Images
Running Spring Boot Applications as GraalVM Native ImagesRunning Spring Boot Applications as GraalVM Native Images
Running Spring Boot Applications as GraalVM Native ImagesVMware Tanzu
 
Frequent Itemset Mining(FIM) on BigData
Frequent Itemset Mining(FIM) on BigDataFrequent Itemset Mining(FIM) on BigData
Frequent Itemset Mining(FIM) on BigDataRaju Gupta
 
C programming enumeration
C programming enumerationC programming enumeration
C programming enumerationKaushal Kumar
 
Coding Standards & Best Practices for iOS/C#
Coding Standards & Best Practices for iOS/C#Coding Standards & Best Practices for iOS/C#
Coding Standards & Best Practices for iOS/C#Asim Rais Siddiqui
 
OCR with MXNet Gluon
OCR with MXNet GluonOCR with MXNet Gluon
OCR with MXNet GluonApache MXNet
 
ES6 PPT FOR 2016
ES6 PPT FOR 2016ES6 PPT FOR 2016
ES6 PPT FOR 2016Manoj Kumar
 

What's hot (20)

Domain Modeling in a Functional World
Domain Modeling in a Functional WorldDomain Modeling in a Functional World
Domain Modeling in a Functional World
 
Lecture: Word Sense Disambiguation
Lecture: Word Sense DisambiguationLecture: Word Sense Disambiguation
Lecture: Word Sense Disambiguation
 
Rust vs C++
Rust vs C++Rust vs C++
Rust vs C++
 
Java 8 presentation
Java 8 presentationJava 8 presentation
Java 8 presentation
 
Laziness, trampolines, monoids and other functional amenities: this is not yo...
Laziness, trampolines, monoids and other functional amenities: this is not yo...Laziness, trampolines, monoids and other functional amenities: this is not yo...
Laziness, trampolines, monoids and other functional amenities: this is not yo...
 
Python programming : Strings
Python programming : StringsPython programming : Strings
Python programming : Strings
 
Functions in python
Functions in python Functions in python
Functions in python
 
Text Similarity
Text SimilarityText Similarity
Text Similarity
 
Introduction to Scala
Introduction to ScalaIntroduction to Scala
Introduction to Scala
 
Lecture 4: Transformers (Full Stack Deep Learning - Spring 2021)
Lecture 4: Transformers (Full Stack Deep Learning - Spring 2021)Lecture 4: Transformers (Full Stack Deep Learning - Spring 2021)
Lecture 4: Transformers (Full Stack Deep Learning - Spring 2021)
 
Kotlin InDepth Tutorial for beginners 2022
Kotlin InDepth Tutorial for beginners 2022Kotlin InDepth Tutorial for beginners 2022
Kotlin InDepth Tutorial for beginners 2022
 
Array in C
Array in CArray in C
Array in C
 
PYTHON-Chapter 3-Classes and Object-oriented Programming: MAULIK BORSANIYA
PYTHON-Chapter 3-Classes and Object-oriented Programming: MAULIK BORSANIYAPYTHON-Chapter 3-Classes and Object-oriented Programming: MAULIK BORSANIYA
PYTHON-Chapter 3-Classes and Object-oriented Programming: MAULIK BORSANIYA
 
Running Spring Boot Applications as GraalVM Native Images
Running Spring Boot Applications as GraalVM Native ImagesRunning Spring Boot Applications as GraalVM Native Images
Running Spring Boot Applications as GraalVM Native Images
 
Frequent Itemset Mining(FIM) on BigData
Frequent Itemset Mining(FIM) on BigDataFrequent Itemset Mining(FIM) on BigData
Frequent Itemset Mining(FIM) on BigData
 
Pytorch
PytorchPytorch
Pytorch
 
C programming enumeration
C programming enumerationC programming enumeration
C programming enumeration
 
Coding Standards & Best Practices for iOS/C#
Coding Standards & Best Practices for iOS/C#Coding Standards & Best Practices for iOS/C#
Coding Standards & Best Practices for iOS/C#
 
OCR with MXNet Gluon
OCR with MXNet GluonOCR with MXNet Gluon
OCR with MXNet Gluon
 
ES6 PPT FOR 2016
ES6 PPT FOR 2016ES6 PPT FOR 2016
ES6 PPT FOR 2016
 

Similar to Managing Software Dependencies and Supply Chain Risks

Enterprise-Grade DevOps Solutions for a Start Up Budget
Enterprise-Grade DevOps Solutions for a Start Up BudgetEnterprise-Grade DevOps Solutions for a Start Up Budget
Enterprise-Grade DevOps Solutions for a Start Up BudgetDevOps.com
 
(DVO311) Containers, Red Hat & AWS For Extreme IT Agility
(DVO311) Containers, Red Hat & AWS For Extreme IT Agility(DVO311) Containers, Red Hat & AWS For Extreme IT Agility
(DVO311) Containers, Red Hat & AWS For Extreme IT AgilityAmazon Web Services
 
Kubernetes, Toolbox to fail or succeed for beginners - Demi Ben-Ari, VP R&D @...
Kubernetes, Toolbox to fail or succeed for beginners - Demi Ben-Ari, VP R&D @...Kubernetes, Toolbox to fail or succeed for beginners - Demi Ben-Ari, VP R&D @...
Kubernetes, Toolbox to fail or succeed for beginners - Demi Ben-Ari, VP R&D @...Demi Ben-Ari
 
Drupal Dev Days Vienna 2023 - What is the secure software supply chain and th...
Drupal Dev Days Vienna 2023 - What is the secure software supply chain and th...Drupal Dev Days Vienna 2023 - What is the secure software supply chain and th...
Drupal Dev Days Vienna 2023 - What is the secure software supply chain and th...sparkfabrik
 
Backstage at CNCF Madison.pptx
Backstage at CNCF Madison.pptxBackstage at CNCF Madison.pptx
Backstage at CNCF Madison.pptxBrandenTimm1
 
Achieving Full Stack DevOps at Colonial Life
Achieving Full Stack DevOps at Colonial Life Achieving Full Stack DevOps at Colonial Life
Achieving Full Stack DevOps at Colonial Life DevOps.com
 
Choisir le bon business model et la bonne licence pour la survie de son proje...
Choisir le bon business model et la bonne licence pour la survie de son proje...Choisir le bon business model et la bonne licence pour la survie de son proje...
Choisir le bon business model et la bonne licence pour la survie de son proje...Open Source Experience
 
The "Holy Grail" of Dev/Ops
The "Holy Grail" of Dev/OpsThe "Holy Grail" of Dev/Ops
The "Holy Grail" of Dev/OpsErik Osterman
 
Leverage the power of Open Source in your company
Leverage the power of Open Source in your company Leverage the power of Open Source in your company
Leverage the power of Open Source in your company Guillaume POTIER
 
What is the Secure Supply Chain and the Current State of the PHP Ecosystem
What is the Secure Supply Chain and the Current State of the PHP EcosystemWhat is the Secure Supply Chain and the Current State of the PHP Ecosystem
What is the Secure Supply Chain and the Current State of the PHP Ecosystemsparkfabrik
 
[Srijan Wednesday Webinars] How to Build a Cloud Native Platform for Enterpri...
[Srijan Wednesday Webinars] How to Build a Cloud Native Platform for Enterpri...[Srijan Wednesday Webinars] How to Build a Cloud Native Platform for Enterpri...
[Srijan Wednesday Webinars] How to Build a Cloud Native Platform for Enterpri...Srijan Technologies
 
Introduction to Go
Introduction to GoIntroduction to Go
Introduction to GoSimon Hewitt
 
Analysis of-quality-of-pkgs-in-packagist-univ-20171024
Analysis of-quality-of-pkgs-in-packagist-univ-20171024Analysis of-quality-of-pkgs-in-packagist-univ-20171024
Analysis of-quality-of-pkgs-in-packagist-univ-20171024Clark Everetts
 
Selecting an Open Source License and Business Model for Your Project to Have ...
Selecting an Open Source License and Business Model for Your Project to Have ...Selecting an Open Source License and Business Model for Your Project to Have ...
Selecting an Open Source License and Business Model for Your Project to Have ...All Things Open
 
Source Control with Domino Designer 8.5.3 and Git (DanNotes, November 28, 2012)
Source Control with Domino Designer 8.5.3 and Git (DanNotes, November 28, 2012)Source Control with Domino Designer 8.5.3 and Git (DanNotes, November 28, 2012)
Source Control with Domino Designer 8.5.3 and Git (DanNotes, November 28, 2012)Per Henrik Lausten
 
Continuous Delivery: Fly the Friendly CI in Pivotal Cloud Foundry with Concourse
Continuous Delivery: Fly the Friendly CI in Pivotal Cloud Foundry with ConcourseContinuous Delivery: Fly the Friendly CI in Pivotal Cloud Foundry with Concourse
Continuous Delivery: Fly the Friendly CI in Pivotal Cloud Foundry with ConcourseVMware Tanzu
 
Creating and Maintaining an Open Source Library
Creating and Maintaining an Open Source LibraryCreating and Maintaining an Open Source Library
Creating and Maintaining an Open Source LibraryNicholas Schweitzer
 
Aleksandr Kutsan "Managing Dependencies in C++"
Aleksandr Kutsan "Managing Dependencies in C++"Aleksandr Kutsan "Managing Dependencies in C++"
Aleksandr Kutsan "Managing Dependencies in C++"LogeekNightUkraine
 
System design for Web Application
System design for Web ApplicationSystem design for Web Application
System design for Web ApplicationMichael Choi
 

Similar to Managing Software Dependencies and Supply Chain Risks (20)

Enterprise-Grade DevOps Solutions for a Start Up Budget
Enterprise-Grade DevOps Solutions for a Start Up BudgetEnterprise-Grade DevOps Solutions for a Start Up Budget
Enterprise-Grade DevOps Solutions for a Start Up Budget
 
(DVO311) Containers, Red Hat & AWS For Extreme IT Agility
(DVO311) Containers, Red Hat & AWS For Extreme IT Agility(DVO311) Containers, Red Hat & AWS For Extreme IT Agility
(DVO311) Containers, Red Hat & AWS For Extreme IT Agility
 
Kubernetes, Toolbox to fail or succeed for beginners - Demi Ben-Ari, VP R&D @...
Kubernetes, Toolbox to fail or succeed for beginners - Demi Ben-Ari, VP R&D @...Kubernetes, Toolbox to fail or succeed for beginners - Demi Ben-Ari, VP R&D @...
Kubernetes, Toolbox to fail or succeed for beginners - Demi Ben-Ari, VP R&D @...
 
Drupal Dev Days Vienna 2023 - What is the secure software supply chain and th...
Drupal Dev Days Vienna 2023 - What is the secure software supply chain and th...Drupal Dev Days Vienna 2023 - What is the secure software supply chain and th...
Drupal Dev Days Vienna 2023 - What is the secure software supply chain and th...
 
Backstage at CNCF Madison.pptx
Backstage at CNCF Madison.pptxBackstage at CNCF Madison.pptx
Backstage at CNCF Madison.pptx
 
Achieving Full Stack DevOps at Colonial Life
Achieving Full Stack DevOps at Colonial Life Achieving Full Stack DevOps at Colonial Life
Achieving Full Stack DevOps at Colonial Life
 
Choisir le bon business model et la bonne licence pour la survie de son proje...
Choisir le bon business model et la bonne licence pour la survie de son proje...Choisir le bon business model et la bonne licence pour la survie de son proje...
Choisir le bon business model et la bonne licence pour la survie de son proje...
 
The "Holy Grail" of Dev/Ops
The "Holy Grail" of Dev/OpsThe "Holy Grail" of Dev/Ops
The "Holy Grail" of Dev/Ops
 
Leverage the power of Open Source in your company
Leverage the power of Open Source in your company Leverage the power of Open Source in your company
Leverage the power of Open Source in your company
 
What is the Secure Supply Chain and the Current State of the PHP Ecosystem
What is the Secure Supply Chain and the Current State of the PHP EcosystemWhat is the Secure Supply Chain and the Current State of the PHP Ecosystem
What is the Secure Supply Chain and the Current State of the PHP Ecosystem
 
Case study
Case studyCase study
Case study
 
[Srijan Wednesday Webinars] How to Build a Cloud Native Platform for Enterpri...
[Srijan Wednesday Webinars] How to Build a Cloud Native Platform for Enterpri...[Srijan Wednesday Webinars] How to Build a Cloud Native Platform for Enterpri...
[Srijan Wednesday Webinars] How to Build a Cloud Native Platform for Enterpri...
 
Introduction to Go
Introduction to GoIntroduction to Go
Introduction to Go
 
Analysis of-quality-of-pkgs-in-packagist-univ-20171024
Analysis of-quality-of-pkgs-in-packagist-univ-20171024Analysis of-quality-of-pkgs-in-packagist-univ-20171024
Analysis of-quality-of-pkgs-in-packagist-univ-20171024
 
Selecting an Open Source License and Business Model for Your Project to Have ...
Selecting an Open Source License and Business Model for Your Project to Have ...Selecting an Open Source License and Business Model for Your Project to Have ...
Selecting an Open Source License and Business Model for Your Project to Have ...
 
Source Control with Domino Designer 8.5.3 and Git (DanNotes, November 28, 2012)
Source Control with Domino Designer 8.5.3 and Git (DanNotes, November 28, 2012)Source Control with Domino Designer 8.5.3 and Git (DanNotes, November 28, 2012)
Source Control with Domino Designer 8.5.3 and Git (DanNotes, November 28, 2012)
 
Continuous Delivery: Fly the Friendly CI in Pivotal Cloud Foundry with Concourse
Continuous Delivery: Fly the Friendly CI in Pivotal Cloud Foundry with ConcourseContinuous Delivery: Fly the Friendly CI in Pivotal Cloud Foundry with Concourse
Continuous Delivery: Fly the Friendly CI in Pivotal Cloud Foundry with Concourse
 
Creating and Maintaining an Open Source Library
Creating and Maintaining an Open Source LibraryCreating and Maintaining an Open Source Library
Creating and Maintaining an Open Source Library
 
Aleksandr Kutsan "Managing Dependencies in C++"
Aleksandr Kutsan "Managing Dependencies in C++"Aleksandr Kutsan "Managing Dependencies in C++"
Aleksandr Kutsan "Managing Dependencies in C++"
 
System design for Web Application
System design for Web ApplicationSystem design for Web Application
System design for Web Application
 

Recently uploaded

Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...MyIntelliSource, Inc.
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfkalichargn70th171
 
why an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfwhy an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfjoe51371421
 
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...soniya singh
 
What is Binary Language? Computer Number Systems
What is Binary Language?  Computer Number SystemsWhat is Binary Language?  Computer Number Systems
What is Binary Language? Computer Number SystemsJheuzeDellosa
 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...MyIntelliSource, Inc.
 
Introduction to Decentralized Applications (dApps)
Introduction to Decentralized Applications (dApps)Introduction to Decentralized Applications (dApps)
Introduction to Decentralized Applications (dApps)Intelisync
 
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...Christina Lin
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfkalichargn70th171
 
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio, Inc.
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...ICS
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comFatema Valibhai
 
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideBuilding Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideChristina Lin
 
Engage Usergroup 2024 - The Good The Bad_The Ugly
Engage Usergroup 2024 - The Good The Bad_The UglyEngage Usergroup 2024 - The Good The Bad_The Ugly
Engage Usergroup 2024 - The Good The Bad_The UglyFrank van der Linden
 
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...kellynguyen01
 
Cloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackCloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackVICTOR MAESTRE RAMIREZ
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdfWave PLM
 
Unit 1.1 Excite Part 1, class 9, cbse...
Unit 1.1 Excite Part 1, class 9, cbse...Unit 1.1 Excite Part 1, class 9, cbse...
Unit 1.1 Excite Part 1, class 9, cbse...aditisharan08
 

Recently uploaded (20)

Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
 
Exploring iOS App Development: Simplifying the Process
Exploring iOS App Development: Simplifying the ProcessExploring iOS App Development: Simplifying the Process
Exploring iOS App Development: Simplifying the Process
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
 
Call Girls In Mukherjee Nagar 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
Call Girls In Mukherjee Nagar 📱  9999965857  🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...Call Girls In Mukherjee Nagar 📱  9999965857  🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
Call Girls In Mukherjee Nagar 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
 
why an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfwhy an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdf
 
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
 
What is Binary Language? Computer Number Systems
What is Binary Language?  Computer Number SystemsWhat is Binary Language?  Computer Number Systems
What is Binary Language? Computer Number Systems
 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
 
Introduction to Decentralized Applications (dApps)
Introduction to Decentralized Applications (dApps)Introduction to Decentralized Applications (dApps)
Introduction to Decentralized Applications (dApps)
 
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
 
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.com
 
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideBuilding Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
 
Engage Usergroup 2024 - The Good The Bad_The Ugly
Engage Usergroup 2024 - The Good The Bad_The UglyEngage Usergroup 2024 - The Good The Bad_The Ugly
Engage Usergroup 2024 - The Good The Bad_The Ugly
 
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
 
Cloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackCloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStack
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf
 
Unit 1.1 Excite Part 1, class 9, cbse...
Unit 1.1 Excite Part 1, class 9, cbse...Unit 1.1 Excite Part 1, class 9, cbse...
Unit 1.1 Excite Part 1, class 9, cbse...
 

Managing Software Dependencies and Supply Chain Risks

  • 1. Managing Software Dependencies and the Supply Chain Wrangling Software Engineering Projects MIT EM.S20 Andrew Lamb April 6, 2022
  • 2. Goal Give both a commercial and an open-source perspective on the benefits, costs, and risks of taking on dependencies.
  • 3. About me MIT Course VI-2 ‘02, MEng ‘03 17 years professional development 🤔 15 commercial enterprise software (startups at various stages) ● Oracle, DataPower/IBM, Vertica/HP, Nutonian, DataRobot Last 2 years in open source commercial software development ● InfluxData, contributor to influxdb_iox ● Maintainer of arrow-rs, arrow-datafusion, and sqlparser-rs projects ● PMC member of Apache Arrow
  • 4. Software “Supply Chain” ? Code Contributors Project Management (e.g PRs) User (😊) AWS Marketplace Apple Pay CI / CD system Software Distribution E.g. Dockerhub, App Store
  • 5. Software Supply Chain Complexity 2005: Andrew’s First Startup (DataPower) ● C/C++, < 5 dependences (OpenSSL) ● Single binary, distributed to customers, on CD or via FTP 2022: Andrew’s Current Startup (InfluxDB) ● IOx has …. 606 dependencies (rust alone) Distributed as a docker image on GCR
  • 6. Dependencies? ● Software Engineering 101 (6.001 / 6.037) ● “Don’t Reinvent the Wheel”: Use a pre-existing library of code ● The number and quality of pre-existing libraries grown massively ● Example: ○ 2004: DataPower had a custom written HTTP/S implementation, url parser, and more! ○ 2022: Most languages have a library to do it (requests for python, node, reqwest in Rust, etc)
  • 7. (Dramatically) Lowers Cost of Building Software ● Low Barrier to Entry: Someone else designed the API, implemented and (hopefully) tested it ○ E.g. can get a cross platform, secure webserver up and running almost instantly, ● Maintenance: You benefit from bugs fixed by others ● Debuggability: Source code is available, you can often even step through it
  • 8.
  • 9. Managing Dependencies: Licensing ● Software Patent licensing is still a (huge) thing ○ IBM makes $1Bn a year on software licensing ● You need to ensure you have the legal right to use the software. ● Good news: Most organizations have figured out licensing, have known good “approved” set of licenses. ○ As long as you stick to known good ones ● Example “Auto Approve” (permissive): MIT, BSD, Apache 2 ● Example “Special Dispensation”: MongoDB server side license ● Example “Do not use”: GPL / LGPL
  • 10. Managing Dependencies: Quality Quality of many Open Source dependencies is outstanding ● Crowdsourcing means more investment into bug reporting and fixing ● In theory you can look at the code to assess the quality ● You have many options to choose from
  • 11. Managing Dependencies: Quality ● Amount of time spent on reviewing / assessing open source is minimal (both commercially and in open source) – think reviewing 606 packages ● No one to cry to: Maintainers have limited time to respond to your issue ● Open source maintainers typically stretched (very) thin ● Parable: “broke my old version, sorry”: dtolnay/quote/#204
  • 12. Managing Dependencies: Security ● Somewhat terrifying to read “Backstabber's toolkit” paper ● Open source maintainers do not have loads of time ○ Open source is fundamentally based on trust but verify (in the maintainers + community) ○ Possible to abuse that trust and insert malicious code ● Surface Area: dependencies of dependencies
  • 13. Managing Dependencies: Build times / package bloat ● Dependencies add build time to compiled languages (C/C++, Rust) ● Add significant bloat to binary / distribution size (MBs!) ○ Parable: Dependency (python) stack in one startup was > 1.5GB package. ● “DLL Hell”: Version matching dependencies (of dependencies)
  • 14. Managing Dependencies: Keeping up to date ● Dependencies get upgraded with unpredictable regularity ● Things like security fixes you want/need, also features you probably don’t Challenges ● Open source projects invest relatively less time on maintaining past releases. ○ p.s. Microsoft Windows: programs written 20+ years ago still run fine ● ⇒ bump dependencies a lot (daily) ● “Semantic versioning” - helps auto update dependencies 🤗 ○ Sometimes do release incompatibilities and break builds 😖 ○ Can get different binaries depending on *when* you run your build 😱 ○ “Backstabbers Toolkit” 😓
  • 15. Managing Dependencies: Packaging Packaging: Gathering your code and dependencies into an executable “package” that user can run on their system As number dependencies grow, so does challenges in packaging / DLL Hell ● Language Runtime ● Your direct dependencies (e.g. http library) ● Indirect dependencies (e.g url parser) ● System dependencies (libssl, libqt, etc)
  • 17. Think Twice about Adding New Dependencies “A little copying is better than a little dependency.” - Rob Pike via https://go-proverbs.github.io/ E.g. One data structure from a library of data structures Anti-example: http clients / crypto library
  • 18. Best Practice: CI/CD (test, test, and test some more) CI: Run Tests on change branch Build “Artifacts” CD: release / deploy Source Code (in git) CI: Run Tests (on main branch) Propose change via Pull Request approve + merge to main branch CI == Continuous Integration CD == Continuous Deployment Likely more tests here Likely more tests here
  • 19. Best Practice: Package Manager ❏ Use package manager built into your ecosystem: ❏ Java; maven ❏ Python: Pip ❏ Nodejs: NPM ❏ Ruby: Ruby Gems ❏ Rust: cargo ❏ … ❏ C/C++ CMake (not quite a package manager, but closer than Makefiles) ❏ Use “freeze” “shrinkwrap” or “version lock” feature to control updates ❏ Ensure you use widely used packages (wisdom of crowds)
  • 20. Managing Dependencies: Best Practices ❏ Invest heavily in automated testing ❏ Especially end to end tests, and key features that rely on behavior of dependencies ❏ Invest in keeping dependencies up to date ❏ Update direct dependencies (tools like Dependabot can help) ❏ Help debug and fix your dependent libraries ❏ Submit patches back upstream ❏ May need to fork / apply a fix while you wait for maintainer to release new version
  • 21. Managing Dependencies: Packaging Technology to the rescue (enabler) ● Static Linking ● yum + .rpm ; apt + .deb ● FX; Electron (for Java; nodejs / desktop apps) ● Containerization (docker, et al) ● VMs (“Virtual Appliances”)
  • 23. Readings (tentative): https://ieeexplore-ieee-org.libproxy.mit.edu/stamp/stamp.jsp?tp=&arnumber=242525 – software maturity https://www.oreilly.com/library/view/understanding-open-source/0596005814/ch06.html – reasonably thorough overview of software licensing https://arxiv.org/pdf/2005.09535.pdf – supply-chain attacks https://blog.npmjs.org/post/141577284765/kik-left-pad-and-npm.html – specific example of how easy/common broad supply-chain breaks are today [optional] https://blogs.sap.com/2020/06/26/attacks-on-open-source-supply-chains-how-hackers-poison-the-well/ [optional] https://www.gnu.org/licenses/license-compatibility.en.html [optional] https://www.tandfonline.com/doi/pdf/10.1080/14783360500235819?needAccess=true – software maturity