SlideShare a Scribd company logo
1 of 23
© 2022 nexB Inc. License: CC-BY-SA-4.0 https://www.nexb.com/ https://www.aboutcode.org/
VulnerableCode
Because a vulnerability database
should not be about Vulnerabilities
1
© 2022 nexB Inc. License: CC-BY-SA-4.0 https://www.nexb.com/ https://www.aboutcode.org/
▷ Project lead and maintainer for VulnerableCode, ScanCode and
AboutCode
▷ Creator of Package URL, co-founder of SPDX, ClearlyDefined
▷ FOSS veteran, long time Google Summer of Code mentor
▷ Co-founder and CTO of nexB Inc., makers of DejaCode
▷ Weird facts and claims to fame
● Signed off on the largest deletion of lines of code in the Linux
kernel (but these were only comments)
● Unrepentant code hoarder. Had 60,000+ GH forks
now down only to 20K forks
▷ pombredanne@nexb.com irc:pombreda
Philippe Ombredanne
2
© 2022 nexB Inc. License: CC-BY-SA-4.0 https://www.nexb.com/ https://www.aboutcode.org/
Agenda
▷ The state of vulnerability databases (open or not)
▷ How do we search vulnerabilities? By package first!
▷ A better approach: package first
▷ Why VulnerableCode?
▷ VulnerableCode Solution
▷ How to create a package vulnerability database
○ Aggregate and correlate many data sources
○ Multi level data refinement
▷ Issues with vulnerability data
▷ Future plans
▷ Next steps: we need your help!
3
© 2022 nexB Inc. License: CC-BY-SA-4.0 https://www.nexb.com/ https://www.aboutcode.org/
State of vulnerability databases (1)
▷ Databases with ghost packages
● DBs reference packages that do not exist anywhere
▷ Databases ghost vulnerable versions
● Even though these are not vulnerable or the opposite
▷ Databases "Crying wolf" - improbable vulnerabilities
● DBs report a package as vulnerable if anything in the
dependency tree may be vulnerable (Log4j)
▷ Impossible, self-contradictory version ranges
● Resolved to nothing or everything
▷ Redundant and noisy duplicated vulnerabilities
▷ Vulnerabilities mapped to hard-to-find CPEs
4
© 2022 nexB Inc. License: CC-BY-SA-4.0 https://www.nexb.com/ https://www.aboutcode.org/
State of vulnerability databases (2)
The Telephone Game problem
▷ Everyone is making something up a little by trying to improve data
▷ Each of them makes something up slightly differently
○ Too much reliance on automated tools on top of bad data
▷ Many DBs base their content on another DB’s content
○ At each step the data is transformed (and damaged) in subtle ways
▷ You can have as many vulnerable ranges as there are DB interpretations.
○ None of them is entirely faithful to the upstream data
○ Over time this turns into The Telephone game
▷ Upstream has better data
5
© 2022 nexB Inc. License: CC-BY-SA-4.0 https://www.nexb.com/ https://www.aboutcode.org/
The true-true vulnerabilities are UPSTREAM!
Credit https://live.staticflickr.com/3798/10142017736_7f69d9f472_h.jpg
"Brown Bears at Brooks Camp, Katmai National Park"
by Christoph Strässlerhttps://www.flickr.com/photos/christoph_straessler/
This work is licensed under a Creative Commons Attribution-2.50 License.
https://creativecommons.org/licenses/by-sa/2.0/
https://www.youtube.com/watch?v=V4kWQSpCbvo
6
© 2022 nexB Inc. License: CC-BY-SA-4.0 https://www.nexb.com/ https://www.aboutcode.org/
State of vulnerability databases (3)
▷ Databases of known FOSS software vulnerabilities are mostly
proprietary and/or privately maintained using proprietary tools,
processes and data.
▷ Why not open data? FOSS code likes open data about FOSS!
○ Some new entrants are now using open licenses:
○ GHSA (GitHub), OSV (Google), GitLab (one month delay)
▷ Emerging support for Package URL promotes interoperability
○ OSSF OSV, Sonatype OSSINDEX
▷ Emerging common format promotes interoperability
○ OSSF OSV
▷ But open formats do not mean common data identifiers
7
© 2022 nexB Inc. License: CC-BY-SA-4.0 https://www.nexb.com/ https://www.aboutcode.org/
How do we search? By package first!
Questions to answer:
▷ Is package foo@1.0 known to be vulnerable?
○ What are the vulnerabilities?
○ What is the severity of the vulnerability?
○ Which version has a fix?
▷ More rarely: do I have any package vulnerable to this
vulnerability?
8
© 2022 nexB Inc. License: CC-BY-SA-4.0 https://www.nexb.com/ https://www.aboutcode.org/
A better approach: package first
Lookup vulnerabilities, find packages
Find packages, lookup vulnerabilities
9
© 2022 nexB Inc. License: CC-BY-SA-4.0 https://www.nexb.com/ https://www.aboutcode.org/
Why VulnerableCode? accuracy and correctness
▷ Vulnerabilities are important!
▷ Code is more important! Package first!
▷ There is no Free Software Vulnerability Database that is
○ Open!!
○ Comprehensive for most ecosystems (system+package)
○ Curated by expert humans
○ Validated: Trusted, correlated and verified data
○ Working towards correctness
10
© 2022 nexB Inc. License: CC-BY-SA-4.0 https://www.nexb.com/ https://www.aboutcode.org/
▷ Find packages with scanning, matching and tracing
○ Leverage all tools that report package-url (we support CPE too)
○ ScanCode.io and ScanCode, ORT, Tern, OWASP Dependency
Track and many more .... or an SBOM (SPDX, CycloneDX)
▷ Lookup package vulnerabilities in an open database that aggregates
them all
▷ Query by purl!
▷ Open data and open source tools are better for open source!
▷ Eventually review by experts to curate all the data.
VulnerableCode Solution
11
© 2022 nexB Inc. License: CC-BY-SA-4.0 https://www.nexb.com/ https://www.aboutcode.org/
▷ Use data from upstream, at the source of the source!
○ From the package maintainers and authors themselves
▷ Employ a confidence based system: not all data are equally trusted
and of the same quality
▷ Aggregate and correlate many data sources to enrich, cross-check
and validate
▷ Discover of new relations between vulnerabilities and packages from
mining the graph
▷ Curate and review for correctness with experts (AI is nice but does
not fix the problems)
How to create a package vulnerability database?
12
© 2022 nexB Inc. License: CC-BY-SA-4.0 https://www.nexb.com/ https://www.aboutcode.org/
Aggregate and correlate many data sources
▷ Collect and parse many sources
○ Store in a common data model
○ Cross-reference to create a graph
▷ Project-specific trackers
○ Apache, OpenSSL, nginx...
○ Bug trackers, commit logs, projects CHANGELOGs.
▷ Linux distro trackers (Debian, Ubuntu, RedHat, SUSE, Gentoo, ...)
○ Custom or standard formats (CVRF, OVAL)
▷ Application package trackers
○ NuGet, Rust, RubyGems, Pysec, RustSec, npm, ....
▷ NVD, and other aggregators: OSV, GHSA, GitLab, GSD and more.
13
© 2022 nexB Inc. License: CC-BY-SA-4.0 https://www.nexb.com/ https://www.aboutcode.org/
Multi level data refinement
▷ The data is always imported in an “Advisory” staging area
▷ "Advisory" data are converted to "Vulnerability" and "Package" data and
their relationships using "Improvers"
▷ "Advisory" data that cannot be converted are kept with a log to investigate
and resolve issue
▷ Specific improvers can mine the graph, cross check with other data sources,
resolve updated version ranges
Import Improve Filter
14
© 2022 nexB Inc. License: CC-BY-SA-4.0 https://www.nexb.com/ https://www.aboutcode.org/
Issue: Ghost packages
▷ Some packages do not exist anywhere
○ Including versions that may not exist: they were never released
▷ Solution: Look up upstream in the package registries and repositories
○ We can look up in the registries and repositories to validate that the Package
URLs and versions are correct and really exist upstream
15
© 2022 nexB Inc. License: CC-BY-SA-4.0 https://www.nexb.com/ https://www.aboutcode.org/
Issue: Lesser data quality
▷ Some vulnerability sources cannot be trusted
○ Known to make incorrect or inaccurate assertions about packages and
versions
▷ Solution: store confidence level
○ Confidence level ensure we keep all inferred data, even lesser quality data
○ We do not trust others: We can discount the data sources we trust less
○ And we do not trust ourselves: we can discount the automated inferences we
do if we are not 100% sure about their correctness
16
© 2022 nexB Inc. License: CC-BY-SA-4.0 https://www.nexb.com/ https://www.aboutcode.org/
Issue: Incorrect or missing versions
▷ Some package versions are missing or incorrect
○ Affected version statements are often ambiguous
○ "All the versions of package foo are vulnerable to CVE XZY" really means all
versions of foo known at the time this advisory was published were vulnerable.
▷ Solution: store version range, resolve range and "time travel"
○ We store version ranges as a compact string (using new purl "vers" spec)
○ We expand and resolve ranges with "univers" version handling library
○ Package version can be re/checked for being in a vulnerable range as needed
■ In the past and in the future
○ Improvers can do "time travel" based on version publication dates and determine
if a package version was vulnerable in the past when published.
17
© 2022 nexB Inc. License: CC-BY-SA-4.0 https://www.nexb.com/ https://www.aboutcode.org/
Issue: Duplicated data
▷ Some vulnerabilities are duplicates
○ Leads to many noisy relationships and lesser correlation abilities
▷ Solution: Introduce a new set of vulnerabilities id aliases
○ Before, we tracked by CVE id; only if there was none we created a
VULCOID id. (VulnerableCode ID for a vulnerability)
○ We now always use a VULCOID id and track many aliases (including a
CVE id when available) for each vulnerability
○ Aliases are used for data reconciliation during the second step of
"Improvers" meaning that we avoid a large number of duplicates
○ Improver jobs will further merge additional duplicates
18
© 2022 nexB Inc. License: CC-BY-SA-4.0 https://www.nexb.com/ https://www.aboutcode.org/
Other issues
▷ Many data sources - redundant, unstructured, messy,
incomplete
○ We grew to appreciate the complexity of the task and why commercial
vendors currently dominate the space
○ Solution: integrate them all (all the data sources) to cross-check them
▷ Old, obsolete, or less useful data
○ More is not always better - e.g. old vulnerabilities on Windows 95
○ Commercial-only software (Windows, etc.) or hardware is less interesting
○ Solution: let go of some of the past! and ignore the legacy
19
© 2022 nexB Inc. License: CC-BY-SA-4.0 https://www.nexb.com/ https://www.aboutcode.org/
Future plans
▷ More primary data sources, going upstream
▷ More data: Actual commits fixing and introducing vulnerabilities
▷ YARA Rules: enable finer grain detection of actual vulnerable code
▷ Community peer curation system including curation UI
▷ AI/ML for data quality improvements
▷ .... and good ole heuristics
▷ VulnTotal: Tool to compare all the Vulnerabilities DB
○ Think Virustotal for vulnerabilities
20
© 2022 nexB Inc. License: CC-BY-SA-4.0 https://www.nexb.com/ https://www.aboutcode.org/
If you want to learn more about our projects
▷ Try out VulnerableCode with a free DejaCode account
https://enterprise.dejacode.com/account/register/
▷ Register for our upcoming webinars
https://nexb.com/webinars
▷ Read our latest blog post at
https://nexb.com/vulnerablecode-public-release
▷ Download VulnerableCode at
https://github.com/nexB/vulnerablecode/releases/latest
▷ Visit https://nexb.com/vulnerablecode for more information
21
© 2022 nexB Inc. License: CC-BY-SA-4.0 https://www.nexb.com/ https://www.aboutcode.org/
If you want to help
You can contribute code, time, docs or funds
▷ Use these fine FOSS tools and specs
● https://github.com/nexb/vulnerablecode
● https://www.aboutcode.org/
● https://github.com/nexB/
● https://github.com/package-url
▷ Join the conversation at
● https://gitter.im/aboutcode-org
▷ Donate at
● https://opencollective.com/aboutcode
22
© 2022 nexB Inc. License: CC-BY-SA-4.0 https://www.nexb.com/ https://www.aboutcode.org/
Credits
▷ Presentation template by SlidesCarnival licensed under CC-BY-4.0
▷ Photograph by Unsplash licensed under Unsplash License
▷ Other content licensed under CC-BY-4.0
23

More Related Content

Similar to VulnerableCode: Finding FOSS software vulnerabilities with FOSS tools

Drupal Dev Days Vienna 2023 - What is the secure software supply chain and th...
Drupal Dev Days Vienna 2023 - What is the secure software supply chain and th...Drupal Dev Days Vienna 2023 - What is the secure software supply chain and th...
Drupal Dev Days Vienna 2023 - What is the secure software supply chain and th...sparkfabrik
 
WebGoat.SDWAN.Net in Depth
WebGoat.SDWAN.Net in DepthWebGoat.SDWAN.Net in Depth
WebGoat.SDWAN.Net in Depthyalegko
 
WebGoat.SDWAN.Net in Depth: SD-WAN Security Assessment
WebGoat.SDWAN.Net in Depth: SD-WAN Security Assessment WebGoat.SDWAN.Net in Depth: SD-WAN Security Assessment
WebGoat.SDWAN.Net in Depth: SD-WAN Security Assessment Sergey Gordeychik
 
Using containers and Continuous Packaging to Build native FOSSology packages
Using containers and Continuous Packaging to Build native FOSSology packagesUsing containers and Continuous Packaging to Build native FOSSology packages
Using containers and Continuous Packaging to Build native FOSSology packagesBruno Cornec
 
CodeMotion 2023 - Deep dive nella supply chain della nostra infrastruttura cl...
CodeMotion 2023 - Deep dive nella supply chain della nostra infrastruttura cl...CodeMotion 2023 - Deep dive nella supply chain della nostra infrastruttura cl...
CodeMotion 2023 - Deep dive nella supply chain della nostra infrastruttura cl...sparkfabrik
 
13 practical tips for writing secure golang applications
13 practical tips for writing secure golang applications13 practical tips for writing secure golang applications
13 practical tips for writing secure golang applicationsKarthik Gaekwad
 
Analysis of-quality-of-pkgs-in-packagist-univ-20171024
Analysis of-quality-of-pkgs-in-packagist-univ-20171024Analysis of-quality-of-pkgs-in-packagist-univ-20171024
Analysis of-quality-of-pkgs-in-packagist-univ-20171024Clark Everetts
 
Kubernetes Summit 2019 - Harden Your Kubernetes Cluster
Kubernetes Summit 2019 - Harden Your Kubernetes ClusterKubernetes Summit 2019 - Harden Your Kubernetes Cluster
Kubernetes Summit 2019 - Harden Your Kubernetes Clustersmalltown
 
Kubernetes and container security
Kubernetes and container securityKubernetes and container security
Kubernetes and container securityVolodymyr Shynkar
 
Php Dependency Management with Composer ZendCon 2017
Php Dependency Management with Composer ZendCon 2017Php Dependency Management with Composer ZendCon 2017
Php Dependency Management with Composer ZendCon 2017Clark Everetts
 
What is the Secure Supply Chain and the Current State of the PHP Ecosystem
What is the Secure Supply Chain and the Current State of the PHP EcosystemWhat is the Secure Supply Chain and the Current State of the PHP Ecosystem
What is the Secure Supply Chain and the Current State of the PHP Ecosystemsparkfabrik
 
ATT&CKING Containers in The Cloud
ATT&CKING Containers in The CloudATT&CKING Containers in The Cloud
ATT&CKING Containers in The CloudMITRE ATT&CK
 
OpenChain Webinar #50 - An Overview of SPDX 3.0
OpenChain Webinar #50 - An Overview of SPDX 3.0OpenChain Webinar #50 - An Overview of SPDX 3.0
OpenChain Webinar #50 - An Overview of SPDX 3.0Shane Coughlan
 
Security in a containerized world - Jessie Frazelle
Security in a containerized world - Jessie FrazelleSecurity in a containerized world - Jessie Frazelle
Security in a containerized world - Jessie FrazelleParis Container Day
 
How is this sausage made
How is this sausage madeHow is this sausage made
How is this sausage madedejanb
 
I Just Want to Run My Code: Waypoint, Nomad, and Other Things
I Just Want to Run My Code: Waypoint, Nomad, and Other ThingsI Just Want to Run My Code: Waypoint, Nomad, and Other Things
I Just Want to Run My Code: Waypoint, Nomad, and Other ThingsMichael Lange
 
Pluggable Infrastructure with CI/CD and Docker
Pluggable Infrastructure with CI/CD and DockerPluggable Infrastructure with CI/CD and Docker
Pluggable Infrastructure with CI/CD and DockerBob Killen
 
Managing Open Source software in the Docker era
Managing Open Source software in the Docker era Managing Open Source software in the Docker era
Managing Open Source software in the Docker era nexB Inc.
 
Choisir le bon business model et la bonne licence pour la survie de son proje...
Choisir le bon business model et la bonne licence pour la survie de son proje...Choisir le bon business model et la bonne licence pour la survie de son proje...
Choisir le bon business model et la bonne licence pour la survie de son proje...Open Source Experience
 

Similar to VulnerableCode: Finding FOSS software vulnerabilities with FOSS tools (20)

Drupal Dev Days Vienna 2023 - What is the secure software supply chain and th...
Drupal Dev Days Vienna 2023 - What is the secure software supply chain and th...Drupal Dev Days Vienna 2023 - What is the secure software supply chain and th...
Drupal Dev Days Vienna 2023 - What is the secure software supply chain and th...
 
WebGoat.SDWAN.Net in Depth
WebGoat.SDWAN.Net in DepthWebGoat.SDWAN.Net in Depth
WebGoat.SDWAN.Net in Depth
 
WebGoat.SDWAN.Net in Depth: SD-WAN Security Assessment
WebGoat.SDWAN.Net in Depth: SD-WAN Security Assessment WebGoat.SDWAN.Net in Depth: SD-WAN Security Assessment
WebGoat.SDWAN.Net in Depth: SD-WAN Security Assessment
 
Using containers and Continuous Packaging to Build native FOSSology packages
Using containers and Continuous Packaging to Build native FOSSology packagesUsing containers and Continuous Packaging to Build native FOSSology packages
Using containers and Continuous Packaging to Build native FOSSology packages
 
CodeMotion 2023 - Deep dive nella supply chain della nostra infrastruttura cl...
CodeMotion 2023 - Deep dive nella supply chain della nostra infrastruttura cl...CodeMotion 2023 - Deep dive nella supply chain della nostra infrastruttura cl...
CodeMotion 2023 - Deep dive nella supply chain della nostra infrastruttura cl...
 
13 practical tips for writing secure golang applications
13 practical tips for writing secure golang applications13 practical tips for writing secure golang applications
13 practical tips for writing secure golang applications
 
Analysis of-quality-of-pkgs-in-packagist-univ-20171024
Analysis of-quality-of-pkgs-in-packagist-univ-20171024Analysis of-quality-of-pkgs-in-packagist-univ-20171024
Analysis of-quality-of-pkgs-in-packagist-univ-20171024
 
Kubernetes Summit 2019 - Harden Your Kubernetes Cluster
Kubernetes Summit 2019 - Harden Your Kubernetes ClusterKubernetes Summit 2019 - Harden Your Kubernetes Cluster
Kubernetes Summit 2019 - Harden Your Kubernetes Cluster
 
Kubernetes and container security
Kubernetes and container securityKubernetes and container security
Kubernetes and container security
 
Php Dependency Management with Composer ZendCon 2017
Php Dependency Management with Composer ZendCon 2017Php Dependency Management with Composer ZendCon 2017
Php Dependency Management with Composer ZendCon 2017
 
Node.js Module: I Choose You!
Node.js Module: I Choose You!Node.js Module: I Choose You!
Node.js Module: I Choose You!
 
What is the Secure Supply Chain and the Current State of the PHP Ecosystem
What is the Secure Supply Chain and the Current State of the PHP EcosystemWhat is the Secure Supply Chain and the Current State of the PHP Ecosystem
What is the Secure Supply Chain and the Current State of the PHP Ecosystem
 
ATT&CKING Containers in The Cloud
ATT&CKING Containers in The CloudATT&CKING Containers in The Cloud
ATT&CKING Containers in The Cloud
 
OpenChain Webinar #50 - An Overview of SPDX 3.0
OpenChain Webinar #50 - An Overview of SPDX 3.0OpenChain Webinar #50 - An Overview of SPDX 3.0
OpenChain Webinar #50 - An Overview of SPDX 3.0
 
Security in a containerized world - Jessie Frazelle
Security in a containerized world - Jessie FrazelleSecurity in a containerized world - Jessie Frazelle
Security in a containerized world - Jessie Frazelle
 
How is this sausage made
How is this sausage madeHow is this sausage made
How is this sausage made
 
I Just Want to Run My Code: Waypoint, Nomad, and Other Things
I Just Want to Run My Code: Waypoint, Nomad, and Other ThingsI Just Want to Run My Code: Waypoint, Nomad, and Other Things
I Just Want to Run My Code: Waypoint, Nomad, and Other Things
 
Pluggable Infrastructure with CI/CD and Docker
Pluggable Infrastructure with CI/CD and DockerPluggable Infrastructure with CI/CD and Docker
Pluggable Infrastructure with CI/CD and Docker
 
Managing Open Source software in the Docker era
Managing Open Source software in the Docker era Managing Open Source software in the Docker era
Managing Open Source software in the Docker era
 
Choisir le bon business model et la bonne licence pour la survie de son proje...
Choisir le bon business model et la bonne licence pour la survie de son proje...Choisir le bon business model et la bonne licence pour la survie de son proje...
Choisir le bon business model et la bonne licence pour la survie de son proje...
 

Recently uploaded

Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...stazi3110
 
Professional Resume Template for Software Developers
Professional Resume Template for Software DevelopersProfessional Resume Template for Software Developers
Professional Resume Template for Software DevelopersVinodh Ram
 
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdfThe Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdfkalichargn70th171
 
chapter--4-software-project-planning.ppt
chapter--4-software-project-planning.pptchapter--4-software-project-planning.ppt
chapter--4-software-project-planning.pptkotipi9215
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...ICS
 
Asset Management Software - Infographic
Asset Management Software - InfographicAsset Management Software - Infographic
Asset Management Software - InfographicHr365.us smith
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxComplianceQuest1
 
Introduction to Decentralized Applications (dApps)
Introduction to Decentralized Applications (dApps)Introduction to Decentralized Applications (dApps)
Introduction to Decentralized Applications (dApps)Intelisync
 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...MyIntelliSource, Inc.
 
Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...OnePlan Solutions
 
why an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfwhy an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfjoe51371421
 
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...gurkirankumar98700
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVshikhaohhpro
 
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...Christina Lin
 
Cloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackCloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackVICTOR MAESTRE RAMIREZ
 
Unit 1.1 Excite Part 1, class 9, cbse...
Unit 1.1 Excite Part 1, class 9, cbse...Unit 1.1 Excite Part 1, class 9, cbse...
Unit 1.1 Excite Part 1, class 9, cbse...aditisharan08
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comFatema Valibhai
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providermohitmore19
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...harshavardhanraghave
 

Recently uploaded (20)

Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
 
Professional Resume Template for Software Developers
Professional Resume Template for Software DevelopersProfessional Resume Template for Software Developers
Professional Resume Template for Software Developers
 
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdfThe Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
 
chapter--4-software-project-planning.ppt
chapter--4-software-project-planning.pptchapter--4-software-project-planning.ppt
chapter--4-software-project-planning.ppt
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
 
Asset Management Software - Infographic
Asset Management Software - InfographicAsset Management Software - Infographic
Asset Management Software - Infographic
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docx
 
Introduction to Decentralized Applications (dApps)
Introduction to Decentralized Applications (dApps)Introduction to Decentralized Applications (dApps)
Introduction to Decentralized Applications (dApps)
 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
 
Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...
 
why an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfwhy an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdf
 
Exploring iOS App Development: Simplifying the Process
Exploring iOS App Development: Simplifying the ProcessExploring iOS App Development: Simplifying the Process
Exploring iOS App Development: Simplifying the Process
 
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTV
 
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
 
Cloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackCloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStack
 
Unit 1.1 Excite Part 1, class 9, cbse...
Unit 1.1 Excite Part 1, class 9, cbse...Unit 1.1 Excite Part 1, class 9, cbse...
Unit 1.1 Excite Part 1, class 9, cbse...
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.com
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service provider
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
 

VulnerableCode: Finding FOSS software vulnerabilities with FOSS tools

  • 1. © 2022 nexB Inc. License: CC-BY-SA-4.0 https://www.nexb.com/ https://www.aboutcode.org/ VulnerableCode Because a vulnerability database should not be about Vulnerabilities 1
  • 2. © 2022 nexB Inc. License: CC-BY-SA-4.0 https://www.nexb.com/ https://www.aboutcode.org/ ▷ Project lead and maintainer for VulnerableCode, ScanCode and AboutCode ▷ Creator of Package URL, co-founder of SPDX, ClearlyDefined ▷ FOSS veteran, long time Google Summer of Code mentor ▷ Co-founder and CTO of nexB Inc., makers of DejaCode ▷ Weird facts and claims to fame ● Signed off on the largest deletion of lines of code in the Linux kernel (but these were only comments) ● Unrepentant code hoarder. Had 60,000+ GH forks now down only to 20K forks ▷ pombredanne@nexb.com irc:pombreda Philippe Ombredanne 2
  • 3. © 2022 nexB Inc. License: CC-BY-SA-4.0 https://www.nexb.com/ https://www.aboutcode.org/ Agenda ▷ The state of vulnerability databases (open or not) ▷ How do we search vulnerabilities? By package first! ▷ A better approach: package first ▷ Why VulnerableCode? ▷ VulnerableCode Solution ▷ How to create a package vulnerability database ○ Aggregate and correlate many data sources ○ Multi level data refinement ▷ Issues with vulnerability data ▷ Future plans ▷ Next steps: we need your help! 3
  • 4. © 2022 nexB Inc. License: CC-BY-SA-4.0 https://www.nexb.com/ https://www.aboutcode.org/ State of vulnerability databases (1) ▷ Databases with ghost packages ● DBs reference packages that do not exist anywhere ▷ Databases ghost vulnerable versions ● Even though these are not vulnerable or the opposite ▷ Databases "Crying wolf" - improbable vulnerabilities ● DBs report a package as vulnerable if anything in the dependency tree may be vulnerable (Log4j) ▷ Impossible, self-contradictory version ranges ● Resolved to nothing or everything ▷ Redundant and noisy duplicated vulnerabilities ▷ Vulnerabilities mapped to hard-to-find CPEs 4
  • 5. © 2022 nexB Inc. License: CC-BY-SA-4.0 https://www.nexb.com/ https://www.aboutcode.org/ State of vulnerability databases (2) The Telephone Game problem ▷ Everyone is making something up a little by trying to improve data ▷ Each of them makes something up slightly differently ○ Too much reliance on automated tools on top of bad data ▷ Many DBs base their content on another DB’s content ○ At each step the data is transformed (and damaged) in subtle ways ▷ You can have as many vulnerable ranges as there are DB interpretations. ○ None of them is entirely faithful to the upstream data ○ Over time this turns into The Telephone game ▷ Upstream has better data 5
  • 6. © 2022 nexB Inc. License: CC-BY-SA-4.0 https://www.nexb.com/ https://www.aboutcode.org/ The true-true vulnerabilities are UPSTREAM! Credit https://live.staticflickr.com/3798/10142017736_7f69d9f472_h.jpg "Brown Bears at Brooks Camp, Katmai National Park" by Christoph Strässlerhttps://www.flickr.com/photos/christoph_straessler/ This work is licensed under a Creative Commons Attribution-2.50 License. https://creativecommons.org/licenses/by-sa/2.0/ https://www.youtube.com/watch?v=V4kWQSpCbvo 6
  • 7. © 2022 nexB Inc. License: CC-BY-SA-4.0 https://www.nexb.com/ https://www.aboutcode.org/ State of vulnerability databases (3) ▷ Databases of known FOSS software vulnerabilities are mostly proprietary and/or privately maintained using proprietary tools, processes and data. ▷ Why not open data? FOSS code likes open data about FOSS! ○ Some new entrants are now using open licenses: ○ GHSA (GitHub), OSV (Google), GitLab (one month delay) ▷ Emerging support for Package URL promotes interoperability ○ OSSF OSV, Sonatype OSSINDEX ▷ Emerging common format promotes interoperability ○ OSSF OSV ▷ But open formats do not mean common data identifiers 7
  • 8. © 2022 nexB Inc. License: CC-BY-SA-4.0 https://www.nexb.com/ https://www.aboutcode.org/ How do we search? By package first! Questions to answer: ▷ Is package foo@1.0 known to be vulnerable? ○ What are the vulnerabilities? ○ What is the severity of the vulnerability? ○ Which version has a fix? ▷ More rarely: do I have any package vulnerable to this vulnerability? 8
  • 9. © 2022 nexB Inc. License: CC-BY-SA-4.0 https://www.nexb.com/ https://www.aboutcode.org/ A better approach: package first Lookup vulnerabilities, find packages Find packages, lookup vulnerabilities 9
  • 10. © 2022 nexB Inc. License: CC-BY-SA-4.0 https://www.nexb.com/ https://www.aboutcode.org/ Why VulnerableCode? accuracy and correctness ▷ Vulnerabilities are important! ▷ Code is more important! Package first! ▷ There is no Free Software Vulnerability Database that is ○ Open!! ○ Comprehensive for most ecosystems (system+package) ○ Curated by expert humans ○ Validated: Trusted, correlated and verified data ○ Working towards correctness 10
  • 11. © 2022 nexB Inc. License: CC-BY-SA-4.0 https://www.nexb.com/ https://www.aboutcode.org/ ▷ Find packages with scanning, matching and tracing ○ Leverage all tools that report package-url (we support CPE too) ○ ScanCode.io and ScanCode, ORT, Tern, OWASP Dependency Track and many more .... or an SBOM (SPDX, CycloneDX) ▷ Lookup package vulnerabilities in an open database that aggregates them all ▷ Query by purl! ▷ Open data and open source tools are better for open source! ▷ Eventually review by experts to curate all the data. VulnerableCode Solution 11
  • 12. © 2022 nexB Inc. License: CC-BY-SA-4.0 https://www.nexb.com/ https://www.aboutcode.org/ ▷ Use data from upstream, at the source of the source! ○ From the package maintainers and authors themselves ▷ Employ a confidence based system: not all data are equally trusted and of the same quality ▷ Aggregate and correlate many data sources to enrich, cross-check and validate ▷ Discover of new relations between vulnerabilities and packages from mining the graph ▷ Curate and review for correctness with experts (AI is nice but does not fix the problems) How to create a package vulnerability database? 12
  • 13. © 2022 nexB Inc. License: CC-BY-SA-4.0 https://www.nexb.com/ https://www.aboutcode.org/ Aggregate and correlate many data sources ▷ Collect and parse many sources ○ Store in a common data model ○ Cross-reference to create a graph ▷ Project-specific trackers ○ Apache, OpenSSL, nginx... ○ Bug trackers, commit logs, projects CHANGELOGs. ▷ Linux distro trackers (Debian, Ubuntu, RedHat, SUSE, Gentoo, ...) ○ Custom or standard formats (CVRF, OVAL) ▷ Application package trackers ○ NuGet, Rust, RubyGems, Pysec, RustSec, npm, .... ▷ NVD, and other aggregators: OSV, GHSA, GitLab, GSD and more. 13
  • 14. © 2022 nexB Inc. License: CC-BY-SA-4.0 https://www.nexb.com/ https://www.aboutcode.org/ Multi level data refinement ▷ The data is always imported in an “Advisory” staging area ▷ "Advisory" data are converted to "Vulnerability" and "Package" data and their relationships using "Improvers" ▷ "Advisory" data that cannot be converted are kept with a log to investigate and resolve issue ▷ Specific improvers can mine the graph, cross check with other data sources, resolve updated version ranges Import Improve Filter 14
  • 15. © 2022 nexB Inc. License: CC-BY-SA-4.0 https://www.nexb.com/ https://www.aboutcode.org/ Issue: Ghost packages ▷ Some packages do not exist anywhere ○ Including versions that may not exist: they were never released ▷ Solution: Look up upstream in the package registries and repositories ○ We can look up in the registries and repositories to validate that the Package URLs and versions are correct and really exist upstream 15
  • 16. © 2022 nexB Inc. License: CC-BY-SA-4.0 https://www.nexb.com/ https://www.aboutcode.org/ Issue: Lesser data quality ▷ Some vulnerability sources cannot be trusted ○ Known to make incorrect or inaccurate assertions about packages and versions ▷ Solution: store confidence level ○ Confidence level ensure we keep all inferred data, even lesser quality data ○ We do not trust others: We can discount the data sources we trust less ○ And we do not trust ourselves: we can discount the automated inferences we do if we are not 100% sure about their correctness 16
  • 17. © 2022 nexB Inc. License: CC-BY-SA-4.0 https://www.nexb.com/ https://www.aboutcode.org/ Issue: Incorrect or missing versions ▷ Some package versions are missing or incorrect ○ Affected version statements are often ambiguous ○ "All the versions of package foo are vulnerable to CVE XZY" really means all versions of foo known at the time this advisory was published were vulnerable. ▷ Solution: store version range, resolve range and "time travel" ○ We store version ranges as a compact string (using new purl "vers" spec) ○ We expand and resolve ranges with "univers" version handling library ○ Package version can be re/checked for being in a vulnerable range as needed ■ In the past and in the future ○ Improvers can do "time travel" based on version publication dates and determine if a package version was vulnerable in the past when published. 17
  • 18. © 2022 nexB Inc. License: CC-BY-SA-4.0 https://www.nexb.com/ https://www.aboutcode.org/ Issue: Duplicated data ▷ Some vulnerabilities are duplicates ○ Leads to many noisy relationships and lesser correlation abilities ▷ Solution: Introduce a new set of vulnerabilities id aliases ○ Before, we tracked by CVE id; only if there was none we created a VULCOID id. (VulnerableCode ID for a vulnerability) ○ We now always use a VULCOID id and track many aliases (including a CVE id when available) for each vulnerability ○ Aliases are used for data reconciliation during the second step of "Improvers" meaning that we avoid a large number of duplicates ○ Improver jobs will further merge additional duplicates 18
  • 19. © 2022 nexB Inc. License: CC-BY-SA-4.0 https://www.nexb.com/ https://www.aboutcode.org/ Other issues ▷ Many data sources - redundant, unstructured, messy, incomplete ○ We grew to appreciate the complexity of the task and why commercial vendors currently dominate the space ○ Solution: integrate them all (all the data sources) to cross-check them ▷ Old, obsolete, or less useful data ○ More is not always better - e.g. old vulnerabilities on Windows 95 ○ Commercial-only software (Windows, etc.) or hardware is less interesting ○ Solution: let go of some of the past! and ignore the legacy 19
  • 20. © 2022 nexB Inc. License: CC-BY-SA-4.0 https://www.nexb.com/ https://www.aboutcode.org/ Future plans ▷ More primary data sources, going upstream ▷ More data: Actual commits fixing and introducing vulnerabilities ▷ YARA Rules: enable finer grain detection of actual vulnerable code ▷ Community peer curation system including curation UI ▷ AI/ML for data quality improvements ▷ .... and good ole heuristics ▷ VulnTotal: Tool to compare all the Vulnerabilities DB ○ Think Virustotal for vulnerabilities 20
  • 21. © 2022 nexB Inc. License: CC-BY-SA-4.0 https://www.nexb.com/ https://www.aboutcode.org/ If you want to learn more about our projects ▷ Try out VulnerableCode with a free DejaCode account https://enterprise.dejacode.com/account/register/ ▷ Register for our upcoming webinars https://nexb.com/webinars ▷ Read our latest blog post at https://nexb.com/vulnerablecode-public-release ▷ Download VulnerableCode at https://github.com/nexB/vulnerablecode/releases/latest ▷ Visit https://nexb.com/vulnerablecode for more information 21
  • 22. © 2022 nexB Inc. License: CC-BY-SA-4.0 https://www.nexb.com/ https://www.aboutcode.org/ If you want to help You can contribute code, time, docs or funds ▷ Use these fine FOSS tools and specs ● https://github.com/nexb/vulnerablecode ● https://www.aboutcode.org/ ● https://github.com/nexB/ ● https://github.com/package-url ▷ Join the conversation at ● https://gitter.im/aboutcode-org ▷ Donate at ● https://opencollective.com/aboutcode 22
  • 23. © 2022 nexB Inc. License: CC-BY-SA-4.0 https://www.nexb.com/ https://www.aboutcode.org/ Credits ▷ Presentation template by SlidesCarnival licensed under CC-BY-4.0 ▷ Photograph by Unsplash licensed under Unsplash License ▷ Other content licensed under CC-BY-4.0 23