by Marcel Böhme, Lászlo Szekeres, and Jonathan Metzman. ICSE'22 conference presentation for our paper on "Coverage-based Fuzzer Benchmarking". Extended version given at UZH IFI colloquium on 12 May 2022.
Paper: https://mboehme.github.io/paper/ICSE22.pdf
Video: https://youtu.be/LCrtSt8MBXc
From Zero to DevSecOps: How to Implement Security at the Speed of DevOps WhiteSource
Your organization has already embraced the DevOps methodology? That’s a great start. But what about security?
It’s a fact - many organizations fear that adding security to their DevOps practices will severely slow down their development processes. But this doesn’t need to be the case.
Tune in to hear Jeff Martin, Senior Director of Product at WhiteSource and Anders Wallgren, VP of Technology Strategy at Cloudbees, as they discuss:
- Why traditional DevOps has shifted, and what this will mean
- Who should own security in the age of DevOps
- Which tools and strategies are needed to implement continuous security throughout the DevOps pipeline
Cyber Threat Hunting: Identify and Hunt Down IntrudersInfosec
View webinar: "Cyber Threat Hunting: Identify and Hunt Down Intruders": https://www2.infosecinstitute.com/l/12882/2018-11-29/b9gwfd
View companion webinar:
"Red Team Operations: Attack and Think Like a Criminal": https://www2.infosecinstitute.com/l/12882/2018-11-29/b9gw5q
Are you red team, blue team — or both? Get an inside look at the offensive and defensive sides of information security in our webinar series.
Senior Security Researcher and InfoSec Instructor Jeremy Martin discusses what it takes to be modern-day threat hunter during our webinar, Cyber Threat Hunting: Identify and Hunt Down Intruders.
The webinar covers:
- The job duties of a Cyber Threat Hunting professional
- Frameworks and strategies for Cyber Threat Hunting
- How to get started and progress your defensive security career
- And questions from live viewers!
Learn about InfoSec Institute's Cyber Threat Hunting couse here: https://www.infosecinstitute.com/courses/cyber-threat-hunting/
Talk on Kaspersky lab's CoLaboratory: Industrial Cybersecurity Meetup #5 with @HeirhabarovT about several ATT&CK practical use cases.
Video (in Russian): https://www.youtube.com/watch?v=ulUF9Sw2T7s&t=3078
Many thanks to Teymur for great tech dive
This document provides an introduction to bug bounty programs. It defines what a bug bounty program is, provides a brief history of major programs, and discusses reasons they are beneficial for both security researchers and companies. Key points covered include popular programs like Google and Facebook, tools used in bug hunting like Burp Suite, and lessons for researchers such as writing quality reports and following each program's rules.
The document provides an overview of Microsoft's Security Development Lifecycle (SDL) threat modeling process and tool. The SDL threat modeling process involves 4 main steps: 1) modeling the system, 2) enumerating potential threats, 3) identifying mitigations, and 4) validating the threat model. Threat modeling helps identify security risks early and guide other security activities. The Microsoft SDL Threat Modeling Tool supports collaboration on threat modeling and integrates with other SDL processes.
From Zero to DevSecOps: How to Implement Security at the Speed of DevOps WhiteSource
Your organization has already embraced the DevOps methodology? That’s a great start. But what about security?
It’s a fact - many organizations fear that adding security to their DevOps practices will severely slow down their development processes. But this doesn’t need to be the case.
Tune in to hear Jeff Martin, Senior Director of Product at WhiteSource and Anders Wallgren, VP of Technology Strategy at Cloudbees, as they discuss:
- Why traditional DevOps has shifted, and what this will mean
- Who should own security in the age of DevOps
- Which tools and strategies are needed to implement continuous security throughout the DevOps pipeline
Cyber Threat Hunting: Identify and Hunt Down IntrudersInfosec
View webinar: "Cyber Threat Hunting: Identify and Hunt Down Intruders": https://www2.infosecinstitute.com/l/12882/2018-11-29/b9gwfd
View companion webinar:
"Red Team Operations: Attack and Think Like a Criminal": https://www2.infosecinstitute.com/l/12882/2018-11-29/b9gw5q
Are you red team, blue team — or both? Get an inside look at the offensive and defensive sides of information security in our webinar series.
Senior Security Researcher and InfoSec Instructor Jeremy Martin discusses what it takes to be modern-day threat hunter during our webinar, Cyber Threat Hunting: Identify and Hunt Down Intruders.
The webinar covers:
- The job duties of a Cyber Threat Hunting professional
- Frameworks and strategies for Cyber Threat Hunting
- How to get started and progress your defensive security career
- And questions from live viewers!
Learn about InfoSec Institute's Cyber Threat Hunting couse here: https://www.infosecinstitute.com/courses/cyber-threat-hunting/
Talk on Kaspersky lab's CoLaboratory: Industrial Cybersecurity Meetup #5 with @HeirhabarovT about several ATT&CK practical use cases.
Video (in Russian): https://www.youtube.com/watch?v=ulUF9Sw2T7s&t=3078
Many thanks to Teymur for great tech dive
This document provides an introduction to bug bounty programs. It defines what a bug bounty program is, provides a brief history of major programs, and discusses reasons they are beneficial for both security researchers and companies. Key points covered include popular programs like Google and Facebook, tools used in bug hunting like Burp Suite, and lessons for researchers such as writing quality reports and following each program's rules.
The document provides an overview of Microsoft's Security Development Lifecycle (SDL) threat modeling process and tool. The SDL threat modeling process involves 4 main steps: 1) modeling the system, 2) enumerating potential threats, 3) identifying mitigations, and 4) validating the threat model. Threat modeling helps identify security risks early and guide other security activities. The Microsoft SDL Threat Modeling Tool supports collaboration on threat modeling and integrates with other SDL processes.
This document discusses key performance indicators (KPIs) for measuring the success of application security initiatives. It provides example metrics in six areas: product security quality and risk exposure, security development lifecycle (SDLC) maturity, application security testing, consulting, training, and DevSecOps. The document recommends starting by measuring a few basic metrics and improving data over time. It emphasizes clear roles and accountability, and communicating risks financially rather than through complex assessments.
This document discusses viruses and antivirus software. It defines a computer virus as a program that can infect other programs. It then discusses various sources of viruses, types of viruses, and what antivirus software is. The document outlines two main methods that antivirus uses to identify viruses: signature-based detection, which compares files to known virus signatures; and heuristic-based detection, which uses general patterns to detect unknown viruses. It provides details on how each method works and their respective advantages and limitations.
This document discusses DevSecOps, including what it is, why it is needed, and how to implement it. DevSecOps aims to integrate security tools and a security-focused culture into the development lifecycle. It allows security to keep pace with rapid development. The document outlines how to incorporate security checks at various stages of the development pipeline from pre-commit hooks to monitoring in production. It provides examples of tools that can be used and discusses cultural and process aspects of DevSecOps implementation.
Making Continuous Security a Reality with OWASP’s AppSec Pipeline - Matt Tesa...Matt Tesauro
You’ve probably heard many talks about DevSecOps and continuous security testing but how many provided the tools needed to actually start that testing? This talk does exactly that. It provides an overview of the open source AppSec Pipeline tool which has been used in real world companies to do real security work. Beyond a stand alone tool, the OWASP AppSec Pipeline provides numerous docker containers ready to automate, a specification to customize with the ability to create your own implementation and references to get you started.
The talk will also cover how to add an AppSec Pipeline to your team’s arsenal and provide example templates of how best to run the automated tools provided. Finally, we’ll briefly cover using OWASP Defect Dojo to store and curate the issues found by your AppSec Pipeline. The goal of this talk is to share the field-tested methods of two AppSec professionals with nearly 20 years of experience between them. If you want to start your DevSecOps journey by continuously testing rather then hear about it, this talk is for you.
The practical DevSecOps course is designed to help individuals and organisations in implementing DevSecOps practices, to achieve massive scale in security. This course is divided into 13 chapters, each chapter will have theory, followed by demos and any limitations we need to keep in my mind while implementing them.
More details here - https://www.practical-devsecops.com/
Purple Team - Work it out: Organizing Effective Adversary Emulation ExercisesJorge Orchilles
Presented at the inaugural SANS Purple Team Summit & Training event, this presentation covers performing a high value adversary emulation exercise in a purple team fashion (red and blue team sitting together throughout the entire engagement).
Link to Youtube video: https://youtu.be/OJMqMWnxlT8
You can contact me at abhimanyu.bhogwan@gmail.com
My linkdin id : https://www.linkedin.com/in/abhimanyu-bhogwan-cissp-ctprp-98978437/
Threat Modeling(system+ enterprise)
What is Threat Modeling?
Why do we need Threat Modeling?
6 Most Common Threat Modeling Misconceptions
Threat Modelling Overview
6 important components of a DevSecOps approach
DevSecOps Security Best Practices
Threat Modeling Approaches
Threat Modeling Methodologies for IT Purposes
STRIDE
Threat Modelling Detailed Flow
System Characterization
Create an Architecture Overview
Decomposing your Application
Decomposing DFD’s and Threat-Element Relationship
Identify possible attack scenarios mapped to S.T.R.I.D.E. model
Identifying Security Controls
Identify possible threats
Report to Developers and Security team
DREAD Scoring
My Opinion on implementing Threat Modeling at enterprise level
DevSecOps Fundamentals and the Scars to Prove it.Matt Tesauro
This document discusses the fundamentals and evolution of DevSecOps. It begins by introducing the author and their background. It then outlines key DevSecOps concepts like reducing complexity, managing dependencies, shared understanding, enabling default security controls, fully utilizing frameworks, embracing cloud-native principles, codifying processes, treating servers as cattle, and automating workflows. The document also discusses the importance of DefectDojo and generating AppSec pipelines to integrate security testing into development pipelines in order to scale efforts and increase visibility, consistency, and flow. It emphasizes automating non-human tasks to optimize security personnel.
Meet the hackers powering the world's best bug bounty programsHackerOne
Not even the strongest or most skilled organizations have the headcount and capacity to avert system vulnerabilities on their own.
There is strength in numbers.
Hackers are that army - and at HackerOne, there's 80,000+ white hat hackers who want to make your software more secure.
Hackers ARE: Problem-solvers, Curious, Technically skilled, Diverse in background and education
Hackers are NOT: Criminals. Using their skills for a malicious purpose
This presentation dives into *who these hackers are and what motivates them. We look at some successful hacker profiles and see what separates the best from the rest.
This document discusses building application security teams. It begins by introducing the author and their background in application security. It then discusses creating an environment where security enables business goals rather than hinders them. It suggests embedding security into culture by focusing on quality, testing, and engineering. It discusses the importance of application security policies being customized and delivered effectively. It emphasizes the need for application security activities like threat modeling and code reviews to avoid relying on "security pixie dust". It argues that even non-software companies should view themselves as software companies due to their reliance on code. Finally, it discusses building application security teams internally by training and educating developers rather than exclusively hiring specialists.
This document provides a guide for becoming a DevOps engineer. It discusses what DevOps is, the responsibilities of a DevOps engineer, and the necessary technical and non-technical skills. Foundational skills like Linux, programming, Git, networking and cloud are recommended. Technical skills like CI/CD, containers, Kubernetes, infrastructure as code and security are important. Non-technical skills include understanding DevOps culture, communication, Agile principles and Lean. The document provides certification and learning resources recommendations.
Threat modeling is about thinking what bad can happen and what can you do about it. It can also find logical flaws and reveal problems in the architecture or software development practices. These vulnerabilities cannot usually be found by technical testing.
Threat modeling helps you deliver better software, prioritize your preventive security measures, and focus your penetration testing to the most risky parts of the system. The beauty of threat modeling is that you can assess security already in the design phase. In addition, it is something every team member can participate in because it doesn't require any source code, special skills, or tools. Threat modeling is for everyone: developers, testers, product owners, and project managers.
The presentation covers various methods, such as the STRIDE model, for finding security and privacy threats. You will also learn to analyze use cases for finding business level threats. The presentation also includes practical tips for arranging threat workshops and representing your results.
This presentation was held in the Diana Initiative 2018 and Nixucon 2018 conferences.
This document provides an introduction to bug bounty programs. It discusses what a bug bounty is, which are popular bug bounty platforms, how to choose target programs, reconnaissance methods like subdomain enumeration and content discovery, attacking single domains by analyzing requests and responses and hidden endpoints, and provides examples of the author's past bug bounty finds. The presentation ends by answering any questions about bug bounty programs.
This document describes a method for enhancing complete genome sequencing of foot-and-mouth disease virus (FMDV) using probe enrichment of next generation sequencing libraries. The method involves creating a library of oligonucleotide probes to enrich sequencing libraries for FMDV sequences prior to sequencing. This target enrichment was shown to dramatically improve the depth of coverage achieved for FMDV sequencing, increasing coverage over 100-fold for good quality samples and enabling sequencing of much weaker samples and samples subjected to heat denaturing. Target enrichment also allowed obtaining a complete FMDV genome from an RNA sample extracted from a field swab in an African abattoir.
Basic Security Concepts of Computer, this presentation will cover the following topics
BASIC SECURITY CONCEPT OF COMPUTER.
THREATS.
THREATS TO COMPUTER HARDWARE.
THREATS TO COMPUTER USER.
THREATS TO COMPUTER DATA.
VULNERABILITY AND COUNTERMEASURE.
SOFTWARE SECURITY.
The document discusses computer security threats and measures. It describes types of security like hardware security, software security and network security. It then discusses various malicious codes like viruses, trojans, worms and logic bombs. It also discusses hacking, natural threats like fires and floods, and theft. It concludes by describing various security measures that can be taken like using antivirus software, firewalls, encryption, backups and focusing on the human aspect of security.
Cyber security and demonstration of security toolsVicky Fernandes
Presentation on Cybersecurity and demonstration of security tools, conducted by Vicky Fernandes on 10th September 2019 at Don Bosco Institute of Technology, Mumbai.
We present our implementation and our reflections on a preregistration-based publication process for the fuzzing community with a pre-stage in the FUZZING workshop (https://fuzzingworkshop.github.io/), plus Stage 1 and Stage 2 at ACM Transactions of Software Engineering and Methodology (TOSEM; https://dl.acm.org/journal/tosem/registered-papers ).
by Marcel Böhme. ICSE'22 (NIER) conference presentation for our paper on "Statistical Reasoning about Programs".
Paper: https://mboehme.github.io/paper/ICSE22.NIER.pdf
Video: https://www.youtube.com/watch?v=nOCjesMumiM
This document discusses key performance indicators (KPIs) for measuring the success of application security initiatives. It provides example metrics in six areas: product security quality and risk exposure, security development lifecycle (SDLC) maturity, application security testing, consulting, training, and DevSecOps. The document recommends starting by measuring a few basic metrics and improving data over time. It emphasizes clear roles and accountability, and communicating risks financially rather than through complex assessments.
This document discusses viruses and antivirus software. It defines a computer virus as a program that can infect other programs. It then discusses various sources of viruses, types of viruses, and what antivirus software is. The document outlines two main methods that antivirus uses to identify viruses: signature-based detection, which compares files to known virus signatures; and heuristic-based detection, which uses general patterns to detect unknown viruses. It provides details on how each method works and their respective advantages and limitations.
This document discusses DevSecOps, including what it is, why it is needed, and how to implement it. DevSecOps aims to integrate security tools and a security-focused culture into the development lifecycle. It allows security to keep pace with rapid development. The document outlines how to incorporate security checks at various stages of the development pipeline from pre-commit hooks to monitoring in production. It provides examples of tools that can be used and discusses cultural and process aspects of DevSecOps implementation.
Making Continuous Security a Reality with OWASP’s AppSec Pipeline - Matt Tesa...Matt Tesauro
You’ve probably heard many talks about DevSecOps and continuous security testing but how many provided the tools needed to actually start that testing? This talk does exactly that. It provides an overview of the open source AppSec Pipeline tool which has been used in real world companies to do real security work. Beyond a stand alone tool, the OWASP AppSec Pipeline provides numerous docker containers ready to automate, a specification to customize with the ability to create your own implementation and references to get you started.
The talk will also cover how to add an AppSec Pipeline to your team’s arsenal and provide example templates of how best to run the automated tools provided. Finally, we’ll briefly cover using OWASP Defect Dojo to store and curate the issues found by your AppSec Pipeline. The goal of this talk is to share the field-tested methods of two AppSec professionals with nearly 20 years of experience between them. If you want to start your DevSecOps journey by continuously testing rather then hear about it, this talk is for you.
The practical DevSecOps course is designed to help individuals and organisations in implementing DevSecOps practices, to achieve massive scale in security. This course is divided into 13 chapters, each chapter will have theory, followed by demos and any limitations we need to keep in my mind while implementing them.
More details here - https://www.practical-devsecops.com/
Purple Team - Work it out: Organizing Effective Adversary Emulation ExercisesJorge Orchilles
Presented at the inaugural SANS Purple Team Summit & Training event, this presentation covers performing a high value adversary emulation exercise in a purple team fashion (red and blue team sitting together throughout the entire engagement).
Link to Youtube video: https://youtu.be/OJMqMWnxlT8
You can contact me at abhimanyu.bhogwan@gmail.com
My linkdin id : https://www.linkedin.com/in/abhimanyu-bhogwan-cissp-ctprp-98978437/
Threat Modeling(system+ enterprise)
What is Threat Modeling?
Why do we need Threat Modeling?
6 Most Common Threat Modeling Misconceptions
Threat Modelling Overview
6 important components of a DevSecOps approach
DevSecOps Security Best Practices
Threat Modeling Approaches
Threat Modeling Methodologies for IT Purposes
STRIDE
Threat Modelling Detailed Flow
System Characterization
Create an Architecture Overview
Decomposing your Application
Decomposing DFD’s and Threat-Element Relationship
Identify possible attack scenarios mapped to S.T.R.I.D.E. model
Identifying Security Controls
Identify possible threats
Report to Developers and Security team
DREAD Scoring
My Opinion on implementing Threat Modeling at enterprise level
DevSecOps Fundamentals and the Scars to Prove it.Matt Tesauro
This document discusses the fundamentals and evolution of DevSecOps. It begins by introducing the author and their background. It then outlines key DevSecOps concepts like reducing complexity, managing dependencies, shared understanding, enabling default security controls, fully utilizing frameworks, embracing cloud-native principles, codifying processes, treating servers as cattle, and automating workflows. The document also discusses the importance of DefectDojo and generating AppSec pipelines to integrate security testing into development pipelines in order to scale efforts and increase visibility, consistency, and flow. It emphasizes automating non-human tasks to optimize security personnel.
Meet the hackers powering the world's best bug bounty programsHackerOne
Not even the strongest or most skilled organizations have the headcount and capacity to avert system vulnerabilities on their own.
There is strength in numbers.
Hackers are that army - and at HackerOne, there's 80,000+ white hat hackers who want to make your software more secure.
Hackers ARE: Problem-solvers, Curious, Technically skilled, Diverse in background and education
Hackers are NOT: Criminals. Using their skills for a malicious purpose
This presentation dives into *who these hackers are and what motivates them. We look at some successful hacker profiles and see what separates the best from the rest.
This document discusses building application security teams. It begins by introducing the author and their background in application security. It then discusses creating an environment where security enables business goals rather than hinders them. It suggests embedding security into culture by focusing on quality, testing, and engineering. It discusses the importance of application security policies being customized and delivered effectively. It emphasizes the need for application security activities like threat modeling and code reviews to avoid relying on "security pixie dust". It argues that even non-software companies should view themselves as software companies due to their reliance on code. Finally, it discusses building application security teams internally by training and educating developers rather than exclusively hiring specialists.
This document provides a guide for becoming a DevOps engineer. It discusses what DevOps is, the responsibilities of a DevOps engineer, and the necessary technical and non-technical skills. Foundational skills like Linux, programming, Git, networking and cloud are recommended. Technical skills like CI/CD, containers, Kubernetes, infrastructure as code and security are important. Non-technical skills include understanding DevOps culture, communication, Agile principles and Lean. The document provides certification and learning resources recommendations.
Threat modeling is about thinking what bad can happen and what can you do about it. It can also find logical flaws and reveal problems in the architecture or software development practices. These vulnerabilities cannot usually be found by technical testing.
Threat modeling helps you deliver better software, prioritize your preventive security measures, and focus your penetration testing to the most risky parts of the system. The beauty of threat modeling is that you can assess security already in the design phase. In addition, it is something every team member can participate in because it doesn't require any source code, special skills, or tools. Threat modeling is for everyone: developers, testers, product owners, and project managers.
The presentation covers various methods, such as the STRIDE model, for finding security and privacy threats. You will also learn to analyze use cases for finding business level threats. The presentation also includes practical tips for arranging threat workshops and representing your results.
This presentation was held in the Diana Initiative 2018 and Nixucon 2018 conferences.
This document provides an introduction to bug bounty programs. It discusses what a bug bounty is, which are popular bug bounty platforms, how to choose target programs, reconnaissance methods like subdomain enumeration and content discovery, attacking single domains by analyzing requests and responses and hidden endpoints, and provides examples of the author's past bug bounty finds. The presentation ends by answering any questions about bug bounty programs.
This document describes a method for enhancing complete genome sequencing of foot-and-mouth disease virus (FMDV) using probe enrichment of next generation sequencing libraries. The method involves creating a library of oligonucleotide probes to enrich sequencing libraries for FMDV sequences prior to sequencing. This target enrichment was shown to dramatically improve the depth of coverage achieved for FMDV sequencing, increasing coverage over 100-fold for good quality samples and enabling sequencing of much weaker samples and samples subjected to heat denaturing. Target enrichment also allowed obtaining a complete FMDV genome from an RNA sample extracted from a field swab in an African abattoir.
Basic Security Concepts of Computer, this presentation will cover the following topics
BASIC SECURITY CONCEPT OF COMPUTER.
THREATS.
THREATS TO COMPUTER HARDWARE.
THREATS TO COMPUTER USER.
THREATS TO COMPUTER DATA.
VULNERABILITY AND COUNTERMEASURE.
SOFTWARE SECURITY.
The document discusses computer security threats and measures. It describes types of security like hardware security, software security and network security. It then discusses various malicious codes like viruses, trojans, worms and logic bombs. It also discusses hacking, natural threats like fires and floods, and theft. It concludes by describing various security measures that can be taken like using antivirus software, firewalls, encryption, backups and focusing on the human aspect of security.
Cyber security and demonstration of security toolsVicky Fernandes
Presentation on Cybersecurity and demonstration of security tools, conducted by Vicky Fernandes on 10th September 2019 at Don Bosco Institute of Technology, Mumbai.
We present our implementation and our reflections on a preregistration-based publication process for the fuzzing community with a pre-stage in the FUZZING workshop (https://fuzzingworkshop.github.io/), plus Stage 1 and Stage 2 at ACM Transactions of Software Engineering and Methodology (TOSEM; https://dl.acm.org/journal/tosem/registered-papers ).
by Marcel Böhme. ICSE'22 (NIER) conference presentation for our paper on "Statistical Reasoning about Programs".
Paper: https://mboehme.github.io/paper/ICSE22.NIER.pdf
Video: https://www.youtube.com/watch?v=nOCjesMumiM
The Curious Case of Fuzzing for Automated Software Testingmboehme
Presented @ RUB - 2. Tag der Informatik to a General Audience
Abstract: Fuzzing is an automated software testing technique and has become the first line of defense against exploitable software vulnerabilities. When you run a fuzzer on your program, hopefully it does not find any bugs. But what does it really say? Is your program perfectly correct and free of bugs? Probably not. Is your fuzzer effective at finding bugs? How do we even measure the effectiveness of a fuzzer in the absence of bugs? In this talk, we’ll go through some interesting and counter-intuitive recent results in fuzzing, and uncover fundamental limitations of existing approaches.
On the Surprising Efficiency and Exponential Cost of Fuzzingmboehme
Fuzzing has become a tremendous success story for automated bug finding. For instance, Google has been fuzzing the 500 most popular open source projects on 100k+ machines 24/7. The average open-source project receives bug reports at a constant rate of three to four new bugs per week - many of which are security critical. So, what makes fuzzing so efficient? Under which conditions does a simple random test input generation outperform symbolic execution even if we assume that the latter could prove the absence of bugs? Which kind of correctness guarantees does a fuzzing campaign provide that finds no bugs? And how many more bugs does an adversary find that has 10^x times more machines than I do?
Invited Talk at the "Workshop on Dependable and Secure Software Systems" organized by the ETH Zürich on 26th Oct. 2021
The document discusses fuzzing, which is a technique for finding bugs in software by automatically feeding unexpected or random inputs to programs. It focuses on improving the scalability, efficiency, and effectiveness of fuzzing to discover more bugs. The author is an expert in software security and fuzzing who is interested in addressing fundamental limitations like the exponential cost of fuzzing and providing stronger assurances about software security through approaches from multiple disciplines.
Fuzzing: On the Exponential Cost of Vulnerability Discoverymboehme
- Google has been fuzzing open source software (OSS) for 4 years using 25k machines and found 11k+ bugs in 160+ projects and 16k+ bugs in Chrome.
- The study examines over 300 OSS projects fuzzed with AFL and LibFuzzer over 4+ CPU years.
- Three empirical laws are presented: 1) Finding linearly more new bugs requires exponentially more machines. 2) Finding the same bugs exponentially faster requires exponentially more machines. 3) Increasing machines exponentially increases the probability of finding a specific bug exponentially until discovery is expected.
Talk @ #fuzzconeurope2020
Paper: https://ieeexplore.ieee.org/document/9166552
(M. Böhme, C. Cadar, and A. Roychoudhury)
Disclaimer: Our perspective on the discussions. Mistakes are mine.
This document summarizes a PhD thesis defense presentation on directed greybox fuzzing. It discusses:
1. Different types of fuzzing techniques including blackbox, whitebox, and greybox fuzzing.
2. How directed greybox fuzzing formulates the problem of reaching targeted locations as an optimization problem rather than using heavy symbolic execution.
3. The instrumentation process to compute distance metrics to target locations and guide input generation towards minimizing distance.
These slides invite prospective graduate students at TU Dresden to join as PhD/MComp or interns the School of Computing at the National University of Singapore.
In the rapidly evolving landscape of technologies, XML continues to play a vital role in structuring, storing, and transporting data across diverse systems. The recent advancements in artificial intelligence (AI) present new methodologies for enhancing XML development workflows, introducing efficiency, automation, and intelligent capabilities. This presentation will outline the scope and perspective of utilizing AI in XML development. The potential benefits and the possible pitfalls will be highlighted, providing a balanced view of the subject.
We will explore the capabilities of AI in understanding XML markup languages and autonomously creating structured XML content. Additionally, we will examine the capacity of AI to enrich plain text with appropriate XML markup. Practical examples and methodological guidelines will be provided to elucidate how AI can be effectively prompted to interpret and generate accurate XML markup.
Further emphasis will be placed on the role of AI in developing XSLT, or schemas such as XSD and Schematron. We will address the techniques and strategies adopted to create prompts for generating code, explaining code, or refactoring the code, and the results achieved.
The discussion will extend to how AI can be used to transform XML content. In particular, the focus will be on the use of AI XPath extension functions in XSLT, Schematron, Schematron Quick Fixes, or for XML content refactoring.
The presentation aims to deliver a comprehensive overview of AI usage in XML development, providing attendees with the necessary knowledge to make informed decisions. Whether you’re at the early stages of adopting AI or considering integrating it in advanced XML development, this presentation will cover all levels of expertise.
By highlighting the potential advantages and challenges of integrating AI with XML development tools and languages, the presentation seeks to inspire thoughtful conversation around the future of XML development. We’ll not only delve into the technical aspects of AI-powered XML development but also discuss practical implications and possible future directions.
Sudheer Mechineni, Head of Application Frameworks, Standard Chartered Bank
Discover how Standard Chartered Bank harnessed the power of Neo4j to transform complex data access challenges into a dynamic, scalable graph database solution. This keynote will cover their journey from initial adoption to deploying a fully automated, enterprise-grade causal cluster, highlighting key strategies for modelling organisational changes and ensuring robust disaster recovery. Learn how these innovations have not only enhanced Standard Chartered Bank’s data infrastructure but also positioned them as pioneers in the banking sector’s adoption of graph technology.
Essentials of Automations: The Art of Triggers and Actions in FMESafe Software
In this second installment of our Essentials of Automations webinar series, we’ll explore the landscape of triggers and actions, guiding you through the nuances of authoring and adapting workspaces for seamless automations. Gain an understanding of the full spectrum of triggers and actions available in FME, empowering you to enhance your workspaces for efficient automation.
We’ll kick things off by showcasing the most commonly used event-based triggers, introducing you to various automation workflows like manual triggers, schedules, directory watchers, and more. Plus, see how these elements play out in real scenarios.
Whether you’re tweaking your current setup or building from the ground up, this session will arm you with the tools and insights needed to transform your FME usage into a powerhouse of productivity. Join us to discover effective strategies that simplify complex processes, enhancing your productivity and transforming your data management practices with FME. Let’s turn complexity into clarity and make your workspaces work wonders!
Dr. Sean Tan, Head of Data Science, Changi Airport Group
Discover how Changi Airport Group (CAG) leverages graph technologies and generative AI to revolutionize their search capabilities. This session delves into the unique search needs of CAG’s diverse passengers and customers, showcasing how graph data structures enhance the accuracy and relevance of AI-generated search results, mitigating the risk of “hallucinations” and improving the overall customer journey.
Maruthi Prithivirajan, Head of ASEAN & IN Solution Architecture, Neo4j
Get an inside look at the latest Neo4j innovations that enable relationship-driven intelligence at scale. Learn more about the newest cloud integrations and product enhancements that make Neo4j an essential choice for developers building apps with interconnected data and generative AI.
Climate Impact of Software Testing at Nordic Testing DaysKari Kakkonen
My slides at Nordic Testing Days 6.6.2024
Climate impact / sustainability of software testing discussed on the talk. ICT and testing must carry their part of global responsibility to help with the climat warming. We can minimize the carbon footprint but we can also have a carbon handprint, a positive impact on the climate. Quality characteristics can be added with sustainability, and then measured continuously. Test environments can be used less, and in smaller scale and on demand. Test techniques can be used in optimizing or minimizing number of tests. Test automation can be used to speed up testing.
Pushing the limits of ePRTC: 100ns holdover for 100 daysAdtran
At WSTS 2024, Alon Stern explored the topic of parametric holdover and explained how recent research findings can be implemented in real-world PNT networks to achieve 100 nanoseconds of accuracy for up to 100 days.
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc
How does your privacy program stack up against your peers? What challenges are privacy teams tackling and prioritizing in 2024?
In the fifth annual Global Privacy Benchmarks Survey, we asked over 1,800 global privacy professionals and business executives to share their perspectives on the current state of privacy inside and outside of their organizations. This year’s report focused on emerging areas of importance for privacy and compliance professionals, including considerations and implications of Artificial Intelligence (AI) technologies, building brand trust, and different approaches for achieving higher privacy competence scores.
See how organizational priorities and strategic approaches to data security and privacy are evolving around the globe.
This webinar will review:
- The top 10 privacy insights from the fifth annual Global Privacy Benchmarks Survey
- The top challenges for privacy leaders, practitioners, and organizations in 2024
- Key themes to consider in developing and maintaining your privacy program
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfMalak Abu Hammad
Discover how MongoDB Atlas and vector search technology can revolutionize your application's search capabilities. This comprehensive presentation covers:
* What is Vector Search?
* Importance and benefits of vector search
* Practical use cases across various industries
* Step-by-step implementation guide
* Live demos with code snippets
* Enhancing LLM capabilities with vector search
* Best practices and optimization strategies
Perfect for developers, AI enthusiasts, and tech leaders. Learn how to leverage MongoDB Atlas to deliver highly relevant, context-aware search results, transforming your data retrieval process. Stay ahead in tech innovation and maximize the potential of your applications.
#MongoDB #VectorSearch #AI #SemanticSearch #TechInnovation #DataScience #LLM #MachineLearning #SearchTechnology
Full-RAG: A modern architecture for hyper-personalizationZilliz
Mike Del Balso, CEO & Co-Founder at Tecton, presents "Full RAG," a novel approach to AI recommendation systems, aiming to push beyond the limitations of traditional models through a deep integration of contextual insights and real-time data, leveraging the Retrieval-Augmented Generation architecture. This talk will outline Full RAG's potential to significantly enhance personalization, address engineering challenges such as data management and model training, and introduce data enrichment with reranking as a key solution. Attendees will gain crucial insights into the importance of hyperpersonalization in AI, the capabilities of Full RAG for advanced personalization, and strategies for managing complex data integrations for deploying cutting-edge AI solutions.
In his public lecture, Christian Timmerer provides insights into the fascinating history of video streaming, starting from its humble beginnings before YouTube to the groundbreaking technologies that now dominate platforms like Netflix and ORF ON. Timmerer also presents provocative contributions of his own that have significantly influenced the industry. He concludes by looking at future challenges and invites the audience to join in a discussion.
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slackshyamraj55
Discover the seamless integration of RPA (Robotic Process Automation), COMPOSER, and APM with AWS IDP enhanced with Slack notifications. Explore how these technologies converge to streamline workflows, optimize performance, and ensure secure access, all while leveraging the power of AWS IDP and real-time communication via Slack notifications.
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?Speck&Tech
ABSTRACT: A prima vista, un mattoncino Lego e la backdoor XZ potrebbero avere in comune il fatto di essere entrambi blocchi di costruzione, o dipendenze di progetti creativi e software. La realtà è che un mattoncino Lego e il caso della backdoor XZ hanno molto di più di tutto ciò in comune.
Partecipate alla presentazione per immergervi in una storia di interoperabilità, standard e formati aperti, per poi discutere del ruolo importante che i contributori hanno in una comunità open source sostenibile.
BIO: Sostenitrice del software libero e dei formati standard e aperti. È stata un membro attivo dei progetti Fedora e openSUSE e ha co-fondato l'Associazione LibreItalia dove è stata coinvolta in diversi eventi, migrazioni e formazione relativi a LibreOffice. In precedenza ha lavorato a migrazioni e corsi di formazione su LibreOffice per diverse amministrazioni pubbliche e privati. Da gennaio 2020 lavora in SUSE come Software Release Engineer per Uyuni e SUSE Manager e quando non segue la sua passione per i computer e per Geeko coltiva la sua curiosità per l'astronomia (da cui deriva il suo nickname deneb_alpha).
Removing Uninteresting Bytes in Software FuzzingAftab Hussain
Imagine a world where software fuzzing, the process of mutating bytes in test seeds to uncover hidden and erroneous program behaviors, becomes faster and more effective. A lot depends on the initial seeds, which can significantly dictate the trajectory of a fuzzing campaign, particularly in terms of how long it takes to uncover interesting behaviour in your code. We introduce DIAR, a technique designed to speedup fuzzing campaigns by pinpointing and eliminating those uninteresting bytes in the seeds. Picture this: instead of wasting valuable resources on meaningless mutations in large, bloated seeds, DIAR removes the unnecessary bytes, streamlining the entire process.
In this work, we equipped AFL, a popular fuzzer, with DIAR and examined two critical Linux libraries -- Libxml's xmllint, a tool for parsing xml documents, and Binutil's readelf, an essential debugging and security analysis command-line tool used to display detailed information about ELF (Executable and Linkable Format). Our preliminary results show that AFL+DIAR does not only discover new paths more quickly but also achieves higher coverage overall. This work thus showcases how starting with lean and optimized seeds can lead to faster, more comprehensive fuzzing campaigns -- and DIAR helps you find such seeds.
- These are slides of the talk given at IEEE International Conference on Software Testing Verification and Validation Workshop, ICSTW 2022.
Communications Mining Series - Zero to Hero - Session 1DianaGray10
This session provides introduction to UiPath Communication Mining, importance and platform overview. You will acquire a good understand of the phases in Communication Mining as we go over the platform with you. Topics covered:
• Communication Mining Overview
• Why is it important?
• How can it help today’s business and the benefits
• Phases in Communication Mining
• Demo on Platform overview
• Q/A
Communications Mining Series - Zero to Hero - Session 1
On the Reliability of Coverage-based Fuzzer Benchmarking
1. On the Reliability of Coverage-Based
Fuzzer Benchmarking
Marcel Böhme
MPI-SP & Monash
László Szekeres
Google
Jonathan Metzman
Google
2. Marcel Böhme, Max Planck Institute for Security and Privacy & Monash University · UZH IFI Colloquium’22 · On the Reliability of Coverage-based Fuzzer Benchmarking
whoami
Marcel Böhme
Foundations of Software Security @ MPI-SP
(Max Planck Institute for Security and Privacy)
Looking for PhD & PostDocs
at Max Planck Institute
Bochum, Germany
• Fuzzing for Automatic Vulnerability Discovery
• Making machines attack other machines.
• Focus on scalability, e
ffi
ciency, and e
ff
ectiveness.
• Foundations of Software Security
• Assurances in Software Security
• Fundamental limitations of existing approaches
• Drawing from multiple disciplines (information theory, biostatistics)
10 yrs
Singapore
3 yrs
Melbourne
since Aug’22
Bochum
3. Marcel Böhme, Max Planck Institute for Security and Privacy & Monash University · UZH IFI Colloquium’22 · On the Reliability of Coverage-based Fuzzer Benchmarking
Motivation
Suppose none of our fuzzers
fi
nds any bugs in our program.
How do we know which fuzzer is better?
4. Marcel Böhme, Max Planck Institute for Security and Privacy & Monash University · UZH IFI Colloquium’22 · On the Reliability of Coverage-based Fuzzer Benchmarking
Motivation
Suppose none of our fuzzers
fi
nds any bugs in our program.
How do we know which fuzzer is better?
We measure code coverage!
5. Marcel Böhme, Max Planck Institute for Security and Privacy & Monash University · UZH IFI Colloquium’22 · On the Reliability of Coverage-based Fuzzer Benchmarking
• Key Idea:
• You cannot
fi
nd bugs in code that is not covered.
• Question:
• How strong is the relationship between coverage and bug
fi
nding?
Motivation: Coverage
We measure code coverage!
6. Marcel Böhme, Max Planck Institute for Security and Privacy & Monash University · UZH IFI Colloquium’22 · On the Reliability of Coverage-based Fuzzer Benchmarking
• Key Idea:
• You cannot
fi
nd bugs in code that is not covered.
• Question:
• How strong is the relationship between coverage and bug
fi
nding?
Motivation: Coverage
ICSE’14
7. Marcel Böhme, Max Planck Institute for Security and Privacy & Monash University · UZH IFI Colloquium’22 · On the Reliability of Coverage-based Fuzzer Benchmarking
• Key Idea:
• You cannot
fi
nd bugs in code that is not covered.
• Question:
• How strong is the relationship between coverage and bug
fi
nding?
Motivation: Coverage
ICSE’14
This is called “correlation”.
8. Marcel Böhme, Max Planck Institute for Security and Privacy & Monash University · UZH IFI Colloquium’22 · On the Reliability of Coverage-based Fuzzer Benchmarking
• Key Idea:
• You cannot
fi
nd bugs in code that is not covered.
• Question:
• How strong is the relationship between coverage and bug
fi
nding?
Motivation: Coverage
ICSE’14
This is called “correlation”.
• Observation: Test suites with
more coverage
fi
nd more bugs
only because they are bigger.
9. Marcel Böhme, Max Planck Institute for Security and Privacy & Monash University · UZH IFI Colloquium’22 · On the Reliability of Coverage-based Fuzzer Benchmarking
• Key Idea:
• You cannot
fi
nd bugs in code that is not covered.
• Question:
• How strong is the relationship between coverage and bug
fi
nding?
Motivation: Coverage
ICSE’14
This is called “correlation”.
10. Marcel Böhme, Max Planck Institute for Security and Privacy & Monash University · UZH IFI Colloquium’22 · On the Reliability of Coverage-based Fuzzer Benchmarking
• Key Idea:
• You cannot
fi
nd bugs in code that is not covered.
• Question:
• How strong is the relationship between coverage and bug
fi
nding?
Motivation: Coverage
ICSE’14
• Observation: Test suites with
more coverage
fi
nd more bugs
irrespective of whether they are bigger.
This is called “correlation”.
11. Marcel Böhme, Max Planck Institute for Security and Privacy & Monash University · UZH IFI Colloquium’22 · On the Reliability of Coverage-based Fuzzer Benchmarking
• Key Idea:
• You cannot
fi
nd bugs in code that is not covered.
• Question:
• How strong is the relationship between coverage and bug
fi
nding?
Motivation: Coverage
ICSE’14
• Observation: Test suites with
more coverage
fi
nd more bugs
irrespective of whether they are bigger.
This is called “correlation”.
This is called “contradiction”.
12. Marcel Böhme, Max Planck Institute for Security and Privacy & Monash University · UZH IFI Colloquium’22 · On the Reliability of Coverage-based Fuzzer Benchmarking
• Key Idea:
• You cannot
fi
nd bugs in code that is not covered.
• Question:
• How strong is the relationship between coverage and bug
fi
nding?
Motivation: Coverage
ASE’20
This is called “correlation”.
13. Marcel Böhme, Max Planck Institute for Security and Privacy & Monash University · UZH IFI Colloquium’22 · On the Reliability of Coverage-based Fuzzer Benchmarking
• Key Idea:
• You cannot
fi
nd bugs in code that is not covered.
• Question:
• How strong is the relationship ?
Motivation: Coverage
ASE’20
This is called “correlation”.
14. Marcel Böhme, Max Planck Institute for Security and Privacy & Monash University · UZH IFI Colloquium’22 · On the Reliability of Coverage-based Fuzzer Benchmarking
• Our experiments con
fi
rm a
very strong correlation for
fuzzer-generated test suites!
• As a fuzzer covers more code,
it also
fi
nds more bugs.
Correlation: Very strong
15. Marcel Böhme, Max Planck Institute for Security and Privacy & Monash University · UZH IFI Colloquium’22 · On the Reliability of Coverage-based Fuzzer Benchmarking
• Our experiments con
fi
rm a
very strong correlation for
fuzzer-generated test suites!
• As a fuzzer covers more code,
it also
fi
nds more bugs.
Correlation: Very strong
16. Marcel Böhme, Max Planck Institute for Security and Privacy & Monash University · UZH IFI Colloquium’22 · On the Reliability of Coverage-based Fuzzer Benchmarking
• Our experiments con
fi
rm a
very strong correlation for
fuzzer-generated test suites!
• As a fuzzer covers more code,
it also
fi
nds more bugs.
Correlation: Very strong
17. Marcel Böhme, Max Planck Institute for Security and Privacy & Monash University · UZH IFI Colloquium’22 · On the Reliability of Coverage-based Fuzzer Benchmarking
• Our experiments con
fi
rm a
very strong correlation for
fuzzer-generated test suites!
• As a fuzzer covers more code,
it also
fi
nds more bugs.
Correlation: Very strong
18. Marcel Böhme, Max Planck Institute for Security and Privacy & Monash University · UZH IFI Colloquium’22 · On the Reliability of Coverage-based Fuzzer Benchmarking
•Problem:
• Fuzzing folks are not convinced.
Correlation: Very strong
“It does not make sense 🤔”
paraphrasing Klees et al., CCS’18
19. Marcel Böhme, Max Planck Institute for Security and Privacy & Monash University · UZH IFI Colloquium’22 · On the Reliability of Coverage-based Fuzzer Benchmarking
•Problem:
• Fuzzing folks are not convinced.
Correlation: Very strong
We cannot compare
two or more fuzzers
in terms of coverage
in order to establish
one as the best fuzzer
in terms of bug
fi
nding.
“It does not make sense 🤔”
paraphrasing Klees et al., CCS’18
20. Marcel Böhme, Max Planck Institute for Security and Privacy & Monash University · UZH IFI Colloquium’22 · On the Reliability of Coverage-based Fuzzer Benchmarking
•Problem:
• Fuzzing folks are not convinced.
Correlation: Very strong
CCS’18
“It does not make sense 🤔”
paraphrasing Klees et al., CCS’18
21. Marcel Böhme, Max Planck Institute for Security and Privacy & Monash University · UZH IFI Colloquium’22 · On the Reliability of Coverage-based Fuzzer Benchmarking
•Problem:
• Fuzzing folks are not convinced.
Correlation: Very strong
CCS’18
22. Marcel Böhme, Max Planck Institute for Security and Privacy & Monash University · UZH IFI Colloquium’22 · On the Reliability of Coverage-based Fuzzer Benchmarking
•Problem:
• Fuzzing folks are not convinced.
Correlation: Very strong
Why?
23. Marcel Böhme, Max Planck Institute for Security and Privacy & Monash University · UZH IFI Colloquium’22 · On the Reliability of Coverage-based Fuzzer Benchmarking
Agreement: That’s why.
24. Marcel Böhme, Max Planck Institute for Security and Privacy & Monash University · UZH IFI Colloquium’22 · On the Reliability of Coverage-based Fuzzer Benchmarking
Agreement: That’s why.
25. Marcel Böhme, Max Planck Institute for Security and Privacy & Monash University · UZH IFI Colloquium’22 · On the Reliability of Coverage-based Fuzzer Benchmarking
Agreement: That’s why.
26. Marcel Böhme, Max Planck Institute for Security and Privacy & Monash University · UZH IFI Colloquium’22 · On the Reliability of Coverage-based Fuzzer Benchmarking
Agreement: That’s why.
• Suppose, we have
• Two instruments to measure acidity.
• Strong correlation:
• More acidity = both indicate higher PH values.
27. Marcel Böhme, Max Planck Institute for Security and Privacy & Monash University · UZH IFI Colloquium’22 · On the Reliability of Coverage-based Fuzzer Benchmarking
Agreement: That’s why.
• Two instruments to measure acidity.
• Strong correlation:
• More acidity = both indicate higher PH values.
28. Marcel Böhme, Max Planck Institute for Security and Privacy & Monash University · UZH IFI Colloquium’22 · On the Reliability of Coverage-based Fuzzer Benchmarking
Agreement: That’s why.
• Two instruments to measure acidity.
• Strong correlation:
• More acidity = both indicate higher PH values.
• Weak agreement:
• Both instruments might rank 2+ tubes di
ff
erently.
29. Marcel Böhme, Max Planck Institute for Security and Privacy & Monash University · UZH IFI Colloquium’22 · On the Reliability of Coverage-based Fuzzer Benchmarking
Agreement: That’s why.
• Two instruments to measure acidity.
• Strong correlation:
• More acidity = both indicate higher PH values.
• Weak agreement:
• Both instruments might rank 2+ tubes di
ff
erently.
Moderate agreement means
we cannot reliably substitute
one instrument for the other.
30. Marcel Böhme, Max Planck Institute for Security and Privacy & Monash University · UZH IFI Colloquium’22 · On the Reliability of Coverage-based Fuzzer Benchmarking
Agreement: That’s why.
Moderate agreement means
we cannot reliably substitute
one instrument for the other.
Ranking 10 fuzzers
in terms of code coverage and
in terms of #bugs found.
31. Marcel Böhme, Max Planck Institute for Security and Privacy & Monash University · UZH IFI Colloquium’22 · On the Reliability of Coverage-based Fuzzer Benchmarking
Agreement: That’s why.
The worst fuzzer in terms coverage is
the best fuzzer in terms of bug
fi
nding.
Ranking 10 fuzzers
in terms of code coverage and
in terms of #bugs found.
Moderate agreement means
we cannot reliably substitute
one instrument for the other.
32. Marcel Böhme, Max Planck Institute for Security and Privacy & Monash University · UZH IFI Colloquium’22 · On the Reliability of Coverage-based Fuzzer Benchmarking
• Experimental Design (post hoc ground truth)
• To minimize threats to validity, we use post hoc bug identi
fi
cation instead
of a pre-determined ground truth benchmark (more on that later).
• Automatic and manual deduplication of bugs found during fuzzing.
• 341,595 generated bug reports across all campaigns.
• 409 unique bugs after automatic deduplication (via a variant of ClusterFuzz).
• 235 unique bugs after manual deduplication (via two profession Softw. Eng.).
Experimental Setup
33. Marcel Böhme, Max Planck Institute for Security and Privacy & Monash University · UZH IFI Colloquium’22 · On the Reliability of Coverage-based Fuzzer Benchmarking
Experimental Setup
• Fuzzers and programs
• FuzzBench infrastructure
34. Marcel Böhme, Max Planck Institute for Security and Privacy & Monash University · UZH IFI Colloquium’22 · On the Reliability of Coverage-based Fuzzer Benchmarking
Experimental Setup
• Fuzzers and programs
• FuzzBench infrastructure
• 10 fuzzers + 24 programs
35. Marcel Böhme, Max Planck Institute for Security and Privacy & Monash University · UZH IFI Colloquium’22 · On the Reliability of Coverage-based Fuzzer Benchmarking
Experimental Setup
• Fuzzers and programs
• FuzzBench infrastructure
• 10 fuzzers + 24 programs
• Benchmark selection
• Randomly selected from
OSS-Fuzz (500+ programs).
• Higher selection probability for
programs with historically more
bugs (for economic reasons).
36. Marcel Böhme, Max Planck Institute for Security and Privacy & Monash University · UZH IFI Colloquium’22 · On the Reliability of Coverage-based Fuzzer Benchmarking
Experimental Setup
• Fuzzers and programs
• FuzzBench infrastructure
• 10 fuzzers + 24 programs
• Benchmark selection
• Randomly selected from
OSS-Fuzz (500+ programs).
• Higher selection probability for
programs with historically more
bugs (for economic reasons).
• Reproducibility
37. Marcel Böhme, Max Planck Institute for Security and Privacy & Monash University · UZH IFI Colloquium’22 · On the Reliability of Coverage-based Fuzzer Benchmarking
Experimental Setup
• Fuzzers and programs
• FuzzBench infrastructure
• 10 fuzzers + 24 programs
• Benchmark selection
• Randomly selected from
OSS-Fuzz (500+ programs).
• Higher selection probability for
programs with historically more
bugs (for economic reasons).
• Reproducibility
10 fuzzers x 24 programs
x 20 campaigns x 23 hours
>13 CPU years
38. Marcel Böhme, Max Planck Institute for Security and Privacy & Monash University · UZH IFI Colloquium’22 · On the Reliability of Coverage-based Fuzzer Benchmarking
Agreement: Coverage vs Bug Finding
39. Marcel Böhme, Max Planck Institute for Security and Privacy & Monash University · UZH IFI Colloquium’22 · On the Reliability of Coverage-based Fuzzer Benchmarking
Agreement: Coverage vs Bug Finding
40. Marcel Böhme, Max Planck Institute for Security and Privacy & Monash University · UZH IFI Colloquium’22 · On the Reliability of Coverage-based Fuzzer Benchmarking
Agreement: Coverage vs Bug Finding
Agreement on superiority comparing 2 fuzzers
in terms of code coverage and #bugs found,
when di
ff
erence is statistical signi
fi
cant at p.
41. Marcel Böhme, Max Planck Institute for Security and Privacy & Monash University · UZH IFI Colloquium’22 · On the Reliability of Coverage-based Fuzzer Benchmarking
Agreement: Coverage vs Bug Finding
Agreement on superiority comparing 2 fuzzers
in terms of code coverage and #bugs found,
when di
ff
erence is statistical signi
fi
cant at p.
Strong agreement for p <= 0.0001
for both: coverage and bug
fi
nding.
42. Marcel Böhme, Max Planck Institute for Security and Privacy & Monash University · UZH IFI Colloquium’22 · On the Reliability of Coverage-based Fuzzer Benchmarking
Agreement: Coverage vs Bug Finding
However, if we only require di
ff
erence in terms of coverage
(which we can observe) to be statistically signi
fi
cant:
Weak agreement.
43. Marcel Böhme, Max Planck Institute for Security and Privacy & Monash University · UZH IFI Colloquium’22 · On the Reliability of Coverage-based Fuzzer Benchmarking
Agreement: Coverage vs Bug Finding
We also provide two other measures of agreement on superiority:
disagreement proportion d and Spearman’s rho p (+1 superior, 0 not signi
fi
cant, -1 inferior)
44. Marcel Böhme, Max Planck Institute for Security and Privacy & Monash University · UZH IFI Colloquium’22 · On the Reliability of Coverage-based Fuzzer Benchmarking
Threats to Validity
45. Marcel Böhme, Max Planck Institute for Security and Privacy & Monash University · UZH IFI Colloquium’22 · On the Reliability of Coverage-based Fuzzer Benchmarking
Threats to Validity: Campaign Length
Does agreement increase
as campaign length increases?
Maybe 23 hours are too short
to expect an agreement.
46. Marcel Böhme, Max Planck Institute for Security and Privacy & Monash University · UZH IFI Colloquium’22 · On the Reliability of Coverage-based Fuzzer Benchmarking
Threats to Validity: Campaign Length
Does agreement increase
as campaign length increases?
Not really.
47. Marcel Böhme, Max Planck Institute for Security and Privacy & Monash University · UZH IFI Colloquium’22 · On the Reliability of Coverage-based Fuzzer Benchmarking
Maybe 23 hours are too short or too long
to expect an agreement.
How does agreement change
as campaign length increases?
Threats to Validity: Campaign Length
48. Marcel Böhme, Max Planck Institute for Security and Privacy & Monash University · UZH IFI Colloquium’22 · On the Reliability of Coverage-based Fuzzer Benchmarking
Does agreement increase
as the number of trials increases?
Maybe 20 trials are too few
to expect an agreement.
Threats to Validity: Number of Trials
49. Marcel Böhme, Max Planck Institute for Security and Privacy & Monash University · UZH IFI Colloquium’22 · On the Reliability of Coverage-based Fuzzer Benchmarking
Looking good!
Threats to Validity: Number of Trials
50. Marcel Böhme, Max Planck Institute for Security and Privacy & Monash University · UZH IFI Colloquium’22 · On the Reliability of Coverage-based Fuzzer Benchmarking
Threats to Validity: Number of Trials
20 trials
are
fi
ne.
51. Marcel Böhme, Max Planck Institute for Security and Privacy & Monash University · UZH IFI Colloquium’22 · On the Reliability of Coverage-based Fuzzer Benchmarking
Threats to Validity: Generality (#Subjects)
52. Marcel Böhme, Max Planck Institute for Security and Privacy & Monash University · UZH IFI Colloquium’22 · On the Reliability of Coverage-based Fuzzer Benchmarking
Threats to Validity: Generality (#Subjects)
53. Marcel Böhme, Max Planck Institute for Security and Privacy & Monash University · UZH IFI Colloquium’22 · On the Reliability of Coverage-based Fuzzer Benchmarking
Result Summary
54. Marcel Böhme, Max Planck Institute for Security and Privacy & Monash University · UZH IFI Colloquium’22 · On the Reliability of Coverage-based Fuzzer Benchmarking
• We observe a moderate agreement on superiority or ranking.
• Only if we require di
ff
erences in coverage *and* bug
fi
nding to
be highly statistically signi
fi
cant, we observe a strong agreement.
You can substitute
coverage for bug
fi
nding
only with moderate reliability.
Result Summary
55. Marcel Böhme, Max Planck Institute for Security and Privacy & Monash University · UZH IFI Colloquium’22 · On the Reliability of Coverage-based Fuzzer Benchmarking
Benchmarking: Challenges of Bug-Based Benchmarking
Reviewers B&C: Why not use a ground-truth
for your bug-based benchmarking?
56. Marcel Böhme, Max Planck Institute for Security and Privacy & Monash University · UZH IFI Colloquium’22 · On the Reliability of Coverage-based Fuzzer Benchmarking
Benchmarking: Challenges of Bug-Based Benchmarking
• Golden evaluation
• Choose a random, representative sample of programs and fuzz them.
• Problem: (Un)fortunately, bugs are very sparse. No statistical power.
57. Marcel Böhme, Max Planck Institute for Security and Privacy & Monash University · UZH IFI Colloquium’22 · On the Reliability of Coverage-based Fuzzer Benchmarking
Benchmarking: Challenges of Bug-Based Benchmarking
• Golden evaluation
• Choose a random, representative sample of programs and fuzz them.
• Problem: (Un)fortunately, bugs are very sparse. No statistical power.
• Mutation-based evaluation
• Inject synthetic bugs into a random, representative sample of programs
• More economical. We know many bugs can be found.
• Problem: Are synthetic bugs representative of real bugs?
58. Marcel Böhme, Max Planck Institute for Security and Privacy & Monash University · UZH IFI Colloquium’22 · On the Reliability of Coverage-based Fuzzer Benchmarking
Benchmarking: Challenges of Bug-Based Benchmarking
• Ground-truth-based evaluation
• Curate real bugs in a random, representative sample of programs.
• Economical, realistic bugs, objective ground truth.
59. Marcel Böhme, Max Planck Institute for Security and Privacy & Monash University · UZH IFI Colloquium’22 · On the Reliability of Coverage-based Fuzzer Benchmarking
Benchmarking: Challenges of Bug-Based Benchmarking
• Ground-truth-based evaluation
• Curate real bugs in a random, representative sample of programs.
• Economical, realistic bugs, objective ground truth.
• Problem:
1. Survivorship bias
• Fuzzers that are better at
fi
nding previously
undiscovered bugs appear worse
• Fuzzers that contributed to the original
discovery appear better
60. Marcel Böhme, Max Planck Institute for Security and Privacy & Monash University · UZH IFI Colloquium’22 · On the Reliability of Coverage-based Fuzzer Benchmarking
Benchmarking: Challenges of Bug-Based Benchmarking
• Ground-truth-based evaluation
• Curate real bugs in a random, representative sample of programs.
• Economical, realistic bugs, objective ground truth.
• Problem:
1. Survivorship bias
2. Experimenter bias
• Porting bugs to one version makes it more economical,
but also potentially introduces bug masking and interaction e
ff
ects.
61. Marcel Böhme, Max Planck Institute for Security and Privacy & Monash University · UZH IFI Colloquium’22 · On the Reliability of Coverage-based Fuzzer Benchmarking
Benchmarking: Challenges of Bug-Based Benchmarking
• Ground-truth-based evaluation
• Curate real bugs in a random, representative sample of programs.
• Economical, realistic bugs, objective ground truth.
• Problem:
1. Survivorship bias
2. Experimenter bias
3. Observer-expectancy bias
• Manual translation to an if-statement representing the bug-trigger condition
simpli
fi
es bug counting and provides the same bug oracle to all fuzzers,
but it enforces a relationship between coverage (of the if-body) and bug
fi
nding.
62. Marcel Böhme, Max Planck Institute for Security and Privacy & Monash University · UZH IFI Colloquium’22 · On the Reliability of Coverage-based Fuzzer Benchmarking
Benchmarking: Challenges of Bug-Based Benchmarking
• Ground-truth-based evaluation
• Curate real bugs in a random, representative sample of programs.
• Economical, realistic bugs, objective ground truth.
• Problem:
1. Survivorship bias
2. Experimenter bias
3. Observer-expectancy bias
4. Con
fi
rmation bias
• Given a ground truth benchmark,
researchers might be enticed
to iteratively and unknowingly
tune their fuzzer to the benchmark.
63. Marcel Böhme, Max Planck Institute for Security and Privacy & Monash University · UZH IFI Colloquium’22 · On the Reliability of Coverage-based Fuzzer Benchmarking
Benchmarking: Challenges of Bug-Based Benchmarking
• Ground-truth-based evaluation
• Curate real bugs in a random, representative sample of programs.
• Economical, realistic bugs, objective ground truth.
• Problem: Many potential sources of bias.
• Post hoc bug based evaluation
• Maximize bug probability in a random, representative sample of programs.
• Identify and deduplicate bugs *after* the fuzzing campaign. Minimizes bias.
• Problem: Less economical (we did not
fi
nd bugs in 7/24 [30%] programs).
64. Marcel Böhme, Max Planck Institute for Security and Privacy & Monash University · UZH IFI Colloquium’22 · On the Reliability of Coverage-based Fuzzer Benchmarking
Discussion Summary
Bug-based benchmarking
is not easy to get
right, either!
65. Marcel Böhme, Max Planck Institute for Security and Privacy & Monash University · UZH IFI Colloquium’22 · On the Reliability of Coverage-based Fuzzer Benchmarking
There are many pitfalls in
sound fuzzer benchmarking.
Discussion Summary
66. Marcel Böhme, Max Planck Institute for Security and Privacy & Monash University · UZH IFI Colloquium’22 · On the Reliability of Coverage-based Fuzzer Benchmarking
Reviewer C (meta-review):
In this role of informing the experimental design of
future fuzzing research, it is important to describe other
e
ff
orts that call for a more holistic fuzzer evaluation.
Benchmarking: Recommendations for Fuzzer Benchmarking
*paraphrasing
*
67. Marcel Böhme, Max Planck Institute for Security and Privacy & Monash University · UZH IFI Colloquium’22 · On the Reliability of Coverage-based Fuzzer Benchmarking
Reviewer C (meta-review):
In this role of informing the experimental design of
future fuzzing research, it is important to describe other
e
ff
orts that call for a more holistic fuzzer evaluation.
Benchmarking: Recommendations for Fuzzer Benchmarking
*paraphrasing
*
So, we synthesized a set of recommendations
from previous work, our results, and our own experience.
68. Marcel Böhme, Max Planck Institute for Security and Privacy & Monash University · UZH IFI Colloquium’22 · On the Reliability of Coverage-based Fuzzer Benchmarking
Benchmarking: Recommendations for Fuzzer Benchmarking
• Select ≥10 representative programs. Repeat each experiment ≥ 10x.
Increasing these values improves generality and statistical power.
•
69. Marcel Böhme, Max Planck Institute for Security and Privacy & Monash University · UZH IFI Colloquium’22 · On the Reliability of Coverage-based Fuzzer Benchmarking
Benchmarking: Recommendations for Fuzzer Benchmarking
• Select ≥10 representative programs. Repeat each experiment ≥ 10x.
Increasing these values improves generality and statistical power.
• Select “real-world programs” that are typically fuzzed in practice.
Increasing representativeness, improves generality.
If experiment cost are a concern, prioritize likely more buggy programs.
70. Marcel Böhme, Max Planck Institute for Security and Privacy & Monash University · UZH IFI Colloquium’22 · On the Reliability of Coverage-based Fuzzer Benchmarking
Benchmarking: Recommendations for Fuzzer Benchmarking
• Select ≥10 representative programs. Repeat each experiment ≥ 10x.
Increasing these values improves generality and statistical power.
• Select “real-world programs” that are typically fuzzed in practice.
Increasing representativeness, improves generality.
If experiment cost are a concern, prioritize likely more buggy programs.
• Select a baseline that was extended to implement the technique.
Ensure equivalent conditions (CLI parameters, initial seeds, ..).
Improves construct validity and allows to attribute improvements precisely.
(Optional) Comparison to SOTA. Note improvements due to engineering di
ff
erences.
71. Marcel Böhme, Max Planck Institute for Security and Privacy & Monash University · UZH IFI Colloquium’22 · On the Reliability of Coverage-based Fuzzer Benchmarking
• Consider using a “training set” during fuzzer development
and a “validation set” (e.g., benchmarking platform) for evaluation.
Reduces over
fi
tting and mitigates con
fi
rmation bias.
Benchmarking: Recommendations for Fuzzer Benchmarking
72. Marcel Böhme, Max Planck Institute for Security and Privacy & Monash University · UZH IFI Colloquium’22 · On the Reliability of Coverage-based Fuzzer Benchmarking
• Consider using a “training set” during fuzzer development
and a “validation set” (e.g., benchmarking platform) for evaluation.
Reduces over
fi
tting and mitigates con
fi
rmation bias.
• Statistical analysis of e
ff
ect size and signi
fi
cance
Allows to assess magnitude of the di
ff
erence and the degree to which
the di
ff
erences are explained by randomness.
Benchmarking: Recommendations for Fuzzer Benchmarking
73. Marcel Böhme, Max Planck Institute for Security and Privacy & Monash University · UZH IFI Colloquium’22 · On the Reliability of Coverage-based Fuzzer Benchmarking
• Consider using a “training set” during fuzzer development
and a “validation set” (e.g., benchmarking platform) for evaluation.
Reduces over
fi
tting and mitigates con
fi
rmation bias.
• Statistical analysis of e
ff
ect size and signi
fi
cance
Allows to assess magnitude of the di
ff
erence and the degree to which
the di
ff
erences are explained by randomness.
• Measure & report coverage and bug-based metrics.
Use the same measurement tooling & procedure across all fuzzers.
Improves construct validity.
Benchmarking: Recommendations for Fuzzer Benchmarking
74. Marcel Böhme, Max Planck Institute for Security and Privacy & Monash University · UZH IFI Colloquium’22 · On the Reliability of Coverage-based Fuzzer Benchmarking
• Discuss potential threats to validity and your strategy to mitigate.
Helps the reader assess the validity of the key claims and results.
Benchmarking: Recommendations for Fuzzer Benchmarking
75. Marcel Böhme, Max Planck Institute for Security and Privacy & Monash University · UZH IFI Colloquium’22 · On the Reliability of Coverage-based Fuzzer Benchmarking
• Discuss potential threats to validity and your strategy to mitigate.
Helps the reader assess the validity of the key claims and results.
• Ensure experiments are reproducible
Publish tool, benchmark, data, and analysis. Report speci
fi
c experiment parameters.
Reproducibility is the foundation of sound scienti
fi
c progress.
Benchmarking: Recommendations for Fuzzer Benchmarking
76. On the Reliability of Coverage-Based
Fuzzer Benchmarking
Marcel Böhme
MPI-SP & Monash
László Szekeres
Google
Jonathan Metzman
Google
77. On the Reliability of Coverage-Based
Fuzzer Benchmarking
Marcel Böhme
MPI-SP & Monash
László Szekeres
Google
Jonathan Metzman
Google
Ç
Marcel Böhme, Max Planck Institute for Security and Privacy & Monash University · ICSE’22 · On the Reliability of Coverage-based Fuzzer Benchmarking
Motivation
Suppose none of our fuzzers
finds any bugs in our program.
How do we know which fuzzer is better?
We measure code coverage!
78. On the Reliability of Coverage-Based
Fuzzer Benchmarking
Marcel Böhme
MPI-SP & Monash
László Szekeres
Google
Jonathan Metzman
Google
Ç
Marcel Böhme, Max Planck Institute for Security and Privacy & Monash University · ICSE’22 · On the Reliability of Coverage-based Fuzzer Benchmarking
Motivation
Suppose none of our fuzzers
finds any bugs in our program.
How do we know which fuzzer is better?
We measure code coverage!
Marcel Böhme, Max Planck Institute for Security and Privacy & Monash University · ICSE’22 · On the Reliability of Coverage-based Fuzzer Benchmarking
• We observe a moderate agreement on superiority or ranking.
• Only if we require differences in coverage *and* bug finding to
be highly statistically significant, we observe a strong agreement.
You can substitute
coverage for bug finding
only with moderate reliability.
Result Summary
79. On the Reliability of Coverage-Based
Fuzzer Benchmarking
Marcel Böhme
MPI-SP & Monash
László Szekeres
Google
Jonathan Metzman
Google
Ç
Marcel Böhme, Max Planck Institute for Security and Privacy & Monash University · ICSE’22 · On the Reliability of Coverage-based Fuzzer Benchmarking
Motivation
Suppose none of our fuzzers
finds any bugs in our program.
How do we know which fuzzer is better?
We measure code coverage!
Marcel Böhme, Max Planck Institute for Security and Privacy & Monash University · ICSE’22 · On the Reliability of Coverage-based Fuzzer Benchmarking
• We observe a moderate agreement on superiority or ranking.
• Only if we require differences in coverage *and* bug finding to
be highly statistically significant, we observe a strong agreement.
You can substitute
coverage for bug finding
only with moderate reliability.
Result Summary
Marcel Böhme, Max Planck Institute for Security and Privacy & Monash University · ICSE’22 · On the Reliability of Coverage-based Fuzzer Benchmarking
Benchmarking: Challenges of Bug-Based Benchmarking
• Ground-truth-based evaluation
• Curate real bugs in a random, representative sample of programs.
• Economical, realistic bugs, objective ground truth.
• Problem:
1. Survivorship bias
2. Experimenter bias
3. Observer-expectancy bias
4. Confirmation bias
• Given a ground truth benchmark,
researchers might be enticed
to iteratively and unknowingly
tune their fuzzer to the benchmark.
80. On the Reliability of Coverage-Based
Fuzzer Benchmarking
Marcel Böhme
MPI-SP & Monash
László Szekeres
Google
Jonathan Metzman
Google
Marcel Böhme, Max Planck Institute for Security and Privacy & Monash University · ICSE’22 · On the Reliability of Coverage-based Fuzzer Benchmarking
Benchmarking: Challenges of Bug-Based Benchmarking
• Ground-truth-based evaluation
• Curate real bugs in a random, representative sample of programs.
• Economical, realistic bugs, objective ground truth.
• Problem:
1. Survivorship bias
2. Experimenter bias
3. Observer-expectancy bias
4. Confirmation bias
• Given a ground truth benchmark,
researchers might be enticed
to iteratively and unknowingly
tune their fuzzer to the benchmark.
Marcel Böhme, Max Planck Institute for Security and Privacy & Monash University · ICSE’22 · On the Reliability of Coverage-based Fuzzer Benchmarking
Motivation
Suppose none of our fuzzers
finds any bugs in our program.
How do we know which fuzzer is better?
We measure code coverage!
Big Picture Conclusion:
• CS graduates need better training in statistical and empirical methods.
• Learn about di
ff
erent statistical instruments to investigate empirical questions,
di
ff
erent sources of bias and threats to validity (what can go wrong), and
sound experiment design (how to do it right)
• In research, we focus on a paper’s claim, and not enough on the claim’s validation.
• In practice, we also make claims about our system that need validation.
Marcel Böhme, Max Planck Institute for Security and Privacy & Monash University · ICSE’22 · On the Reliability of Coverage-based Fuzzer Benchmarking
• We observe a moderate agreement on superiority or ranking.
• Only if we require differences in coverage *and* bug finding to
be highly statistically significant, we observe a strong agreement.
You can substitute
coverage for bug finding
only with moderate reliability.
Result Summary
Ç
81. On the Reliability of Coverage-Based
Fuzzer Benchmarking
Marcel Böhme
MPI-SP & Monash
László Szekeres
Google
Jonathan Metzman
Google
Marcel Böhme, Max Planck Institute for Security and Privacy & Monash University · ICSE’22 · On the Reliability of Coverage-based Fuzzer Benchmarking
Benchmarking: Challenges of Bug-Based Benchmarking
• Ground-truth-based evaluation
• Curate real bugs in a random, representative sample of programs.
• Economical, realistic bugs, objective ground truth.
• Problem:
1. Survivorship bias
2. Experimenter bias
3. Observer-expectancy bias
4. Confirmation bias
• Given a ground truth benchmark,
researchers might be enticed
to iteratively and unknowingly
tune their fuzzer to the benchmark.
Marcel Böhme, Max Planck Institute for Security and Privacy & Monash University · ICSE’22 · On the Reliability of Coverage-based Fuzzer Benchmarking
Motivation
Suppose none of our fuzzers
finds any bugs in our program.
How do we know which fuzzer is better?
We measure code coverage!
Big Picture Conclusion:
• CS graduates need better training in statistical and empirical methods.
• Learn about di
ff
erent statistical instruments to investigate empirical questions,
di
ff
erent sources of bias and threats to validity (what can go wrong), and
sound experiment design (how to do it right)
• In research, we focus on a paper’s claim, and not enough on the claim’s validation.
• In practice, we also make claims about our system that need validation.
• CS research community needs more focus on evaluation standards.
• Publication bias & Author bias: Too much focus on the results
• Investigate soundness of our experimental designs
Marcel Böhme, Max Planck Institute for Security and Privacy & Monash University · ICSE’22 · On the Reliability of Coverage-based Fuzzer Benchmarking
• We observe a moderate agreement on superiority or ranking.
• Only if we require differences in coverage *and* bug finding to
be highly statistically significant, we observe a strong agreement.
You can substitute
coverage for bug finding
only with moderate reliability.
Result Summary
82. On the Reliability of Coverage-Based
Fuzzer Benchmarking
Marcel Böhme
MPI-SP & Monash
László Szekeres
Google
Jonathan Metzman
Google
Marcel Böhme, Max Planck Institute for Security and Privacy & Monash University · ICSE’22 · On the Reliability of Coverage-based Fuzzer Benchmarking
Benchmarking: Challenges of Bug-Based Benchmarking
• Ground-truth-based evaluation
• Curate real bugs in a random, representative sample of programs.
• Economical, realistic bugs, objective ground truth.
• Problem:
1. Survivorship bias
2. Experimenter bias
3. Observer-expectancy bias
4. Confirmation bias
• Given a ground truth benchmark,
researchers might be enticed
to iteratively and unknowingly
tune their fuzzer to the benchmark.
Marcel Böhme, Max Planck Institute for Security and Privacy & Monash University · ICSE’22 · On the Reliability of Coverage-based Fuzzer Benchmarking
Motivation
Suppose none of our fuzzers
finds any bugs in our program.
How do we know which fuzzer is better?
We measure code coverage!
Big Picture Conclusion:
• CS graduates need better training in statistical and empirical methods.
• Learn about di
ff
erent statistical instruments to investigate empirical questions,
di
ff
erent sources of bias and threats to validity (what can go wrong), and
sound experiment design (how to do it right)
• In research, we focus on a paper’s claim, and not enough on the claim’s validation.
• In practice, we also make claims about our system that need validation.
• CS research community needs more focus on evaluation standards.
• Publication bias & Author bias: Too much focus on the results
• Investigate soundness of our experimental designs
Marcel Böhme, Max Planck Institute for Security and Privacy & Monash University · ICSE’22 · On the Reliability of Coverage-based Fuzzer Benchmarking
1. Select ≥10 representative programs. Repeat each experiment ≥ 10x.
2. Select “real-world programs”.
3. Select a fair baseline.
4. Use a training and a validation set.
5. Use a statistical analysis of effect size and significance.
6. Measure & report coverage and bug-based metrics.
7. Discuss potential threats to validity.
8. Ensure reproducibility.
Benchmarking: Recommendations for Fuzzer Benchmarking
Marcel Böhme, Max Planck Institute for Security and Privacy & Monash University · ICSE’22 · On the Reliability of Coverage-based Fuzzer Benchmarking
• We observe a moderate agreement on superiority or ranking.
• Only if we require differences in coverage *and* bug finding to
be highly statistically significant, we observe a strong agreement.
You can substitute
coverage for bug finding
only with moderate reliability.
Result Summary