SlideShare a Scribd company logo
1 of 94
Download to read offline
Foundations of 

Software Testing
ECOOP/ISSTA’21 Summer School
Marcel Böhme (Monash University, Australia)
Soon @ Max Planck Institute for Security and Privacy, Germany.
Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing
• Fuzzing for Automatic Vulnerability Discovery

• Making machines attack other machines.

• Focus on scalability, efficiency, and effectiveness.

• Foundations of Software Security

• Assurances in Software Security

• Fundamental limitations of existing approaches

• Drawing from multiple disciplines (information theory, biostatistics)
whoami
Marcel Böhme
ARC DECRA Fellow

Senior Lecturer (A/Prof)

Monash University, Australia
Looking for PhD & PostDocs

at Max Planck Institute

Bochum, Germany
1
Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing
software testing
Input Process Output
2
Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing
software testing
Test Case Program
Pass

or

Fail
2
Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing
software testing
Problem: 

Generate at least one failing test case for each bug in the program.
Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing
• You’ve been generating test cases for your program. 

• No bugs found! 👍 

• Is your program free of bugs?

• Probably not. 😆

• Is your test case generation technique effective?

• Maybe? 😅

•
🤔
• How do you even measure effectiveness if there are no bugs?
questions for today
3
Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing
• How does the test case know if something is a bug or a feature?

• What is the difference between effectiveness and efficiency?

• When is the most effective technique (whitebox fuzzing) 

more efficient than random test generation (blackbox fuzzing)?

• How does greybox fuzzing work? Why is it so successful?

• What is the relationship btw. #bugs found and #machines available?
questions for today
3
Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing
• Let’s start at the beginning.

• A test case consists of a test input and at least one test oracle.

• A test case passes if no test oracle detects a bug for the test input.
test case = test input + test oracle
4
test case
test oracle
test input
effectiveness
efficiency
scalability
^
Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing
Test Input
Expected

Output?
Program
test case » system testing
5
test case
test oracle
test input
effectiveness
efficiency
scalability
^
Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing
$ ./gifbuild -d crashing.PoC.gif
#
# GIF information from ./crashing.PoC.gif
screen width 0
screen height 0
screen colors 2
screen background 0
pixel aspect byte 232
image # 1
image left 0
image top 0
ASAN:DEADLYSIGNAL
=================================================================
==18392==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000000 

(pc 0x000000403d84 bp 0x7fc122903708 sp 0x7ffcac6ff150 T0)
#0 0x403d83 in Gif2Icon /home/root/giflib-asan/gifbuild.c:877
#1 0x401c3c in main /home/root/giflib-asan/gifbuild.c:100
#2 0x7fc12255e82f in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2082f)
#3 0x4020b8 in _start (/home/root/giflib-asan/gifbuild+0x4020b8)
AddressSanitizer can not provide additional info.
SUMMARY: AddressSanitizer: SEGV /home/root/giflib/gifbuild.c:877 in Gif2Icon
==18392==ABORTING
Test Input
Test
Oracle
Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing
Unit
test case » unit testing
6
test case
test oracle
test input
effectiveness
efficiency
scalability
^
Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing
Unit
Test Harness
Input Expected?
test case » unit testing
6
test case
test oracle
test input
effectiveness
efficiency
scalability
^
Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing
Test Input
Test Oracle
Test Case
http://scala-ide.org/docs/2.0.x/testingframeworks.html
6
test case
test oracle
test input
effectiveness
efficiency
scalability
^
Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing
How does the test case know if the
test input exposes a bug?
test case » test oracle
7
test case
test oracle
test input
effectiveness
efficiency
scalability
^
Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing
How does the test case know if the
test input exposes a bug?
The test oracle flags it as a bug.
test case » test oracle
test oracle
7
test case
test oracle
test input
effectiveness
efficiency
scalability
^
Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing
test oracle
Question:
What kind of test oracles do you know?
7
test case
test oracle
test input
effectiveness
efficiency
scalability
^
Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing
Assertion
Test Input
test oracle » assertion-based testing
8
test case
test oracle
test input
effectiveness
efficiency
scalability
^
Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing
Test input
Satisfies

Postcondition?
Satisfies

Precondition?
If not, ignore.
test oracle » property-based testing
9
test case
test oracle
test input
effectiveness
efficiency
scalability
^
Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing
Question:
Can we have a test oracle that tells for
every input what is the expected output?
test oracle
10
test case
test oracle
test input
effectiveness
efficiency
scalability
^
Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing
Question:
Can we have a test oracle that tells for
every input what is the expected output?
test oracle
No. This is called the oracle problem.
10
test case
test oracle
test input
effectiveness
efficiency
scalability
^
Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing
• We may know how outputs relate to each other for all inputs.

• Mathematical laws: 

• sin(π - x) = sin(x)

• x + y = y + x

• Round-trip properties:

• x = unzip(zip(x))

• x = uncompress(compress(x))

• x = decrypt(encrypt(x))

• x = pickle.dump(pickle.load(x)) # serialization / parsing
test oracle » metamorphic testing
11
test case
test oracle
test input
effectiveness
efficiency
scalability
^
Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing
• We may know how to change inputs, expecting the same output.

• Compiler / interpreter testing

• If you add unreachable code, the compiled binary should give the same output.

• Constraint solver testing

• If you change a constraint and guarantee that the resulting constraint

is logically equivalent, then the solver should produce the same result.

• Fairness testing

• If you change a sensitive field (gender or race), then the classifier 

should produce the same result.
test oracle » EMI testing
12
test case
test oracle
test input
effectiveness
efficiency
scalability
^
Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing
• We may know how to change the program, expecting the same output.

• Regression testing ensures that future program versions are 

at least as correct as the current version.

• When a regression test case fails, the bug is either 

in the program or in the regression test oracle.
test oracle » regression testing
13
test case
test oracle
test input
effectiveness
efficiency
scalability
^
Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing
• We may have many implementations that can cast their vote.

• For instance, Guido Vranken’s cryptofuzz @ OSS-Fuzz continuously tests 

cryptographic protocol implementations:

• OpenSSL, BoringSSL, LibreSSL, BearSSL, MBedTLS, 

EverCrypt, Crypto++, cppcrypto, crypto-js, libgcrypt, libtomcrypt, 

symcrypt, wolfcrypt, veracrypt, libtomath + 40 more
test oracle » differential testing
14
test case
test oracle
test input
effectiveness
efficiency
scalability
^
Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing
• Implicit test oracles detect “non-semantic”, functional bugs.

• Examples: buffer overflows, memory leaks, data races, integer overflows, 

null pointers, type confusion, 

• Test oracles: crashes, exceptions, kernel panic, runtime monitors,

instrumentation from code sanitizers,..
test oracle » implicit oracles
15
test case
test oracle
test input
effectiveness
efficiency
scalability
^
Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing
• Examples of non-functional requirements

• Higher performance, low energy consumption, good user ratings

• Often checked via A/B testing or canary testing

• Deploy a new feature to a small user base first.

• Deploy old version to your remaining user base.

• Compare your non-functional measures.

• Have the values changed for the worse?
test oracle » non-functional testing
16
test case
test oracle
test input
effectiveness
efficiency
scalability
^
Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing
• Examples of non-functional requirements

• Higher performance, low energy consumption, good user ratings

• Quantify side channel leakage

• High accuracy and robustness of floating-point arithmetic code
16
test oracle » non-functional testing
test case
test oracle
test input
effectiveness
efficiency
scalability
^
Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing
assuming the perfect test oracle.
software testing
Problem: 

Generate at least one failing test case for each bug in the program.
test input
*
*
Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing
Question:
What are ways to generate test inputs?
test input
17
test case
test oracle
test input
effectiveness
efficiency
scalability
^
Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing
• Manually construct test inputs

• Assertion (e.g., JUnit)

• Setup concrete program state.

• Assert a property of that state.

• Record & Replay (e.g., Selenium)

• Record a user interaction.

• Replay the recorded interaction.

• Assert the same behaviour.
test input » manual generation
18
test case
test oracle
test input
effectiveness
efficiency
scalability
^
Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing
• Automatically construct test inputs

• Blackbox: Random test input generation. No program information.

• Greybox: Guided random test input generation. Program feedback.

• Whitebox: Systematic test input generation. Analyze program code.

• Structure-aware (guided) random test input generation
test input » automatic generation
test case
test oracle
test input
effectiveness
efficiency
scalability
^
Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing
• Automatically construct test inputs

• Blackbox: Random test input generation. No program information.

• Greybox: Guided random test input generation. Program feedback.

• Whitebox: Systematic test input generation. Analyze program code.

• Software testing as

• Optimization problem (Search-based Software Testing)

• Constraint satisfaction problem (Symbolic Execution)
test input » automatic generation
19
test case
test oracle
test input
effectiveness
efficiency
scalability
^
Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing
test input » no bugs found 🤔
After generating a few test inputs, what
does it mean if no bugs have been found?
20
test case
test oracle
test input
effectiveness
efficiency
scalability
^
Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing
• Is your program free of bugs? Probably not 😆
test input » no bugs found 🤔
https://www.cs.utexas.edu/users/EWD/ewd02xx/EWD249.PDF 21
test case
test oracle
test input
effectiveness
efficiency
scalability
^
Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing
• Is your program free of bugs? Probably not 😆

• However, we can estimate the residual risk for

• whitebox fuzzing (Filieri, Pāsāreanu, and Wisser, “Reliability Analysis in Symbolic Pathfinder”, ICSE’13) 

• blackbox fuzzing (Böhme; “STADS: Software Testing as Species Discovery”; TOSEM’18)

• greybox fuzzing (Böhme, Liyanage, and Wüstholz; “Estimating Residual Risk in Greybox Fuzzing”; ESEC/FSE’21)
test input » no bugs found 🤔
21
test case
test oracle
test input
effectiveness
efficiency
scalability
^
Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing
• Is your program free of bugs? Probably not 😆

• Is your test input generator effective? Maybe? 😅
test input » no bugs found 🤔
Recall, we call a test input generator as effective if it generates
at least one test input for each bug in the program. 21
test case
test oracle
test input
effectiveness
efficiency
scalability
^
Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing
• Is your program free of bugs? Probably not 😆

• Is your test input generator effective? Maybe? 😅
test input » no bugs found 🤔
Recall, we call a test input generator as effective if it generates
at least one test input for each bug in the program.
How do we measure effectiveness
if there are no bugs? 🤔
21
test case
test oracle
test input
effectiveness
efficiency
scalability
^
Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing
effectiveness
How do we know if we are the best at fishing
if we never catch any fish in our lake?
Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing
• Catch fish at representative lakes (Fuzzbench)

• If we are the best at fishing in many lakes,

then we might be the best in fishing in our lake.
effectiveness » benchmarking
22
test case
test oracle
test input
effectiveness
efficiency
scalability
^
Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing
• Catch fish at representative lakes (Fuzzbench)

• If we are the best at fishing in many lakes,

then we might be the best in fishing in our lake.

• Problem: Too many lakes which may also have no fishes.
effectiveness » benchmarking
22
test case
test oracle
test input
effectiveness
efficiency
scalability
^
Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing
• Catch fish at representative lakes (Fuzzbench)

• Catch fish predominantly at lakes we know have catchable fish
effectiveness » benchmarking
22
test case
test oracle
test input
effectiveness
efficiency
scalability
^
Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing
• Catch fish at representative lakes (Fuzzbench)

• Catch fish predominantly at lakes we know have catchable fish.

• Problem 1: We don’t learn how good we are at catching fish

that others don’t know how to catch.

• Problem 2: We still need to find and “curate” those fishes.
effectiveness » benchmarking
22
test case
test oracle
test input
effectiveness
efficiency
scalability
^
Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing
• Catch fish at representative lakes (Fuzzbench).

• Catch fish predominantly at lakes we know have catchable fish.

• Catch artificial fish to representative lakes (Lava-M, rode0day).
effectiveness » benchmarking
Looks like fish.
Swims like fish. 22
test case
test oracle
test input
effectiveness
efficiency
scalability
^
Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing
• Catch fish at representative lakes (Fuzzbench).

• Catch fish predominantly at lakes we know have catchable fish.

• Catch artificial fish to representative lakes (Lava-M, rode0day).

• Problem: Are artificial fish “realistic”? That is, is our performance of 

catching artificial fish indicative of our performance of catching real fish?
effectiveness » benchmarking
Looks like fish.
Swims like fish. 22
test case
test oracle
test input
effectiveness
efficiency
scalability
^
Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing
• Catch fish at representative lakes (Fuzzbench).

• Catch fish predominantly at lakes we know have catchable fish.

• Catch artificial fish to representative lakes (Lava-M, rode0day).

• Catch more realistic artificial fish in representative lakes (SemSeed).
effectiveness » benchmarking
22
test case
test oracle
test input
effectiveness
efficiency
scalability
^
Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing
• Catch fish at representative lakes (Fuzzbench).

• Catch fish predominantly at lakes we know have catchable fish.

• Catch artificial fish to representative lakes (Lava-M, rode0day).

• Catch more realistic artificial fish in representative lakes (SemSeed).
effectiveness » benchmarking
Problem: The best fuzzer for most programs
may not be the best fuzzer for my program.
😣 22
test case
test oracle
test input
effectiveness
efficiency
scalability
^
Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing
effectiveness
How do we know if we are the best at fishing
if we never catch any fish in our lake?
Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing
• Hypothesis: Can’t catch fish living in parts of the lake we don’t cover.

• Coverage-based evaluation: The more we covered, the better we are.
effectiveness » coverage
23
test case
test oracle
test input
effectiveness
efficiency
scalability
^
Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing
• Hypothesis: Can’t catch fish living in parts of the lake we don’t cover.

• Coverage-based evaluation: The more we covered, the better we are.

• Types of coverage:

• Code coverage, e.g, statement, branch, def-use pair, MCDC, path coverage.
effectiveness » coverage
23
test case
test oracle
test input
effectiveness
efficiency
scalability
^
Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing
• Hypothesis: Can’t catch fish living in parts of the lake we don’t cover.

• Coverage-based evaluation: The more we covered, the better we are.

• Types of coverage:

• Code coverage, e.g, statement, branch, def-use pair, MCDC, path coverage.

• Input coverage, e.g, grammar or protocol coverage; pairwise testing
effectiveness » coverage
23
test case
test oracle
test input
effectiveness
efficiency
scalability
^
Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing
• Hypothesis: Can’t catch fish living in parts of the lake we don’t cover.

• Coverage-based evaluation: The more we covered, the better we are.

• Types of coverage:

• Code coverage, e.g, statement, branch, def-use pair, MCDC, path coverage.

• Input coverage, e.g, grammar or protocol coverage; pairwise testing

• Requirements coverage, e.g., specification (pre-/post-condition) coverage.
effectiveness » coverage
23
test case
test oracle
test input
effectiveness
efficiency
scalability
^
Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing
• Hypothesis: Can’t catch real fish if we can’t catch artificial fish.

• Mutation-based evaluation: The more artificial fish we catch, 

the better we are.
effectiveness » artificial faults
24
test case
test oracle
test input
effectiveness
efficiency
scalability
^
Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing
assuming the perfect test oracle.
software testing problem
Generate at least one failing test case for each bug in the program.
test input
*
*
assuming more coverage is correlated with better bug finding.
coverage element
*
*
v
Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing
So, you are saying: Achieve 100% code coverage!
That should be easy, right?
effectiveness » cover all the things
25
test case
test oracle
test input
effectiveness
efficiency
scalability
^
Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing
So, you are saying: Achieve 100% code coverage!
That should be easy, right?
wrong.
25
test case
test oracle
test input
effectiveness
efficiency
scalability
effectiveness » cover all the things
^
Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing 25
test case
test oracle
test input
effectiveness
efficiency
scalability
effectiveness » cover all the things
^
Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing
• Problem: We don’t know how much coverage *can* be achieved 🤔
26
test case
test oracle
test input
effectiveness
efficiency
scalability
effectiveness » cover all the things
^
Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing
• Problem: We don’t know how much coverage *can* be achieved 🤔

• We cannot compute the asymptote.
?
26
test case
test oracle
test input
effectiveness
efficiency
scalability
effectiveness » cover all the things
^
Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing
• Problem: We don’t know how much coverage *can* be achieved 🤔

• We cannot compute the asymptote.

• Determining whether an element can

be covered is as hard as determining

whether an assertion can be violated.
if (unexpected_behavior) {
fail();
}
Can this be covered?
26
test case
test oracle
test input
effectiveness
efficiency
scalability
effectiveness » cover all the things
^
Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing
• Problem: We don’t know how much coverage *can* be achieved 🤔

• We cannot compute the asymptote.

• However, we can estimate the 

asymptote during testing!
26
test case
test oracle
test input
effectiveness
efficiency
scalability
effectiveness » cover all the things
^
Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing
• Problem: We don’t know how much coverage *can* be achieved 🤔

• We cannot compute the asymptote.

• However, we can estimate the 

asymptote during testing!

• Consider test input gen. as sampling.

• Cast as species discovery problem.
ACM TOSEM’18
26
test case
test oracle
test input
effectiveness
efficiency
scalability
effectiveness » cover all the things
^
Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing
assuming the perfect test oracle.
software testing problem
Generate at least one failing test case for each bug in the program.
test input
*
*
assuming more coverage is correlated with better bug finding.
coverage element
*
*
v
as many as possible
v
Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing
So, it should be better to systematically
generate test inputs that cover most
of the coverage elements rather than
to randomly generate inputs, right?
effectiveness
27
test case
test oracle
test input
effectiveness
efficiency
scalability
^
Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing
So, it should be better to systematically
generate test inputs that cover most
of the coverage elements rather than
to randomly generate inputs, right?
effectiveness
wrong.
27
test case
test oracle
test input
effectiveness
efficiency
scalability
^
Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing
effectiveness
27
test case
test oracle
test input
effectiveness
efficiency
scalability
^
Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing
efficiency
Consider time!*
*[TSE’15] “A Probabilistic Analysis of the Efficiency of Automated Software Testing”
Böhme and Paul 28
test case
test oracle
test input
effectiveness
efficiency
scalability
^
Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing
efficiency
When is the most effective technique (whitebox fuzzing)
more efficient than random test generation (blackbox fuzzing)?
Consider time!*
*[TSE’15] “A Probabilistic Analysis of the Efficiency of Automated Software Testing”
Böhme and Paul 28
test case
test oracle
test input
effectiveness
efficiency
scalability
^
Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing
• Our whitebox fuzzer generates one test input per path.

• Most effective! Covers all statements, branches, paths, and bugs!
efficiency » example
void crashme(char s[4]) {
if (s[0] == 'b')
if (s[1] == 'a')
if (s[2] == 'd')
if (s[3] == '!')
abort();
}
29
test case
test oracle
test input
effectiveness
efficiency
scalability
^
Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing
• Our whitebox fuzzer generates one test input per path.

• Most effective! Covers all statements, branches, paths, and bugs!

• Discovers the bug after 5 inputs.
efficiency » example
void crashme(char s[4]) {
if (s[0] == 'b')
if (s[1] == 'a')
if (s[2] == 'd')
if (s[3] == '!')
abort();
}
This program has five paths:

1. ****: false
2. b***: true, false
3. ba**: true, true, false
4. bad*: true, true, true, false
5. bad!: true, true, true, true, abort();
29
test case
test oracle
test input
effectiveness
efficiency
scalability
^
Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing
• Our whitebox fuzzer generates one test input per path.

• Most effective! Covers all statements, branches, paths, and bugs!

• Discovers the bug after 5 inputs.
efficiency » example
• Our generational blackbox fuzzer generates a random input of length 4.

• Discovers the bug after ((2-8)4)-1 ≈ 4 billion inputs, in expectation.

void crashme(char s[4]) {
if (s[0] == 'b')
if (s[1] == 'a')
if (s[2] == 'd')
if (s[3] == '!')
abort();
}
29
test case
test oracle
test input
effectiveness
efficiency
scalability
^
Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing
• Our whitebox fuzzer generates one test input per path.

• Most effective! Covers all statements, branches, paths, and bugs!

• Discovers the bug after 5 inputs.
efficiency » example
• Our generational blackbox fuzzer generates a random input of length 4.

• Discovers the bug after ((2-8)4)-1 ≈ 4 billion inputs, in expectation.

• On my machine, this takes 6.3 seconds. On 100 machines, it takes 63 milliseconds.
void crashme(char s[4]) {
if (s[0] == 'b')
if (s[1] == 'a')
if (s[2] == 'd')
if (s[3] == '!')
abort();
}
29
test case
test oracle
test input
effectiveness
efficiency
scalability
^
Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing
• Our whitebox fuzzer generates one test input per path.

• Most effective! Covers all statements, branches, paths, and bugs!

• Discovers the bug after 5 inputs.
efficiency » example
• Our generational blackbox fuzzer generates a random input of length 4.

• Discovers the bug after ((2-8)4)-1 ≈ 4 billion inputs, in expectation.

• On my machine, this takes 6.3 seconds. On 100 machines, it takes 63 milliseconds.
If our whitebox fuzzer takes too long
per input, our blackbox fuzzer outperforms!
» There is a maximum time per test input!
void crashme(char s[4]) {
if (s[0] == 'b')
if (s[1] == 'a')
if (s[2] == 'd')
if (s[3] == '!')
abort();
}
29
test case
test oracle
test input
effectiveness
efficiency
scalability
^
Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing
efficiency » example
• Our generational blackbox fuzzer generates a random input of length 4.

• Discovers the bug after ((2-8)4)-1 ≈ 4 billion inputs, in expectation.

• On my machine, this takes 6.3 seconds. On 100 machines, it takes 63 milliseconds.
• Our mutational blackbox fuzzer mutates a random character in a seed.

• Started with the seed bad?

• Discovers the bug after ((4-1)*(2-8))-1 ≈ 1024 inputs, in expectation.
void crashme(char s[4]) {
if (s[0] == 'b')
if (s[1] == 'a')
if (s[2] == 'd')
if (s[3] == '!')
abort();
}
29
test case
test oracle
test input
effectiveness
efficiency
scalability
^
Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing
efficiency » example
• Our generational blackbox fuzzer generates a random input of length 4.

• Discovers the bug after ((2-8)4)-1 ≈ 4 billion inputs, in expectation.

• On my machine, this takes 6.3 seconds. On 100 machines, it takes 63 milliseconds.
• Our mutational blackbox fuzzer mutates a random character in a seed.

• Started with the seed bad?

• Discovers the bug after ((4-1)*(2-8))-1 ≈ 1024 inputs, in expectation.
Where do we get that seed?
Discover it!
void crashme(char s[4]) {
if (s[0] == 'b')
if (s[1] == 'a')
if (s[2] == 'd')
if (s[3] == '!')
abort();
}
29
test case
test oracle
test input
effectiveness
efficiency
scalability
^
Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing
[CCS’16] “Coverage-based Greybox Fuzzing as Markov Chain”
Böhme Pham, and Roychoudhury
• Our greybox fuzzer is mutational but adds inputs that increase coverage.
efficiency » example
**** b*** (1✕ 4-1 ✕ 2-8)-1

= 1024
****
b***
ba** (1/2 ✕ 4-1 ✕ 2-8)-1

= 2048
****
b***
ba**
bad* (1/3 ✕ 4-1 ✕ 2-8)-1

= 3072
****
b***
ba**
bad*
bad! (1/4 ✕ 4-1 ✕ 2-8)-1

= 4096
Total: 10240
Seed corpus
Expected #inputs
“Interesting”

Input
void crashme(char s[4]) {
if (s[0] == 'b')
if (s[1] == 'a')
if (s[2] == 'd')
if (s[3] == '!')
abort();
}
test case
test oracle
test input
effectiveness
efficiency
scalability
^
Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing
• Our greybox fuzzer is mutational but adds inputs that increase coverage.

• Started with an random test input ****

efficiency » example
**** b*** (1✕ 4-1 ✕ 2-8)-1

= 1024
****
b***
ba** (1/2 ✕ 4-1 ✕ 2-8)-1

= 2048
****
b***
ba**
bad* (1/3 ✕ 4-1 ✕ 2-8)-1

= 3072
****
b***
ba**
bad*
bad! (1/4 ✕ 4-1 ✕ 2-8)-1

= 4096
Total: 10240
[CCS’16] “Coverage-based Greybox Fuzzing as Markov Chain”
Böhme Pham, and Roychoudhury
void crashme(char s[4]) {
if (s[0] == 'b')
if (s[1] == 'a')
if (s[2] == 'd')
if (s[3] == '!')
abort();
}
test case
test oracle
test input
effectiveness
efficiency
scalability
^
Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing
• Our greybox fuzzer is mutational but adds inputs that increase coverage.

• Started with an random test input ****

• Discovers the bug after generating 10k inputs.

• On my machine, 150 milliseconds.

efficiency » example
**** b*** (1✕ 4-1 ✕ 2-8)-1

= 1024
****
b***
ba** (1/2 ✕ 4-1 ✕ 2-8)-1

= 2048
****
b***
ba**
bad* (1/3 ✕ 4-1 ✕ 2-8)-1

= 3072
****
b***
ba**
bad*
bad! (1/4 ✕ 4-1 ✕ 2-8)-1

= 4096
Total: 10240
[CCS’16] “Coverage-based Greybox Fuzzing as Markov Chain”
Böhme Pham, and Roychoudhury
void crashme(char s[4]) {
if (s[0] == 'b')
if (s[1] == 'a')
if (s[2] == 'd')
if (s[3] == '!')
abort();
}
test case
test oracle
test input
effectiveness
efficiency
scalability
^
Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing
• Our greybox fuzzer is mutational but adds inputs that increase coverage.

• Started with an random test input ****

• Discovers the bug after generating 10k inputs.

• On my machine, 150 milliseconds.

• If we prefer seeds on low-probability paths,

it only takes 4k inputs (55 ms).

efficiency » example
[CCS’16] “Coverage-based Greybox Fuzzing as Markov Chain”
Böhme Pham, and Roychoudhury
**** b*** (1✕ 4-1 ✕ 2-8)-1

= 1024
****
b***
ba** (1 ✕ 4-1 ✕ 2-8)-1

= 1024
****
b***
ba**
bad* (1 ✕ 4-1 ✕ 2-8)-1

= 1024
****
b***
ba**
bad*
bad! (1 ✕ 4-1 ✕ 2-8)-1

= 1024
Total: 4096
void crashme(char s[4]) {
if (s[0] == 'b')
if (s[1] == 'a')
if (s[2] == 'd')
if (s[3] == '!')
abort();
}
test case
test oracle
test input
effectiveness
efficiency
scalability
^
Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing
• Limitation of “smarter” testing:

If your test input generation is not fast enough, 

even a simple (guided) random test input generation 

will find more bugs in your limited time budget.
efficiency
test case
test oracle
test input
effectiveness
efficiency
scalability
^
Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing
scalability
test case
test oracle
test input
effectiveness
efficiency
scalability
Okay. Let’s take the most popular technique
and distribute it across many machines.
How does bug finding scale with #machines?
[ESEC/FSE’20] “Fuzzing: On the Exponential Cost of Vulnerability Discovery”
Böhme and Falk
^
• Google has been fuzzing OSS for about 4 years

• 25k machines; 11k+ bugs in 160+ OSS; 16k+ bugs in Chrome browser

• Discovery rate reduces. As more bugs are fixed, less new bugs found.
scalability
[ESEC/FSE’20] “Fuzzing: On the Exponential Cost of Vulnerability Discovery”
Böhme and Falk
test case
test oracle
test input
effectiveness
efficiency
scalability
^
• Google has been fuzzing OSS for about 4 years.

• Suppose, Google now employs 100x more machines.

• In 1 month on 2.5 million machines they find 100 vulns more.
scalability
[ESEC/FSE’20] “Fuzzing: On the Exponential Cost of Vulnerability Discovery”
Böhme and Falk
test case
test oracle
test input
effectiveness
efficiency
scalability
^
• Google has been fuzzing OSS for about 4 years.

• Suppose, Google now employs 100x more machines.

• In 1 month on 2.5 million machines they find 100 vulns more.
How long do you expect it would take
to find all of these known vulnerabilities on
*250 million* machines?
scalability
[ESEC/FSE’20] “Fuzzing: On the Exponential Cost of Vulnerability Discovery”
Böhme and Falk
test case
test oracle
test input
effectiveness
efficiency
scalability
^
• Google has been fuzzing OSS for about 4 years.

• Suppose, Google now employs 100x more machines.

• In 1 month on 2.5 million machines they find 100 vulns more
Given the same non-deterministic fuzzer,
finding the same bugs linearly faster
requires linearly more machines.
Do you agree?
scalability
[ESEC/FSE’20] “Fuzzing: On the Exponential Cost of Vulnerability Discovery”
Böhme and Falk
test case
test oracle
test input
effectiveness
efficiency
scalability
^
• Google has been fuzzing OSS for about 4 years.

• Suppose, Google now employs 100x more machines.

• In 1 month on 2.5 million machines they find 100 vulns more.
Now, how many undiscovered vulns
do you expect to find in 1 month
on 250 million machines?
scalability
[ESEC/FSE’20] “Fuzzing: On the Exponential Cost of Vulnerability Discovery”
Böhme and Falk
test case
test oracle
test input
effectiveness
efficiency
scalability
^
• Google has been fuzzing OSS for about 4 years.

• Suppose, Google now employs 100x more machines.

• In 1 month on 2.5 million machines they find 100 vulns more.
Given the same non-deterministic fuzzer,
finding linearly more new bugs in c months,
requires exponentially more machines.
scalability
[ESEC/FSE’20] “Fuzzing: On the Exponential Cost of Vulnerability Discovery”
Böhme and Falk
test case
test oracle
test input
effectiveness
efficiency
scalability
^
1 2 4 8 16 32 64 128 256 512
machines
new bugs 1 2 3 4 5 6 7 8 9
24 hrs
scalability
Summary
Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing
assuming the perfect test oracle.
software testing problem
Generate at least one failing test case for each bug in the program.
test input
*
*
assuming more coverage is correlated with better bug finding.
coverage element
*
*
v
as many as possible
v
Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing
assuming the perfect test oracle.
software testing problem
Generate at least one failing test case for each bug in the program.
test input
*
*
assuming more coverage is correlated with better bug finding.
coverage element
*
*
v
as many as possible
v
Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing
• Automatically construct test inputs

• Blackbox: Random test input generation. No program information.

• Greybox: Guided random test input generation. Program feedback.

• Whitebox: Systematic test input generation. Analyze program code.

• Software testing as

• Optimization problem (Search-based Software Testing)

• Constraint satisfaction problem (Symbolic Execution)
test input » automatic generation
19
test case
test oracle
test input
effectiveness
efficiency
scalability
Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing
• We may know how outputs relate to each other for all inputs.

• Mathematical laws: 

• sin(π - x) = sin(x)

• x + y = y + x

• Round-trip properties:

• x = unzip(zip(x))

• x = uncompress(compress(x))

• x = decrypt(encrypt(x))

• x = pickle.dump(pickle.load(x)) # serialization / parsing
test oracle » metamorphic testing
11
test case
test oracle
test input
effectiveness
efficiency
scalability
Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing
Test Input
Expected

Output?
Program
test case » system testing
5
test case
test oracle
test input
effectiveness
efficiency
scalability
Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing
Unit
Test Harness
Input Expected?
test case » unit testing
6
test case
test oracle
test input
effectiveness
efficiency
scalability
Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing
assuming the perfect test oracle.
software testing problem
Generate at least one failing test case for each bug in the program.
test input
*
*
assuming more coverage is correlated with better bug finding.
coverage element
*
*
v
as many as possible
v
Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing
• Automatically construct test inputs

• Blackbox: Random test input generation. No program information.

• Greybox: Guided random test input generation. Program feedback.

• Whitebox: Systematic test input generation. Analyze program code.

• Software testing as

• Optimization problem (Search-based Software Testing)

• Constraint satisfaction problem (Symbolic Execution)
test input » automatic generation
19
test case
test oracle
test input
effectiveness
efficiency
scalability
Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing
• We may know how outputs relate to each other for all inputs.

• Mathematical laws: 

• sin(π - x) = sin(x)

• x + y = y + x

• Round-trip properties:

• x = unzip(zip(x))

• x = uncompress(compress(x))

• x = decrypt(encrypt(x))

• x = pickle.dump(pickle.load(x)) # serialization / parsing
test oracle » metamorphic testing
11
test case
test oracle
test input
effectiveness
efficiency
scalability
Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing
Test Input
Expected

Output?
Program
test case » system testing
5
test case
test oracle
test input
effectiveness
efficiency
scalability
Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing
Unit
Test Harness
Input Expected?
test case » unit testing
6
test case
test oracle
test input
effectiveness
efficiency
scalability
Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing 25
test case
test oracle
test input
effectiveness
efficiency
scalability
effectiveness » cover all the things
test case
test oracle
test input
effectiveness
efficiency
scalability
effectiveness » cover all the things
x
?
Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing
• Limitation of “smarter” testing:

If your test input generation is not fast enough, 

even a simple (guided) random test input generation 

will find more bugs in your limited time budget.
efficiency
test case
test oracle
test input
effectiveness
efficiency
scalability
^
Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing
assuming the perfect test oracle.
software testing problem
Generate at least one failing test case for each bug in the program.
test input
*
*
assuming more coverage is correlated with better bug finding.
coverage element
*
*
v
as many as possible
v
Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing
• Automatically construct test inputs

• Blackbox: Random test input generation. No program information.

• Greybox: Guided random test input generation. Program feedback.

• Whitebox: Systematic test input generation. Analyze program code.

• Software testing as

• Optimization problem (Search-based Software Testing)

• Constraint satisfaction problem (Symbolic Execution)
test input » automatic generation
19
test case
test oracle
test input
effectiveness
efficiency
scalability
Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing
• We may know how outputs relate to each other for all inputs.

• Mathematical laws: 

• sin(π - x) = sin(x)

• x + y = y + x

• Round-trip properties:

• x = unzip(zip(x))

• x = uncompress(compress(x))

• x = decrypt(encrypt(x))

• x = pickle.dump(pickle.load(x)) # serialization / parsing
test oracle » metamorphic testing
11
test case
test oracle
test input
effectiveness
efficiency
scalability
Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing
Test Input
Expected

Output?
Program
test case » system testing
5
test case
test oracle
test input
effectiveness
efficiency
scalability
Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing
Unit
Test Harness
Input Expected?
test case » unit testing
6
test case
test oracle
test input
effectiveness
efficiency
scalability
Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing 25
test case
test oracle
test input
effectiveness
efficiency
scalability
effectiveness » cover all the things
test case
test oracle
test input
effectiveness
efficiency
scalability
effectiveness » cover all the things
x
?
Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing
• Limitation of “smarter” testing:

If your test input generation is not fast enough, 

even a simple (guided) random test input generation 

will find more bugs in your limited time budget.
efficiency
test case
test oracle
test input
effectiveness
efficiency
scalability
^
Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing
assuming the perfect test oracle.
software testing problem
Generate at least one failing test case for each bug in the program.
test input
*
*
assuming more coverage is correlated with better bug finding.
coverage element
*
*
v
as many as possible
v
Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing
• Automatically construct test inputs

• Blackbox: Random test input generation. No program information.

• Greybox: Guided random test input generation. Program feedback.

• Whitebox: Systematic test input generation. Analyze program code.

• Software testing as

• Optimization problem (Search-based Software Testing)

• Constraint satisfaction problem (Symbolic Execution)
test input » automatic generation
19
test case
test oracle
test input
effectiveness
efficiency
scalability
Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing
• We may know how outputs relate to each other for all inputs.

• Mathematical laws: 

• sin(π - x) = sin(x)

• x + y = y + x

• Round-trip properties:

• x = unzip(zip(x))

• x = uncompress(compress(x))

• x = decrypt(encrypt(x))

• x = pickle.dump(pickle.load(x)) # serialization / parsing
test oracle » metamorphic testing
11
test case
test oracle
test input
effectiveness
efficiency
scalability
Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing
Test Input
Expected

Output?
Program
test case » system testing
5
test case
test oracle
test input
effectiveness
efficiency
scalability
Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing
Unit
Test Harness
Input Expected?
test case » unit testing
6
test case
test oracle
test input
effectiveness
efficiency
scalability
Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing 25
test case
test oracle
test input
effectiveness
efficiency
scalability
effectiveness » cover all the things
1 2 4 8 16 32 64 128 256 512
machines
new bugs 1 2 3 4 5 6 7 8 9
24 hrs
scalability
test case
test oracle
test input
effectiveness
efficiency
scalability
effectiveness » cover all the things
x
?
Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing
• Limitation of “smarter” testing:

If your test input generation is not fast enough, 

even a simple (guided) random test input generation 

will find more bugs in your limited time budget.
efficiency
test case
test oracle
test input
effectiveness
efficiency
scalability
^
1 2 4 8 16 32 64 128 256 512
machines
new bugs 1 2 3 4 5 6 7 8 9
24 hrs
scalability
Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing
assuming the perfect test oracle.
software testing problem
Generate at least one failing test case for each bug in the program.
test input
*
*
assuming more coverage is correlated with better bug finding.
coverage element
*
*
v
as many as possible
v
Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing
• Automatically construct test inputs

• Blackbox: Random test input generation. No program information.

• Greybox: Guided random test input generation. Program feedback.

• Whitebox: Systematic test input generation. Analyze program code.

• Software testing as

• Optimization problem (Search-based Software Testing)

• Constraint satisfaction problem (Symbolic Execution)
test input » automatic generation
19
test case
test oracle
test input
effectiveness
efficiency
scalability
Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing
• We may know how outputs relate to each other for all inputs.

• Mathematical laws: 

• sin(π - x) = sin(x)

• x + y = y + x

• Round-trip properties:

• x = unzip(zip(x))

• x = uncompress(compress(x))

• x = decrypt(encrypt(x))

• x = pickle.dump(pickle.load(x)) # serialization / parsing
test oracle » metamorphic testing
11
test case
test oracle
test input
effectiveness
efficiency
scalability
Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing
Test Input
Expected

Output?
Program
test case » system testing
5
test case
test oracle
test input
effectiveness
efficiency
scalability
Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing
Unit
Test Harness
Input Expected?
test case » unit testing
6
test case
test oracle
test input
effectiveness
efficiency
scalability
Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing 25
test case
test oracle
test input
effectiveness
efficiency
scalability
effectiveness » cover all the things
test case
test oracle
test input
effectiveness
efficiency
scalability
effectiveness » cover all the things
x
?
If you want to take a deeper dive:
* Attend workshops and talks @ ECOOP/ISSTA’21 this week!

* Read our interactive text book: The Fuzzing Book

* Read our IEEE Software article: “Fuzzing: Challenges and Reflections”

* Apply for PhD / PostDoc in my group at MPI-SP, Bochum, Germany.

Web: https://mboehme.github.com Twitter: @mboehme_

More Related Content

Similar to Foundations Of Software Testing

Software testing: an introduction - 2017
Software testing: an introduction - 2017Software testing: an introduction - 2017
Software testing: an introduction - 2017XavierDevroey
 
David Parnas - Documentation Based Software Testing - SoftTest Ireland
David Parnas - Documentation Based Software Testing - SoftTest IrelandDavid Parnas - Documentation Based Software Testing - SoftTest Ireland
David Parnas - Documentation Based Software Testing - SoftTest IrelandDavid O'Dowd
 
Statistical Reasoning About Programs
Statistical Reasoning About ProgramsStatistical Reasoning About Programs
Statistical Reasoning About Programsmboehme
 
Generating test cases using UML Communication Diagram
Generating test cases using UML Communication Diagram Generating test cases using UML Communication Diagram
Generating test cases using UML Communication Diagram Praveen Penumathsa
 
How to Actually DO High-volume Automated Testing
How to Actually DO High-volume Automated TestingHow to Actually DO High-volume Automated Testing
How to Actually DO High-volume Automated TestingTechWell
 
Leaping over the Boundaries of Boundary Value Analysis
Leaping over the Boundaries of Boundary Value AnalysisLeaping over the Boundaries of Boundary Value Analysis
Leaping over the Boundaries of Boundary Value AnalysisTechWell
 
Annotated Bibliography .Guidelines Annotated Bibliograph.docx
Annotated Bibliography  .Guidelines Annotated Bibliograph.docxAnnotated Bibliography  .Guidelines Annotated Bibliograph.docx
Annotated Bibliography .Guidelines Annotated Bibliograph.docxjustine1simpson78276
 
white-box-testing.pptx
white-box-testing.pptxwhite-box-testing.pptx
white-box-testing.pptxSurajMolla3
 
black box and white box testing .ppt
black box and  white  box testing   .pptblack box and  white  box testing   .ppt
black box and white box testing .pptMsSanaFatimaLecturer
 
13-blackwhiteboxtesting.ppt
13-blackwhiteboxtesting.ppt13-blackwhiteboxtesting.ppt
13-blackwhiteboxtesting.pptNaumanKhan580192
 
Week10style
Week10styleWeek10style
Week10stylehccit
 
Can we induce change with what we measure?
Can we induce change with what we measure?Can we induce change with what we measure?
Can we induce change with what we measure?Michaela Greiler
 
Software Analytics - Achievements and Challenges
Software Analytics - Achievements and ChallengesSoftware Analytics - Achievements and Challenges
Software Analytics - Achievements and ChallengesTao Xie
 
Replication and Benchmarking in Software Analytics
Replication and Benchmarking in Software AnalyticsReplication and Benchmarking in Software Analytics
Replication and Benchmarking in Software AnalyticsUniversity of Zurich
 
30 February 2005 QUEUE rants [email protected] DARNEDTestin.docx
30  February 2005  QUEUE rants [email protected] DARNEDTestin.docx30  February 2005  QUEUE rants [email protected] DARNEDTestin.docx
30 February 2005 QUEUE rants [email protected] DARNEDTestin.docxtamicawaysmith
 
Testing of Object-Oriented Software
Testing of Object-Oriented SoftwareTesting of Object-Oriented Software
Testing of Object-Oriented SoftwarePraveen Penumathsa
 

Similar to Foundations Of Software Testing (20)

Software testing: an introduction - 2017
Software testing: an introduction - 2017Software testing: an introduction - 2017
Software testing: an introduction - 2017
 
Software testing
Software testingSoftware testing
Software testing
 
David Parnas - Documentation Based Software Testing - SoftTest Ireland
David Parnas - Documentation Based Software Testing - SoftTest IrelandDavid Parnas - Documentation Based Software Testing - SoftTest Ireland
David Parnas - Documentation Based Software Testing - SoftTest Ireland
 
Statistical Reasoning About Programs
Statistical Reasoning About ProgramsStatistical Reasoning About Programs
Statistical Reasoning About Programs
 
Exposé Ontology
Exposé OntologyExposé Ontology
Exposé Ontology
 
Generating test cases using UML Communication Diagram
Generating test cases using UML Communication Diagram Generating test cases using UML Communication Diagram
Generating test cases using UML Communication Diagram
 
How to Actually DO High-volume Automated Testing
How to Actually DO High-volume Automated TestingHow to Actually DO High-volume Automated Testing
How to Actually DO High-volume Automated Testing
 
Leaping over the Boundaries of Boundary Value Analysis
Leaping over the Boundaries of Boundary Value AnalysisLeaping over the Boundaries of Boundary Value Analysis
Leaping over the Boundaries of Boundary Value Analysis
 
Annotated Bibliography .Guidelines Annotated Bibliograph.docx
Annotated Bibliography  .Guidelines Annotated Bibliograph.docxAnnotated Bibliography  .Guidelines Annotated Bibliograph.docx
Annotated Bibliography .Guidelines Annotated Bibliograph.docx
 
white-box-testing.pptx
white-box-testing.pptxwhite-box-testing.pptx
white-box-testing.pptx
 
black box and white box testing .ppt
black box and  white  box testing   .pptblack box and  white  box testing   .ppt
black box and white box testing .ppt
 
13-blackwhiteboxtesting.ppt
13-blackwhiteboxtesting.ppt13-blackwhiteboxtesting.ppt
13-blackwhiteboxtesting.ppt
 
13-blackwhiteboxtesting.ppt
13-blackwhiteboxtesting.ppt13-blackwhiteboxtesting.ppt
13-blackwhiteboxtesting.ppt
 
13-blackwhiteboxtesting.ppt
13-blackwhiteboxtesting.ppt13-blackwhiteboxtesting.ppt
13-blackwhiteboxtesting.ppt
 
Week10style
Week10styleWeek10style
Week10style
 
Can we induce change with what we measure?
Can we induce change with what we measure?Can we induce change with what we measure?
Can we induce change with what we measure?
 
Software Analytics - Achievements and Challenges
Software Analytics - Achievements and ChallengesSoftware Analytics - Achievements and Challenges
Software Analytics - Achievements and Challenges
 
Replication and Benchmarking in Software Analytics
Replication and Benchmarking in Software AnalyticsReplication and Benchmarking in Software Analytics
Replication and Benchmarking in Software Analytics
 
30 February 2005 QUEUE rants [email protected] DARNEDTestin.docx
30  February 2005  QUEUE rants [email protected] DARNEDTestin.docx30  February 2005  QUEUE rants [email protected] DARNEDTestin.docx
30 February 2005 QUEUE rants [email protected] DARNEDTestin.docx
 
Testing of Object-Oriented Software
Testing of Object-Oriented SoftwareTesting of Object-Oriented Software
Testing of Object-Oriented Software
 

More from mboehme

An Implementation of Preregistration
An Implementation of PreregistrationAn Implementation of Preregistration
An Implementation of Preregistrationmboehme
 
On the Reliability of Coverage-based Fuzzer Benchmarking
On the Reliability of Coverage-based Fuzzer BenchmarkingOn the Reliability of Coverage-based Fuzzer Benchmarking
On the Reliability of Coverage-based Fuzzer Benchmarkingmboehme
 
On the Surprising Efficiency and Exponential Cost of Fuzzing
On the Surprising Efficiency and Exponential Cost of FuzzingOn the Surprising Efficiency and Exponential Cost of Fuzzing
On the Surprising Efficiency and Exponential Cost of Fuzzingmboehme
 
Fuzzing: On the Exponential Cost of Vulnerability Discovery
Fuzzing: On the Exponential Cost of Vulnerability DiscoveryFuzzing: On the Exponential Cost of Vulnerability Discovery
Fuzzing: On the Exponential Cost of Vulnerability Discoverymboehme
 
Boosting Fuzzer Efficiency: An Information Theoretic Perspective
Boosting Fuzzer Efficiency: An Information Theoretic PerspectiveBoosting Fuzzer Efficiency: An Information Theoretic Perspective
Boosting Fuzzer Efficiency: An Information Theoretic Perspectivemboehme
 
AFLGo: Directed Greybox Fuzzing
AFLGo: Directed Greybox FuzzingAFLGo: Directed Greybox Fuzzing
AFLGo: Directed Greybox Fuzzingmboehme
 
NUS SoC Graduate Outreach @ TU Dresden
NUS SoC Graduate Outreach @ TU DresdenNUS SoC Graduate Outreach @ TU Dresden
NUS SoC Graduate Outreach @ TU Dresdenmboehme
 

More from mboehme (7)

An Implementation of Preregistration
An Implementation of PreregistrationAn Implementation of Preregistration
An Implementation of Preregistration
 
On the Reliability of Coverage-based Fuzzer Benchmarking
On the Reliability of Coverage-based Fuzzer BenchmarkingOn the Reliability of Coverage-based Fuzzer Benchmarking
On the Reliability of Coverage-based Fuzzer Benchmarking
 
On the Surprising Efficiency and Exponential Cost of Fuzzing
On the Surprising Efficiency and Exponential Cost of FuzzingOn the Surprising Efficiency and Exponential Cost of Fuzzing
On the Surprising Efficiency and Exponential Cost of Fuzzing
 
Fuzzing: On the Exponential Cost of Vulnerability Discovery
Fuzzing: On the Exponential Cost of Vulnerability DiscoveryFuzzing: On the Exponential Cost of Vulnerability Discovery
Fuzzing: On the Exponential Cost of Vulnerability Discovery
 
Boosting Fuzzer Efficiency: An Information Theoretic Perspective
Boosting Fuzzer Efficiency: An Information Theoretic PerspectiveBoosting Fuzzer Efficiency: An Information Theoretic Perspective
Boosting Fuzzer Efficiency: An Information Theoretic Perspective
 
AFLGo: Directed Greybox Fuzzing
AFLGo: Directed Greybox FuzzingAFLGo: Directed Greybox Fuzzing
AFLGo: Directed Greybox Fuzzing
 
NUS SoC Graduate Outreach @ TU Dresden
NUS SoC Graduate Outreach @ TU DresdenNUS SoC Graduate Outreach @ TU Dresden
NUS SoC Graduate Outreach @ TU Dresden
 

Recently uploaded

Twin's paradox experiment is a meassurement of the extra dimensions.pptx
Twin's paradox experiment is a meassurement of the extra dimensions.pptxTwin's paradox experiment is a meassurement of the extra dimensions.pptx
Twin's paradox experiment is a meassurement of the extra dimensions.pptxEran Akiva Sinbar
 
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptxSOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptxkessiyaTpeter
 
Call Girls in Aiims Metro Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Aiims Metro Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Aiims Metro Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Aiims Metro Delhi 💯Call Us 🔝9953322196🔝 💯Escort.aasikanpl
 
Analytical Profile of Coleus Forskohlii | Forskolin .pdf
Analytical Profile of Coleus Forskohlii | Forskolin .pdfAnalytical Profile of Coleus Forskohlii | Forskolin .pdf
Analytical Profile of Coleus Forskohlii | Forskolin .pdfSwapnil Therkar
 
Forest laws, Indian forest laws, why they are important
Forest laws, Indian forest laws, why they are importantForest laws, Indian forest laws, why they are important
Forest laws, Indian forest laws, why they are importantadityabhardwaj282
 
Harmful and Useful Microorganisms Presentation
Harmful and Useful Microorganisms PresentationHarmful and Useful Microorganisms Presentation
Harmful and Useful Microorganisms Presentationtahreemzahra82
 
TOPIC 8 Temperature and Heat.pdf physics
TOPIC 8 Temperature and Heat.pdf physicsTOPIC 8 Temperature and Heat.pdf physics
TOPIC 8 Temperature and Heat.pdf physicsssuserddc89b
 
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
‏‏VIRUS - 123455555555555555555555555555555555555555
‏‏VIRUS -  123455555555555555555555555555555555555555‏‏VIRUS -  123455555555555555555555555555555555555555
‏‏VIRUS - 123455555555555555555555555555555555555555kikilily0909
 
Vision and reflection on Mining Software Repositories research in 2024
Vision and reflection on Mining Software Repositories research in 2024Vision and reflection on Mining Software Repositories research in 2024
Vision and reflection on Mining Software Repositories research in 2024AyushiRastogi48
 
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.aasikanpl
 
Call Us ≽ 9953322196 ≼ Call Girls In Lajpat Nagar (Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Lajpat Nagar (Delhi) |Call Us ≽ 9953322196 ≼ Call Girls In Lajpat Nagar (Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Lajpat Nagar (Delhi) |aasikanpl
 
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptx
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptxTHE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptx
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptxNandakishor Bhaurao Deshmukh
 
Heredity: Inheritance and Variation of Traits
Heredity: Inheritance and Variation of TraitsHeredity: Inheritance and Variation of Traits
Heredity: Inheritance and Variation of TraitsCharlene Llagas
 
Neurodevelopmental disorders according to the dsm 5 tr
Neurodevelopmental disorders according to the dsm 5 trNeurodevelopmental disorders according to the dsm 5 tr
Neurodevelopmental disorders according to the dsm 5 trssuser06f238
 
Analytical Profile of Coleus Forskohlii | Forskolin .pptx
Analytical Profile of Coleus Forskohlii | Forskolin .pptxAnalytical Profile of Coleus Forskohlii | Forskolin .pptx
Analytical Profile of Coleus Forskohlii | Forskolin .pptxSwapnil Therkar
 
Solution chemistry, Moral and Normal solutions
Solution chemistry, Moral and Normal solutionsSolution chemistry, Moral and Normal solutions
Solution chemistry, Moral and Normal solutionsHajira Mahmood
 
Microphone- characteristics,carbon microphone, dynamic microphone.pptx
Microphone- characteristics,carbon microphone, dynamic microphone.pptxMicrophone- characteristics,carbon microphone, dynamic microphone.pptx
Microphone- characteristics,carbon microphone, dynamic microphone.pptxpriyankatabhane
 

Recently uploaded (20)

Twin's paradox experiment is a meassurement of the extra dimensions.pptx
Twin's paradox experiment is a meassurement of the extra dimensions.pptxTwin's paradox experiment is a meassurement of the extra dimensions.pptx
Twin's paradox experiment is a meassurement of the extra dimensions.pptx
 
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptxSOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
 
Call Girls in Aiims Metro Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Aiims Metro Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Aiims Metro Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Aiims Metro Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
 
Analytical Profile of Coleus Forskohlii | Forskolin .pdf
Analytical Profile of Coleus Forskohlii | Forskolin .pdfAnalytical Profile of Coleus Forskohlii | Forskolin .pdf
Analytical Profile of Coleus Forskohlii | Forskolin .pdf
 
Forest laws, Indian forest laws, why they are important
Forest laws, Indian forest laws, why they are importantForest laws, Indian forest laws, why they are important
Forest laws, Indian forest laws, why they are important
 
Hot Sexy call girls in Moti Nagar,🔝 9953056974 🔝 escort Service
Hot Sexy call girls in  Moti Nagar,🔝 9953056974 🔝 escort ServiceHot Sexy call girls in  Moti Nagar,🔝 9953056974 🔝 escort Service
Hot Sexy call girls in Moti Nagar,🔝 9953056974 🔝 escort Service
 
Harmful and Useful Microorganisms Presentation
Harmful and Useful Microorganisms PresentationHarmful and Useful Microorganisms Presentation
Harmful and Useful Microorganisms Presentation
 
TOPIC 8 Temperature and Heat.pdf physics
TOPIC 8 Temperature and Heat.pdf physicsTOPIC 8 Temperature and Heat.pdf physics
TOPIC 8 Temperature and Heat.pdf physics
 
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
 
‏‏VIRUS - 123455555555555555555555555555555555555555
‏‏VIRUS -  123455555555555555555555555555555555555555‏‏VIRUS -  123455555555555555555555555555555555555555
‏‏VIRUS - 123455555555555555555555555555555555555555
 
Vision and reflection on Mining Software Repositories research in 2024
Vision and reflection on Mining Software Repositories research in 2024Vision and reflection on Mining Software Repositories research in 2024
Vision and reflection on Mining Software Repositories research in 2024
 
Engler and Prantl system of classification in plant taxonomy
Engler and Prantl system of classification in plant taxonomyEngler and Prantl system of classification in plant taxonomy
Engler and Prantl system of classification in plant taxonomy
 
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
 
Call Us ≽ 9953322196 ≼ Call Girls In Lajpat Nagar (Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Lajpat Nagar (Delhi) |Call Us ≽ 9953322196 ≼ Call Girls In Lajpat Nagar (Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Lajpat Nagar (Delhi) |
 
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptx
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptxTHE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptx
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptx
 
Heredity: Inheritance and Variation of Traits
Heredity: Inheritance and Variation of TraitsHeredity: Inheritance and Variation of Traits
Heredity: Inheritance and Variation of Traits
 
Neurodevelopmental disorders according to the dsm 5 tr
Neurodevelopmental disorders according to the dsm 5 trNeurodevelopmental disorders according to the dsm 5 tr
Neurodevelopmental disorders according to the dsm 5 tr
 
Analytical Profile of Coleus Forskohlii | Forskolin .pptx
Analytical Profile of Coleus Forskohlii | Forskolin .pptxAnalytical Profile of Coleus Forskohlii | Forskolin .pptx
Analytical Profile of Coleus Forskohlii | Forskolin .pptx
 
Solution chemistry, Moral and Normal solutions
Solution chemistry, Moral and Normal solutionsSolution chemistry, Moral and Normal solutions
Solution chemistry, Moral and Normal solutions
 
Microphone- characteristics,carbon microphone, dynamic microphone.pptx
Microphone- characteristics,carbon microphone, dynamic microphone.pptxMicrophone- characteristics,carbon microphone, dynamic microphone.pptx
Microphone- characteristics,carbon microphone, dynamic microphone.pptx
 

Foundations Of Software Testing

  • 1. Foundations of 
 Software Testing ECOOP/ISSTA’21 Summer School Marcel Böhme (Monash University, Australia) Soon @ Max Planck Institute for Security and Privacy, Germany.
  • 2. Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing • Fuzzing for Automatic Vulnerability Discovery • Making machines attack other machines. • Focus on scalability, efficiency, and effectiveness. • Foundations of Software Security • Assurances in Software Security • Fundamental limitations of existing approaches • Drawing from multiple disciplines (information theory, biostatistics) whoami Marcel Böhme ARC DECRA Fellow Senior Lecturer (A/Prof) Monash University, Australia Looking for PhD & PostDocs
 at Max Planck Institute
 Bochum, Germany 1
  • 3. Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing software testing Input Process Output 2
  • 4. Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing software testing Test Case Program Pass
 or Fail 2
  • 5. Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing software testing Problem: Generate at least one failing test case for each bug in the program.
  • 6. Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing • You’ve been generating test cases for your program. • No bugs found! 👍 • Is your program free of bugs? • Probably not. 😆 • Is your test case generation technique effective? • Maybe? 😅 • 🤔 • How do you even measure effectiveness if there are no bugs? questions for today 3
  • 7. Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing • How does the test case know if something is a bug or a feature? • What is the difference between effectiveness and efficiency? • When is the most effective technique (whitebox fuzzing) 
 more efficient than random test generation (blackbox fuzzing)? • How does greybox fuzzing work? Why is it so successful? • What is the relationship btw. #bugs found and #machines available? questions for today 3
  • 8. Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing • Let’s start at the beginning. • A test case consists of a test input and at least one test oracle. • A test case passes if no test oracle detects a bug for the test input. test case = test input + test oracle 4 test case test oracle test input effectiveness efficiency scalability ^
  • 9. Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing Test Input Expected
 Output? Program test case » system testing 5 test case test oracle test input effectiveness efficiency scalability ^
  • 10. Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing $ ./gifbuild -d crashing.PoC.gif # # GIF information from ./crashing.PoC.gif screen width 0 screen height 0 screen colors 2 screen background 0 pixel aspect byte 232 image # 1 image left 0 image top 0 ASAN:DEADLYSIGNAL ================================================================= ==18392==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000000 
 (pc 0x000000403d84 bp 0x7fc122903708 sp 0x7ffcac6ff150 T0) #0 0x403d83 in Gif2Icon /home/root/giflib-asan/gifbuild.c:877 #1 0x401c3c in main /home/root/giflib-asan/gifbuild.c:100 #2 0x7fc12255e82f in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2082f) #3 0x4020b8 in _start (/home/root/giflib-asan/gifbuild+0x4020b8) AddressSanitizer can not provide additional info. SUMMARY: AddressSanitizer: SEGV /home/root/giflib/gifbuild.c:877 in Gif2Icon ==18392==ABORTING Test Input Test Oracle
  • 11. Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing Unit test case » unit testing 6 test case test oracle test input effectiveness efficiency scalability ^
  • 12. Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing Unit Test Harness Input Expected? test case » unit testing 6 test case test oracle test input effectiveness efficiency scalability ^
  • 13. Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing Test Input Test Oracle Test Case http://scala-ide.org/docs/2.0.x/testingframeworks.html 6 test case test oracle test input effectiveness efficiency scalability ^
  • 14. Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing How does the test case know if the test input exposes a bug? test case » test oracle 7 test case test oracle test input effectiveness efficiency scalability ^
  • 15. Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing How does the test case know if the test input exposes a bug? The test oracle flags it as a bug. test case » test oracle test oracle 7 test case test oracle test input effectiveness efficiency scalability ^
  • 16. Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing test oracle Question: What kind of test oracles do you know? 7 test case test oracle test input effectiveness efficiency scalability ^
  • 17. Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing Assertion Test Input test oracle » assertion-based testing 8 test case test oracle test input effectiveness efficiency scalability ^
  • 18. Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing Test input Satisfies
 Postcondition? Satisfies
 Precondition? If not, ignore. test oracle » property-based testing 9 test case test oracle test input effectiveness efficiency scalability ^
  • 19. Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing Question: Can we have a test oracle that tells for every input what is the expected output? test oracle 10 test case test oracle test input effectiveness efficiency scalability ^
  • 20. Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing Question: Can we have a test oracle that tells for every input what is the expected output? test oracle No. This is called the oracle problem. 10 test case test oracle test input effectiveness efficiency scalability ^
  • 21. Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing • We may know how outputs relate to each other for all inputs. • Mathematical laws: • sin(π - x) = sin(x) • x + y = y + x • Round-trip properties: • x = unzip(zip(x)) • x = uncompress(compress(x)) • x = decrypt(encrypt(x)) • x = pickle.dump(pickle.load(x)) # serialization / parsing test oracle » metamorphic testing 11 test case test oracle test input effectiveness efficiency scalability ^
  • 22. Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing • We may know how to change inputs, expecting the same output. • Compiler / interpreter testing • If you add unreachable code, the compiled binary should give the same output. • Constraint solver testing • If you change a constraint and guarantee that the resulting constraint
 is logically equivalent, then the solver should produce the same result. • Fairness testing • If you change a sensitive field (gender or race), then the classifier 
 should produce the same result. test oracle » EMI testing 12 test case test oracle test input effectiveness efficiency scalability ^
  • 23. Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing • We may know how to change the program, expecting the same output. • Regression testing ensures that future program versions are 
 at least as correct as the current version. • When a regression test case fails, the bug is either 
 in the program or in the regression test oracle. test oracle » regression testing 13 test case test oracle test input effectiveness efficiency scalability ^
  • 24. Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing • We may have many implementations that can cast their vote. • For instance, Guido Vranken’s cryptofuzz @ OSS-Fuzz continuously tests 
 cryptographic protocol implementations: • OpenSSL, BoringSSL, LibreSSL, BearSSL, MBedTLS, 
 EverCrypt, Crypto++, cppcrypto, crypto-js, libgcrypt, libtomcrypt, 
 symcrypt, wolfcrypt, veracrypt, libtomath + 40 more test oracle » differential testing 14 test case test oracle test input effectiveness efficiency scalability ^
  • 25. Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing • Implicit test oracles detect “non-semantic”, functional bugs. • Examples: buffer overflows, memory leaks, data races, integer overflows, 
 null pointers, type confusion, • Test oracles: crashes, exceptions, kernel panic, runtime monitors,
 instrumentation from code sanitizers,.. test oracle » implicit oracles 15 test case test oracle test input effectiveness efficiency scalability ^
  • 26. Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing • Examples of non-functional requirements • Higher performance, low energy consumption, good user ratings • Often checked via A/B testing or canary testing • Deploy a new feature to a small user base first. • Deploy old version to your remaining user base. • Compare your non-functional measures. • Have the values changed for the worse? test oracle » non-functional testing 16 test case test oracle test input effectiveness efficiency scalability ^
  • 27. Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing • Examples of non-functional requirements • Higher performance, low energy consumption, good user ratings • Quantify side channel leakage • High accuracy and robustness of floating-point arithmetic code 16 test oracle » non-functional testing test case test oracle test input effectiveness efficiency scalability ^
  • 28. Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing assuming the perfect test oracle. software testing Problem: Generate at least one failing test case for each bug in the program. test input * *
  • 29. Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing Question: What are ways to generate test inputs? test input 17 test case test oracle test input effectiveness efficiency scalability ^
  • 30. Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing • Manually construct test inputs • Assertion (e.g., JUnit) • Setup concrete program state. • Assert a property of that state. • Record & Replay (e.g., Selenium) • Record a user interaction. • Replay the recorded interaction. • Assert the same behaviour. test input » manual generation 18 test case test oracle test input effectiveness efficiency scalability ^
  • 31. Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing • Automatically construct test inputs • Blackbox: Random test input generation. No program information. • Greybox: Guided random test input generation. Program feedback. • Whitebox: Systematic test input generation. Analyze program code. • Structure-aware (guided) random test input generation test input » automatic generation test case test oracle test input effectiveness efficiency scalability ^
  • 32. Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing • Automatically construct test inputs • Blackbox: Random test input generation. No program information. • Greybox: Guided random test input generation. Program feedback. • Whitebox: Systematic test input generation. Analyze program code. • Software testing as • Optimization problem (Search-based Software Testing) • Constraint satisfaction problem (Symbolic Execution) test input » automatic generation 19 test case test oracle test input effectiveness efficiency scalability ^
  • 33. Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing test input » no bugs found 🤔 After generating a few test inputs, what does it mean if no bugs have been found? 20 test case test oracle test input effectiveness efficiency scalability ^
  • 34. Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing • Is your program free of bugs? Probably not 😆 test input » no bugs found 🤔 https://www.cs.utexas.edu/users/EWD/ewd02xx/EWD249.PDF 21 test case test oracle test input effectiveness efficiency scalability ^
  • 35. Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing • Is your program free of bugs? Probably not 😆 • However, we can estimate the residual risk for • whitebox fuzzing (Filieri, Pāsāreanu, and Wisser, “Reliability Analysis in Symbolic Pathfinder”, ICSE’13) • blackbox fuzzing (Böhme; “STADS: Software Testing as Species Discovery”; TOSEM’18) • greybox fuzzing (Böhme, Liyanage, and Wüstholz; “Estimating Residual Risk in Greybox Fuzzing”; ESEC/FSE’21) test input » no bugs found 🤔 21 test case test oracle test input effectiveness efficiency scalability ^
  • 36. Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing • Is your program free of bugs? Probably not 😆 • Is your test input generator effective? Maybe? 😅 test input » no bugs found 🤔 Recall, we call a test input generator as effective if it generates at least one test input for each bug in the program. 21 test case test oracle test input effectiveness efficiency scalability ^
  • 37. Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing • Is your program free of bugs? Probably not 😆 • Is your test input generator effective? Maybe? 😅 test input » no bugs found 🤔 Recall, we call a test input generator as effective if it generates at least one test input for each bug in the program. How do we measure effectiveness if there are no bugs? 🤔 21 test case test oracle test input effectiveness efficiency scalability ^
  • 38. Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing effectiveness How do we know if we are the best at fishing if we never catch any fish in our lake?
  • 39. Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing • Catch fish at representative lakes (Fuzzbench) • If we are the best at fishing in many lakes,
 then we might be the best in fishing in our lake. effectiveness » benchmarking 22 test case test oracle test input effectiveness efficiency scalability ^
  • 40. Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing • Catch fish at representative lakes (Fuzzbench) • If we are the best at fishing in many lakes,
 then we might be the best in fishing in our lake. • Problem: Too many lakes which may also have no fishes. effectiveness » benchmarking 22 test case test oracle test input effectiveness efficiency scalability ^
  • 41. Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing • Catch fish at representative lakes (Fuzzbench) • Catch fish predominantly at lakes we know have catchable fish effectiveness » benchmarking 22 test case test oracle test input effectiveness efficiency scalability ^
  • 42. Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing • Catch fish at representative lakes (Fuzzbench) • Catch fish predominantly at lakes we know have catchable fish. • Problem 1: We don’t learn how good we are at catching fish
 that others don’t know how to catch. • Problem 2: We still need to find and “curate” those fishes. effectiveness » benchmarking 22 test case test oracle test input effectiveness efficiency scalability ^
  • 43. Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing • Catch fish at representative lakes (Fuzzbench). • Catch fish predominantly at lakes we know have catchable fish. • Catch artificial fish to representative lakes (Lava-M, rode0day). effectiveness » benchmarking Looks like fish. Swims like fish. 22 test case test oracle test input effectiveness efficiency scalability ^
  • 44. Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing • Catch fish at representative lakes (Fuzzbench). • Catch fish predominantly at lakes we know have catchable fish. • Catch artificial fish to representative lakes (Lava-M, rode0day). • Problem: Are artificial fish “realistic”? That is, is our performance of 
 catching artificial fish indicative of our performance of catching real fish? effectiveness » benchmarking Looks like fish. Swims like fish. 22 test case test oracle test input effectiveness efficiency scalability ^
  • 45. Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing • Catch fish at representative lakes (Fuzzbench). • Catch fish predominantly at lakes we know have catchable fish. • Catch artificial fish to representative lakes (Lava-M, rode0day). • Catch more realistic artificial fish in representative lakes (SemSeed). effectiveness » benchmarking 22 test case test oracle test input effectiveness efficiency scalability ^
  • 46. Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing • Catch fish at representative lakes (Fuzzbench). • Catch fish predominantly at lakes we know have catchable fish. • Catch artificial fish to representative lakes (Lava-M, rode0day). • Catch more realistic artificial fish in representative lakes (SemSeed). effectiveness » benchmarking Problem: The best fuzzer for most programs may not be the best fuzzer for my program. 😣 22 test case test oracle test input effectiveness efficiency scalability ^
  • 47. Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing effectiveness How do we know if we are the best at fishing if we never catch any fish in our lake?
  • 48. Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing • Hypothesis: Can’t catch fish living in parts of the lake we don’t cover. • Coverage-based evaluation: The more we covered, the better we are. effectiveness » coverage 23 test case test oracle test input effectiveness efficiency scalability ^
  • 49. Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing • Hypothesis: Can’t catch fish living in parts of the lake we don’t cover. • Coverage-based evaluation: The more we covered, the better we are. • Types of coverage: • Code coverage, e.g, statement, branch, def-use pair, MCDC, path coverage. effectiveness » coverage 23 test case test oracle test input effectiveness efficiency scalability ^
  • 50. Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing • Hypothesis: Can’t catch fish living in parts of the lake we don’t cover. • Coverage-based evaluation: The more we covered, the better we are. • Types of coverage: • Code coverage, e.g, statement, branch, def-use pair, MCDC, path coverage. • Input coverage, e.g, grammar or protocol coverage; pairwise testing effectiveness » coverage 23 test case test oracle test input effectiveness efficiency scalability ^
  • 51. Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing • Hypothesis: Can’t catch fish living in parts of the lake we don’t cover. • Coverage-based evaluation: The more we covered, the better we are. • Types of coverage: • Code coverage, e.g, statement, branch, def-use pair, MCDC, path coverage. • Input coverage, e.g, grammar or protocol coverage; pairwise testing • Requirements coverage, e.g., specification (pre-/post-condition) coverage. effectiveness » coverage 23 test case test oracle test input effectiveness efficiency scalability ^
  • 52. Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing • Hypothesis: Can’t catch real fish if we can’t catch artificial fish. • Mutation-based evaluation: The more artificial fish we catch, 
 the better we are. effectiveness » artificial faults 24 test case test oracle test input effectiveness efficiency scalability ^
  • 53. Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing assuming the perfect test oracle. software testing problem Generate at least one failing test case for each bug in the program. test input * * assuming more coverage is correlated with better bug finding. coverage element * * v
  • 54. Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing So, you are saying: Achieve 100% code coverage! That should be easy, right? effectiveness » cover all the things 25 test case test oracle test input effectiveness efficiency scalability ^
  • 55. Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing So, you are saying: Achieve 100% code coverage! That should be easy, right? wrong. 25 test case test oracle test input effectiveness efficiency scalability effectiveness » cover all the things ^
  • 56. Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing 25 test case test oracle test input effectiveness efficiency scalability effectiveness » cover all the things ^
  • 57. Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing • Problem: We don’t know how much coverage *can* be achieved 🤔 26 test case test oracle test input effectiveness efficiency scalability effectiveness » cover all the things ^
  • 58. Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing • Problem: We don’t know how much coverage *can* be achieved 🤔 • We cannot compute the asymptote. ? 26 test case test oracle test input effectiveness efficiency scalability effectiveness » cover all the things ^
  • 59. Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing • Problem: We don’t know how much coverage *can* be achieved 🤔 • We cannot compute the asymptote. • Determining whether an element can
 be covered is as hard as determining
 whether an assertion can be violated. if (unexpected_behavior) { fail(); } Can this be covered? 26 test case test oracle test input effectiveness efficiency scalability effectiveness » cover all the things ^
  • 60. Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing • Problem: We don’t know how much coverage *can* be achieved 🤔 • We cannot compute the asymptote. • However, we can estimate the 
 asymptote during testing! 26 test case test oracle test input effectiveness efficiency scalability effectiveness » cover all the things ^
  • 61. Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing • Problem: We don’t know how much coverage *can* be achieved 🤔 • We cannot compute the asymptote. • However, we can estimate the 
 asymptote during testing! • Consider test input gen. as sampling. • Cast as species discovery problem. ACM TOSEM’18 26 test case test oracle test input effectiveness efficiency scalability effectiveness » cover all the things ^
  • 62. Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing assuming the perfect test oracle. software testing problem Generate at least one failing test case for each bug in the program. test input * * assuming more coverage is correlated with better bug finding. coverage element * * v as many as possible v
  • 63. Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing So, it should be better to systematically generate test inputs that cover most of the coverage elements rather than to randomly generate inputs, right? effectiveness 27 test case test oracle test input effectiveness efficiency scalability ^
  • 64. Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing So, it should be better to systematically generate test inputs that cover most of the coverage elements rather than to randomly generate inputs, right? effectiveness wrong. 27 test case test oracle test input effectiveness efficiency scalability ^
  • 65. Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing effectiveness 27 test case test oracle test input effectiveness efficiency scalability ^
  • 66. Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing efficiency Consider time!* *[TSE’15] “A Probabilistic Analysis of the Efficiency of Automated Software Testing” Böhme and Paul 28 test case test oracle test input effectiveness efficiency scalability ^
  • 67. Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing efficiency When is the most effective technique (whitebox fuzzing) more efficient than random test generation (blackbox fuzzing)? Consider time!* *[TSE’15] “A Probabilistic Analysis of the Efficiency of Automated Software Testing” Böhme and Paul 28 test case test oracle test input effectiveness efficiency scalability ^
  • 68. Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing • Our whitebox fuzzer generates one test input per path. • Most effective! Covers all statements, branches, paths, and bugs! efficiency » example void crashme(char s[4]) { if (s[0] == 'b') if (s[1] == 'a') if (s[2] == 'd') if (s[3] == '!') abort(); } 29 test case test oracle test input effectiveness efficiency scalability ^
  • 69. Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing • Our whitebox fuzzer generates one test input per path. • Most effective! Covers all statements, branches, paths, and bugs! • Discovers the bug after 5 inputs. efficiency » example void crashme(char s[4]) { if (s[0] == 'b') if (s[1] == 'a') if (s[2] == 'd') if (s[3] == '!') abort(); } This program has five paths: 1. ****: false 2. b***: true, false 3. ba**: true, true, false 4. bad*: true, true, true, false 5. bad!: true, true, true, true, abort(); 29 test case test oracle test input effectiveness efficiency scalability ^
  • 70. Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing • Our whitebox fuzzer generates one test input per path. • Most effective! Covers all statements, branches, paths, and bugs! • Discovers the bug after 5 inputs. efficiency » example • Our generational blackbox fuzzer generates a random input of length 4. • Discovers the bug after ((2-8)4)-1 ≈ 4 billion inputs, in expectation. void crashme(char s[4]) { if (s[0] == 'b') if (s[1] == 'a') if (s[2] == 'd') if (s[3] == '!') abort(); } 29 test case test oracle test input effectiveness efficiency scalability ^
  • 71. Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing • Our whitebox fuzzer generates one test input per path. • Most effective! Covers all statements, branches, paths, and bugs! • Discovers the bug after 5 inputs. efficiency » example • Our generational blackbox fuzzer generates a random input of length 4. • Discovers the bug after ((2-8)4)-1 ≈ 4 billion inputs, in expectation. • On my machine, this takes 6.3 seconds. On 100 machines, it takes 63 milliseconds. void crashme(char s[4]) { if (s[0] == 'b') if (s[1] == 'a') if (s[2] == 'd') if (s[3] == '!') abort(); } 29 test case test oracle test input effectiveness efficiency scalability ^
  • 72. Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing • Our whitebox fuzzer generates one test input per path. • Most effective! Covers all statements, branches, paths, and bugs! • Discovers the bug after 5 inputs. efficiency » example • Our generational blackbox fuzzer generates a random input of length 4. • Discovers the bug after ((2-8)4)-1 ≈ 4 billion inputs, in expectation. • On my machine, this takes 6.3 seconds. On 100 machines, it takes 63 milliseconds. If our whitebox fuzzer takes too long per input, our blackbox fuzzer outperforms! » There is a maximum time per test input! void crashme(char s[4]) { if (s[0] == 'b') if (s[1] == 'a') if (s[2] == 'd') if (s[3] == '!') abort(); } 29 test case test oracle test input effectiveness efficiency scalability ^
  • 73. Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing efficiency » example • Our generational blackbox fuzzer generates a random input of length 4. • Discovers the bug after ((2-8)4)-1 ≈ 4 billion inputs, in expectation. • On my machine, this takes 6.3 seconds. On 100 machines, it takes 63 milliseconds. • Our mutational blackbox fuzzer mutates a random character in a seed. • Started with the seed bad? • Discovers the bug after ((4-1)*(2-8))-1 ≈ 1024 inputs, in expectation. void crashme(char s[4]) { if (s[0] == 'b') if (s[1] == 'a') if (s[2] == 'd') if (s[3] == '!') abort(); } 29 test case test oracle test input effectiveness efficiency scalability ^
  • 74. Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing efficiency » example • Our generational blackbox fuzzer generates a random input of length 4. • Discovers the bug after ((2-8)4)-1 ≈ 4 billion inputs, in expectation. • On my machine, this takes 6.3 seconds. On 100 machines, it takes 63 milliseconds. • Our mutational blackbox fuzzer mutates a random character in a seed. • Started with the seed bad? • Discovers the bug after ((4-1)*(2-8))-1 ≈ 1024 inputs, in expectation. Where do we get that seed? Discover it! void crashme(char s[4]) { if (s[0] == 'b') if (s[1] == 'a') if (s[2] == 'd') if (s[3] == '!') abort(); } 29 test case test oracle test input effectiveness efficiency scalability ^
  • 75. Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing [CCS’16] “Coverage-based Greybox Fuzzing as Markov Chain” Böhme Pham, and Roychoudhury • Our greybox fuzzer is mutational but adds inputs that increase coverage. efficiency » example **** b*** (1✕ 4-1 ✕ 2-8)-1
 = 1024 **** b*** ba** (1/2 ✕ 4-1 ✕ 2-8)-1
 = 2048 **** b*** ba** bad* (1/3 ✕ 4-1 ✕ 2-8)-1
 = 3072 **** b*** ba** bad* bad! (1/4 ✕ 4-1 ✕ 2-8)-1
 = 4096 Total: 10240 Seed corpus Expected #inputs “Interesting”
 Input void crashme(char s[4]) { if (s[0] == 'b') if (s[1] == 'a') if (s[2] == 'd') if (s[3] == '!') abort(); } test case test oracle test input effectiveness efficiency scalability ^
  • 76. Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing • Our greybox fuzzer is mutational but adds inputs that increase coverage. • Started with an random test input **** efficiency » example **** b*** (1✕ 4-1 ✕ 2-8)-1
 = 1024 **** b*** ba** (1/2 ✕ 4-1 ✕ 2-8)-1
 = 2048 **** b*** ba** bad* (1/3 ✕ 4-1 ✕ 2-8)-1
 = 3072 **** b*** ba** bad* bad! (1/4 ✕ 4-1 ✕ 2-8)-1
 = 4096 Total: 10240 [CCS’16] “Coverage-based Greybox Fuzzing as Markov Chain” Böhme Pham, and Roychoudhury void crashme(char s[4]) { if (s[0] == 'b') if (s[1] == 'a') if (s[2] == 'd') if (s[3] == '!') abort(); } test case test oracle test input effectiveness efficiency scalability ^
  • 77. Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing • Our greybox fuzzer is mutational but adds inputs that increase coverage. • Started with an random test input **** • Discovers the bug after generating 10k inputs. • On my machine, 150 milliseconds. efficiency » example **** b*** (1✕ 4-1 ✕ 2-8)-1
 = 1024 **** b*** ba** (1/2 ✕ 4-1 ✕ 2-8)-1
 = 2048 **** b*** ba** bad* (1/3 ✕ 4-1 ✕ 2-8)-1
 = 3072 **** b*** ba** bad* bad! (1/4 ✕ 4-1 ✕ 2-8)-1
 = 4096 Total: 10240 [CCS’16] “Coverage-based Greybox Fuzzing as Markov Chain” Böhme Pham, and Roychoudhury void crashme(char s[4]) { if (s[0] == 'b') if (s[1] == 'a') if (s[2] == 'd') if (s[3] == '!') abort(); } test case test oracle test input effectiveness efficiency scalability ^
  • 78. Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing • Our greybox fuzzer is mutational but adds inputs that increase coverage. • Started with an random test input **** • Discovers the bug after generating 10k inputs. • On my machine, 150 milliseconds. • If we prefer seeds on low-probability paths,
 it only takes 4k inputs (55 ms). efficiency » example [CCS’16] “Coverage-based Greybox Fuzzing as Markov Chain” Böhme Pham, and Roychoudhury **** b*** (1✕ 4-1 ✕ 2-8)-1
 = 1024 **** b*** ba** (1 ✕ 4-1 ✕ 2-8)-1
 = 1024 **** b*** ba** bad* (1 ✕ 4-1 ✕ 2-8)-1
 = 1024 **** b*** ba** bad* bad! (1 ✕ 4-1 ✕ 2-8)-1
 = 1024 Total: 4096 void crashme(char s[4]) { if (s[0] == 'b') if (s[1] == 'a') if (s[2] == 'd') if (s[3] == '!') abort(); } test case test oracle test input effectiveness efficiency scalability ^
  • 79. Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing • Limitation of “smarter” testing:
 If your test input generation is not fast enough, 
 even a simple (guided) random test input generation 
 will find more bugs in your limited time budget. efficiency test case test oracle test input effectiveness efficiency scalability ^
  • 80. Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing scalability test case test oracle test input effectiveness efficiency scalability Okay. Let’s take the most popular technique and distribute it across many machines. How does bug finding scale with #machines? [ESEC/FSE’20] “Fuzzing: On the Exponential Cost of Vulnerability Discovery” Böhme and Falk ^
  • 81. • Google has been fuzzing OSS for about 4 years • 25k machines; 11k+ bugs in 160+ OSS; 16k+ bugs in Chrome browser • Discovery rate reduces. As more bugs are fixed, less new bugs found. scalability [ESEC/FSE’20] “Fuzzing: On the Exponential Cost of Vulnerability Discovery” Böhme and Falk test case test oracle test input effectiveness efficiency scalability ^
  • 82. • Google has been fuzzing OSS for about 4 years. • Suppose, Google now employs 100x more machines. • In 1 month on 2.5 million machines they find 100 vulns more. scalability [ESEC/FSE’20] “Fuzzing: On the Exponential Cost of Vulnerability Discovery” Böhme and Falk test case test oracle test input effectiveness efficiency scalability ^
  • 83. • Google has been fuzzing OSS for about 4 years. • Suppose, Google now employs 100x more machines. • In 1 month on 2.5 million machines they find 100 vulns more. How long do you expect it would take to find all of these known vulnerabilities on *250 million* machines? scalability [ESEC/FSE’20] “Fuzzing: On the Exponential Cost of Vulnerability Discovery” Böhme and Falk test case test oracle test input effectiveness efficiency scalability ^
  • 84. • Google has been fuzzing OSS for about 4 years. • Suppose, Google now employs 100x more machines. • In 1 month on 2.5 million machines they find 100 vulns more Given the same non-deterministic fuzzer, finding the same bugs linearly faster requires linearly more machines. Do you agree? scalability [ESEC/FSE’20] “Fuzzing: On the Exponential Cost of Vulnerability Discovery” Böhme and Falk test case test oracle test input effectiveness efficiency scalability ^
  • 85. • Google has been fuzzing OSS for about 4 years. • Suppose, Google now employs 100x more machines. • In 1 month on 2.5 million machines they find 100 vulns more. Now, how many undiscovered vulns do you expect to find in 1 month on 250 million machines? scalability [ESEC/FSE’20] “Fuzzing: On the Exponential Cost of Vulnerability Discovery” Böhme and Falk test case test oracle test input effectiveness efficiency scalability ^
  • 86. • Google has been fuzzing OSS for about 4 years. • Suppose, Google now employs 100x more machines. • In 1 month on 2.5 million machines they find 100 vulns more. Given the same non-deterministic fuzzer, finding linearly more new bugs in c months, requires exponentially more machines. scalability [ESEC/FSE’20] “Fuzzing: On the Exponential Cost of Vulnerability Discovery” Böhme and Falk test case test oracle test input effectiveness efficiency scalability ^
  • 87. 1 2 4 8 16 32 64 128 256 512 machines new bugs 1 2 3 4 5 6 7 8 9 24 hrs scalability
  • 89. Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing assuming the perfect test oracle. software testing problem Generate at least one failing test case for each bug in the program. test input * * assuming more coverage is correlated with better bug finding. coverage element * * v as many as possible v
  • 90. Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing assuming the perfect test oracle. software testing problem Generate at least one failing test case for each bug in the program. test input * * assuming more coverage is correlated with better bug finding. coverage element * * v as many as possible v Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing • Automatically construct test inputs • Blackbox: Random test input generation. No program information. • Greybox: Guided random test input generation. Program feedback. • Whitebox: Systematic test input generation. Analyze program code. • Software testing as • Optimization problem (Search-based Software Testing) • Constraint satisfaction problem (Symbolic Execution) test input » automatic generation 19 test case test oracle test input effectiveness efficiency scalability Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing • We may know how outputs relate to each other for all inputs. • Mathematical laws: • sin(π - x) = sin(x) • x + y = y + x • Round-trip properties: • x = unzip(zip(x)) • x = uncompress(compress(x)) • x = decrypt(encrypt(x)) • x = pickle.dump(pickle.load(x)) # serialization / parsing test oracle » metamorphic testing 11 test case test oracle test input effectiveness efficiency scalability Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing Test Input Expected
 Output? Program test case » system testing 5 test case test oracle test input effectiveness efficiency scalability Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing Unit Test Harness Input Expected? test case » unit testing 6 test case test oracle test input effectiveness efficiency scalability
  • 91. Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing assuming the perfect test oracle. software testing problem Generate at least one failing test case for each bug in the program. test input * * assuming more coverage is correlated with better bug finding. coverage element * * v as many as possible v Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing • Automatically construct test inputs • Blackbox: Random test input generation. No program information. • Greybox: Guided random test input generation. Program feedback. • Whitebox: Systematic test input generation. Analyze program code. • Software testing as • Optimization problem (Search-based Software Testing) • Constraint satisfaction problem (Symbolic Execution) test input » automatic generation 19 test case test oracle test input effectiveness efficiency scalability Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing • We may know how outputs relate to each other for all inputs. • Mathematical laws: • sin(π - x) = sin(x) • x + y = y + x • Round-trip properties: • x = unzip(zip(x)) • x = uncompress(compress(x)) • x = decrypt(encrypt(x)) • x = pickle.dump(pickle.load(x)) # serialization / parsing test oracle » metamorphic testing 11 test case test oracle test input effectiveness efficiency scalability Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing Test Input Expected
 Output? Program test case » system testing 5 test case test oracle test input effectiveness efficiency scalability Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing Unit Test Harness Input Expected? test case » unit testing 6 test case test oracle test input effectiveness efficiency scalability Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing 25 test case test oracle test input effectiveness efficiency scalability effectiveness » cover all the things test case test oracle test input effectiveness efficiency scalability effectiveness » cover all the things x ?
  • 92. Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing • Limitation of “smarter” testing:
 If your test input generation is not fast enough, 
 even a simple (guided) random test input generation 
 will find more bugs in your limited time budget. efficiency test case test oracle test input effectiveness efficiency scalability ^ Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing assuming the perfect test oracle. software testing problem Generate at least one failing test case for each bug in the program. test input * * assuming more coverage is correlated with better bug finding. coverage element * * v as many as possible v Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing • Automatically construct test inputs • Blackbox: Random test input generation. No program information. • Greybox: Guided random test input generation. Program feedback. • Whitebox: Systematic test input generation. Analyze program code. • Software testing as • Optimization problem (Search-based Software Testing) • Constraint satisfaction problem (Symbolic Execution) test input » automatic generation 19 test case test oracle test input effectiveness efficiency scalability Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing • We may know how outputs relate to each other for all inputs. • Mathematical laws: • sin(π - x) = sin(x) • x + y = y + x • Round-trip properties: • x = unzip(zip(x)) • x = uncompress(compress(x)) • x = decrypt(encrypt(x)) • x = pickle.dump(pickle.load(x)) # serialization / parsing test oracle » metamorphic testing 11 test case test oracle test input effectiveness efficiency scalability Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing Test Input Expected
 Output? Program test case » system testing 5 test case test oracle test input effectiveness efficiency scalability Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing Unit Test Harness Input Expected? test case » unit testing 6 test case test oracle test input effectiveness efficiency scalability Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing 25 test case test oracle test input effectiveness efficiency scalability effectiveness » cover all the things test case test oracle test input effectiveness efficiency scalability effectiveness » cover all the things x ?
  • 93. Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing • Limitation of “smarter” testing:
 If your test input generation is not fast enough, 
 even a simple (guided) random test input generation 
 will find more bugs in your limited time budget. efficiency test case test oracle test input effectiveness efficiency scalability ^ Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing assuming the perfect test oracle. software testing problem Generate at least one failing test case for each bug in the program. test input * * assuming more coverage is correlated with better bug finding. coverage element * * v as many as possible v Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing • Automatically construct test inputs • Blackbox: Random test input generation. No program information. • Greybox: Guided random test input generation. Program feedback. • Whitebox: Systematic test input generation. Analyze program code. • Software testing as • Optimization problem (Search-based Software Testing) • Constraint satisfaction problem (Symbolic Execution) test input » automatic generation 19 test case test oracle test input effectiveness efficiency scalability Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing • We may know how outputs relate to each other for all inputs. • Mathematical laws: • sin(π - x) = sin(x) • x + y = y + x • Round-trip properties: • x = unzip(zip(x)) • x = uncompress(compress(x)) • x = decrypt(encrypt(x)) • x = pickle.dump(pickle.load(x)) # serialization / parsing test oracle » metamorphic testing 11 test case test oracle test input effectiveness efficiency scalability Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing Test Input Expected
 Output? Program test case » system testing 5 test case test oracle test input effectiveness efficiency scalability Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing Unit Test Harness Input Expected? test case » unit testing 6 test case test oracle test input effectiveness efficiency scalability Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing 25 test case test oracle test input effectiveness efficiency scalability effectiveness » cover all the things 1 2 4 8 16 32 64 128 256 512 machines new bugs 1 2 3 4 5 6 7 8 9 24 hrs scalability test case test oracle test input effectiveness efficiency scalability effectiveness » cover all the things x ?
  • 94. Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing • Limitation of “smarter” testing:
 If your test input generation is not fast enough, 
 even a simple (guided) random test input generation 
 will find more bugs in your limited time budget. efficiency test case test oracle test input effectiveness efficiency scalability ^ 1 2 4 8 16 32 64 128 256 512 machines new bugs 1 2 3 4 5 6 7 8 9 24 hrs scalability Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing assuming the perfect test oracle. software testing problem Generate at least one failing test case for each bug in the program. test input * * assuming more coverage is correlated with better bug finding. coverage element * * v as many as possible v Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing • Automatically construct test inputs • Blackbox: Random test input generation. No program information. • Greybox: Guided random test input generation. Program feedback. • Whitebox: Systematic test input generation. Analyze program code. • Software testing as • Optimization problem (Search-based Software Testing) • Constraint satisfaction problem (Symbolic Execution) test input » automatic generation 19 test case test oracle test input effectiveness efficiency scalability Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing • We may know how outputs relate to each other for all inputs. • Mathematical laws: • sin(π - x) = sin(x) • x + y = y + x • Round-trip properties: • x = unzip(zip(x)) • x = uncompress(compress(x)) • x = decrypt(encrypt(x)) • x = pickle.dump(pickle.load(x)) # serialization / parsing test oracle » metamorphic testing 11 test case test oracle test input effectiveness efficiency scalability Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing Test Input Expected
 Output? Program test case » system testing 5 test case test oracle test input effectiveness efficiency scalability Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing Unit Test Harness Input Expected? test case » unit testing 6 test case test oracle test input effectiveness efficiency scalability Marcel Böhme, Monash University · ECOOP/ISSTA’21 Summer School · Foundations of Software Testing 25 test case test oracle test input effectiveness efficiency scalability effectiveness » cover all the things test case test oracle test input effectiveness efficiency scalability effectiveness » cover all the things x ? If you want to take a deeper dive: * Attend workshops and talks @ ECOOP/ISSTA’21 this week! * Read our interactive text book: The Fuzzing Book * Read our IEEE Software article: “Fuzzing: Challenges and Reflections” * Apply for PhD / PostDoc in my group at MPI-SP, Bochum, Germany. Web: https://mboehme.github.com Twitter: @mboehme_