SlideShare a Scribd company logo
1 of 57
Download to read offline
PhD Thesis Defense
Presented by Marcel Boehme
Directed

Greybox Fuzzing
Marcel Böhme Thuan Pham Abhik RoychoudhuryM.-D. Nguyen
Presented by Abhik Roychoudhury
Directed Greybox Fuzzing
Motivation
• Automated vulnerability detection techniques

1. Blackbox Fuzzing (no program analysis, no feedback)

2. Whitebox Fuzzing (mostly program analysis)

3. Greybox Fuzzing (no program analysis, but coverage feedback)
Presented by Abhik Roychoudhury
Directed Greybox Fuzzing
Motivation
• Automated vulnerability detection techniques

1. Blackbox Fuzzing (no program analysis, no feedback)
📄 Model-Based
Blackbox
Fuzzing
Peach, Spike …
Seed Input
Presented by Abhik Roychoudhury
Directed Greybox Fuzzing
Motivation
• Automated vulnerability detection techniques

1. Blackbox Fuzzing (no program analysis, no feedback)
📄 Model-Based
Blackbox
Fuzzing
Peach, Spike …
Seed Input
📄
📄
📄
Valid input
Semi-valid input
Semi-valid input
Mutated Inputs
Presented by Abhik Roychoudhury
Directed Greybox Fuzzing
Motivation
• Automated vulnerability detection techniques

1. Blackbox Fuzzing (no program analysis, no feedback)
📄 Model-Based
Blackbox
Fuzzing
Input model
Peach, Spike …
Seed Input
📄
📄
📄
Valid input
Semi-valid input
Semi-valid input
Mutated Inputs
Presented by Abhik Roychoudhury
Directed Greybox Fuzzing
Motivation
• Automated vulnerability detection techniques

1. Blackbox Fuzzing (no program analysis, no feedback)

2. Whitebox Fuzzing (mostly program analysis)
Presented by Abhik Roychoudhury
Directed Greybox Fuzzing
Motivation
• Automated vulnerability detection techniques

1. Blackbox Fuzzing (no program analysis, no feedback)

2. Whitebox Fuzzing (mostly program analysis)

3. Greybox Fuzzing (no program analysis, but coverage feedback)
📄 📄📄 📄Greybox
Fuzzing
…
Seed Input
Presented by Abhik Roychoudhury
Directed Greybox Fuzzing
Motivation
• Automated vulnerability detection techniques

1. Blackbox Fuzzing (no program analysis, no feedback)

2. Whitebox Fuzzing (mostly program analysis)

3. Greybox Fuzzing (no program analysis, but coverage feedback)
📄 📄📄 📄Greybox
Fuzzing
…
📄📄 Enqueue
Seed Input Mutated Inputs
Queue of 

“interesting” seeds
Retain inputs

that increase

coverage!
Presented by Abhik Roychoudhury
Directed Greybox Fuzzing
Motivation
• Automated vulnerability detection techniques

1. Blackbox Fuzzing (no program analysis, no feedback)

2. Whitebox Fuzzing (mostly program analysis)

3. Greybox Fuzzing (no program analysis, but coverage feedback)
📄 📄📄 📄Greybox
Fuzzing
…
📄📄 EnqueueDequeue
Seed Input Mutated Inputs
Queue of 

“interesting” seeds
Retain inputs

that increase

coverage!
Presented by Abhik Roychoudhury
Directed Greybox Fuzzing
Motivation
• Greybox Fuzzing is frequently used
• State-of-the-art in automated vulnerability detection

• Extremely efficient coverage-based input generation

• All program analysis before/at instrumentation time.

• Start with a seed corpus, choose a seed file, fuzz it.

• Add to corpus only if new input increases coverage.
Presented by Abhik Roychoudhury
Directed Greybox Fuzzing
Motivation
• Greybox Fuzzing is frequently used
• State-of-the-art in automated vulnerability detection

• Extremely efficient coverage-based input generation

• All program analysis before/at instrumentation time.

• Start with a seed corpus, choose a seed file, fuzz it.

• Add to corpus only if new input increases coverage.

• Cannot be directed!
Presented by Abhik Roychoudhury
Directed Greybox Fuzzing
Motivation
• Directed Fuzzing has many applications
• Patch Testing: reach changed statements
• Crash Reproduction: exercise stack trace

• SA Report Verification: reach “dangerous” location

• Information Flow Detection: exercise source-sink pairs
Presented by Abhik Roychoudhury
Directed Greybox Fuzzing
Motivation
• Directed Fuzzing: classical constraint satisfaction prob.

• Program analysis to identify program paths 

that reach given program locations.

• Symbolic Execution to derive path conditions 

for any of the identified paths.

• Constraint Solving to find an input that

• satisfies the path condition and thus

• reaches a program location that was given.
φ1 = (x>y)∧(x+y>10)

φ2 = ¬(x>y)∧(x+y>10)
x > y
a = x a = y
x+y>10
b = a
return b
Presented by Abhik Roychoudhury
Directed Greybox Fuzzing
Motivation
• Directed Fuzzing: classical constraint satisfaction prob.

• Program analysis to identify program paths 

that reach given program locations.

• Symbolic Execution to derive path conditions 

for any of the identified paths.

• Constraint Solving to find an input that

• satisfies the path condition and thus

• reaches a program location that was given.
φ1 = (x>y)∧(x+y>10)

φ2 = ¬(x>y)∧(x+y>10)
x > y
a = x a = y
x+y>10
b = a
return b
Requires

heavy-weight

machinery!
Presented by Abhik Roychoudhury
Directed Greybox Fuzzing
Overview
• Directed Fuzzing as optimisation problem!

1. Instrumentation Time:

1. Extract call graph (CG) and control-flow graphs (CFGs).

2. For each BB, compute distance to target locations.

3. Instrument program to aggregate distance values.
Presented by Abhik Roychoudhury
Directed Greybox Fuzzing
Overview
• Directed Fuzzing as optimisation problem!

1. Instrumentation Time:

1. Extract call graph (CG) and control-flow graphs (CFGs).

2. For each BB, compute distance to target locations.

3. Instrument program to aggregate distance values.

2. Runtime, for each input

1. collect coverage and distance information, and

2. decide how long to be fuzzed based on distance.

• If input is closer to the targets, it is fuzzed for longer.

• If input is further away from the targets, it is fuzzed for shorter.
Presented by Abhik Roychoudhury
Directed Greybox Fuzzing
Instrumentation
• Function-level target distance using call graph (CG)
main
a b
cd e
Presented by Abhik Roychoudhury
Directed Greybox Fuzzing
Instrumentation
• Function-level target distance using call graph (CG)

1. Identify target functions in CG
main
a b
cd e
Presented by Abhik Roychoudhury
Directed Greybox Fuzzing
main
a b
cd e
Instrumentation
• Function-level target distance using call graph (CG)

1. Identify target functions in CG

2. For each function, compute the

harmonic mean of the length 

of the shortest path to targets
1
2
0
2
3N/A
Presented by Abhik Roychoudhury
Directed Greybox Fuzzing
Instrumentation
• Function-level target distance using call graph (CG)

• BB-level target distance using CFG
CFG for function b
main
a b
cd e
1
2
0
2
3N/A
Presented by Abhik Roychoudhury
Directed Greybox Fuzzing
Instrumentation
• Function-level target distance using call graph (CG)

• BB-level target distance using control-flow graph (CFG)

1. Identify target BBs and

assign distance 0

(none in function b)
CFG for function b
main
a b
cd e
1
2
0
2
3N/A
Presented by Abhik Roychoudhury
Directed Greybox Fuzzing
Instrumentation
• Function-level target distance using call graph (CG)

• BB-level target distance using control-flow graph (CFG)

1. Identify target BBs and

assign distance 0

2. Identify BBs that

call functions
CFG for function b
main
a b
cd e
1
2
0
2
3N/A
c
a
Presented by Abhik Roychoudhury
Directed Greybox Fuzzing
Instrumentation
• Function-level target distance using call graph (CG)

• BB-level target distance using control-flow graph (CFG)

1. Identify target BBs and

assign distance 0

2. Identify BBs that

call functions and

assign 10*FLTD
CFG for function b
main
a b
cd e
1
2
0
2
3N/A
c
a 10
30
Presented by Abhik Roychoudhury
Directed Greybox Fuzzing
Instrumentation
• Function-level target distance using call graph (CG)

• BB-level target distance using control-flow graph (CFG)

1. Identify target BBs and

assign distance 0

2. Identify BBs that

call functions and

assign 10*FLTD

3. For each BB, compute harmonic

mean of (length of shortest path to

any function-calling BB + 10*FLTD).
CFG for function b
c
a 10
30
[(1+30)-1+(2+10)-1]-1
Presented by Abhik Roychoudhury
Directed Greybox Fuzzing
Instrumentation
• Function-level target distance using call graph (CG)

• BB-level target distance using control-flow graph (CFG)

1. Identify target BBs and

assign distance 0

2. Identify BBs that

call functions and

assign 10*FLTD

3. For each BB, compute harmonic

mean of (length of shortest path to

any function-calling BB + 10*FLTD).
CFG for function b
8.7
11
10
30
13
12
N/A
Presented by Abhik Roychoudhury
Directed Greybox Fuzzing
Runtime
• Function-level target distance using call graph (CG)

• BB-level target distance using control-flow graph (CFG)

• Seed distance from instrumented binary
CFG for function b
8.7
11
10
30
13
12
N/A
Presented by Abhik Roychoudhury
Directed Greybox Fuzzing
Runtime
• Function-level target distance using call graph (CG)

• BB-level target distance using control-flow graph (CFG)

• Seed distance from instrumented binary

• Two 64-bit shared memory entries

• Aggregated BB-level distance values

• Number of executed BBs
Seed Distance: 19.4 

= (8.7+30)/2
8.7
11
10
30
13
12
N/A
Presented by Abhik Roychoudhury
Directed Greybox Fuzzing
Runtime
• Function-level target distance using call graph (CG)

• BB-level target distance using control-flow graph (CFG)

• Seed distance from instrumented binary

• Two 64-bit shared memory entries

• Aggregated BB-level distance values

• Number of executed BBs
8.7
11
10
30
13
12
N/A
Seed Distance: 10.4 

= (8.7+11+10+12)/4
Presented by Abhik Roychoudhury
Directed Greybox Fuzzing
Runtime
• Function-level target distance using call graph (CG)

• BB-level target distance using control-flow graph (CFG)

• Seed distance from instrumented binary

• Two 64-bit shared memory entries

• Aggregated BB-level distance values

• Number of executed BBs
8.7
11
10
30
13
12
N/A
Seed Distance: 10.4 

= (8.7+11+10+12)/4
Now that we know how to

compute seed distance,
— let’s minimise it! —
Presented by Abhik Roychoudhury
Directed Greybox Fuzzing
• Background: Coverage-based Greybox Fuzzing
Directed Fuzzing as 

Optimisation Problem
Seed File
📄
Presented by Abhik Roychoudhury
Directed Greybox Fuzzing
•
Mutation Operators:
• Bitflips
• BoundaryValues 

(0,1,-1,INT_MAX,INT_MIN)
• Simple arithmetics

(add/subtract 1)
• Block deletion
• Block insertion
• Background: Coverage-based Greybox Fuzzing
Directed Fuzzing as 

Optimisation Problem
Seed File
📄
Presented by Abhik Roychoudhury
Directed Greybox Fuzzing
• Background: Coverage-based Greybox Fuzzing
Directed Fuzzing as 

Optimisation Problem
📄 📄📄 📄Greybox
Fuzzing
…
Seed Input
Presented by Abhik Roychoudhury
Directed Greybox Fuzzing
• Background: Coverage-based Greybox Fuzzing
Directed Fuzzing as 

Optimisation Problem
📄 📄📄 📄Greybox
Fuzzing
…
📄📄 Enqueue
Seed Input Mutated Inputs
Queue of 

“interesting” seeds
Retain inputs

that increase

coverage!
Presented by Abhik Roychoudhury
Directed Greybox Fuzzing
• Background: Coverage-based Greybox Fuzzing
Directed Fuzzing as 

Optimisation Problem
📄 📄📄 📄Greybox
Fuzzing
…
📄📄 EnqueueDequeue
Seed Input Mutated Inputs
Queue of 

“interesting” seeds
Retain inputs

that increase

coverage!
Presented by Abhik Roychoudhury
Directed Greybox Fuzzing
• Background: Coverage-based Greybox Fuzzing
Directed Fuzzing as 

Optimisation Problem
📄📄
Queue of 

“interesting” seeds
📄📄 📄
energy
low energy

- generated less

test inputs!
Presented by Abhik Roychoudhury
Directed Greybox Fuzzing
• Background: Coverage-based Greybox Fuzzing
Directed Fuzzing as 

Optimisation Problem
📄📄
Queue of 

“interesting” seeds
📄📄 📄
high energy

- generated more

test inputs!📄
energy
Presented by Abhik Roychoudhury
Directed Greybox Fuzzing
Directed Fuzzing as 

Optimisation Problem
• Background: Coverage-based Greybox Fuzzing

• Seed’s energy:

• Number of inputs generated when chosen for fuzzing

• Local property: each seed has its own energy

• Power schedule: 

• Assigns energy to seeds according to a pre-defined formula

★ Boosted Greybox Fuzzing (AFLFast CCS’16)

• Assign more energy to seeds exercising low-frequency paths.
★ Directed Greybox Fuzzing (AFLGo CCS’17)
• Assign more energy to seeds that a closer to the given targets!
Presented by Abhik Roychoudhury
Directed Greybox Fuzzing
Directed Fuzzing as 

Optimisation Problem
• Background: Coverage-based Greybox Fuzzing

• Seed’s energy:

• Number of inputs generated when chosen for fuzzing

• Local property: each seed has its own energy

• Power schedule: 

• Assigns energy to seeds according to a pre-defined formula

★ Boosted Greybox Fuzzing (AFLFast CCS’16)

• Assign more energy to seeds exercising low-frequency paths.
★ Directed Greybox Fuzzing (AFLGo CCS’17)
• Assign more energy to seeds that a closer to the given targets!
Presented by Abhik Roychoudhury
Directed Greybox Fuzzing
• Directed Greybox Fuzzing

• Assign more energy to seeds that a closer to the given targets!

• Problem (Stochastic Gradient Descent)
• If we always assign more energy to closer seeds, 

we typically reach only a local minimum,

but never a global minimum distance!

• Solution (Simulated Annealing)

• Sometimes assign more energy to further-away seeds!

• Approaches global minimum distance.
Directed Fuzzing as 

Optimisation Problem
Presented by Abhik Roychoudhury
Directed Greybox Fuzzing
• Directed Greybox Fuzzing

• Assign more energy to seeds that a closer to the given targets!

• Problem (Stochastic Gradient Descent)
• If we always assign more energy to closer seeds, 

we typically reach only a local minimum,

but never a global minimum distance!

• Solution (Simulated Annealing)

• Sometimes assign more energy to further-away seeds!

• Approaches global minimum distance.
Directed Fuzzing as 

Optimisation Problem
Presented by Abhik Roychoudhury
Directed Greybox Fuzzing
Directed Fuzzing as 

Optimisation Problem
• Simulated Annealing (SA)

• Exploration phase:

• Energy of closer seeds similar to energy

of further-away seeds

• Exploitation phase:

• Energy of closer seeds is assigned 

to be higher and higher

• Energy of further-away seeds is 

assigned to be lower and lower

• We are increasing the “importance” 

of seed distance over time.
Presented by Abhik Roychoudhury
Directed Greybox Fuzzing
Directed Fuzzing as 

Optimisation Problem
• Simulated Annealing (SA)

• Annealing from metallurgy: control the cooling of material

to reduce defects (e.g., cracks or bubbles) in the material.

• Temperature T ∈ [0,1] specifies “importance” of distance.

• At T=1, exploration (normal AFL)

• At T=0, exploitation (gradient descent)

• Cooling schedule controls (global) temperature

• Classically, exponential cooling.
Presented by Abhik Roychoudhury
Directed Greybox Fuzzing
• Integrating Simulated Annealing as power schedule

• In the beginning (t = 0min), 

assign the same energy

to all seeds.
Directed Fuzzing as 

Optimisation Problem
dene the normalized seed
set of target locations Tb .
s. This trace contains the
ed distance d(s,Tb ) as
(m,Tb )
|
(3)
set S of seeds to fuzz. We
,Tb ) as the dierence be-
he minimum seed distance
by the dierence between
0.00
0.25
0.50
0.75
1.00
0.00 0.25 0.50 0.75 1.00
Distance d(s,Tb)
Energyp(s,Tb)
t = 0min t =10min t = 80min
Presented by Abhik Roychoudhury
Directed Greybox Fuzzing
• Integrating Simulated Annealing as power schedule

• In the beginning (t = 0min), 

assign the same energy

to all seeds.

• Later (t=10min), assign

a bit more energy to

seeds that are closer.

Directed Fuzzing as 

Optimisation Problem
dene the normalized seed
set of target locations Tb .
s. This trace contains the
ed distance d(s,Tb ) as
(m,Tb )
|
(3)
set S of seeds to fuzz. We
,Tb ) as the dierence be-
he minimum seed distance
by the dierence between
0.00
0.25
0.50
0.75
1.00
0.00 0.25 0.50 0.75 1.00
Distance d(s,Tb)
Energyp(s,Tb)
t = 0min t =10min t = 80min
Presented by Abhik Roychoudhury
Directed Greybox Fuzzing
• Integrating Simulated Annealing as power schedule

• In the beginning (t = 0min), 

assign the same energy

to all seeds.

• Later (t=10min), assign

a bit more energy to

seeds that are closer.

• At exploitation (t=80min),

assign maximal energy to

seeds that are closest.
Directed Fuzzing as 

Optimisation Problem
dene the normalized seed
set of target locations Tb .
s. This trace contains the
ed distance d(s,Tb ) as
(m,Tb )
|
(3)
set S of seeds to fuzz. We
,Tb ) as the dierence be-
he minimum seed distance
by the dierence between
0.00
0.25
0.50
0.75
1.00
0.00 0.25 0.50 0.75 1.00
Distance d(s,Tb)
Energyp(s,Tb)
t = 0min t =10min t = 80min
Presented by Abhik Roychoudhury
Directed Greybox Fuzzing
• Integrating Simulated Annealing as power schedule

• In the beginning (t = 0min), 

assign the same energy

to all seeds.

• Later (t=10min), assign

a bit more energy to

seeds that are closer.

• At exploitation (t=80min),

assign maximal energy to

seeds that are closest.
Directed Fuzzing as 

Optimisation Problem
0.00
0.25
0.50
0.75
1.00
0.00 0.25 0.50 0.75 1.00
Distance d(s,Tb)
Energyp(s,Tb)
t = 0min t =10min t = 80min
0.00
0.25
0.50
0.75
1.00
0 20 40 60 80
Current time t (in min)
Energyp(s,Tb)
d = 1 d = 0.5 d = 0
Presented by Abhik Roychoudhury
Directed Greybox Fuzzing
Results
• Patch Testing: Reach changed statements

• State-of-the-art in patch testing

• KATCH (based on Klee symbolic exec. tool)

• Experimental Setup

• Reuse original KATCH-benchmark

• Measure patch coverage (#changed BBs reached)

• Measure vuln. detection (#errors discovered)
KATCH: High-Coverage Testing of Software Patches
Paul Dan Marinescu
Department of Computing
Imperial College London, UK
p.marinescu@imperial.ac.uk
Cristian Cadar
Department of Computing
Imperial College London, UK
c.cadar@imperial.ac.uk
ABSTRACT
One of the distinguishing characteristics of software systems
is that they evolve: new patches are committed to software
repositories and new versions are released to users on a
continuous basis. Unfortunately, many of these changes
bring unexpected bugs that break the stability of the system
or a↵ect its security. In this paper, we address this problem
using a technique for automatically testing code patches.
Our technique combines symbolic execution with several
novel heuristics based on static and dynamic program anal-
!#$%'$!(
Figure 1: KATCH is integrated in the software
Presented by Abhik Roychoudhury
Directed Greybox Fuzzing
Results
• Patch Testing: Reach changed statements

• State-of-the-art in patch testing

• KATCH (based on Klee symbolic exec. tool)

• Patch Coverage (#changed BBs reached)
is unrelated to the concept [32]. We make a conscious eort to
explain the observed phenomena and distinguish conceptual from
technical origins. Moreover, we encourage the reader to consider
the perspective of a security researcher who is actually handling
these tools to establish whether there exists a vulnerability.
5.1 Patch Coverage
We begin by analyzing the patch coverage achieved by both K
and AFLG as measured by the number of previously uncovered
basic blocks that were changed in the respective patch.
Table 1: Patch coverage results showing the number of previ-
ously uncovered targets that K and AFLG could cover
in the stipulated time budget, respectively.
#Changed #Uncovered
Basic Blocks Changed BBs K AFLG
Binutils 852 702 135 159
Diutils 166 108 63 64
Sum 1018 810 198 223
complex inp
executed onl
for certain a
dened stru
higher-quali
approach [2
To unders
we investiga
we can see
cannot cove
cover. We at
vidual streng
dicult con
be dicult t
quickly exp
stuck in a pa
AFLG and
282 targets
cover indi
Presented by Abhik Roychoudhury
Directed Greybox Fuzzing
Results
• Patch Testing: Reach changed statements

• State-of-the-art in patch testing

• KATCH (based on Klee symbolic exec. tool)

• Patch Coverage (#changed BBs reached)

• While we would expect Klee to take a substantial lead, 

AFLGo outperforms KATCH in terms of patch coverage.
is unrelated to the concept [32]. We make a conscious eort to
explain the observed phenomena and distinguish conceptual from
technical origins. Moreover, we encourage the reader to consider
the perspective of a security researcher who is actually handling
these tools to establish whether there exists a vulnerability.
5.1 Patch Coverage
We begin by analyzing the patch coverage achieved by both K
and AFLG as measured by the number of previously uncovered
basic blocks that were changed in the respective patch.
Table 1: Patch coverage results showing the number of previ-
ously uncovered targets that K and AFLG could cover
in the stipulated time budget, respectively.
#Changed #Uncovered
Basic Blocks Changed BBs K AFLG
Binutils 852 702 135 159
Diutils 166 108 63 64
Sum 1018 810 198 223
complex inp
executed onl
for certain a
dened stru
higher-quali
approach [2
To unders
we investiga
we can see
cannot cove
cover. We at
vidual streng
dicult con
be dicult t
quickly exp
stuck in a pa
AFLG and
282 targets
cover indi
Presented by Abhik Roychoudhury
Directed Greybox Fuzzing
Results
• Patch Testing: Reach changed statements

• State-of-the-art in patch testing

• KATCH (based on Klee symbolic exec. tool)

• Patch Coverage (#changed BBs reached)

• While we would expect Klee to take a substantial lead, 

AFLGo outperforms KATCH in terms of patch coverage.

• BUT: Together they cover 42% and 26% 

more than AFLGo and KATCH individually. 

They complement each other!

oject Tools diff, sdiff, diff3, cmp
ogram Size 42,930 LoC
n Commits 175 commits from Nov’09–May’12
GNU Diutils
oject Tools
addr2line, ar, cxxfilt, elfedit, nm,
objcopy, objdump, ranlib, readelf
size, strings, strip
ogram Size 68,830 LoC + 800kLoC from libraries
n Commits 181 commits from Apr’11–Aug’12
GNU Binutils
K — 59 139 84 — AFLG
Presented by Abhik Roychoudhury
Directed Greybox Fuzzing
Results
• Patch Testing: Reach changed statements

• State-of-the-art in patch testing

• KATCH (based on Klee symbolic exec. tool)

• Patch Coverage (#changed BBs reached)

• Vulnerability Detection (#errors discovered)
As future work, we are planning to integrate symbolic-execution-
based directed whitebox fuzzing and directed greybox fuzzing to
achieve a directed fuzzing technique that is both very eective and
very ecient in terms of reaching pre-specied target locations.
We believe that such an integration would be superior to each
technique individually. An integrated directed fuzzing technique
that leverages both symbolic execution and search as optmization
problem would be able to draw on their combined strengths to
mitigate their indiviual weaknesses. Driller [38] is an example of
an (undirected) fuzzer that integrates the AFL greybox fuzzer and
the K whitebox fuzzer.
5.2 Vulnerability Detection
Table 2: Showing the number of previously unreported bugs
found by AFLG (reported Apr’2017) in addition to the num-
ber of bug reports for K (reported Feb’2013).
K AFLG
#Reports #Reports14 #New Reports #CVEs
Binutils 7 4 12 7
Diutils 0 N/A 1 0
Sum 7 4 13 7
AFLG found 13 previously unreported bugs in addition to 4 of
Table 3: Bug rep
Report-ID
Binutils
21408
21409
21412
21414
21415
21417
Diutils http://lists.g
We attribute m
to the eciency of
program analysis a
magnitute more in
is the runtime che
error detection mec
the program when
we instrumented o
Sanitizer (ASAN). I
e.g., by reading be
writing memory th
a SEGFAULT even
fuzzer uses this sig
for a generated in
(i.e., interprets) th
and uses constrain
Presented by Abhik Roychoudhury
Directed Greybox Fuzzing
Results
• Patch Testing: Reach changed statements

• State-of-the-art in patch testing

• KATCH (based on Klee symbolic exec. tool)

• Patch Coverage (#changed BBs reached)

• Vulnerability Detection (#errors discovered)

• AFLGo found 13 previously unreported bugs (7 CVEs) 

in addition to 4 of the 7 bugs that were found by KATCH.
As future work, we are planning to integrate symbolic-execution-
based directed whitebox fuzzing and directed greybox fuzzing to
achieve a directed fuzzing technique that is both very eective and
very ecient in terms of reaching pre-specied target locations.
We believe that such an integration would be superior to each
technique individually. An integrated directed fuzzing technique
that leverages both symbolic execution and search as optmization
problem would be able to draw on their combined strengths to
mitigate their indiviual weaknesses. Driller [38] is an example of
an (undirected) fuzzer that integrates the AFL greybox fuzzer and
the K whitebox fuzzer.
5.2 Vulnerability Detection
Table 2: Showing the number of previously unreported bugs
found by AFLG (reported Apr’2017) in addition to the num-
ber of bug reports for K (reported Feb’2013).
K AFLG
#Reports #Reports14 #New Reports #CVEs
Binutils 7 4 12 7
Diutils 0 N/A 1 0
Sum 7 4 13 7
AFLG found 13 previously unreported bugs in addition to 4 of
Table 3: Bug rep
Report-ID
Binutils
21408
21409
21412
21414
21415
21417
Diutils http://lists.g
We attribute m
to the eciency of
program analysis a
magnitute more in
is the runtime che
error detection mec
the program when
we instrumented o
Sanitizer (ASAN). I
e.g., by reading be
writing memory th
a SEGFAULT even
fuzzer uses this sig
for a generated in
(i.e., interprets) th
and uses constrain
Presented by Abhik Roychoudhury
Directed Greybox Fuzzing
Results
• Crash Reproduction: Exercise stack trace

• State-of-the-art in crash reproduction

• BugRedux (based on Klee symbolic exec. tool)

• Experimental Setup

• Reuse original BugRedux-benchmark

• Determine whether or not crash can be reproduced
Software
developer
Instrumenter
Application Instrumented
application
Crash report
(execution data)
Analyzer
BugRedux
Test input
Debugging
tool
Software
tester
Figure 4. Intuitive high-level view of BUGREDUX.
Test input
Oracle
Presented by Abhik Roychoudhury
Directed Greybox Fuzzing
Results
• Crash Reproduction: Exercise stack trace

• State-of-the-art in crash reproduction

• BugRedux (based on Klee symbolic exec. tool)

• Experimental Setup

• Reuse original BugRedux-benchmark

• Determine whether or not crash can be reproduced
experimental setup, starting conguration, and timeouts. BugRedux
[18] is a directed whitebox fuzzer based on K, takes as input a
sequence of program statements, and generates as output a test case
that exercises that sequence and crashes the program. It was shown
that BugRedux works best of the complete method-call sequence is
provided that lead to the crash. However, as discussed earlier often
only the stack-trace is available, which does not contain methods
that have already “returned”, i.e., nished execution. Hence, for our
comparison, we set the method-calls in the stack trace as targets.
Despite our request for all subjects from the original dataset, only
a subset of nine subjects could be located for us. For two subjects
(exim, xmail), we could not obtain the stack-trace that would specify
the target locations. Specically, the crash in exim can only be
reproduced on 32bit architectures while the crash in xmail overows
the stack such that the stack-trace is overridden. The results for the
remaining seven subjects are shown in Table 6.
Table 6: Bugs reproduced for the original BugRedux subjects
Subjects BugRedux AFLG Comments
sed.fault1 7 7 Takes two les as input
sed.fault2 7 3
grep 7 3
gzip.fault1 7 3
gzip.fault2 7 3
ncompress 3 3
polymorph 3 3
same benchmarks that th
in the original papers [18
The second concern is
a study minimizes system
nal validity for fuzzer exp
However, for our experi
was readily available, suc
Binutils and the K ex
OSS-Fuzz experiments, a
classically provides for th
when comparing two fuzz
seed corpus such that bot
Second, like implementati
faithfully implement the
shown in the comparison
The third concern is co
a test measures what it c
note that results of tool co
grain of salt. An empirical
implementations of two c
selves. Improving the ec
fuzzer may only be a que
lated to the concept [32].
explain the observed phen
technical origins. Moreov
the perspective of a secur
these tools to establish w
AFLGo reproduces 

3 times more crashes!
Presented by Abhik Roychoudhury
Directed Greybox Fuzzing
Result Summary
• Our directed greybox fuzzer (AFLGo) outperforms 

symbolic execution-based directed fuzzers (KATCH  BugRedux)

• in terms of reaching more target locations and

• in terms of detecting more vulnerabilities,
• on their own, original benchmark sets.
Presented by Abhik Roychoudhury
Directed Greybox Fuzzing
Conclusion
Presented by Abhik Roychoudhury
Directed Greybox Fuzzing
Motivation
• Directed Fuzzing: classical constraint satisfaction prob.

• Program analysis to identify program paths that reach given
program locations.

• Symbolic Execution to derive path conditions for any of the
identified paths.

• Constraint Solving to find an input that

• satisfies the path condition and thus

• reaches a program location that was given.
Requires

heavy-weight

machinery!
Presented by Abhik Roychoudhury
Directed Greybox Fuzzing
Overview
• Directed Fuzzing as optimisation problem!

1. Instrumentation Time:

1. Extract call graph (CG) and control-flow graphs (CFGs).

2. For each BB, compute distance to target locations.

3. Instrument program to aggregate distance values.

2. Runtime, for each input

1. collect coverage and distance information, and

2. decide how long to be fuzzed based on distance.

• If input is closer to the targets, it is fuzzed for longer.

• If input is further away from the targets, it is fuzzed for shorter.
Presented by Abhik Roychoudhury
Directed Greybox Fuzzing
• Integrating Simulated Annealing as power schedule

• In the beginning (t = 0min), 

assign the same energy

to all seeds.

• Later (t=10min), assign

a bit more energy to

seeds that are closer.

• At exploitation (t=80min),

assign maximal energy to

seeds that are closest.
Directed Fuzzing as 

Optimisation Problem
0.00
0.25
0.50
0.75
1.00
0.00 0.25 0.50 0.75 1.00
Distance d(s,Tb)
Energyp(s,Tb)
t = 0min t =10min t = 80min
0.00
0.25
0.50
0.75
1.00
0 20 40 60 80
Current time t (in min)
Energyp(s,Tb)
d = 1 d = 0.5 d = 0
Presented by Abhik Roychoudhury
Directed Greybox Fuzzing
Result Summary
• Our directed greybox fuzzer (AFLGo) outperforms 

symbolic execution-based directed fuzzers (KATCH  BugRedux)

• in terms of reaching more target locations and

• in terms of detecting more vulnerabilities,
• on their own, original benchmark sets.
*Tool comparisons should always be taken with a grain of salt.
*
Presented by Abhik Roychoudhury
Directed Greybox Fuzzing
Conclusion
Presented by Abhik Roychoudhury
Directed Greybox Fuzzing
Motivation
• Directed Fuzzing: classical constraint satisfaction prob.

• Program analysis to identify program paths that reach given
program locations.

• Symbolic Execution to derive path conditions for any of the
identified paths.

• Constraint Solving to find an input that

• satisfies the path condition and thus

• reaches a program location that was given.
Requires

heavy-weight

machinery!
Presented by Abhik Roychoudhury
Directed Greybox Fuzzing
Overview
• Directed Fuzzing as optimisation problem!

1. Instrumentation Time:

1. Extract call graph (CG) and control-flow graphs (CFGs).

2. For each BB, compute distance to target locations.

3. Instrument program to aggregate distance values.

2. Runtime, for each input

1. collect coverage and distance information, and

2. decide how long to be fuzzed based on distance.

• If input is closer to the targets, it is fuzzed for longer.

• If input is further away from the targets, it is fuzzed for shorter.
Presented by Abhik Roychoudhury
Directed Greybox Fuzzing
• Integrating Simulated Annealing as power schedule

• In the beginning (t = 0min), 

assign the same energy

to all seeds.

• Later (t=10min), assign

a bit more energy to

seeds that are closer.

• At exploitation (t=80min),

assign maximal energy to

seeds that are closest.
Directed Fuzzing as 

Optimisation Problem
0.00
0.25
0.50
0.75
1.00
0.00 0.25 0.50 0.75 1.00
Distance d(s,Tb)
Energyp(s,Tb)
t = 0min t =10min t = 80min
0.00
0.25
0.50
0.75
1.00
0 20 40 60 80
Current time t (in min)
Energyp(s,Tb)
d = 1 d = 0.5 d = 0
Presented by Abhik Roychoudhury
Directed Greybox Fuzzing
Result Summary
• Our directed greybox fuzzer (AFLGo) outperforms 

symbolic execution-based directed fuzzers (KATCH  BugRedux)

• in terms of reaching more target locations and

• in terms of detecting more vulnerabilities,
• on their own, original benchmark sets.
*Tool comparisons should always be taken with a grain of salt.
*
Questions?

More Related Content

What's hot

Return to dlresolve
Return to dlresolveReturn to dlresolve
Return to dlresolveAngel Boy
 
Heap exploitation
Heap exploitationHeap exploitation
Heap exploitationAngel Boy
 
Pwning mobile apps without root or jailbreak
Pwning mobile apps without root or jailbreakPwning mobile apps without root or jailbreak
Pwning mobile apps without root or jailbreakAbraham Aranguren
 
台科逆向簡報
台科逆向簡報台科逆向簡報
台科逆向簡報耀德 蔡
 
関数プログラマから見たPythonと機械学習
関数プログラマから見たPythonと機械学習関数プログラマから見たPythonと機械学習
関数プログラマから見たPythonと機械学習Masahiro Sakai
 
2600Hz - Tuning Kazoo to 10,000 Handsets - KazooCon 2015
2600Hz - Tuning Kazoo to 10,000 Handsets - KazooCon 20152600Hz - Tuning Kazoo to 10,000 Handsets - KazooCon 2015
2600Hz - Tuning Kazoo to 10,000 Handsets - KazooCon 20152600Hz
 
CPU / GPU高速化セミナー!性能モデルの理論と実践:理論編
CPU / GPU高速化セミナー!性能モデルの理論と実践:理論編CPU / GPU高速化セミナー!性能モデルの理論と実践:理論編
CPU / GPU高速化セミナー!性能モデルの理論と実践:理論編Fixstars Corporation
 
ZabbixのAPIを使って運用を楽しくする話
ZabbixのAPIを使って運用を楽しくする話ZabbixのAPIを使って運用を楽しくする話
ZabbixのAPIを使って運用を楽しくする話Masahito Zembutsu
 
ROP 輕鬆談
ROP 輕鬆談ROP 輕鬆談
ROP 輕鬆談hackstuff
 
Pwning in c++ (basic)
Pwning in c++ (basic)Pwning in c++ (basic)
Pwning in c++ (basic)Angel Boy
 
ARM Trusted FirmwareのBL31を単体で使う!
ARM Trusted FirmwareのBL31を単体で使う!ARM Trusted FirmwareのBL31を単体で使う!
ARM Trusted FirmwareのBL31を単体で使う!Mr. Vengineer
 
A whirlwind tour of the LLVM optimizer
A whirlwind tour of the LLVM optimizerA whirlwind tour of the LLVM optimizer
A whirlwind tour of the LLVM optimizerNikita Popov
 
統計的学習手法による人検出
統計的学習手法による人検出統計的学習手法による人検出
統計的学習手法による人検出MPRG_Chubu_University
 
TDOH x 台科 pwn課程
TDOH x 台科 pwn課程TDOH x 台科 pwn課程
TDOH x 台科 pwn課程Weber Tsai
 
Virtual machine and javascript engine
Virtual machine and javascript engineVirtual machine and javascript engine
Virtual machine and javascript engineDuoyi Wu
 
Launch the First Process in Linux System
Launch the First Process in Linux SystemLaunch the First Process in Linux System
Launch the First Process in Linux SystemJian-Hong Pan
 
Sigreturn Oriented Programming
Sigreturn Oriented ProgrammingSigreturn Oriented Programming
Sigreturn Oriented ProgrammingAngel Boy
 
C#言語機能の作り方
C#言語機能の作り方C#言語機能の作り方
C#言語機能の作り方信之 岩永
 

What's hot (20)

Return to dlresolve
Return to dlresolveReturn to dlresolve
Return to dlresolve
 
Heap exploitation
Heap exploitationHeap exploitation
Heap exploitation
 
Pwning mobile apps without root or jailbreak
Pwning mobile apps without root or jailbreakPwning mobile apps without root or jailbreak
Pwning mobile apps without root or jailbreak
 
台科逆向簡報
台科逆向簡報台科逆向簡報
台科逆向簡報
 
関数プログラマから見たPythonと機械学習
関数プログラマから見たPythonと機械学習関数プログラマから見たPythonと機械学習
関数プログラマから見たPythonと機械学習
 
2600Hz - Tuning Kazoo to 10,000 Handsets - KazooCon 2015
2600Hz - Tuning Kazoo to 10,000 Handsets - KazooCon 20152600Hz - Tuning Kazoo to 10,000 Handsets - KazooCon 2015
2600Hz - Tuning Kazoo to 10,000 Handsets - KazooCon 2015
 
CPU / GPU高速化セミナー!性能モデルの理論と実践:理論編
CPU / GPU高速化セミナー!性能モデルの理論と実践:理論編CPU / GPU高速化セミナー!性能モデルの理論と実践:理論編
CPU / GPU高速化セミナー!性能モデルの理論と実践:理論編
 
ZabbixのAPIを使って運用を楽しくする話
ZabbixのAPIを使って運用を楽しくする話ZabbixのAPIを使って運用を楽しくする話
ZabbixのAPIを使って運用を楽しくする話
 
Astricon 10 (October 2013) - SIP over WebSocket on Kamailio
Astricon 10 (October 2013) - SIP over WebSocket on KamailioAstricon 10 (October 2013) - SIP over WebSocket on Kamailio
Astricon 10 (October 2013) - SIP over WebSocket on Kamailio
 
ROP 輕鬆談
ROP 輕鬆談ROP 輕鬆談
ROP 輕鬆談
 
Pwning in c++ (basic)
Pwning in c++ (basic)Pwning in c++ (basic)
Pwning in c++ (basic)
 
暗号技術入門
暗号技術入門暗号技術入門
暗号技術入門
 
ARM Trusted FirmwareのBL31を単体で使う!
ARM Trusted FirmwareのBL31を単体で使う!ARM Trusted FirmwareのBL31を単体で使う!
ARM Trusted FirmwareのBL31を単体で使う!
 
A whirlwind tour of the LLVM optimizer
A whirlwind tour of the LLVM optimizerA whirlwind tour of the LLVM optimizer
A whirlwind tour of the LLVM optimizer
 
統計的学習手法による人検出
統計的学習手法による人検出統計的学習手法による人検出
統計的学習手法による人検出
 
TDOH x 台科 pwn課程
TDOH x 台科 pwn課程TDOH x 台科 pwn課程
TDOH x 台科 pwn課程
 
Virtual machine and javascript engine
Virtual machine and javascript engineVirtual machine and javascript engine
Virtual machine and javascript engine
 
Launch the First Process in Linux System
Launch the First Process in Linux SystemLaunch the First Process in Linux System
Launch the First Process in Linux System
 
Sigreturn Oriented Programming
Sigreturn Oriented ProgrammingSigreturn Oriented Programming
Sigreturn Oriented Programming
 
C#言語機能の作り方
C#言語機能の作り方C#言語機能の作り方
C#言語機能の作り方
 

Similar to AFLGo: Directed Greybox Fuzzing

EuroMPI 2019: Multilevel Checkpointing for MPI Applications
EuroMPI 2019: Multilevel Checkpointing for MPI ApplicationsEuroMPI 2019: Multilevel Checkpointing for MPI Applications
EuroMPI 2019: Multilevel Checkpointing for MPI ApplicationsLEGATO project
 
Open CV - 電腦怎麼看世界
Open CV - 電腦怎麼看世界Open CV - 電腦怎麼看世界
Open CV - 電腦怎麼看世界Tech Podcast Night
 
Csw2016 d antoine_automatic_exploitgeneration
Csw2016 d antoine_automatic_exploitgenerationCsw2016 d antoine_automatic_exploitgeneration
Csw2016 d antoine_automatic_exploitgenerationCanSecWest
 
Practical Core Bluetooth in IoT & Wearable projects @ UIKonf 2016
Practical Core Bluetooth in IoT & Wearable projects @ UIKonf 2016Practical Core Bluetooth in IoT & Wearable projects @ UIKonf 2016
Practical Core Bluetooth in IoT & Wearable projects @ UIKonf 2016Shuichi Tsutsumi
 
Dynamic Binary Analysis and Obfuscated Codes
Dynamic Binary Analysis and Obfuscated Codes Dynamic Binary Analysis and Obfuscated Codes
Dynamic Binary Analysis and Obfuscated Codes Jonathan Salwan
 
Tech Day 2015: A Gentle Introduction to GPS and GNATbench
Tech Day 2015: A Gentle Introduction to GPS and GNATbenchTech Day 2015: A Gentle Introduction to GPS and GNATbench
Tech Day 2015: A Gentle Introduction to GPS and GNATbenchAdaCore
 
Pytorch kr devcon
Pytorch kr devconPytorch kr devcon
Pytorch kr devconjaewon lee
 
Navigational BCI Using Acoustic Stimulation
Navigational BCI Using Acoustic StimulationNavigational BCI Using Acoustic Stimulation
Navigational BCI Using Acoustic Stimulationwacax
 
EXTENT-2016: Industry Practices of Advanced Program Analysis
EXTENT-2016: Industry Practices of Advanced Program AnalysisEXTENT-2016: Industry Practices of Advanced Program Analysis
EXTENT-2016: Industry Practices of Advanced Program AnalysisIosif Itkin
 
Avihu Efrat's Viola and Jones face detection slides
Avihu Efrat's Viola and Jones face detection slidesAvihu Efrat's Viola and Jones face detection slides
Avihu Efrat's Viola and Jones face detection slideswolf
 
Practical Core Bluetooth in IoT & Wearable projects @ AltConf 2016
Practical Core Bluetooth in IoT & Wearable projects @ AltConf 2016Practical Core Bluetooth in IoT & Wearable projects @ AltConf 2016
Practical Core Bluetooth in IoT & Wearable projects @ AltConf 2016Shuichi Tsutsumi
 
How to Reverse Engineer Web Applications
How to Reverse Engineer Web ApplicationsHow to Reverse Engineer Web Applications
How to Reverse Engineer Web ApplicationsJarrod Overson
 
White box testing-200709
White box testing-200709White box testing-200709
White box testing-200709pragati3009
 
Machine Vision On Embedded Platform
Machine Vision On Embedded Platform Machine Vision On Embedded Platform
Machine Vision On Embedded Platform Omkar Rane
 
Machine vision Application
Machine vision ApplicationMachine vision Application
Machine vision ApplicationAbhishek Sainkar
 
Lesson 2 introduction to grasshopper3D
Lesson 2   introduction to grasshopper3DLesson 2   introduction to grasshopper3D
Lesson 2 introduction to grasshopper3DItai Cohen
 
D1T3-Anto-Joseph-Droid-FF
D1T3-Anto-Joseph-Droid-FFD1T3-Anto-Joseph-Droid-FF
D1T3-Anto-Joseph-Droid-FFAnthony Jose
 
Code quality par Simone Civetta
Code quality par Simone CivettaCode quality par Simone Civetta
Code quality par Simone CivettaCocoaHeads France
 
BinProxy: New Paradigm of Binary Analysis With Your Favorite Web Proxy
BinProxy: New Paradigm of Binary Analysis With Your Favorite Web ProxyBinProxy: New Paradigm of Binary Analysis With Your Favorite Web Proxy
BinProxy: New Paradigm of Binary Analysis With Your Favorite Web ProxyDONGJOO HA
 

Similar to AFLGo: Directed Greybox Fuzzing (20)

EuroMPI 2019: Multilevel Checkpointing for MPI Applications
EuroMPI 2019: Multilevel Checkpointing for MPI ApplicationsEuroMPI 2019: Multilevel Checkpointing for MPI Applications
EuroMPI 2019: Multilevel Checkpointing for MPI Applications
 
Open CV - 電腦怎麼看世界
Open CV - 電腦怎麼看世界Open CV - 電腦怎麼看世界
Open CV - 電腦怎麼看世界
 
Csw2016 d antoine_automatic_exploitgeneration
Csw2016 d antoine_automatic_exploitgenerationCsw2016 d antoine_automatic_exploitgeneration
Csw2016 d antoine_automatic_exploitgeneration
 
Practical Core Bluetooth in IoT & Wearable projects @ UIKonf 2016
Practical Core Bluetooth in IoT & Wearable projects @ UIKonf 2016Practical Core Bluetooth in IoT & Wearable projects @ UIKonf 2016
Practical Core Bluetooth in IoT & Wearable projects @ UIKonf 2016
 
Dynamic Binary Analysis and Obfuscated Codes
Dynamic Binary Analysis and Obfuscated Codes Dynamic Binary Analysis and Obfuscated Codes
Dynamic Binary Analysis and Obfuscated Codes
 
Tech Day 2015: A Gentle Introduction to GPS and GNATbench
Tech Day 2015: A Gentle Introduction to GPS and GNATbenchTech Day 2015: A Gentle Introduction to GPS and GNATbench
Tech Day 2015: A Gentle Introduction to GPS and GNATbench
 
Pytorch kr devcon
Pytorch kr devconPytorch kr devcon
Pytorch kr devcon
 
Navigational BCI Using Acoustic Stimulation
Navigational BCI Using Acoustic StimulationNavigational BCI Using Acoustic Stimulation
Navigational BCI Using Acoustic Stimulation
 
EXTENT-2016: Industry Practices of Advanced Program Analysis
EXTENT-2016: Industry Practices of Advanced Program AnalysisEXTENT-2016: Industry Practices of Advanced Program Analysis
EXTENT-2016: Industry Practices of Advanced Program Analysis
 
Profile Guided Optimization
Profile Guided OptimizationProfile Guided Optimization
Profile Guided Optimization
 
Avihu Efrat's Viola and Jones face detection slides
Avihu Efrat's Viola and Jones face detection slidesAvihu Efrat's Viola and Jones face detection slides
Avihu Efrat's Viola and Jones face detection slides
 
Practical Core Bluetooth in IoT & Wearable projects @ AltConf 2016
Practical Core Bluetooth in IoT & Wearable projects @ AltConf 2016Practical Core Bluetooth in IoT & Wearable projects @ AltConf 2016
Practical Core Bluetooth in IoT & Wearable projects @ AltConf 2016
 
How to Reverse Engineer Web Applications
How to Reverse Engineer Web ApplicationsHow to Reverse Engineer Web Applications
How to Reverse Engineer Web Applications
 
White box testing-200709
White box testing-200709White box testing-200709
White box testing-200709
 
Machine Vision On Embedded Platform
Machine Vision On Embedded Platform Machine Vision On Embedded Platform
Machine Vision On Embedded Platform
 
Machine vision Application
Machine vision ApplicationMachine vision Application
Machine vision Application
 
Lesson 2 introduction to grasshopper3D
Lesson 2   introduction to grasshopper3DLesson 2   introduction to grasshopper3D
Lesson 2 introduction to grasshopper3D
 
D1T3-Anto-Joseph-Droid-FF
D1T3-Anto-Joseph-Droid-FFD1T3-Anto-Joseph-Droid-FF
D1T3-Anto-Joseph-Droid-FF
 
Code quality par Simone Civetta
Code quality par Simone CivettaCode quality par Simone Civetta
Code quality par Simone Civetta
 
BinProxy: New Paradigm of Binary Analysis With Your Favorite Web Proxy
BinProxy: New Paradigm of Binary Analysis With Your Favorite Web ProxyBinProxy: New Paradigm of Binary Analysis With Your Favorite Web Proxy
BinProxy: New Paradigm of Binary Analysis With Your Favorite Web Proxy
 

More from mboehme

An Implementation of Preregistration
An Implementation of PreregistrationAn Implementation of Preregistration
An Implementation of Preregistrationmboehme
 
On the Reliability of Coverage-based Fuzzer Benchmarking
On the Reliability of Coverage-based Fuzzer BenchmarkingOn the Reliability of Coverage-based Fuzzer Benchmarking
On the Reliability of Coverage-based Fuzzer Benchmarkingmboehme
 
Statistical Reasoning About Programs
Statistical Reasoning About ProgramsStatistical Reasoning About Programs
Statistical Reasoning About Programsmboehme
 
The Curious Case of Fuzzing for Automated Software Testing
The Curious Case of Fuzzing for Automated Software TestingThe Curious Case of Fuzzing for Automated Software Testing
The Curious Case of Fuzzing for Automated Software Testingmboehme
 
On the Surprising Efficiency and Exponential Cost of Fuzzing
On the Surprising Efficiency and Exponential Cost of FuzzingOn the Surprising Efficiency and Exponential Cost of Fuzzing
On the Surprising Efficiency and Exponential Cost of Fuzzingmboehme
 
Foundations Of Software Testing
Foundations Of Software TestingFoundations Of Software Testing
Foundations Of Software Testingmboehme
 
DS3 Fuzzing Panel (M. Boehme)
DS3 Fuzzing Panel (M. Boehme)DS3 Fuzzing Panel (M. Boehme)
DS3 Fuzzing Panel (M. Boehme)mboehme
 
Fuzzing: On the Exponential Cost of Vulnerability Discovery
Fuzzing: On the Exponential Cost of Vulnerability DiscoveryFuzzing: On the Exponential Cost of Vulnerability Discovery
Fuzzing: On the Exponential Cost of Vulnerability Discoverymboehme
 
Boosting Fuzzer Efficiency: An Information Theoretic Perspective
Boosting Fuzzer Efficiency: An Information Theoretic PerspectiveBoosting Fuzzer Efficiency: An Information Theoretic Perspective
Boosting Fuzzer Efficiency: An Information Theoretic Perspectivemboehme
 
Fuzzing: Challenges and Reflections
Fuzzing: Challenges and ReflectionsFuzzing: Challenges and Reflections
Fuzzing: Challenges and Reflectionsmboehme
 
NUS SoC Graduate Outreach @ TU Dresden
NUS SoC Graduate Outreach @ TU DresdenNUS SoC Graduate Outreach @ TU Dresden
NUS SoC Graduate Outreach @ TU Dresdenmboehme
 

More from mboehme (11)

An Implementation of Preregistration
An Implementation of PreregistrationAn Implementation of Preregistration
An Implementation of Preregistration
 
On the Reliability of Coverage-based Fuzzer Benchmarking
On the Reliability of Coverage-based Fuzzer BenchmarkingOn the Reliability of Coverage-based Fuzzer Benchmarking
On the Reliability of Coverage-based Fuzzer Benchmarking
 
Statistical Reasoning About Programs
Statistical Reasoning About ProgramsStatistical Reasoning About Programs
Statistical Reasoning About Programs
 
The Curious Case of Fuzzing for Automated Software Testing
The Curious Case of Fuzzing for Automated Software TestingThe Curious Case of Fuzzing for Automated Software Testing
The Curious Case of Fuzzing for Automated Software Testing
 
On the Surprising Efficiency and Exponential Cost of Fuzzing
On the Surprising Efficiency and Exponential Cost of FuzzingOn the Surprising Efficiency and Exponential Cost of Fuzzing
On the Surprising Efficiency and Exponential Cost of Fuzzing
 
Foundations Of Software Testing
Foundations Of Software TestingFoundations Of Software Testing
Foundations Of Software Testing
 
DS3 Fuzzing Panel (M. Boehme)
DS3 Fuzzing Panel (M. Boehme)DS3 Fuzzing Panel (M. Boehme)
DS3 Fuzzing Panel (M. Boehme)
 
Fuzzing: On the Exponential Cost of Vulnerability Discovery
Fuzzing: On the Exponential Cost of Vulnerability DiscoveryFuzzing: On the Exponential Cost of Vulnerability Discovery
Fuzzing: On the Exponential Cost of Vulnerability Discovery
 
Boosting Fuzzer Efficiency: An Information Theoretic Perspective
Boosting Fuzzer Efficiency: An Information Theoretic PerspectiveBoosting Fuzzer Efficiency: An Information Theoretic Perspective
Boosting Fuzzer Efficiency: An Information Theoretic Perspective
 
Fuzzing: Challenges and Reflections
Fuzzing: Challenges and ReflectionsFuzzing: Challenges and Reflections
Fuzzing: Challenges and Reflections
 
NUS SoC Graduate Outreach @ TU Dresden
NUS SoC Graduate Outreach @ TU DresdenNUS SoC Graduate Outreach @ TU Dresden
NUS SoC Graduate Outreach @ TU Dresden
 

Recently uploaded

Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024BookNet Canada
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 

Recently uploaded (20)

Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 

AFLGo: Directed Greybox Fuzzing

  • 1. PhD Thesis Defense Presented by Marcel Boehme Directed
 Greybox Fuzzing Marcel Böhme Thuan Pham Abhik RoychoudhuryM.-D. Nguyen
  • 2. Presented by Abhik Roychoudhury Directed Greybox Fuzzing Motivation • Automated vulnerability detection techniques 1. Blackbox Fuzzing (no program analysis, no feedback) 2. Whitebox Fuzzing (mostly program analysis) 3. Greybox Fuzzing (no program analysis, but coverage feedback)
  • 3. Presented by Abhik Roychoudhury Directed Greybox Fuzzing Motivation • Automated vulnerability detection techniques 1. Blackbox Fuzzing (no program analysis, no feedback) 📄 Model-Based Blackbox Fuzzing Peach, Spike … Seed Input
  • 4. Presented by Abhik Roychoudhury Directed Greybox Fuzzing Motivation • Automated vulnerability detection techniques 1. Blackbox Fuzzing (no program analysis, no feedback) 📄 Model-Based Blackbox Fuzzing Peach, Spike … Seed Input 📄 📄 📄 Valid input Semi-valid input Semi-valid input Mutated Inputs
  • 5. Presented by Abhik Roychoudhury Directed Greybox Fuzzing Motivation • Automated vulnerability detection techniques 1. Blackbox Fuzzing (no program analysis, no feedback) 📄 Model-Based Blackbox Fuzzing Input model Peach, Spike … Seed Input 📄 📄 📄 Valid input Semi-valid input Semi-valid input Mutated Inputs
  • 6. Presented by Abhik Roychoudhury Directed Greybox Fuzzing Motivation • Automated vulnerability detection techniques 1. Blackbox Fuzzing (no program analysis, no feedback) 2. Whitebox Fuzzing (mostly program analysis)
  • 7. Presented by Abhik Roychoudhury Directed Greybox Fuzzing Motivation • Automated vulnerability detection techniques 1. Blackbox Fuzzing (no program analysis, no feedback) 2. Whitebox Fuzzing (mostly program analysis) 3. Greybox Fuzzing (no program analysis, but coverage feedback) 📄 📄📄 📄Greybox Fuzzing … Seed Input
  • 8. Presented by Abhik Roychoudhury Directed Greybox Fuzzing Motivation • Automated vulnerability detection techniques 1. Blackbox Fuzzing (no program analysis, no feedback) 2. Whitebox Fuzzing (mostly program analysis) 3. Greybox Fuzzing (no program analysis, but coverage feedback) 📄 📄📄 📄Greybox Fuzzing … 📄📄 Enqueue Seed Input Mutated Inputs Queue of 
 “interesting” seeds Retain inputs
 that increase
 coverage!
  • 9. Presented by Abhik Roychoudhury Directed Greybox Fuzzing Motivation • Automated vulnerability detection techniques 1. Blackbox Fuzzing (no program analysis, no feedback) 2. Whitebox Fuzzing (mostly program analysis) 3. Greybox Fuzzing (no program analysis, but coverage feedback) 📄 📄📄 📄Greybox Fuzzing … 📄📄 EnqueueDequeue Seed Input Mutated Inputs Queue of 
 “interesting” seeds Retain inputs
 that increase
 coverage!
  • 10. Presented by Abhik Roychoudhury Directed Greybox Fuzzing Motivation • Greybox Fuzzing is frequently used • State-of-the-art in automated vulnerability detection • Extremely efficient coverage-based input generation • All program analysis before/at instrumentation time. • Start with a seed corpus, choose a seed file, fuzz it. • Add to corpus only if new input increases coverage.
  • 11. Presented by Abhik Roychoudhury Directed Greybox Fuzzing Motivation • Greybox Fuzzing is frequently used • State-of-the-art in automated vulnerability detection • Extremely efficient coverage-based input generation • All program analysis before/at instrumentation time. • Start with a seed corpus, choose a seed file, fuzz it. • Add to corpus only if new input increases coverage. • Cannot be directed!
  • 12. Presented by Abhik Roychoudhury Directed Greybox Fuzzing Motivation • Directed Fuzzing has many applications • Patch Testing: reach changed statements • Crash Reproduction: exercise stack trace • SA Report Verification: reach “dangerous” location • Information Flow Detection: exercise source-sink pairs
  • 13. Presented by Abhik Roychoudhury Directed Greybox Fuzzing Motivation • Directed Fuzzing: classical constraint satisfaction prob. • Program analysis to identify program paths 
 that reach given program locations. • Symbolic Execution to derive path conditions 
 for any of the identified paths. • Constraint Solving to find an input that • satisfies the path condition and thus • reaches a program location that was given. φ1 = (x>y)∧(x+y>10)
 φ2 = ¬(x>y)∧(x+y>10) x > y a = x a = y x+y>10 b = a return b
  • 14. Presented by Abhik Roychoudhury Directed Greybox Fuzzing Motivation • Directed Fuzzing: classical constraint satisfaction prob. • Program analysis to identify program paths 
 that reach given program locations. • Symbolic Execution to derive path conditions 
 for any of the identified paths. • Constraint Solving to find an input that • satisfies the path condition and thus • reaches a program location that was given. φ1 = (x>y)∧(x+y>10)
 φ2 = ¬(x>y)∧(x+y>10) x > y a = x a = y x+y>10 b = a return b Requires
 heavy-weight
 machinery!
  • 15. Presented by Abhik Roychoudhury Directed Greybox Fuzzing Overview • Directed Fuzzing as optimisation problem! 1. Instrumentation Time: 1. Extract call graph (CG) and control-flow graphs (CFGs). 2. For each BB, compute distance to target locations. 3. Instrument program to aggregate distance values.
  • 16. Presented by Abhik Roychoudhury Directed Greybox Fuzzing Overview • Directed Fuzzing as optimisation problem! 1. Instrumentation Time: 1. Extract call graph (CG) and control-flow graphs (CFGs). 2. For each BB, compute distance to target locations. 3. Instrument program to aggregate distance values. 2. Runtime, for each input 1. collect coverage and distance information, and 2. decide how long to be fuzzed based on distance. • If input is closer to the targets, it is fuzzed for longer. • If input is further away from the targets, it is fuzzed for shorter.
  • 17. Presented by Abhik Roychoudhury Directed Greybox Fuzzing Instrumentation • Function-level target distance using call graph (CG) main a b cd e
  • 18. Presented by Abhik Roychoudhury Directed Greybox Fuzzing Instrumentation • Function-level target distance using call graph (CG) 1. Identify target functions in CG main a b cd e
  • 19. Presented by Abhik Roychoudhury Directed Greybox Fuzzing main a b cd e Instrumentation • Function-level target distance using call graph (CG) 1. Identify target functions in CG 2. For each function, compute the
 harmonic mean of the length 
 of the shortest path to targets 1 2 0 2 3N/A
  • 20. Presented by Abhik Roychoudhury Directed Greybox Fuzzing Instrumentation • Function-level target distance using call graph (CG) • BB-level target distance using CFG CFG for function b main a b cd e 1 2 0 2 3N/A
  • 21. Presented by Abhik Roychoudhury Directed Greybox Fuzzing Instrumentation • Function-level target distance using call graph (CG) • BB-level target distance using control-flow graph (CFG) 1. Identify target BBs and
 assign distance 0
 (none in function b) CFG for function b main a b cd e 1 2 0 2 3N/A
  • 22. Presented by Abhik Roychoudhury Directed Greybox Fuzzing Instrumentation • Function-level target distance using call graph (CG) • BB-level target distance using control-flow graph (CFG) 1. Identify target BBs and
 assign distance 0 2. Identify BBs that
 call functions CFG for function b main a b cd e 1 2 0 2 3N/A c a
  • 23. Presented by Abhik Roychoudhury Directed Greybox Fuzzing Instrumentation • Function-level target distance using call graph (CG) • BB-level target distance using control-flow graph (CFG) 1. Identify target BBs and
 assign distance 0 2. Identify BBs that
 call functions and
 assign 10*FLTD CFG for function b main a b cd e 1 2 0 2 3N/A c a 10 30
  • 24. Presented by Abhik Roychoudhury Directed Greybox Fuzzing Instrumentation • Function-level target distance using call graph (CG) • BB-level target distance using control-flow graph (CFG) 1. Identify target BBs and
 assign distance 0 2. Identify BBs that
 call functions and
 assign 10*FLTD 3. For each BB, compute harmonic
 mean of (length of shortest path to
 any function-calling BB + 10*FLTD). CFG for function b c a 10 30 [(1+30)-1+(2+10)-1]-1
  • 25. Presented by Abhik Roychoudhury Directed Greybox Fuzzing Instrumentation • Function-level target distance using call graph (CG) • BB-level target distance using control-flow graph (CFG) 1. Identify target BBs and
 assign distance 0 2. Identify BBs that
 call functions and
 assign 10*FLTD 3. For each BB, compute harmonic
 mean of (length of shortest path to
 any function-calling BB + 10*FLTD). CFG for function b 8.7 11 10 30 13 12 N/A
  • 26. Presented by Abhik Roychoudhury Directed Greybox Fuzzing Runtime • Function-level target distance using call graph (CG) • BB-level target distance using control-flow graph (CFG) • Seed distance from instrumented binary CFG for function b 8.7 11 10 30 13 12 N/A
  • 27. Presented by Abhik Roychoudhury Directed Greybox Fuzzing Runtime • Function-level target distance using call graph (CG) • BB-level target distance using control-flow graph (CFG) • Seed distance from instrumented binary • Two 64-bit shared memory entries • Aggregated BB-level distance values • Number of executed BBs Seed Distance: 19.4 
 = (8.7+30)/2 8.7 11 10 30 13 12 N/A
  • 28. Presented by Abhik Roychoudhury Directed Greybox Fuzzing Runtime • Function-level target distance using call graph (CG) • BB-level target distance using control-flow graph (CFG) • Seed distance from instrumented binary • Two 64-bit shared memory entries • Aggregated BB-level distance values • Number of executed BBs 8.7 11 10 30 13 12 N/A Seed Distance: 10.4 
 = (8.7+11+10+12)/4
  • 29. Presented by Abhik Roychoudhury Directed Greybox Fuzzing Runtime • Function-level target distance using call graph (CG) • BB-level target distance using control-flow graph (CFG) • Seed distance from instrumented binary • Two 64-bit shared memory entries • Aggregated BB-level distance values • Number of executed BBs 8.7 11 10 30 13 12 N/A Seed Distance: 10.4 
 = (8.7+11+10+12)/4 Now that we know how to
 compute seed distance, — let’s minimise it! —
  • 30. Presented by Abhik Roychoudhury Directed Greybox Fuzzing • Background: Coverage-based Greybox Fuzzing Directed Fuzzing as 
 Optimisation Problem Seed File 📄
  • 31. Presented by Abhik Roychoudhury Directed Greybox Fuzzing • Mutation Operators: • Bitflips • BoundaryValues 
 (0,1,-1,INT_MAX,INT_MIN) • Simple arithmetics
 (add/subtract 1) • Block deletion • Block insertion • Background: Coverage-based Greybox Fuzzing Directed Fuzzing as 
 Optimisation Problem Seed File 📄
  • 32. Presented by Abhik Roychoudhury Directed Greybox Fuzzing • Background: Coverage-based Greybox Fuzzing Directed Fuzzing as 
 Optimisation Problem 📄 📄📄 📄Greybox Fuzzing … Seed Input
  • 33. Presented by Abhik Roychoudhury Directed Greybox Fuzzing • Background: Coverage-based Greybox Fuzzing Directed Fuzzing as 
 Optimisation Problem 📄 📄📄 📄Greybox Fuzzing … 📄📄 Enqueue Seed Input Mutated Inputs Queue of 
 “interesting” seeds Retain inputs
 that increase
 coverage!
  • 34. Presented by Abhik Roychoudhury Directed Greybox Fuzzing • Background: Coverage-based Greybox Fuzzing Directed Fuzzing as 
 Optimisation Problem 📄 📄📄 📄Greybox Fuzzing … 📄📄 EnqueueDequeue Seed Input Mutated Inputs Queue of 
 “interesting” seeds Retain inputs
 that increase
 coverage!
  • 35. Presented by Abhik Roychoudhury Directed Greybox Fuzzing • Background: Coverage-based Greybox Fuzzing Directed Fuzzing as 
 Optimisation Problem 📄📄 Queue of 
 “interesting” seeds 📄📄 📄 energy low energy
 - generated less
 test inputs!
  • 36. Presented by Abhik Roychoudhury Directed Greybox Fuzzing • Background: Coverage-based Greybox Fuzzing Directed Fuzzing as 
 Optimisation Problem 📄📄 Queue of 
 “interesting” seeds 📄📄 📄 high energy
 - generated more
 test inputs!📄 energy
  • 37. Presented by Abhik Roychoudhury Directed Greybox Fuzzing Directed Fuzzing as 
 Optimisation Problem • Background: Coverage-based Greybox Fuzzing • Seed’s energy: • Number of inputs generated when chosen for fuzzing • Local property: each seed has its own energy • Power schedule: • Assigns energy to seeds according to a pre-defined formula ★ Boosted Greybox Fuzzing (AFLFast CCS’16) • Assign more energy to seeds exercising low-frequency paths. ★ Directed Greybox Fuzzing (AFLGo CCS’17) • Assign more energy to seeds that a closer to the given targets!
  • 38. Presented by Abhik Roychoudhury Directed Greybox Fuzzing Directed Fuzzing as 
 Optimisation Problem • Background: Coverage-based Greybox Fuzzing • Seed’s energy: • Number of inputs generated when chosen for fuzzing • Local property: each seed has its own energy • Power schedule: • Assigns energy to seeds according to a pre-defined formula ★ Boosted Greybox Fuzzing (AFLFast CCS’16) • Assign more energy to seeds exercising low-frequency paths. ★ Directed Greybox Fuzzing (AFLGo CCS’17) • Assign more energy to seeds that a closer to the given targets!
  • 39. Presented by Abhik Roychoudhury Directed Greybox Fuzzing • Directed Greybox Fuzzing • Assign more energy to seeds that a closer to the given targets! • Problem (Stochastic Gradient Descent) • If we always assign more energy to closer seeds, 
 we typically reach only a local minimum,
 but never a global minimum distance! • Solution (Simulated Annealing) • Sometimes assign more energy to further-away seeds! • Approaches global minimum distance. Directed Fuzzing as 
 Optimisation Problem
  • 40. Presented by Abhik Roychoudhury Directed Greybox Fuzzing • Directed Greybox Fuzzing • Assign more energy to seeds that a closer to the given targets! • Problem (Stochastic Gradient Descent) • If we always assign more energy to closer seeds, 
 we typically reach only a local minimum,
 but never a global minimum distance! • Solution (Simulated Annealing) • Sometimes assign more energy to further-away seeds! • Approaches global minimum distance. Directed Fuzzing as 
 Optimisation Problem
  • 41. Presented by Abhik Roychoudhury Directed Greybox Fuzzing Directed Fuzzing as 
 Optimisation Problem • Simulated Annealing (SA) • Exploration phase: • Energy of closer seeds similar to energy
 of further-away seeds • Exploitation phase: • Energy of closer seeds is assigned 
 to be higher and higher • Energy of further-away seeds is 
 assigned to be lower and lower • We are increasing the “importance” 
 of seed distance over time.
  • 42. Presented by Abhik Roychoudhury Directed Greybox Fuzzing Directed Fuzzing as 
 Optimisation Problem • Simulated Annealing (SA) • Annealing from metallurgy: control the cooling of material
 to reduce defects (e.g., cracks or bubbles) in the material. • Temperature T ∈ [0,1] specifies “importance” of distance. • At T=1, exploration (normal AFL) • At T=0, exploitation (gradient descent) • Cooling schedule controls (global) temperature • Classically, exponential cooling.
  • 43. Presented by Abhik Roychoudhury Directed Greybox Fuzzing • Integrating Simulated Annealing as power schedule • In the beginning (t = 0min), 
 assign the same energy
 to all seeds. Directed Fuzzing as 
 Optimisation Problem dene the normalized seed set of target locations Tb . s. This trace contains the ed distance d(s,Tb ) as (m,Tb ) | (3) set S of seeds to fuzz. We ,Tb ) as the dierence be- he minimum seed distance by the dierence between 0.00 0.25 0.50 0.75 1.00 0.00 0.25 0.50 0.75 1.00 Distance d(s,Tb) Energyp(s,Tb) t = 0min t =10min t = 80min
  • 44. Presented by Abhik Roychoudhury Directed Greybox Fuzzing • Integrating Simulated Annealing as power schedule • In the beginning (t = 0min), 
 assign the same energy
 to all seeds. • Later (t=10min), assign
 a bit more energy to
 seeds that are closer.
 Directed Fuzzing as 
 Optimisation Problem dene the normalized seed set of target locations Tb . s. This trace contains the ed distance d(s,Tb ) as (m,Tb ) | (3) set S of seeds to fuzz. We ,Tb ) as the dierence be- he minimum seed distance by the dierence between 0.00 0.25 0.50 0.75 1.00 0.00 0.25 0.50 0.75 1.00 Distance d(s,Tb) Energyp(s,Tb) t = 0min t =10min t = 80min
  • 45. Presented by Abhik Roychoudhury Directed Greybox Fuzzing • Integrating Simulated Annealing as power schedule • In the beginning (t = 0min), 
 assign the same energy
 to all seeds. • Later (t=10min), assign
 a bit more energy to
 seeds that are closer. • At exploitation (t=80min),
 assign maximal energy to
 seeds that are closest. Directed Fuzzing as 
 Optimisation Problem dene the normalized seed set of target locations Tb . s. This trace contains the ed distance d(s,Tb ) as (m,Tb ) | (3) set S of seeds to fuzz. We ,Tb ) as the dierence be- he minimum seed distance by the dierence between 0.00 0.25 0.50 0.75 1.00 0.00 0.25 0.50 0.75 1.00 Distance d(s,Tb) Energyp(s,Tb) t = 0min t =10min t = 80min
  • 46. Presented by Abhik Roychoudhury Directed Greybox Fuzzing • Integrating Simulated Annealing as power schedule • In the beginning (t = 0min), 
 assign the same energy
 to all seeds. • Later (t=10min), assign
 a bit more energy to
 seeds that are closer. • At exploitation (t=80min),
 assign maximal energy to
 seeds that are closest. Directed Fuzzing as 
 Optimisation Problem 0.00 0.25 0.50 0.75 1.00 0.00 0.25 0.50 0.75 1.00 Distance d(s,Tb) Energyp(s,Tb) t = 0min t =10min t = 80min 0.00 0.25 0.50 0.75 1.00 0 20 40 60 80 Current time t (in min) Energyp(s,Tb) d = 1 d = 0.5 d = 0
  • 47. Presented by Abhik Roychoudhury Directed Greybox Fuzzing Results • Patch Testing: Reach changed statements • State-of-the-art in patch testing • KATCH (based on Klee symbolic exec. tool) • Experimental Setup • Reuse original KATCH-benchmark • Measure patch coverage (#changed BBs reached) • Measure vuln. detection (#errors discovered) KATCH: High-Coverage Testing of Software Patches Paul Dan Marinescu Department of Computing Imperial College London, UK p.marinescu@imperial.ac.uk Cristian Cadar Department of Computing Imperial College London, UK c.cadar@imperial.ac.uk ABSTRACT One of the distinguishing characteristics of software systems is that they evolve: new patches are committed to software repositories and new versions are released to users on a continuous basis. Unfortunately, many of these changes bring unexpected bugs that break the stability of the system or a↵ect its security. In this paper, we address this problem using a technique for automatically testing code patches. Our technique combines symbolic execution with several novel heuristics based on static and dynamic program anal- !#$%'$!( Figure 1: KATCH is integrated in the software
  • 48. Presented by Abhik Roychoudhury Directed Greybox Fuzzing Results • Patch Testing: Reach changed statements • State-of-the-art in patch testing • KATCH (based on Klee symbolic exec. tool) • Patch Coverage (#changed BBs reached) is unrelated to the concept [32]. We make a conscious eort to explain the observed phenomena and distinguish conceptual from technical origins. Moreover, we encourage the reader to consider the perspective of a security researcher who is actually handling these tools to establish whether there exists a vulnerability. 5.1 Patch Coverage We begin by analyzing the patch coverage achieved by both K and AFLG as measured by the number of previously uncovered basic blocks that were changed in the respective patch. Table 1: Patch coverage results showing the number of previ- ously uncovered targets that K and AFLG could cover in the stipulated time budget, respectively. #Changed #Uncovered Basic Blocks Changed BBs K AFLG Binutils 852 702 135 159 Diutils 166 108 63 64 Sum 1018 810 198 223 complex inp executed onl for certain a dened stru higher-quali approach [2 To unders we investiga we can see cannot cove cover. We at vidual streng dicult con be dicult t quickly exp stuck in a pa AFLG and 282 targets cover indi
  • 49. Presented by Abhik Roychoudhury Directed Greybox Fuzzing Results • Patch Testing: Reach changed statements • State-of-the-art in patch testing • KATCH (based on Klee symbolic exec. tool) • Patch Coverage (#changed BBs reached) • While we would expect Klee to take a substantial lead, 
 AFLGo outperforms KATCH in terms of patch coverage. is unrelated to the concept [32]. We make a conscious eort to explain the observed phenomena and distinguish conceptual from technical origins. Moreover, we encourage the reader to consider the perspective of a security researcher who is actually handling these tools to establish whether there exists a vulnerability. 5.1 Patch Coverage We begin by analyzing the patch coverage achieved by both K and AFLG as measured by the number of previously uncovered basic blocks that were changed in the respective patch. Table 1: Patch coverage results showing the number of previ- ously uncovered targets that K and AFLG could cover in the stipulated time budget, respectively. #Changed #Uncovered Basic Blocks Changed BBs K AFLG Binutils 852 702 135 159 Diutils 166 108 63 64 Sum 1018 810 198 223 complex inp executed onl for certain a dened stru higher-quali approach [2 To unders we investiga we can see cannot cove cover. We at vidual streng dicult con be dicult t quickly exp stuck in a pa AFLG and 282 targets cover indi
  • 50. Presented by Abhik Roychoudhury Directed Greybox Fuzzing Results • Patch Testing: Reach changed statements • State-of-the-art in patch testing • KATCH (based on Klee symbolic exec. tool) • Patch Coverage (#changed BBs reached) • While we would expect Klee to take a substantial lead, 
 AFLGo outperforms KATCH in terms of patch coverage. • BUT: Together they cover 42% and 26% 
 more than AFLGo and KATCH individually. 
 They complement each other!
 oject Tools diff, sdiff, diff3, cmp ogram Size 42,930 LoC n Commits 175 commits from Nov’09–May’12 GNU Diutils oject Tools addr2line, ar, cxxfilt, elfedit, nm, objcopy, objdump, ranlib, readelf size, strings, strip ogram Size 68,830 LoC + 800kLoC from libraries n Commits 181 commits from Apr’11–Aug’12 GNU Binutils K — 59 139 84 — AFLG
  • 51. Presented by Abhik Roychoudhury Directed Greybox Fuzzing Results • Patch Testing: Reach changed statements • State-of-the-art in patch testing • KATCH (based on Klee symbolic exec. tool) • Patch Coverage (#changed BBs reached) • Vulnerability Detection (#errors discovered) As future work, we are planning to integrate symbolic-execution- based directed whitebox fuzzing and directed greybox fuzzing to achieve a directed fuzzing technique that is both very eective and very ecient in terms of reaching pre-specied target locations. We believe that such an integration would be superior to each technique individually. An integrated directed fuzzing technique that leverages both symbolic execution and search as optmization problem would be able to draw on their combined strengths to mitigate their indiviual weaknesses. Driller [38] is an example of an (undirected) fuzzer that integrates the AFL greybox fuzzer and the K whitebox fuzzer. 5.2 Vulnerability Detection Table 2: Showing the number of previously unreported bugs found by AFLG (reported Apr’2017) in addition to the num- ber of bug reports for K (reported Feb’2013). K AFLG #Reports #Reports14 #New Reports #CVEs Binutils 7 4 12 7 Diutils 0 N/A 1 0 Sum 7 4 13 7 AFLG found 13 previously unreported bugs in addition to 4 of Table 3: Bug rep Report-ID Binutils 21408 21409 21412 21414 21415 21417 Diutils http://lists.g We attribute m to the eciency of program analysis a magnitute more in is the runtime che error detection mec the program when we instrumented o Sanitizer (ASAN). I e.g., by reading be writing memory th a SEGFAULT even fuzzer uses this sig for a generated in (i.e., interprets) th and uses constrain
  • 52. Presented by Abhik Roychoudhury Directed Greybox Fuzzing Results • Patch Testing: Reach changed statements • State-of-the-art in patch testing • KATCH (based on Klee symbolic exec. tool) • Patch Coverage (#changed BBs reached) • Vulnerability Detection (#errors discovered) • AFLGo found 13 previously unreported bugs (7 CVEs) 
 in addition to 4 of the 7 bugs that were found by KATCH. As future work, we are planning to integrate symbolic-execution- based directed whitebox fuzzing and directed greybox fuzzing to achieve a directed fuzzing technique that is both very eective and very ecient in terms of reaching pre-specied target locations. We believe that such an integration would be superior to each technique individually. An integrated directed fuzzing technique that leverages both symbolic execution and search as optmization problem would be able to draw on their combined strengths to mitigate their indiviual weaknesses. Driller [38] is an example of an (undirected) fuzzer that integrates the AFL greybox fuzzer and the K whitebox fuzzer. 5.2 Vulnerability Detection Table 2: Showing the number of previously unreported bugs found by AFLG (reported Apr’2017) in addition to the num- ber of bug reports for K (reported Feb’2013). K AFLG #Reports #Reports14 #New Reports #CVEs Binutils 7 4 12 7 Diutils 0 N/A 1 0 Sum 7 4 13 7 AFLG found 13 previously unreported bugs in addition to 4 of Table 3: Bug rep Report-ID Binutils 21408 21409 21412 21414 21415 21417 Diutils http://lists.g We attribute m to the eciency of program analysis a magnitute more in is the runtime che error detection mec the program when we instrumented o Sanitizer (ASAN). I e.g., by reading be writing memory th a SEGFAULT even fuzzer uses this sig for a generated in (i.e., interprets) th and uses constrain
  • 53. Presented by Abhik Roychoudhury Directed Greybox Fuzzing Results • Crash Reproduction: Exercise stack trace • State-of-the-art in crash reproduction • BugRedux (based on Klee symbolic exec. tool) • Experimental Setup • Reuse original BugRedux-benchmark • Determine whether or not crash can be reproduced Software developer Instrumenter Application Instrumented application Crash report (execution data) Analyzer BugRedux Test input Debugging tool Software tester Figure 4. Intuitive high-level view of BUGREDUX. Test input Oracle
  • 54. Presented by Abhik Roychoudhury Directed Greybox Fuzzing Results • Crash Reproduction: Exercise stack trace • State-of-the-art in crash reproduction • BugRedux (based on Klee symbolic exec. tool) • Experimental Setup • Reuse original BugRedux-benchmark • Determine whether or not crash can be reproduced experimental setup, starting conguration, and timeouts. BugRedux [18] is a directed whitebox fuzzer based on K, takes as input a sequence of program statements, and generates as output a test case that exercises that sequence and crashes the program. It was shown that BugRedux works best of the complete method-call sequence is provided that lead to the crash. However, as discussed earlier often only the stack-trace is available, which does not contain methods that have already “returned”, i.e., nished execution. Hence, for our comparison, we set the method-calls in the stack trace as targets. Despite our request for all subjects from the original dataset, only a subset of nine subjects could be located for us. For two subjects (exim, xmail), we could not obtain the stack-trace that would specify the target locations. Specically, the crash in exim can only be reproduced on 32bit architectures while the crash in xmail overows the stack such that the stack-trace is overridden. The results for the remaining seven subjects are shown in Table 6. Table 6: Bugs reproduced for the original BugRedux subjects Subjects BugRedux AFLG Comments sed.fault1 7 7 Takes two les as input sed.fault2 7 3 grep 7 3 gzip.fault1 7 3 gzip.fault2 7 3 ncompress 3 3 polymorph 3 3 same benchmarks that th in the original papers [18 The second concern is a study minimizes system nal validity for fuzzer exp However, for our experi was readily available, suc Binutils and the K ex OSS-Fuzz experiments, a classically provides for th when comparing two fuzz seed corpus such that bot Second, like implementati faithfully implement the shown in the comparison The third concern is co a test measures what it c note that results of tool co grain of salt. An empirical implementations of two c selves. Improving the ec fuzzer may only be a que lated to the concept [32]. explain the observed phen technical origins. Moreov the perspective of a secur these tools to establish w AFLGo reproduces 
 3 times more crashes!
  • 55. Presented by Abhik Roychoudhury Directed Greybox Fuzzing Result Summary • Our directed greybox fuzzer (AFLGo) outperforms 
 symbolic execution-based directed fuzzers (KATCH BugRedux) • in terms of reaching more target locations and • in terms of detecting more vulnerabilities, • on their own, original benchmark sets.
  • 56. Presented by Abhik Roychoudhury Directed Greybox Fuzzing Conclusion Presented by Abhik Roychoudhury Directed Greybox Fuzzing Motivation • Directed Fuzzing: classical constraint satisfaction prob. • Program analysis to identify program paths that reach given program locations. • Symbolic Execution to derive path conditions for any of the identified paths. • Constraint Solving to find an input that • satisfies the path condition and thus • reaches a program location that was given. Requires
 heavy-weight
 machinery! Presented by Abhik Roychoudhury Directed Greybox Fuzzing Overview • Directed Fuzzing as optimisation problem! 1. Instrumentation Time: 1. Extract call graph (CG) and control-flow graphs (CFGs). 2. For each BB, compute distance to target locations. 3. Instrument program to aggregate distance values. 2. Runtime, for each input 1. collect coverage and distance information, and 2. decide how long to be fuzzed based on distance. • If input is closer to the targets, it is fuzzed for longer. • If input is further away from the targets, it is fuzzed for shorter. Presented by Abhik Roychoudhury Directed Greybox Fuzzing • Integrating Simulated Annealing as power schedule • In the beginning (t = 0min), 
 assign the same energy
 to all seeds. • Later (t=10min), assign
 a bit more energy to
 seeds that are closer. • At exploitation (t=80min),
 assign maximal energy to
 seeds that are closest. Directed Fuzzing as 
 Optimisation Problem 0.00 0.25 0.50 0.75 1.00 0.00 0.25 0.50 0.75 1.00 Distance d(s,Tb) Energyp(s,Tb) t = 0min t =10min t = 80min 0.00 0.25 0.50 0.75 1.00 0 20 40 60 80 Current time t (in min) Energyp(s,Tb) d = 1 d = 0.5 d = 0 Presented by Abhik Roychoudhury Directed Greybox Fuzzing Result Summary • Our directed greybox fuzzer (AFLGo) outperforms 
 symbolic execution-based directed fuzzers (KATCH BugRedux) • in terms of reaching more target locations and • in terms of detecting more vulnerabilities, • on their own, original benchmark sets. *Tool comparisons should always be taken with a grain of salt. *
  • 57. Presented by Abhik Roychoudhury Directed Greybox Fuzzing Conclusion Presented by Abhik Roychoudhury Directed Greybox Fuzzing Motivation • Directed Fuzzing: classical constraint satisfaction prob. • Program analysis to identify program paths that reach given program locations. • Symbolic Execution to derive path conditions for any of the identified paths. • Constraint Solving to find an input that • satisfies the path condition and thus • reaches a program location that was given. Requires
 heavy-weight
 machinery! Presented by Abhik Roychoudhury Directed Greybox Fuzzing Overview • Directed Fuzzing as optimisation problem! 1. Instrumentation Time: 1. Extract call graph (CG) and control-flow graphs (CFGs). 2. For each BB, compute distance to target locations. 3. Instrument program to aggregate distance values. 2. Runtime, for each input 1. collect coverage and distance information, and 2. decide how long to be fuzzed based on distance. • If input is closer to the targets, it is fuzzed for longer. • If input is further away from the targets, it is fuzzed for shorter. Presented by Abhik Roychoudhury Directed Greybox Fuzzing • Integrating Simulated Annealing as power schedule • In the beginning (t = 0min), 
 assign the same energy
 to all seeds. • Later (t=10min), assign
 a bit more energy to
 seeds that are closer. • At exploitation (t=80min),
 assign maximal energy to
 seeds that are closest. Directed Fuzzing as 
 Optimisation Problem 0.00 0.25 0.50 0.75 1.00 0.00 0.25 0.50 0.75 1.00 Distance d(s,Tb) Energyp(s,Tb) t = 0min t =10min t = 80min 0.00 0.25 0.50 0.75 1.00 0 20 40 60 80 Current time t (in min) Energyp(s,Tb) d = 1 d = 0.5 d = 0 Presented by Abhik Roychoudhury Directed Greybox Fuzzing Result Summary • Our directed greybox fuzzer (AFLGo) outperforms 
 symbolic execution-based directed fuzzers (KATCH BugRedux) • in terms of reaching more target locations and • in terms of detecting more vulnerabilities, • on their own, original benchmark sets. *Tool comparisons should always be taken with a grain of salt. * Questions?