Taint scope

A Checksum-Aware Directed fuzzing
Tool for Automatic Software
Vulnerability Detection
Tielei Wang1, Tao Wei1, Guofei Gu2, Wei Zou1
1

Peking University, China

2

Texas A&M University, US

2



Checksum – a way to check the integrity of data.
Used in network protocols and files.
data

Checksum function

data

Checksum field

Fuzzing – generating malformed inputs and
feeding them to the application.
 Dynamic Taint Analysis – runs a program and
observes which computations are affected by
predefined taint sources (e.g. input)


3

 The

input mutation space is enormous .

 Most

malformed inputs dropped at an early
stage, if the program employs a checksum
mechanism.

4

1
2
3
4
5
6
7
8
9
10
11
12
13
14

void decode_image(FILE* fd){
...
int length = get_length(fd);
int recomputed_chksum = checksum(fd, length);
int chksum_in_file = get_checksum(fd);
//line 6 is used to check the integrity of inputs
if(chksum_in_file != recomputed_chksum)
error();
int Width = get_width(input_file);
int Height = get_height(input_file);
int size = Width*Height*sizeof(int);
int* p = malloc(size);
...
for(i=0; i<Height; i++){// read ith row to p
read_row(p+Width*i, i, fd);

5



To infer whether/where a program checks the
integrity of input.



Identify which input bytes can flow into sensitive
points:
Taint analysis at byte level – monitors how application uses
the input data.



Create malformed input focusing the “hot bytes”.



Repair checksum fields in input, to expose
vulnerability.



Fully automatic



Found 27 new vulnerability – acrobat reader, google
picasa and more.

6

1.
2.
3.
4.

Dynamic taint tracing
Detecting checksum
Directed fuzzing
Repairing crashed samples

7

Modified

Crashed

Program

Samples

Checksum
Locator

Directed
Fuzzer

Instruction
Profile

Execution Monitor

Checksum
Repairer

Hot Bytes Info

Reports

8

 Runs

the program with well-formed input.

 Execution


Which input bytes related to arguments of API functions
(e.g.



monitor records:

malloc, strcpy) – “hot bytes” report.

Which bytes each conditional jump instruction depends on
(e.g.

JZ, JE, JB) – checksum report.

 Considering

only data flow (no control flow).

9

 Instruments

instructions – movement (e.g.
MOV, PUSH), arithmetic (e.g. SUB,
ADD), logic (e.g. AND, XOR)
 Taints all values written by an instruction
with union of all taint labels associated with
values used by that instruction.
 Considering

also

eflags register.

eax {0x6, 0x7}, ebx {0x8, 0x9}
add eax, ebx
eax {0x6, 0x7, 0x8, 0x9}, eflags

10

Input size is 1024 bytes
“hot bytes” report:
8
9
10
11


…
0x8048d5b: invoking malloc: [0x8,0xf]
…

11

Input size is 1024 bytes
checksum report:
6
7

error();

…
0x8048d4f: JZ: 1024: [0x0,0x3ff]
…

12

Checksum detector:
 identify






potential checksum check points

the recomputed checksum value depends on
many input bytes
Instruments conditional jump. Before execution,
checks whether the number of marks associated
with eflags register exceeds a threshold.
Problem with decompressed bytes.

13

Refinement:
Well-formed inputs can pass the checksum test,
but most malformed inputs cannot

14

Refinement:


Run well-formed inputs, identify the
always-taken and always-not-taken
instructions.

15

Refinement:




instructions.
Run malformed inputs, also identify the
instructions.

16

Refinement:






instructions.
Run malformed inputs, also identify the
instructions.
Identify the conditional jump
instructions that behaves completely
different when processing well-formed
and malformed inputs.

17

Checksum detector:
 Creates

bypass rules –

always-taken, always-not-taken
6
7

error();

…
0x8048d4f: JZ: 1024: [0x0,0x3ff]
…

0x8048d4f: JZ: always-taken

18

Checksum detector:
 Checksum
6
7

field identification

error();

Input bytes that affects chksum_in_file are
the checksum field.

19

 Generates

malformed test cases – feeds them
to the original or instrumented program.

 According

to the bypass rules, alters the
execution traces at check points – sets the
eflags register.

20

 All

malformed test cases are constructed
based on the “hot bytes” information


Using attack heuristics:
bytes that influence memory allocation are set to small,
large or negative.
bytes that flow into string functions are replaced by
characters such as %n, %p.

 Output

– test cases that could cause to crash
or consume 100% CPU.

21

6
7
8
9
10
11

error();

Checksum report
…
0x8048d4f: JZ: 1024: [0x0,0x3ff]
…
Bypass info

“hot bytes” report
…
…

22

6 if(chksum_in_file != recomputed_chksum)
7
error();
8
9 Before executing 0x8048d4f,
10 int size = Width*Height*sizeof(int);
11 the fuzzer sets the flag
in

eflags

Checksum report
to an
…
0x8048d4f: JZ: 1024: [0x0,0x3ff]
…
Bypass info

ZF

opposite value
…

“hot bytes” report

…

23

 Fixing

is expensive - fixes checksum fields
only in test cases that caused crashing.
 How?
Cr – row data in the checksum field
D – input data protected by checksum filed
Checksum() – the complete checksum algorithm
T – transformation
We want to pass the constraint:
Checksum(D) == T(Cr)

24

Using symbolic execution to solve:
Checksum(D) == T(Cr)
Checksum(D) is a runtime determinable constant:

c== T(Cr)
Only Cr is a symbolic value.
 Common transformations (e.g. converting from
hex/oct to decimal), can be solved by existing
solvers (STP).

25

If the new test case cause the original
program to crash,
a potential vulnerability is detected!

26

An incomplete list of applications:

27

“hot bytes” identification results –
memory allocation

28

Checksum identification results:
Threshold = 16

30

27 previous unknown Vulnerabilities:

MS Paint

Google Picasa

irfanview

gstreamer

Amaya

dillo

Adobe Acrobat

ImageMagick

Winamp

XEmacs

wxWidgets

PDFlib

31

Vulnerabilities detected by TaintScope:

32

 TaintScope

cannot deal with secure integrity
check schemes (e.g. cryptographic hash
algorithms, digital signature) – impossible to
generate valid test cases.
 Limited effectiveness when all input data are
encrypted (tracking decrypted data).
 Checksum check points identification can be
affected by the quality of inputs.
 Not tracks control flow propagation.
 Not all instructions of x86 are instrumented
by the execution monitor.

33

TaintScope can perform:
 Directed fuzzing




Identify which bytes flow into system/library
calls.
dramatically reduce the mutation space.

 Checksum-aware




fuzzing

Disable checksum checks by control flow
alternation.
Generate correct checksum fields in invalid
inputs.

Taint scope

Recommended

Recommended

More Related Content

Viewers also liked

Viewers also liked (8)

Similar to Taint scope

Similar to Taint scope (20)

More from geeksec80

More from geeksec80 (19)

Recently uploaded

Recently uploaded (20)

Taint scope