Field Failure Reproduction Using Symbolic Execution and Genetic Programming

FIELD FAILURE REPRODUCTION
USING SYMBOLIC EXECUTION AND
GENETIC PROGRAMMING
Alessandro (Alex) Orso

School of Computer Science – College of Computing
Georgia Institute of Technology

Partially supported by: NSF, IBM, and MSR

DSE

SBST

GENETIC PROGRAMMING
Alessandro (Alex) Orso



DSE

SBST

GENETIC PROGRAMMING

es are
failur
Field
Alessandro (Alex)ble!
oida Orso
unav



TYPICAL DEBUGGING PROCESS

Bug Repository

Very hard to
(1) reproduce
(2) debug

Recent survey of
Apache, Eclipse, and Mozilla developers:

Information on how to reproduce ﬁeld failures is the most
valuable, and difﬁcult to obtain, piece of information for
investigating such failures.
[Zimmermann10]

Bug Repository

Very hard to
(1) reproduce
(2) debug

Recent survey of
Apache, Eclipse, and Mozilla developers:

Information on how to reproduce field failures is the most
valuable, and difficult to obtain, piece of information for
investigating such failures.
[Zimmermann10]

Bug Repository

OVERARCHING GOAL: help developers
(1) investigate field failures,
(2) understand their causes, and
(3) eliminate such causes.

Very hard to
(1) reproduce
(2) debug

OUR WORK SO FAR
Recording and replaying executions
[icsm 2007, icse 2007]

✘

Input minimization
[woda 2006, icse 2007]

Input anonymization
[icse 2011]

Mimicking ﬁeld failures
[icse 2012, icst 2014]

Explaining ﬁeld failures
[issta 2013, TR]

MIMICKING FIELD FAILURES
User run (R)

Mimicked run (R’)

•F’ is analogous to F
•R’ is an actual execution

in the ﬁeld

F

F’

in house

MIMICKING FIELD FAILURES
User run (R)

Relevant events Mimicked run (R’)
(breadcrumbs)

In house

OVERALL VISION

Software
developer Application

In the ﬁeld

Instrumentation

sed.c:8958 -> sed.c:
8958
9011
8786
8786
sed.c:990 -> sed.c:
990

Likely faults

Field Failure
Debugging

Synthesized
Executions

Field Failure
Reproduction

Crash report
(execution data)

DSE

BUGREDUX/SBFR

Synthesized
Executions

Field Failure
Reproduction

SBST

Crash report
(execution data)

ith Wei
Joint wor k w

Jin

BUGREDUX

Crash report
(execution data)

Synthesized
Executions

ith Wei
Joint wor k w

Test Input

Jin

BUGREDUX

Crash report
(execution data)

ith Wei
Joint wor k w

Jin

BUGREDUX
Candidate
input
Oracle

Test Input

•

Crash report
(execution data)

Execution data
•
•
•
•

•

Input
generator

Point of failure (POF)
Failure call stack
Call sequence
Complete trace

Input generation technique
•

Guided symbolic execution

ALGORITHM (SIMPLIFIED)
Input
icfg for P
goals (list of code locations)
Output
If (candidate input)
Main algorithm
init; currGoal = ﬁrst(goals)
repeat
currState = SelNextState()
if (!currState) backtrack or fail
if (currState.cl == currGoal)
if (currGoal == last(goals))
return solve(currState.pc)
else
currGoal = next(goals)
currState.goal = currGoal
symbolicallyExec(currState)

statesSet= {<cl, pc, ss, goal>}
SelNextState
minDis = ∞
retState = null
foreach state in statesSet
if (state.goal = currGoal)
if (state.cl can reach currGoal)
d = |shortest path state.cl, currGoal|
if d < minDis
minDis = d
retState = state
return retState

ALGORITHM (SIMPLIFIED)
Input
icfg for P
goals (list of code locations)
Output
If (candidate input)

statesSet= {<cl, pc, ss, goal>}

Main algorithm
SelNextState s
/Heuristic ut space
ns
init; currGoal = ﬁrst(goals) izatio minDiss=mbolic inp
tim
Op
the y ∞
e
e
repeat
null
ting to reducretStater=ne the search spac
Dyn=mic tain information to p u
ion
currState a SelNextState()
th computat
alysis
shor test pa
Program an
if (!currState) backtrack orss in the foreach state in statesSet
fail
an omne
e r==dcurrGoal)
if (currState.cl
if (state.goal = currGoal)
Som
if (currGoal == last(goals))
if (state.cl can reach currGoal)
return solve(currState.pc)
d = |shortest path state.cl, currGoal|
else
if d < minDis
currGoal = next(goals)
minDis = d
currState.goal = currGoal
retState = state
symbolicallyExec(currState)
return retState

BUGREDUX EVALUATION – FAILURES CONSIDERED
Name
sed
grep
gzip
ncompress
polymorph
aeon
glftpd
htget
socat
tipxd
aspell
exim
rsync
xmail

Repository
SIR
SIR
SIR
BugBench
BugBench
exploit-db
exploit-db
exploit-db
exploit-db
exploit-db
exploit-db
exploit-db
exploit-db
exploit-db

Size(KLOC)
14
10
5
2
1
3
6
3
35
7
0.5
241
67
1

# Faults
2
1
2
1
1
1
1
1
1
1
1
1
1
1

BUGREDUX EVALUATION – FAILURES CONSIDERED
Name
sed
grep
gzip
ncompress
polymorph
aeon
glftpd
htget
socat
tipxd
aspell
exim
rsync
xmail

Repository
Size(KLOC)
# Faults
SIR
14
2
SIR
10
1
SIR
5
2
BugBench
2
1
BugBench
1
1
exploit-db
3
scovered by 1
di
faults can6be
exploit-db se
rs 1
ut of 72 hou
None of the
meo
exploit-db EE with a ti 3
1
vanilla KL
a
exploit-db
35
1
exploit-db
7
1
exploit-db
0.5
1
exploit-db
241
1
exploit-db
67
1
exploit-db
1
1

BUGREDUX EVALUATION – RESULTS
Name
sed #1
sed #2
grep
gzip #1
gzip #2
ncompress
polymorph
aeon
rsync
glftpd
htget
socat
tipxd
aspell
xmail
exim

POF

Call Stack

Call Seq.

One of three outcomes:
✘: fail
∼: synthesize
✔: (synthesize and) mimic

Compl. Trace

16/16
2/16
Synth.: 9/16
BUGREDUX6/16 Synth.: 10/16 Synth.:– RESULTS
EVALUATION 16/16 Synth.: 2/16
Mimic: 6/16
Mimic:
Mimic:
Mimic:

Name
sed #1
sed #2
grep
gzip #1
gzip #2
ncompress
polymorph
aeon
rsync
glftpd
htget
socat
tipxd
aspell
xmail
exim

POF
✘
✘
✘
✔
∼
✔
✔
✔
✘
✔
∼
✘
✔
∼
✘
✘

Call Stack
✘
✘
∼
✔
∼
✔
✔
✔
✘
✔
∼
✘
✔
∼
✘
✘

Call Seq.
✔
✔
✔
✔
✔
✔
✔
✔
✔
✔
✔
✔
✔
✔
✔
✔

Compl. Trace
✘
✘
✘
✘
✘
✘
✘
✔
✘
✘
✘
✘
✘
✘
✘
✔

16/16
2/16
Synth.: 9/16
Mimic: 6/16
Mimic:
Mimic:
Mimic:

Name
POF
Call Stack
sed #1
✘
✘
sed #2
✘
✘
grep
✘
∼
s:
n
gzip #1 Observatio
✔
✔
gzip #2
∼
∼
from
nt
ncompress ults can be dista
✔
✔
• Fa
polymorph e failure ✔ oints:
✔
p
th
s
aeon
✔nd call stack ✔
a
=> POFs ✘
rsync
✘
ely to help
nlik
t
glftpd u
✔mation is no✔
r
• More info
htget
∼
∼
t
be✘ter
s
socat alway
✘
xecution can
ce
tipxd• Symboli ✔
✔
factor ∼
ting
aspell be a limi∼
xmail
✘
✘
exim
✘
✘

Call Seq.
✔
✔
✔
✔
✔
✔
✔
✔
✔
✔
✔
✔
✔
✔
✔
✔

Compl. Trace
✘
✘
✘
✘
✘
✘
✘
✔
✘
✘
✘
✘
✘
✘
✘
✔

16/16
2/16
Synth.: 9/16
Mimic: 6/16
Mimic:
Mimic:
Mimic:

Name
POF
Call Stack
sed #1
✘
✘
sed #2
✘
✘
grep
✘
∼
s: can
n
gzip #1 Observatiotion
✔ cu
✔
lic exe
Sym
gzip #2 bo
∼
∼
e o
fectivstafntrfrom
be s ef ✔
ncompress ultincan be di
✔
• Fa
polymorph e failure ✔ ointhighly
✔
p ith s:
th rograms w all stacks
aeon• p POFs ✔nd c
✔
a
>
=str uctured inputs
rsync
p
a ✘
elyato✘htelat internctt
nlik
glftpd •u progr ms mation is o✔
✔ h
s
nf t r
• Moire iexoer nal libr ar ie ∼
htget
∼
w ths better
r rams
alwraye co✘ plex tpoogcan
m ecu i n ✘
socat • la g
ce
Syn genlieralx tor ✔
tipxd• i mbo ✔
ting fac
aspell be a limi∼
∼
xmail
✘
✘
exim
✘
✘

Call Seq.
✔
✔
✔
✔
✔
✔
✔
✔
✔
✔
✔
✔
✔
✔
✔
✔

Compl. Trace
✘
✘
✘
✘
✘
✘
✘
✔
✘
✘
✘
✘
✘
✘
✘
✔

SBFR

ith
Joint wor k w nella
iella, To
Kifetew, Jin, T

Crash report
(execution data)

•

Execution data
•

•

Call sequence

Input generation technique
•

Genetic Programming

Test Input

ith
Joint wor k w nella
iella, To
Kifetew, Jin, T

SBFR

<a> ::=
<b> |λ

Grammar

Crash report
(execution data)

Test Input

ith
Joint wor k w nella
iella, To
Kifetew, Jin, T

SBFR

<a> ::=
<b> |λ

Grammar

Derivation
Tree

Genetic
Programming

Crash report
(execution data)

Sentence derivation from the grammar:
Random application of grammar rules
• Uniform
• 80/20
• Stochastic (from a corpus)

Test Input

ith
Joint wor k w nella
iella, To
Kifetew, Jin, T

SBFR

<a> ::=
<b> |λ

Grammar

Derivation
Tree

Genetic
Programming

Crash report
(execution data)
Evolution:

Test Input

Fitness function:
Sentence derivation from the grammar:
Distance b/w execution traces
Random application of grammar rules
(candidate–actual failure)
• Uniform
• 80/20
• Stochastic (from a corpus)

ith
Joint wor k w nella
iella, To
Kifetew, Jin, T

SBFR

<a> ::=
<b> |λ

Grammar

Derivation
Tree

✔︎
Genetic
Programming

Crash report
(execution data)

Stopping criterion:
• Success
• Ic reaches the point of failure
• The program fails “in the same way”
• Search budget exhausted

Test Input

SBFR EVALUATION – FAILURES CONSIDERED
Name

Language Size(KLOC) # Productions # Faults

calc

Java

2

38

2

bc

C

12

80

1

MSDL

Java

13

140

5

PicoC

C

11

194

1

Lua

C

17

106

2

SBFR EVALUATION – FAILURES CONSIDERED
Name

Language Size(KLOC) # Productions # Faults

calc

Java

2

38

2

bc

C

12

80

duce any of
pro
unable to re
rs
as 13
72
MSDL BugRedux w
Java
ut of 140 hou
o
s with a time
these failure

1

PicoC

C

11

194

1

Lua

C

17

106

2

5

SBFR EVALUATION – RESULTS
Name
calc bug 1
calc bug 2
bc
MSDL bug 1
MSDL bug 2
MSDL bug 3
MSDL bug 4
MSDL bug 5
PicoC

FRP (SBFR)

FRP (Random)
0.0
0.0
• Parameters:
0.0
• Population: 500
0.0
• Budget: 10,000 unique

ﬁtness evaluations
0.0
• Performed 10 runs
1.0
• Measured failure
0.0
reproduction probability
0.0
• Used both 80/20 and
stochastic derivations
0.1

Lua bug 1

0.0

Lua bug 2

0.0

Name
calc bug 1

FRP (SBFR)
0.6

FRP (Random)
0.0

calc bug 2

0.8

0.0

bc

1.0

0.0

MSDL bug 1

1.0

0.0

MSDL bug 2

1.0

0.0

MSDL bug 3

1.0

1.0

MSDL bug 4

1.0

0.0

MSDL bug 5

1.0

0.0

PicoC

0.8

0.1

Lua bug 1

0.0

0.0

Lua bug 2

0.5

0.0

Name
calc bug 1

FRP (SBFR)
0.6

FRP (Random)
0.0

calc bug 2

0.8

0.0

bc

1.0

0.0

1.0failure in bc
le:

0.0

Lua bug 1

0.0

0.0

Lua bug 2

0.5

0.0

MSDL bug 1
MSDL bug 2

Examp

1.0

0.0
str uction
in
gered by an
tri
MSDL bug 3 tion fault 1.0g
an
arrays 1.0 d
nta
t 32
me
eas
seg
allocates at l
n
MSDL bug ence that
4
1.0
ha0.0
sequ
bles higher t
o r ia
umber1.0f va rays
MSDLdbuglares a n
0.0
ec 5
ar
d
er of allocate
PicoC numb
0.8
0.1
the

Name
calc bug 1

FRP (SBFR)
0.6

FRP (Random)
0.0

calc bug 2

0.8

0.0

Lua bug 1

0.0

0.0

Lua bug 2

0.5

0.0

:
ervations
1.0
0.0
Ob s
c
MSDL bug 1
1.0failure in an be effective in
0.0
:
eroaches c b
mplp
Exa
based ap1.0
annotchiandle
rchMSDL bug 2
0.0
• Sea
t on c i
olic execubiy an ivnstr u t on
b
a
e
s
a3es thot sfymlt 1.0ar s red effect e rays 1.0 d
c
MSDL bug
i tin gau mtrigg are ast 32 ar ted an
t
segmentaas c raocmtes atulteless direc
och hat all 1.0 le, b
a
n
MSDL • Stence t e scalab
bug 4
a0.0
u
les h em r th
seqSBST mor ber of vare bompilgheentar y,
ia
•
um d DSE ar c ays
MSDLdbuglares BST an 1.0
ec=5 S a n
artrechniques 0.0
>
of a ernatd
er an allltocate ive
PicoC rnuheb th
0.8
0.1
the at m r
bc

FUTURE WORK / FOOD
FOR THOUGHTS
Relevant execution data identiﬁcation
• Which types?
• Which speciﬁc ones?
• Failure explanation
• Reproduction is not enough
• Can DSE and SBST help?
• Use of different input generation techniques
• Grammar-based symbolic execution
• Backward symbolic analysis?
• Other SBST approaches?
• SBST targeted at different kinds of programs?
• Combination of techniques
•

Field Failure Reproduction Using Symbolic Execution and Genetic Programming

Recommended

Recommended

More Related Content

Similar to Field Failure Reproduction Using Symbolic Execution and Genetic Programming

Similar to Field Failure Reproduction Using Symbolic Execution and Genetic Programming (20)

Recently uploaded

Recently uploaded (20)

Field Failure Reproduction Using Symbolic Execution and Genetic Programming