SlideShare a Scribd company logo
1 of 103
Download to read offline
Penumbra:
Automatically Identifying
Failure-Relevant Inputs
James Clause and Alessandro Orso
College of Computing
Georgia Institute of Technology
Supported in part by:
NSF awards CCF-0725202 and CCF-0541080
to Georgia Tech
Automated Debugging
• Gupta and colleagues ’05
• Jones and colleagues ’02
• Korel and Laski ’88
• Liblit and colleagues ’05
• Nainar and colleagues ’07
• Renieris and Reiss ’03
• Seward and Nethercote ’05
• Tucek and colleagues ’07
• Weiser ’81
• Zhang and colleagues ’05
• Zhang and colleagues ’06
• ...
Automated Debugging
Code-centric
• Gupta and colleagues ’05
• Jones and colleagues ’02
• Korel and Laski ’88
• Liblit and colleagues ’05
• Nainar and colleagues ’07
• Renieris and Reiss ’03
• Seward and Nethercote ’05
• Tucek and colleagues ’07
• Weiser ’81
• Zhang and colleagues ’05
• Zhang and colleagues ’06
• ...
Automated Debugging
Code-centric
• Gupta and colleagues ’05
• Jones and colleagues ’02
• Korel and Laski ’88
• Liblit and colleagues ’05
• Nainar and colleagues ’07
• Renieris and Reiss ’03
• Seward and Nethercote ’05
• Tucek and colleagues ’07
• Weiser ’81
• Zhang and colleagues ’05
• Zhang and colleagues ’06
• ...
What about inputs which cause the failure?
• Chan and Lakhotia ’98
• Zeller and Hildebrandt ’02
• Misherghi and Su ’06
Data-centric Techniques
• Chan and Lakhotia ’98
• Zeller and Hildebrandt ’02
• Misherghi and Su ’06
Delta Debugging
Data-centric Techniques
• Chan and Lakhotia ’98
• Zeller and Hildebrandt ’02
• Misherghi and Su ’06
Delta Debugging
Data-centric Techniques
Requires:
1. Multiple executions
2. Large amounts of manual
effort (oracle creation, setup)
• Chan and Lakhotia ’98
• Zeller and Hildebrandt ’02
• Misherghi and Su ’06
Delta Debugging
Data-centric Techniques
Requires:
1. Multiple executions
2. Large amounts of manual
effort (oracle creation, setup)
Penumbra
• Chan and Lakhotia ’98
• Zeller and Hildebrandt ’02
• Misherghi and Su ’06
Delta Debugging
Data-centric Techniques
Requires:
1. Multiple executions
2. Large amounts of manual
effort (oracle creation, setup)
Penumbra
Comparable
performance
• Chan and Lakhotia ’98
• Zeller and Hildebrandt ’02
• Misherghi and Su ’06
Delta Debugging
Data-centric Techniques
Requires:
1. Multiple executions
2. Large amounts of manual
effort (oracle creation, setup)
Requires:
1. Single execution
2. Reduced manual effort
Penumbra
Comparable
performance
Intuition and Terminology
Failure-revealing input vector
Intuition and Terminology
Failure-revealing input vector
Failure-relevant subset
(inputs which are useful for investigating the failure)
Intuition and Terminology
Failure-revealing input vector
Failure-relevant subset
(inputs which are useful for investigating the failure)
Approximate failure-relevant subsets by
identifying inputs that reach the failure along
program dependencies.
Motivating Example
int main(int argc, char **argv) {
1. int verbose, i, total_size = 0;
2. struct stat buf;
3. verbose = atoi(argv[1]);
4. for(i = 2; i < argc; i++) {
5. int fd = open(argv[i], O_RDONLY);
6. fstat(fd, &buf);
7. char *out = malloc(60);
8. sprintf(out, "%d", buf.st_size);
9. if(verbose) {
10. char *pview = malloc(51);
11. read(fd, pview, 50);
12. pview[50] = '0';
13. strcat(out, pview);
14. }
15. printf("%s: %sn", argv[i], out);
16. total_size += buf.st_size;
17. }
18. printf("total: %dn", total_size);
}
fileinfo
Motivating Example
int main(int argc, char **argv) {
1. int verbose, i, total_size = 0;
2. struct stat buf;
3. verbose = atoi(argv[1]);
4. for(i = 2; i < argc; i++) {
5. int fd = open(argv[i], O_RDONLY);
6. fstat(fd, &buf);
7. char *out = malloc(60);
8. sprintf(out, "%d", buf.st_size);
9. if(verbose) {
10. char *pview = malloc(51);
11. read(fd, pview, 50);
12. pview[50] = '0';
13. strcat(out, pview);
14. }
15. printf("%s: %sn", argv[i], out);
16. total_size += buf.st_size;
17. }
18. printf("total: %dn", total_size);
}
fileinfo
Command line arguments
(flag, list of file names)
Motivating Example
int main(int argc, char **argv) {
1. int verbose, i, total_size = 0;
2. struct stat buf;
3. verbose = atoi(argv[1]);
4. for(i = 2; i < argc; i++) {
5. int fd = open(argv[i], O_RDONLY);
6. fstat(fd, &buf);
7. char *out = malloc(60);
8. sprintf(out, "%d", buf.st_size);
9. if(verbose) {
10. char *pview = malloc(51);
11. read(fd, pview, 50);
12. pview[50] = '0';
13. strcat(out, pview);
14. }
15. printf("%s: %sn", argv[i], out);
16. total_size += buf.st_size;
17. }
18. printf("total: %dn", total_size);
}
fileinfo
File statistics (for each file)
(size, last modified date, ...)
Command line arguments
(flag, list of file names)
Motivating Example
int main(int argc, char **argv) {
1. int verbose, i, total_size = 0;
2. struct stat buf;
3. verbose = atoi(argv[1]);
4. for(i = 2; i < argc; i++) {
5. int fd = open(argv[i], O_RDONLY);
6. fstat(fd, &buf);
7. char *out = malloc(60);
8. sprintf(out, "%d", buf.st_size);
9. if(verbose) {
10. char *pview = malloc(51);
11. read(fd, pview, 50);
12. pview[50] = '0';
13. strcat(out, pview);
14. }
15. printf("%s: %sn", argv[i], out);
16. total_size += buf.st_size;
17. }
18. printf("total: %dn", total_size);
}
fileinfo
File statistics (for each file)
(size, last modified date, ...)
File contents (for each file)
(first 50 characters)
Command line arguments
(flag, list of file names)
Motivating Example
int main(int argc, char **argv) {
1. int verbose, i, total_size = 0;
2. struct stat buf;
3. verbose = atoi(argv[1]);
4. for(i = 2; i < argc; i++) {
5. int fd = open(argv[i], O_RDONLY);
6. fstat(fd, &buf);
7. char *out = malloc(60);
8. sprintf(out, "%d", buf.st_size);
9. if(verbose) {
10. char *pview = malloc(51);
11. read(fd, pview, 50);
12. pview[50] = '0';
13. strcat(out, pview);
14. }
15. printf("%s: %sn", argv[i], out);
16. total_size += buf.st_size;
17. }
18. printf("total: %dn", total_size);
}
fileinfo
File statistics (for each file)
(size, last modified date, ...)
File contents (for each file)
(first 50 characters)
Command line arguments
(flag, list of file names)
Inputvector
Motivating Example
int main(int argc, char **argv) {
1. int verbose, i, total_size = 0;
2. struct stat buf;
3. verbose = atoi(argv[1]);
4. for(i = 2; i < argc; i++) {
5. int fd = open(argv[i], O_RDONLY);
6. fstat(fd, &buf);
7. char *out = malloc(60);
8. sprintf(out, "%d", buf.st_size);
9. if(verbose) {
10. char *pview = malloc(51);
11. read(fd, pview, 50);
12. pview[50] = '0';
13. strcat(out, pview);
14. }
15. printf("%s: %sn", argv[i], out);
16. total_size += buf.st_size;
17. }
18. printf("total: %dn", total_size);
}
fileinfo
Motivating Example
int main(int argc, char **argv) {
1. int verbose, i, total_size = 0;
2. struct stat buf;
3. verbose = atoi(argv[1]);
4. for(i = 2; i < argc; i++) {
5. int fd = open(argv[i], O_RDONLY);
6. fstat(fd, &buf);
7. char *out = malloc(60);
8. sprintf(out, "%d", buf.st_size);
9. if(verbose) {
10. char *pview = malloc(51);
11. read(fd, pview, 50);
12. pview[50] = '0';
13. strcat(out, pview);
14. }
15. printf("%s: %sn", argv[i], out);
16. total_size += buf.st_size;
17. }
18. printf("total: %dn", total_size);
}
fileinfo
Overflow out
Motivating Example
int main(int argc, char **argv) {
1. int verbose, i, total_size = 0;
2. struct stat buf;
3. verbose = atoi(argv[1]);
4. for(i = 2; i < argc; i++) {
5. int fd = open(argv[i], O_RDONLY);
6. fstat(fd, &buf);
7. char *out = malloc(60);
8. sprintf(out, "%d", buf.st_size);
9. if(verbose) {
10. char *pview = malloc(51);
11. read(fd, pview, 50);
12. pview[50] = '0';
13. strcat(out, pview);
14. }
15. printf("%s: %sn", argv[i], out);
16. total_size += buf.st_size;
17. }
18. printf("total: %dn", total_size);
}
fileinfo
buf.st_size ≥ 1GB
Overflow out
Motivating Example
int main(int argc, char **argv) {
1. int verbose, i, total_size = 0;
2. struct stat buf;
3. verbose = atoi(argv[1]);
4. for(i = 2; i < argc; i++) {
5. int fd = open(argv[i], O_RDONLY);
6. fstat(fd, &buf);
7. char *out = malloc(60);
8. sprintf(out, "%d", buf.st_size);
9. if(verbose) {
10. char *pview = malloc(51);
11. read(fd, pview, 50);
12. pview[50] = '0';
13. strcat(out, pview);
14. }
15. printf("%s: %sn", argv[i], out);
16. total_size += buf.st_size;
17. }
18. printf("total: %dn", total_size);
}
fileinfo
buf.st_size ≥ 1GB
verbose is true
Overflow out
Motivating Example
int main(int argc, char **argv) {
1. int verbose, i, total_size = 0;
2. struct stat buf;
3. verbose = atoi(argv[1]);
4. for(i = 2; i < argc; i++) {
5. int fd = open(argv[i], O_RDONLY);
6. fstat(fd, &buf);
7. char *out = malloc(60);
8. sprintf(out, "%d", buf.st_size);
9. if(verbose) {
10. char *pview = malloc(51);
11. read(fd, pview, 50);
12. pview[50] = '0';
13. strcat(out, pview);
14. }
15. printf("%s: %sn", argv[i], out);
16. total_size += buf.st_size;
17. }
18. printf("total: %dn", total_size);
}
fileinfo
buf.st_size ≥ 1GB
verbose is true
Overflow out
read 50 characters
Motivating Example
int main(int argc, char **argv) {
1. int verbose, i, total_size = 0;
2. struct stat buf;
3. verbose = atoi(argv[1]);
4. for(i = 2; i < argc; i++) {
5. int fd = open(argv[i], O_RDONLY);
6. fstat(fd, &buf);
7. char *out = malloc(60);
8. sprintf(out, "%d", buf.st_size);
9. if(verbose) {
10. char *pview = malloc(51);
11. read(fd, pview, 50);
12. pview[50] = '0';
13. strcat(out, pview);
14. }
15. printf("%s: %sn", argv[i], out);
16. total_size += buf.st_size;
17. }
18. printf("total: %dn", total_size);
}
fileinfo
1. Many more inputs than lines
of code.
Motivating Example
int main(int argc, char **argv) {
1. int verbose, i, total_size = 0;
2. struct stat buf;
3. verbose = atoi(argv[1]);
4. for(i = 2; i < argc; i++) {
5. int fd = open(argv[i], O_RDONLY);
6. fstat(fd, &buf);
7. char *out = malloc(60);
8. sprintf(out, "%d", buf.st_size);
9. if(verbose) {
10. char *pview = malloc(51);
11. read(fd, pview, 50);
12. pview[50] = '0';
13. strcat(out, pview);
14. }
15. printf("%s: %sn", argv[i], out);
16. total_size += buf.st_size;
17. }
18. printf("total: %dn", total_size);
}
fileinfo
1. Many more inputs than lines
of code.
2. Understanding the failure
requires tracing interactions
between inputs from multiple
sources.
Motivating Example
int main(int argc, char **argv) {
1. int verbose, i, total_size = 0;
2. struct stat buf;
3. verbose = atoi(argv[1]);
4. for(i = 2; i < argc; i++) {
5. int fd = open(argv[i], O_RDONLY);
6. fstat(fd, &buf);
7. char *out = malloc(60);
8. sprintf(out, "%d", buf.st_size);
9. if(verbose) {
10. char *pview = malloc(51);
11. read(fd, pview, 50);
12. pview[50] = '0';
13. strcat(out, pview);
14. }
15. printf("%s: %sn", argv[i], out);
16. total_size += buf.st_size;
17. }
18. printf("total: %dn", total_size);
}
fileinfo
1. Many more inputs than lines
of code.
2. Understanding the failure
requires tracing interactions
between inputs from multiple
sources.
3. Only a small percentage of all
inputs are relevant for the
failure.
Motivating Example
int main(int argc, char **argv) {
1. int verbose, i, total_size = 0;
2. struct stat buf;
3. verbose = atoi(argv[1]);
4. for(i = 2; i < argc; i++) {
5. int fd = open(argv[i], O_RDONLY);
6. fstat(fd, &buf);
7. char *out = malloc(60);
8. sprintf(out, "%d", buf.st_size);
9. if(verbose) {
10. char *pview = malloc(51);
11. read(fd, pview, 50);
12. pview[50] = '0';
13. strcat(out, pview);
14. }
15. printf("%s: %sn", argv[i], out);
16. total_size += buf.st_size;
17. }
18. printf("total: %dn", total_size);
}
fileinfo
fileinfo
Penumbra Overview
foo: 512 ... bar: 1024 ... baz: 150... total: 150...
Foo
512B
Bar
1KB
Baz
1.5GB
fileinfo
Penumbra Overview
foo: 512 ... bar: 1024 ... baz: 150... total: 150...
Foo
512B
Bar
1KB
Baz
1.5GB
Relevant context:
1. When the failure occurs.
2. Which data are involved in
the failure.
fileinfo
Penumbra Overview
foo: 512 ... bar: 1024 ... baz: 150... total: 150...
Foo
512B
Bar
1KB
Baz
1.5GB
13. strcat(out, pview);
In general, it is chosen using
traditional debugging methods.
fileinfo
Penumbra Overview
foo: 512 ... bar: 1024 ... baz: 150... total: 150...
Foo
512B
Bar
1KB
Baz
1.5GB
fileinfo
Penumbra Overview
foo: 512 ... bar: 1024 ... baz: 150... total: 150...
Foo
512B
Bar
1KB
Baz
1.5GB
1 Taint inputs
fileinfo
Penumbra Overview
foo: 512 ... bar: 1024 ... baz: 150... total: 150...
Foo
512B
Bar
1KB
Baz
1.5GB
1 Taint inputs
1
2
3
4
5
6
7
8
9
0
fileinfo
Penumbra Overview
foo: 512 ... bar: 1024 ... baz: 150... total: 150...
Foo
512B
Bar
1KB
Baz
1.5GB
1 Taint inputs
2 Propagate
taint marks
1
2
3
4
5
6
7
8
9
0
fileinfo
Penumbra Overview
foo: 512 ... bar: 1024 ... baz: 150... total: 150...
Foo
512B
Bar
1KB
Baz
1.5GB
1 Taint inputs
2 Propagate
taint marks
fileinfo
Penumbra Overview
foo: 512 ... bar: 1024 ... baz: 150... total: 150...
Foo
512B
Bar
1KB
Baz
1.5GB
1 Taint inputs
2 Propagate
taint marks
3 Identify
relevant inputs
fileinfo
Penumbra Overview
foo: 512 ... bar: 1024 ... baz: 150... total: 150...
Foo
512B
Bar
1KB
Baz
1.5GB
1 Taint inputs
2 Propagate
taint marks
3 Identify
relevant inputs
0 8 9
fileinfo
Penumbra Overview
foo: 512 ... bar: 1024 ... baz: 150... total: 150...
Foo
512B
Bar
1KB
Baz
1.5GB
1 Taint inputs
2 Propagate
taint marks
3 Identify
relevant inputs
0 8 9
fileinfo
Penumbra Overview
foo: 512 ... bar: 1024 ... baz: 150... total: 150...
Foo
512B
Bar
1KB
Baz
1.5GB
1 Taint inputs
2 Propagate
taint marks
3 Identify
relevant inputs
0 8 9
verbose is true
read 50 characters
buf.st_size ≥ 1GB
Outline
• Penumbra approach
1. Tainting inputs
2. Propagating taint marks
3. Identifying relevant inputs
• Evaluation
• Conclusions and future work
1: Tainting Inputs
Assign a taint mark to each input as it enters the application.
1: Tainting Inputs
Assign a taint mark to each input as it enters the application.
Per-byte Per-entity Domain specific
1: Tainting Inputs
Assign a unique
taint mark to each
byte.
(read from files)
Assign the same
taint mark to
related bytes.
(argv, argc, fstat, ...)
Assign taint marks
based on user-
provided
information.
Assign a taint mark to each input as it enters the application.
Per-byte Per-entity Domain specific
1: Tainting Inputs
Assign a unique
taint mark to each
byte.
(read from files)
Assign the same
taint mark to
related bytes.
(argv, argc, fstat, ...)
Assign taint marks
based on user-
provided
information.
Assign a taint mark to each input as it enters the application.
Precise
identification
Per-byte Per-entity Domain specific
1: Tainting Inputs
Assign a unique
taint mark to each
byte.
(read from files)
Assign the same
taint mark to
related bytes.
(argv, argc, fstat, ...)
Assign taint marks
based on user-
provided
information.
Assign a taint mark to each input as it enters the application.
Precise
identification
Unnecessarily
expensive
Per-byte Per-entity Domain specific
1: Tainting Inputs
Assign a unique
taint mark to each
byte.
(read from files)
Assign the same
taint mark to
related bytes.
(argv, argc, fstat, ...)
Assign taint marks
based on user-
provided
information.
Assign a taint mark to each input as it enters the application.
Precise
identification
Unnecessarily
expensive
Per-byte Per-entity Domain specific
1: Tainting Inputs
Assign a unique
taint mark to each
byte.
(read from files)
Assign the same
taint mark to
related bytes.
(argv, argc, fstat, ...)
Assign taint marks
based on user-
provided
information.
Assign a taint mark to each input as it enters the application.
Precise
identification
Unnecessarily
expensive
Maintains per -
byte precision
Per-byte Per-entity Domain specific
1: Tainting Inputs
Assign a unique
taint mark to each
byte.
(read from files)
Assign the same
taint mark to
related bytes.
(argv, argc, fstat, ...)
Assign taint marks
based on user-
provided
information.
Assign a taint mark to each input as it enters the application.
Precise
identification
Unnecessarily
expensive
Maintains per -
byte precision
Increases
scalability
Per-byte Per-entity Domain specific
1: Tainting Inputs
Assign a unique
taint mark to each
byte.
(read from files)
Assign the same
taint mark to
related bytes.
(argv, argc, fstat, ...)
Assign taint marks
based on user-
provided
information.
Assign a taint mark to each input as it enters the application.
Precise
identification
Unnecessarily
expensive
Maintains per -
byte precision
Increases
scalability
Per-byte Per-entity Domain specific
1: Tainting Inputs
Assign a unique
taint mark to each
byte.
(read from files)
Assign the same
taint mark to
related bytes.
(argv, argc, fstat, ...)
Assign taint marks
based on user-
provided
information.
Assign a taint mark to each input as it enters the application.
Precise
identification
Unnecessarily
expensive
Maintains per -
byte precision
Increases
scalability
Per-byte Per-entity Domain specific
Maintains per -
byte precision
1: Tainting Inputs
Assign a unique
taint mark to each
byte.
(read from files)
Assign the same
taint mark to
related bytes.
(argv, argc, fstat, ...)
Assign taint marks
based on user-
provided
information.
Assign a taint mark to each input as it enters the application.
Precise
identification
Unnecessarily
expensive
Maintains per -
byte precision
Increases
scalability
Per-byte Per-entity Domain specific
Maintains per -
byte precision
Further increases
scalability
1: Tainting Inputs
Assign a unique
taint mark to each
byte.
(read from files)
Assign the same
taint mark to
related bytes.
(argv, argc, fstat, ...)
Assign taint marks
based on user-
provided
information.
Assign a taint mark to each input as it enters the application.
When a taint mark is assigned to an input, log the
input’s value and where the input was read from.
Precise
identification
Unnecessarily
expensive
Maintains per -
byte precision
Increases
scalability
Per-byte Per-entity Domain specific
Maintains per -
byte precision
Further increases
scalability
2: Propagating Taint Marks
2: Propagating Taint Marks
Data-flow
Propagation (DF)
Data- and control-flow
Propagation (DF + CF)
2: Propagating Taint Marks
Taint marks flow along only
data dependencies.
Taint marks flow along data and
control dependencies.
Data-flow
Propagation (DF)
Data- and control-flow
Propagation (DF + CF)
2: Propagating Taint Marks
Taint marks flow along only
data dependencies.
Taint marks flow along data and
control dependencies.
C = A + B;
Data-flow
Propagation (DF)
Data- and control-flow
Propagation (DF + CF)
2: Propagating Taint Marks
Taint marks flow along only
data dependencies.
Taint marks flow along data and
control dependencies.
C = A + B;
1 2
Data-flow
Propagation (DF)
Data- and control-flow
Propagation (DF + CF)
2: Propagating Taint Marks
Taint marks flow along only
data dependencies.
Taint marks flow along data and
control dependencies.
C = A + B;
1 21 2
Data-flow
Propagation (DF)
Data- and control-flow
Propagation (DF + CF)
2: Propagating Taint Marks
Taint marks flow along only
data dependencies.
Taint marks flow along data and
control dependencies.
C = A + B;
1 21 2
Data-flow
Propagation (DF)
Data- and control-flow
Propagation (DF + CF)
2: Propagating Taint Marks
Taint marks flow along only
data dependencies.
Taint marks flow along data and
control dependencies.
C = A + B;
if(X) {
C = A + B;
}
1 21 2
Data-flow
Propagation (DF)
Data- and control-flow
Propagation (DF + CF)
2: Propagating Taint Marks
Taint marks flow along only
data dependencies.
Taint marks flow along data and
control dependencies.
C = A + B;
if(X) {
C = A + B;
}
1 21 2
1 2
3
Data-flow
Propagation (DF)
Data- and control-flow
Propagation (DF + CF)
2: Propagating Taint Marks
Taint marks flow along only
data dependencies.
Taint marks flow along data and
control dependencies.
C = A + B;
if(X) {
C = A + B;
}
1 21 2
1 2
3
1 2 3
Data-flow
Propagation (DF)
Data- and control-flow
Propagation (DF + CF)
2: Propagating Taint Marks
Taint marks flow along only
data dependencies.
Taint marks flow along data and
control dependencies.
C = A + B;
if(X) {
C = A + B;
}
1 21 2
1 2
3
1 2 3
The effectiveness of each option depends on the particular failure.
Data-flow
Propagation (DF)
Data- and control-flow
Propagation (DF + CF)
3: Identifying Relevant-inputs
1. Relevant context indicates
which data is involved in the
considered failure.
2. Identify which taint marks as
associated with the data
indicated by the relevant
context.
3. Use recorded logs to
reconstruct inputs that are
identified by the taint marks.
Baz
1.5GB
Prototype Implementation
Trace
Processor
Trace
generator
input vector
executable
trace
relevant context
Prototype Implementation
Trace
Processor
Trace
generator
input vector
executable
trace
relevant context
Prototype Implementation
Trace
Processor
Trace
generator
Implemented using Dytan, a
generic x86 tainting framework
developed in previous work
[Clause and Orso 2007].
input vector
executable
trace
relevant context
Prototype Implementation
Trace
Processor
Trace
generator
input vector
executable
trace
relevant context
Prototype Implementation
Trace
Processor
Trace
generator
input subset
(DF)
input subset
(DF+CF)
Evaluation
Study 1: Effectiveness for debugging real failures
Study 2: Comparison with Delta Debugging
Evaluation
Study 1: Effectiveness for debugging real failures
Study 2: Comparison with Delta Debugging
Application KLoC Fault location
bc 1.06 10.5 more_arrays : 177
gzip 1.24 6.3 get_istat : 828
ncompress 4.24 1.4 comprexx : 896
pine 4.44 239.1 rfc822_cat : 260
squid 2.3 69.9 ftpBuildTitleUrl : 1024
Subjects:
Evaluation
Study 1: Effectiveness for debugging real failures
Study 2: Comparison with Delta Debugging
Application KLoC Fault location
bc 1.06 10.5 more_arrays : 177
gzip 1.24 6.3 get_istat : 828
ncompress 4.24 1.4 comprexx : 896
pine 4.44 239.1 rfc822_cat : 260
squid 2.3 69.9 ftpBuildTitleUrl : 1024
Subjects:
We selected a failure-revealing input vector for each subject.
Data Generation
Penumbra Delta Debugging
Setup
(manual)
Execution
(automated)
Choose a relevant
context
Create an automated
oracle
Use prototype tool to
identify failure-relevant
inputs (DF and DF +
CF)
Use the standard Delta
Debugging
implementation to
minimize inputs.
Data Generation
Penumbra Delta Debugging
Setup
(manual)
Execution
(automated)
Choose a relevant
context
Create an automated
oracle
Use prototype tool to
identify failure-relevant
inputs (DF and DF +
CF)
Use the standard Delta
Debugging
implementation to
minimize inputs.
Data Generation
Penumbra Delta Debugging
Setup
(manual)
Execution
(automated)
Choose a relevant
context
Create an automated
oracle
Use prototype tool to
identify failure-relevant
inputs (DF and DF +
CF)
Use the standard Delta
Debugging
implementation to
minimize inputs.
• Location: statement where
the failure occurs.
• Data: any data read by such
statement
Data Generation
Penumbra Delta Debugging
Setup
(manual)
Execution
(automated)
Choose a relevant
context
Create an automated
oracle
Use prototype tool to
identify failure-relevant
inputs (DF and DF +
CF)
Use the standard Delta
Debugging
implementation to
minimize inputs.
Data Generation
Penumbra Delta Debugging
Setup
(manual)
Execution
(automated)
Choose a relevant
context
Create an automated
oracle
Use prototype tool to
identify failure-relevant
inputs (DF and DF +
CF)
Use the standard Delta
Debugging
implementation to
minimize inputs.
Data Generation
Penumbra Delta Debugging
Setup
(manual)
Execution
(automated)
Choose a relevant
context
Create an automated
oracle
Use prototype tool to
identify failure-relevant
inputs (DF and DF +
CF)
Use the standard Delta
Debugging
implementation to
minimize inputs.
• Use gdb to inspect stack
trace and program data.
• One second timeout to
prevent incorrect results.
Data Generation
Penumbra Delta Debugging
Setup
(manual)
Execution
(automated)
Choose a relevant
context
Create an automated
oracle
Use prototype tool to
identify failure-relevant
inputs (DF and DF +
CF)
Use the standard Delta
Debugging
implementation to
minimize inputs.
Study 1: Effectiveness
Is the information that
Penumbra provides helpful for
debugging real failures?
Study 1 Results: gzip & ncompress
Crash when a file name is longer than 1,024 characters.
Study 1 Results: gzip & ncompress
Contents
&
Attributes
Contents
&
Attributes
bar
Contents
&
Attributes
foo./gzip
Crash when a file name is longer than 1,024 characters.
# Inputs: 10,000,056
long
filename[ ]
Study 1 Results: gzip & ncompress
Contents
&
Attributes
Contents
&
Attributes
bar
Contents
&
Attributes
foo./gzip
Crash when a file name is longer than 1,024 characters.
# Inputs: 10,000,056 # Relevant (DF): 1
long
filename[ ]
Study 1 Results: gzip & ncompress
Contents
&
Attributes
Contents
&
Attributes
bar
Contents
&
Attributes
foo./gzip
Crash when a file name is longer than 1,024 characters.
# Relevant (DF + CF): 3
# Inputs: 10,000,056 # Relevant (DF): 1
long
filename[ ]
Study 1 Results: pine
Crash when a “from” field contains 22 or more double quote characters.
Study 1 Results: pine
# Inputs: 15,103,766
...
From clause@boar Tue Feb 20 11:49:53 2007
Return-Path: <clause@boar>
X-Original-To: clause
Delivered-To: clause@boar
Received: by boar (Postfix, from userid 1000)
id 88EDD1724523; Tue, 20 Feb 2007 11:49:53 -0500 (EST)
To: clause@boar
Subject: test
Message-Id: <20070220164953.88EDD1724523@boar>
Date: Tue, 20 Feb 2007 11:49:53 -0500 (EST)
From: """"""""""""""""""""""""""""""""@host.fubar
X-IMAPbase: 1172160370 390
Status: O
X-Status:
X-Keywords:
X-UID: 5
...
Crash when a “from” field contains 22 or more double quote characters.
Study 1 Results: pine
# Inputs: 15,103,766
...
From clause@boar Tue Feb 20 11:49:53 2007
Return-Path: <clause@boar>
X-Original-To: clause
Delivered-To: clause@boar
Received: by boar (Postfix, from userid 1000)
id 88EDD1724523; Tue, 20 Feb 2007 11:49:53 -0500 (EST)
To: clause@boar
Subject: test
Message-Id: <20070220164953.88EDD1724523@boar>
Date: Tue, 20 Feb 2007 11:49:53 -0500 (EST)
From: """"""""""""""""""""""""""""""""@host.fubar
X-IMAPbase: 1172160370 390
Status: O
X-Status:
X-Keywords:
X-UID: 5
...
…            …" " " " " " " " " " " "
Crash when a “from” field contains 22 or more double quote characters.
Study 1 Results: pine
# Inputs: 15,103,766 # Relevant (DF): 26
...
From clause@boar Tue Feb 20 11:49:53 2007
Return-Path: <clause@boar>
X-Original-To: clause
Delivered-To: clause@boar
Received: by boar (Postfix, from userid 1000)
id 88EDD1724523; Tue, 20 Feb 2007 11:49:53 -0500 (EST)
To: clause@boar
Subject: test
Message-Id: <20070220164953.88EDD1724523@boar>
Date: Tue, 20 Feb 2007 11:49:53 -0500 (EST)
From: """"""""""""""""""""""""""""""""@host.fubar
X-IMAPbase: 1172160370 390
Status: O
X-Status:
X-Keywords:
X-UID: 5
...
…            …" " " " " " " " " " " "
Crash when a “from” field contains 22 or more double quote characters.
Study 1 Results: pine
# Relevant (DF + CF):15,100,344
# Inputs: 15,103,766 # Relevant (DF): 26
...
From clause@boar Tue Feb 20 11:49:53 2007
Return-Path: <clause@boar>
X-Original-To: clause
Delivered-To: clause@boar
Received: by boar (Postfix, from userid 1000)
id 88EDD1724523; Tue, 20 Feb 2007 11:49:53 -0500 (EST)
To: clause@boar
Subject: test
Message-Id: <20070220164953.88EDD1724523@boar>
Date: Tue, 20 Feb 2007 11:49:53 -0500 (EST)
From: """"""""""""""""""""""""""""""""@host.fubar
X-IMAPbase: 1172160370 390
Status: O
X-Status:
X-Keywords:
X-UID: 5
...
…            …" " " " " " " " " " " "
Crash when a “from” field contains 22 or more double quote characters.
Study 1: Conclusions
Study 1: Conclusions
1. Data-flow propagation is always effective,
data- and control-flow propagation is sometimes
effective.
➡ Use data-flow first then, if necessary, use control-flow.
Study 1: Conclusions
1. Data-flow propagation is always effective,
data- and control-flow propagation is sometimes
effective.
➡ Use data-flow first then, if necessary, use control-flow.
2. Inputs identified by Penumbra correspond to the
failure conditions.
➡ Our technique is effective in assisting the debugging of
real failures.
Study 2: Comparison with Delta Debugging
RQ1: How much manual effort
does each technique require?
RQ2: How long does it take to
fix a considered failure given
the information provided by
each technique?
RQ1: Manual effort
Use setup-time as a proxy for manual (developer) effort.
RQ1: Manual effort
Use setup-time as a proxy for manual (developer) effort.
5,400
12,600
1,8001,800
1259731470163
ncompress bc pine
Setup-time(s)
gzip
Penumbra
Delta Debugging
squid
RQ1: Manual effort
Use setup-time as a proxy for manual (developer) effort.
5,400
12,600
1,8001,800
1259731470163
ncompress bc pine
Setup-time(s)
gzip
Penumbra
Delta Debugging
squid
RQ1: Manual effort
Use setup-time as a proxy for manual (developer) effort.
5,400
12,600
1,8001,800
1259731470163
ncompress bc pine
Setup-time(s)
gzip
Penumbra
Delta Debugging
squid
RQ1: Manual effort
Use setup-time as a proxy for manual (developer) effort.
5,400
12,600
1,8001,800
1259731470163
ncompress bc pine
Setup-time(s)
gzip
Penumbra
Delta Debugging
squid
Penumbra requires considerably less setup time than Delta Debugging
(although more time time overall for gzip and ncompress).
RQ2: Debugging Effort
Use number of relevant inputs as a proxy for debugging effort.
RQ2: Debugging Effort
Subject PenumbraPenumbra Delta Debugging
DF DF + CF
bc 209 743 285
gzip 1 3 1
ncompress 1 3 1
pine 26 15,100,344 90
squid 89 2,056 —
Use number of relevant inputs as a proxy for debugging effort.
RQ2: Debugging Effort
Subject PenumbraPenumbra Delta Debugging
DF DF + CF
bc 209 743 285
gzip 1 3 1
ncompress 1 3 1
pine 26 15,100,344 90
squid 89 2,056 —
Use number of relevant inputs as a proxy for debugging effort.
• Penumbra (DF) is comparable to (slightly better than) Delta Debugging.
RQ2: Debugging Effort
Subject PenumbraPenumbra Delta Debugging
DF DF + CF
bc 209 743 285
gzip 1 3 1
ncompress 1 3 1
pine 26 15,100,344 90
squid 89 2,056 —
Use number of relevant inputs as a proxy for debugging effort.
• Penumbra (DF) is comparable to (slightly better than) Delta Debugging.
• Penumbra (DF + CF) is likely less effective for bc, pine, and squid
Conclusions & Future Work
• Novel technique for identifying failure-relevant
inputs.
• Overcomes limitations of existing approaches
• Single execution
• Minimal manual effort
• Comparable effectiveness
• Combine Penumbra with existing code-centric
techniques.

More Related Content

What's hot

Powered by Python - PyCon Germany 2016
Powered by Python - PyCon Germany 2016Powered by Python - PyCon Germany 2016
Powered by Python - PyCon Germany 2016Steffen Wenz
 
Cluj.py Meetup: Extending Python in C
Cluj.py Meetup: Extending Python in CCluj.py Meetup: Extending Python in C
Cluj.py Meetup: Extending Python in CSteffen Wenz
 
Secure Programming Practices in C++ (NDC Security 2018)
Secure Programming Practices in C++ (NDC Security 2018)Secure Programming Practices in C++ (NDC Security 2018)
Secure Programming Practices in C++ (NDC Security 2018)Patricia Aas
 
Errors detected in C++Builder
Errors detected in C++BuilderErrors detected in C++Builder
Errors detected in C++BuilderPVS-Studio
 
Потоки в перле изнутри
Потоки в перле изнутриПотоки в перле изнутри
Потоки в перле изнутриIlya Zelenchuk
 
Bai Giang 11
Bai Giang 11Bai Giang 11
Bai Giang 11nbb3i
 
ITGM #9 - Коварный CodeType, или от segfault'а к работающему коду
ITGM #9 - Коварный CodeType, или от segfault'а к работающему кодуITGM #9 - Коварный CodeType, или от segfault'а к работающему коду
ITGM #9 - Коварный CodeType, или от segfault'а к работающему кодуdelimitry
 
OSMC 2013 | Making monitoring simple? by Michael Medin
OSMC 2013 | Making monitoring simple? by Michael MedinOSMC 2013 | Making monitoring simple? by Michael Medin
OSMC 2013 | Making monitoring simple? by Michael MedinNETWAYS
 
Code obfuscation, php shells & more
Code obfuscation, php shells & moreCode obfuscation, php shells & more
Code obfuscation, php shells & moreMattias Geniar
 
ONLINE STUDENT MANAGEMENT SYSTEM
ONLINE STUDENT MANAGEMENT SYSTEMONLINE STUDENT MANAGEMENT SYSTEM
ONLINE STUDENT MANAGEMENT SYSTEMRohit malav
 
Коварный code type ITGM #9
Коварный code type ITGM #9Коварный code type ITGM #9
Коварный code type ITGM #9Andrey Zakharevich
 
CONFidence 2015: DTrace + OSX = Fun - Andrzej Dyjak
CONFidence 2015: DTrace + OSX = Fun - Andrzej Dyjak   CONFidence 2015: DTrace + OSX = Fun - Andrzej Dyjak
CONFidence 2015: DTrace + OSX = Fun - Andrzej Dyjak PROIDEA
 
Architecture for Massively Parallel HDL Simulations
Architecture for Massively Parallel HDL Simulations Architecture for Massively Parallel HDL Simulations
Architecture for Massively Parallel HDL Simulations DVClub
 
Arduino、Web 到 IoT
Arduino、Web 到 IoTArduino、Web 到 IoT
Arduino、Web 到 IoTJustin Lin
 
Catch a spider monkey
Catch a spider monkeyCatch a spider monkey
Catch a spider monkeyChengHui Weng
 
Top 10 php classic traps confoo
Top 10 php classic traps confooTop 10 php classic traps confoo
Top 10 php classic traps confooDamien Seguy
 
The Anatomy of an Exploit
The Anatomy of an ExploitThe Anatomy of an Exploit
The Anatomy of an ExploitPatricia Aas
 

What's hot (20)

Hack ASP.NET website
Hack ASP.NET websiteHack ASP.NET website
Hack ASP.NET website
 
Powered by Python - PyCon Germany 2016
Powered by Python - PyCon Germany 2016Powered by Python - PyCon Germany 2016
Powered by Python - PyCon Germany 2016
 
Cluj.py Meetup: Extending Python in C
Cluj.py Meetup: Extending Python in CCluj.py Meetup: Extending Python in C
Cluj.py Meetup: Extending Python in C
 
Secure Programming Practices in C++ (NDC Security 2018)
Secure Programming Practices in C++ (NDC Security 2018)Secure Programming Practices in C++ (NDC Security 2018)
Secure Programming Practices in C++ (NDC Security 2018)
 
Errors detected in C++Builder
Errors detected in C++BuilderErrors detected in C++Builder
Errors detected in C++Builder
 
What Lies Beneath
What Lies BeneathWhat Lies Beneath
What Lies Beneath
 
Потоки в перле изнутри
Потоки в перле изнутриПотоки в перле изнутри
Потоки в перле изнутри
 
Bai Giang 11
Bai Giang 11Bai Giang 11
Bai Giang 11
 
TensorFlow XLA RPC
TensorFlow XLA RPCTensorFlow XLA RPC
TensorFlow XLA RPC
 
ITGM #9 - Коварный CodeType, или от segfault'а к работающему коду
ITGM #9 - Коварный CodeType, или от segfault'а к работающему кодуITGM #9 - Коварный CodeType, или от segfault'а к работающему коду
ITGM #9 - Коварный CodeType, или от segfault'а к работающему коду
 
OSMC 2013 | Making monitoring simple? by Michael Medin
OSMC 2013 | Making monitoring simple? by Michael MedinOSMC 2013 | Making monitoring simple? by Michael Medin
OSMC 2013 | Making monitoring simple? by Michael Medin
 
Code obfuscation, php shells & more
Code obfuscation, php shells & moreCode obfuscation, php shells & more
Code obfuscation, php shells & more
 
ONLINE STUDENT MANAGEMENT SYSTEM
ONLINE STUDENT MANAGEMENT SYSTEMONLINE STUDENT MANAGEMENT SYSTEM
ONLINE STUDENT MANAGEMENT SYSTEM
 
Коварный code type ITGM #9
Коварный code type ITGM #9Коварный code type ITGM #9
Коварный code type ITGM #9
 
CONFidence 2015: DTrace + OSX = Fun - Andrzej Dyjak
CONFidence 2015: DTrace + OSX = Fun - Andrzej Dyjak   CONFidence 2015: DTrace + OSX = Fun - Andrzej Dyjak
CONFidence 2015: DTrace + OSX = Fun - Andrzej Dyjak
 
Architecture for Massively Parallel HDL Simulations
Architecture for Massively Parallel HDL Simulations Architecture for Massively Parallel HDL Simulations
Architecture for Massively Parallel HDL Simulations
 
Arduino、Web 到 IoT
Arduino、Web 到 IoTArduino、Web 到 IoT
Arduino、Web 到 IoT
 
Catch a spider monkey
Catch a spider monkeyCatch a spider monkey
Catch a spider monkey
 
Top 10 php classic traps confoo
Top 10 php classic traps confooTop 10 php classic traps confoo
Top 10 php classic traps confoo
 
The Anatomy of an Exploit
The Anatomy of an ExploitThe Anatomy of an Exploit
The Anatomy of an Exploit
 

Viewers also liked

Tweet for Business
Tweet for BusinessTweet for Business
Tweet for BusinessUNIQ_Academy
 
Bella2010 Misc
Bella2010 MiscBella2010 Misc
Bella2010 MiscBella Marr
 
Enabling and Supporting the Debugging of Software Failures (PhD Defense)
Enabling and Supporting the Debugging of Software Failures (PhD Defense)Enabling and Supporting the Debugging of Software Failures (PhD Defense)
Enabling and Supporting the Debugging of Software Failures (PhD Defense)James Clause
 
Taint-based Dynamic Analysis (CoC Research Day 2009)
Taint-based Dynamic Analysis (CoC Research Day 2009)Taint-based Dynamic Analysis (CoC Research Day 2009)
Taint-based Dynamic Analysis (CoC Research Day 2009)James Clause
 
Dytan: A Generic Dynamic Taint Analysis Framework (ISSTA 2007)
Dytan: A Generic Dynamic Taint Analysis Framework (ISSTA 2007)Dytan: A Generic Dynamic Taint Analysis Framework (ISSTA 2007)
Dytan: A Generic Dynamic Taint Analysis Framework (ISSTA 2007)James Clause
 

Viewers also liked (7)

Tweet for Business
Tweet for BusinessTweet for Business
Tweet for Business
 
Bella2010 Misc
Bella2010 MiscBella2010 Misc
Bella2010 Misc
 
Prezentace
PrezentacePrezentace
Prezentace
 
Enabling and Supporting the Debugging of Software Failures (PhD Defense)
Enabling and Supporting the Debugging of Software Failures (PhD Defense)Enabling and Supporting the Debugging of Software Failures (PhD Defense)
Enabling and Supporting the Debugging of Software Failures (PhD Defense)
 
Prezentace
PrezentacePrezentace
Prezentace
 
Taint-based Dynamic Analysis (CoC Research Day 2009)
Taint-based Dynamic Analysis (CoC Research Day 2009)Taint-based Dynamic Analysis (CoC Research Day 2009)
Taint-based Dynamic Analysis (CoC Research Day 2009)
 
Dytan: A Generic Dynamic Taint Analysis Framework (ISSTA 2007)
Dytan: A Generic Dynamic Taint Analysis Framework (ISSTA 2007)Dytan: A Generic Dynamic Taint Analysis Framework (ISSTA 2007)
Dytan: A Generic Dynamic Taint Analysis Framework (ISSTA 2007)
 

Similar to Penumbra: Automatically Identifying Failure-Relevant Inputs (ISSTA 2009)

Effective Memory Protection Using Dynamic Tainting (ASE 2007)
Effective Memory Protection Using Dynamic Tainting (ASE 2007)Effective Memory Protection Using Dynamic Tainting (ASE 2007)
Effective Memory Protection Using Dynamic Tainting (ASE 2007)James Clause
 
C ISRO Debugging
C ISRO DebuggingC ISRO Debugging
C ISRO Debuggingsplix757
 
A miało być tak... bez wycieków
A miało być tak... bez wyciekówA miało być tak... bez wycieków
A miało być tak... bez wyciekówKonrad Kokosa
 
Don't Be Afraid of Abstract Syntax Trees
Don't Be Afraid of Abstract Syntax TreesDon't Be Afraid of Abstract Syntax Trees
Don't Be Afraid of Abstract Syntax TreesJamund Ferguson
 
Intoduction to dynamic memory allocation
Intoduction to dynamic memory allocationIntoduction to dynamic memory allocation
Intoduction to dynamic memory allocationUtsav276
 
How Automated Vulnerability Analysis Discovered Hundreds of Android 0-days
How Automated Vulnerability Analysis Discovered Hundreds of Android 0-daysHow Automated Vulnerability Analysis Discovered Hundreds of Android 0-days
How Automated Vulnerability Analysis Discovered Hundreds of Android 0-daysPriyanka Aash
 
Itp practical file_1-year
Itp practical file_1-yearItp practical file_1-year
Itp practical file_1-yearAMIT SINGH
 
Static code analysis: what? how? why?
Static code analysis: what? how? why?Static code analysis: what? how? why?
Static code analysis: what? how? why?Andrey Karpov
 
Machine Learning on Code - SF meetup
Machine Learning on Code - SF meetupMachine Learning on Code - SF meetup
Machine Learning on Code - SF meetupsource{d}
 
Rust LDN 24 7 19 Oxidising the Command Line
Rust LDN 24 7 19 Oxidising the Command LineRust LDN 24 7 19 Oxidising the Command Line
Rust LDN 24 7 19 Oxidising the Command LineMatt Provost
 
NSC #2 - D2 06 - Richard Johnson - SAGEly Advice
NSC #2 - D2 06 - Richard Johnson - SAGEly AdviceNSC #2 - D2 06 - Richard Johnson - SAGEly Advice
NSC #2 - D2 06 - Richard Johnson - SAGEly AdviceNoSuchCon
 
C++ CoreHard Autumn 2018. Concurrency and Parallelism in C++17 and C++20/23 -...
C++ CoreHard Autumn 2018. Concurrency and Parallelism in C++17 and C++20/23 -...C++ CoreHard Autumn 2018. Concurrency and Parallelism in C++17 and C++20/23 -...
C++ CoreHard Autumn 2018. Concurrency and Parallelism in C++17 and C++20/23 -...corehard_by
 
C and Data structure lab manual ECE (2).pdf
C and Data structure lab manual ECE (2).pdfC and Data structure lab manual ECE (2).pdf
C and Data structure lab manual ECE (2).pdfjanakim15
 
Computer Networks Lab File
Computer Networks Lab FileComputer Networks Lab File
Computer Networks Lab FileKandarp Tiwari
 

Similar to Penumbra: Automatically Identifying Failure-Relevant Inputs (ISSTA 2009) (20)

Usp
UspUsp
Usp
 
Effective Memory Protection Using Dynamic Tainting (ASE 2007)
Effective Memory Protection Using Dynamic Tainting (ASE 2007)Effective Memory Protection Using Dynamic Tainting (ASE 2007)
Effective Memory Protection Using Dynamic Tainting (ASE 2007)
 
Marat-Slides
Marat-SlidesMarat-Slides
Marat-Slides
 
C ISRO Debugging
C ISRO DebuggingC ISRO Debugging
C ISRO Debugging
 
A miało być tak... bez wycieków
A miało być tak... bez wyciekówA miało być tak... bez wycieków
A miało być tak... bez wycieków
 
Don't Be Afraid of Abstract Syntax Trees
Don't Be Afraid of Abstract Syntax TreesDon't Be Afraid of Abstract Syntax Trees
Don't Be Afraid of Abstract Syntax Trees
 
Intoduction to dynamic memory allocation
Intoduction to dynamic memory allocationIntoduction to dynamic memory allocation
Intoduction to dynamic memory allocation
 
How Automated Vulnerability Analysis Discovered Hundreds of Android 0-days
How Automated Vulnerability Analysis Discovered Hundreds of Android 0-daysHow Automated Vulnerability Analysis Discovered Hundreds of Android 0-days
How Automated Vulnerability Analysis Discovered Hundreds of Android 0-days
 
Itp practical file_1-year
Itp practical file_1-yearItp practical file_1-year
Itp practical file_1-year
 
Static code analysis: what? how? why?
Static code analysis: what? how? why?Static code analysis: what? how? why?
Static code analysis: what? how? why?
 
Machine Learning on Code - SF meetup
Machine Learning on Code - SF meetupMachine Learning on Code - SF meetup
Machine Learning on Code - SF meetup
 
Os lab 1st mid
Os lab 1st midOs lab 1st mid
Os lab 1st mid
 
Os lab upto_1st_mid
Os lab upto_1st_midOs lab upto_1st_mid
Os lab upto_1st_mid
 
Os lab upto 1st mid
Os lab upto 1st midOs lab upto 1st mid
Os lab upto 1st mid
 
Rust LDN 24 7 19 Oxidising the Command Line
Rust LDN 24 7 19 Oxidising the Command LineRust LDN 24 7 19 Oxidising the Command Line
Rust LDN 24 7 19 Oxidising the Command Line
 
NSC #2 - D2 06 - Richard Johnson - SAGEly Advice
NSC #2 - D2 06 - Richard Johnson - SAGEly AdviceNSC #2 - D2 06 - Richard Johnson - SAGEly Advice
NSC #2 - D2 06 - Richard Johnson - SAGEly Advice
 
C++ CoreHard Autumn 2018. Concurrency and Parallelism in C++17 and C++20/23 -...
C++ CoreHard Autumn 2018. Concurrency and Parallelism in C++17 and C++20/23 -...C++ CoreHard Autumn 2018. Concurrency and Parallelism in C++17 and C++20/23 -...
C++ CoreHard Autumn 2018. Concurrency and Parallelism in C++17 and C++20/23 -...
 
Lab loop
Lab loopLab loop
Lab loop
 
C and Data structure lab manual ECE (2).pdf
C and Data structure lab manual ECE (2).pdfC and Data structure lab manual ECE (2).pdf
C and Data structure lab manual ECE (2).pdf
 
Computer Networks Lab File
Computer Networks Lab FileComputer Networks Lab File
Computer Networks Lab File
 

More from James Clause

Investigating the Impacts of Web Servers on Web Application Energy Usage (GRE...
Investigating the Impacts of Web Servers on Web Application Energy Usage (GRE...Investigating the Impacts of Web Servers on Web Application Energy Usage (GRE...
Investigating the Impacts of Web Servers on Web Application Energy Usage (GRE...James Clause
 
Energy-directed Test Suite Optimization (GREENS 2013)
Energy-directed Test Suite Optimization (GREENS 2013)Energy-directed Test Suite Optimization (GREENS 2013)
Energy-directed Test Suite Optimization (GREENS 2013)James Clause
 
Enabling and Supporting the Debugging of Field Failures (Job Talk)
Enabling and Supporting the Debugging of Field Failures (Job Talk)Enabling and Supporting the Debugging of Field Failures (Job Talk)
Enabling and Supporting the Debugging of Field Failures (Job Talk)James Clause
 
Leakpoint: Pinpointing the Causes of Memory Leaks (ICSE 2010)
Leakpoint: Pinpointing the Causes of Memory Leaks (ICSE 2010)Leakpoint: Pinpointing the Causes of Memory Leaks (ICSE 2010)
Leakpoint: Pinpointing the Causes of Memory Leaks (ICSE 2010)James Clause
 
Debugging Field Failures by Minimizing Captured Executions (ICSE 2009: NIER e...
Debugging Field Failures by Minimizing Captured Executions (ICSE 2009: NIER e...Debugging Field Failures by Minimizing Captured Executions (ICSE 2009: NIER e...
Debugging Field Failures by Minimizing Captured Executions (ICSE 2009: NIER e...James Clause
 
A Technique for Enabling and Supporting Debugging of Field Failures (ICSE 2007)
A Technique for Enabling and Supporting Debugging of Field Failures (ICSE 2007)A Technique for Enabling and Supporting Debugging of Field Failures (ICSE 2007)
A Technique for Enabling and Supporting Debugging of Field Failures (ICSE 2007)James Clause
 
Demand-Driven Structural Testing with Dynamic Instrumentation (ICSE 2005)
Demand-Driven Structural Testing with Dynamic Instrumentation (ICSE 2005)Demand-Driven Structural Testing with Dynamic Instrumentation (ICSE 2005)
Demand-Driven Structural Testing with Dynamic Instrumentation (ICSE 2005)James Clause
 
Initial Explorations on Design Pattern Energy Usage (GREENS 12)
Initial Explorations on Design Pattern Energy Usage (GREENS 12)Initial Explorations on Design Pattern Energy Usage (GREENS 12)
Initial Explorations on Design Pattern Energy Usage (GREENS 12)James Clause
 
Advanced Dynamic Analysis for Leak Detection (Apple Internship 2008)
Advanced Dynamic Analysis for Leak Detection (Apple Internship 2008)Advanced Dynamic Analysis for Leak Detection (Apple Internship 2008)
Advanced Dynamic Analysis for Leak Detection (Apple Internship 2008)James Clause
 
Camouflage: Automated Anonymization of Field Data (ICSE 2011)
Camouflage: Automated Anonymization of Field Data (ICSE 2011)Camouflage: Automated Anonymization of Field Data (ICSE 2011)
Camouflage: Automated Anonymization of Field Data (ICSE 2011)James Clause
 

More from James Clause (10)

Investigating the Impacts of Web Servers on Web Application Energy Usage (GRE...
Investigating the Impacts of Web Servers on Web Application Energy Usage (GRE...Investigating the Impacts of Web Servers on Web Application Energy Usage (GRE...
Investigating the Impacts of Web Servers on Web Application Energy Usage (GRE...
 
Energy-directed Test Suite Optimization (GREENS 2013)
Energy-directed Test Suite Optimization (GREENS 2013)Energy-directed Test Suite Optimization (GREENS 2013)
Energy-directed Test Suite Optimization (GREENS 2013)
 
Enabling and Supporting the Debugging of Field Failures (Job Talk)
Enabling and Supporting the Debugging of Field Failures (Job Talk)Enabling and Supporting the Debugging of Field Failures (Job Talk)
Enabling and Supporting the Debugging of Field Failures (Job Talk)
 
Leakpoint: Pinpointing the Causes of Memory Leaks (ICSE 2010)
Leakpoint: Pinpointing the Causes of Memory Leaks (ICSE 2010)Leakpoint: Pinpointing the Causes of Memory Leaks (ICSE 2010)
Leakpoint: Pinpointing the Causes of Memory Leaks (ICSE 2010)
 
Debugging Field Failures by Minimizing Captured Executions (ICSE 2009: NIER e...
Debugging Field Failures by Minimizing Captured Executions (ICSE 2009: NIER e...Debugging Field Failures by Minimizing Captured Executions (ICSE 2009: NIER e...
Debugging Field Failures by Minimizing Captured Executions (ICSE 2009: NIER e...
 
A Technique for Enabling and Supporting Debugging of Field Failures (ICSE 2007)
A Technique for Enabling and Supporting Debugging of Field Failures (ICSE 2007)A Technique for Enabling and Supporting Debugging of Field Failures (ICSE 2007)
A Technique for Enabling and Supporting Debugging of Field Failures (ICSE 2007)
 
Demand-Driven Structural Testing with Dynamic Instrumentation (ICSE 2005)
Demand-Driven Structural Testing with Dynamic Instrumentation (ICSE 2005)Demand-Driven Structural Testing with Dynamic Instrumentation (ICSE 2005)
Demand-Driven Structural Testing with Dynamic Instrumentation (ICSE 2005)
 
Initial Explorations on Design Pattern Energy Usage (GREENS 12)
Initial Explorations on Design Pattern Energy Usage (GREENS 12)Initial Explorations on Design Pattern Energy Usage (GREENS 12)
Initial Explorations on Design Pattern Energy Usage (GREENS 12)
 
Advanced Dynamic Analysis for Leak Detection (Apple Internship 2008)
Advanced Dynamic Analysis for Leak Detection (Apple Internship 2008)Advanced Dynamic Analysis for Leak Detection (Apple Internship 2008)
Advanced Dynamic Analysis for Leak Detection (Apple Internship 2008)
 
Camouflage: Automated Anonymization of Field Data (ICSE 2011)
Camouflage: Automated Anonymization of Field Data (ICSE 2011)Camouflage: Automated Anonymization of Field Data (ICSE 2011)
Camouflage: Automated Anonymization of Field Data (ICSE 2011)
 

Recently uploaded

Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 

Recently uploaded (20)

Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 

Penumbra: Automatically Identifying Failure-Relevant Inputs (ISSTA 2009)

  • 1. Penumbra: Automatically Identifying Failure-Relevant Inputs James Clause and Alessandro Orso College of Computing Georgia Institute of Technology Supported in part by: NSF awards CCF-0725202 and CCF-0541080 to Georgia Tech
  • 2. Automated Debugging • Gupta and colleagues ’05 • Jones and colleagues ’02 • Korel and Laski ’88 • Liblit and colleagues ’05 • Nainar and colleagues ’07 • Renieris and Reiss ’03 • Seward and Nethercote ’05 • Tucek and colleagues ’07 • Weiser ’81 • Zhang and colleagues ’05 • Zhang and colleagues ’06 • ...
  • 3. Automated Debugging Code-centric • Gupta and colleagues ’05 • Jones and colleagues ’02 • Korel and Laski ’88 • Liblit and colleagues ’05 • Nainar and colleagues ’07 • Renieris and Reiss ’03 • Seward and Nethercote ’05 • Tucek and colleagues ’07 • Weiser ’81 • Zhang and colleagues ’05 • Zhang and colleagues ’06 • ...
  • 4. Automated Debugging Code-centric • Gupta and colleagues ’05 • Jones and colleagues ’02 • Korel and Laski ’88 • Liblit and colleagues ’05 • Nainar and colleagues ’07 • Renieris and Reiss ’03 • Seward and Nethercote ’05 • Tucek and colleagues ’07 • Weiser ’81 • Zhang and colleagues ’05 • Zhang and colleagues ’06 • ... What about inputs which cause the failure?
  • 5. • Chan and Lakhotia ’98 • Zeller and Hildebrandt ’02 • Misherghi and Su ’06 Data-centric Techniques
  • 6. • Chan and Lakhotia ’98 • Zeller and Hildebrandt ’02 • Misherghi and Su ’06 Delta Debugging Data-centric Techniques
  • 7. • Chan and Lakhotia ’98 • Zeller and Hildebrandt ’02 • Misherghi and Su ’06 Delta Debugging Data-centric Techniques Requires: 1. Multiple executions 2. Large amounts of manual effort (oracle creation, setup)
  • 8. • Chan and Lakhotia ’98 • Zeller and Hildebrandt ’02 • Misherghi and Su ’06 Delta Debugging Data-centric Techniques Requires: 1. Multiple executions 2. Large amounts of manual effort (oracle creation, setup) Penumbra
  • 9. • Chan and Lakhotia ’98 • Zeller and Hildebrandt ’02 • Misherghi and Su ’06 Delta Debugging Data-centric Techniques Requires: 1. Multiple executions 2. Large amounts of manual effort (oracle creation, setup) Penumbra Comparable performance
  • 10. • Chan and Lakhotia ’98 • Zeller and Hildebrandt ’02 • Misherghi and Su ’06 Delta Debugging Data-centric Techniques Requires: 1. Multiple executions 2. Large amounts of manual effort (oracle creation, setup) Requires: 1. Single execution 2. Reduced manual effort Penumbra Comparable performance
  • 12. Intuition and Terminology Failure-revealing input vector Failure-relevant subset (inputs which are useful for investigating the failure)
  • 13. Intuition and Terminology Failure-revealing input vector Failure-relevant subset (inputs which are useful for investigating the failure) Approximate failure-relevant subsets by identifying inputs that reach the failure along program dependencies.
  • 14. Motivating Example int main(int argc, char **argv) { 1. int verbose, i, total_size = 0; 2. struct stat buf; 3. verbose = atoi(argv[1]); 4. for(i = 2; i < argc; i++) { 5. int fd = open(argv[i], O_RDONLY); 6. fstat(fd, &buf); 7. char *out = malloc(60); 8. sprintf(out, "%d", buf.st_size); 9. if(verbose) { 10. char *pview = malloc(51); 11. read(fd, pview, 50); 12. pview[50] = '0'; 13. strcat(out, pview); 14. } 15. printf("%s: %sn", argv[i], out); 16. total_size += buf.st_size; 17. } 18. printf("total: %dn", total_size); } fileinfo
  • 15. Motivating Example int main(int argc, char **argv) { 1. int verbose, i, total_size = 0; 2. struct stat buf; 3. verbose = atoi(argv[1]); 4. for(i = 2; i < argc; i++) { 5. int fd = open(argv[i], O_RDONLY); 6. fstat(fd, &buf); 7. char *out = malloc(60); 8. sprintf(out, "%d", buf.st_size); 9. if(verbose) { 10. char *pview = malloc(51); 11. read(fd, pview, 50); 12. pview[50] = '0'; 13. strcat(out, pview); 14. } 15. printf("%s: %sn", argv[i], out); 16. total_size += buf.st_size; 17. } 18. printf("total: %dn", total_size); } fileinfo Command line arguments (flag, list of file names)
  • 16. Motivating Example int main(int argc, char **argv) { 1. int verbose, i, total_size = 0; 2. struct stat buf; 3. verbose = atoi(argv[1]); 4. for(i = 2; i < argc; i++) { 5. int fd = open(argv[i], O_RDONLY); 6. fstat(fd, &buf); 7. char *out = malloc(60); 8. sprintf(out, "%d", buf.st_size); 9. if(verbose) { 10. char *pview = malloc(51); 11. read(fd, pview, 50); 12. pview[50] = '0'; 13. strcat(out, pview); 14. } 15. printf("%s: %sn", argv[i], out); 16. total_size += buf.st_size; 17. } 18. printf("total: %dn", total_size); } fileinfo File statistics (for each file) (size, last modified date, ...) Command line arguments (flag, list of file names)
  • 17. Motivating Example int main(int argc, char **argv) { 1. int verbose, i, total_size = 0; 2. struct stat buf; 3. verbose = atoi(argv[1]); 4. for(i = 2; i < argc; i++) { 5. int fd = open(argv[i], O_RDONLY); 6. fstat(fd, &buf); 7. char *out = malloc(60); 8. sprintf(out, "%d", buf.st_size); 9. if(verbose) { 10. char *pview = malloc(51); 11. read(fd, pview, 50); 12. pview[50] = '0'; 13. strcat(out, pview); 14. } 15. printf("%s: %sn", argv[i], out); 16. total_size += buf.st_size; 17. } 18. printf("total: %dn", total_size); } fileinfo File statistics (for each file) (size, last modified date, ...) File contents (for each file) (first 50 characters) Command line arguments (flag, list of file names)
  • 18. Motivating Example int main(int argc, char **argv) { 1. int verbose, i, total_size = 0; 2. struct stat buf; 3. verbose = atoi(argv[1]); 4. for(i = 2; i < argc; i++) { 5. int fd = open(argv[i], O_RDONLY); 6. fstat(fd, &buf); 7. char *out = malloc(60); 8. sprintf(out, "%d", buf.st_size); 9. if(verbose) { 10. char *pview = malloc(51); 11. read(fd, pview, 50); 12. pview[50] = '0'; 13. strcat(out, pview); 14. } 15. printf("%s: %sn", argv[i], out); 16. total_size += buf.st_size; 17. } 18. printf("total: %dn", total_size); } fileinfo File statistics (for each file) (size, last modified date, ...) File contents (for each file) (first 50 characters) Command line arguments (flag, list of file names) Inputvector
  • 19. Motivating Example int main(int argc, char **argv) { 1. int verbose, i, total_size = 0; 2. struct stat buf; 3. verbose = atoi(argv[1]); 4. for(i = 2; i < argc; i++) { 5. int fd = open(argv[i], O_RDONLY); 6. fstat(fd, &buf); 7. char *out = malloc(60); 8. sprintf(out, "%d", buf.st_size); 9. if(verbose) { 10. char *pview = malloc(51); 11. read(fd, pview, 50); 12. pview[50] = '0'; 13. strcat(out, pview); 14. } 15. printf("%s: %sn", argv[i], out); 16. total_size += buf.st_size; 17. } 18. printf("total: %dn", total_size); } fileinfo
  • 20. Motivating Example int main(int argc, char **argv) { 1. int verbose, i, total_size = 0; 2. struct stat buf; 3. verbose = atoi(argv[1]); 4. for(i = 2; i < argc; i++) { 5. int fd = open(argv[i], O_RDONLY); 6. fstat(fd, &buf); 7. char *out = malloc(60); 8. sprintf(out, "%d", buf.st_size); 9. if(verbose) { 10. char *pview = malloc(51); 11. read(fd, pview, 50); 12. pview[50] = '0'; 13. strcat(out, pview); 14. } 15. printf("%s: %sn", argv[i], out); 16. total_size += buf.st_size; 17. } 18. printf("total: %dn", total_size); } fileinfo Overflow out
  • 21. Motivating Example int main(int argc, char **argv) { 1. int verbose, i, total_size = 0; 2. struct stat buf; 3. verbose = atoi(argv[1]); 4. for(i = 2; i < argc; i++) { 5. int fd = open(argv[i], O_RDONLY); 6. fstat(fd, &buf); 7. char *out = malloc(60); 8. sprintf(out, "%d", buf.st_size); 9. if(verbose) { 10. char *pview = malloc(51); 11. read(fd, pview, 50); 12. pview[50] = '0'; 13. strcat(out, pview); 14. } 15. printf("%s: %sn", argv[i], out); 16. total_size += buf.st_size; 17. } 18. printf("total: %dn", total_size); } fileinfo buf.st_size ≥ 1GB Overflow out
  • 22. Motivating Example int main(int argc, char **argv) { 1. int verbose, i, total_size = 0; 2. struct stat buf; 3. verbose = atoi(argv[1]); 4. for(i = 2; i < argc; i++) { 5. int fd = open(argv[i], O_RDONLY); 6. fstat(fd, &buf); 7. char *out = malloc(60); 8. sprintf(out, "%d", buf.st_size); 9. if(verbose) { 10. char *pview = malloc(51); 11. read(fd, pview, 50); 12. pview[50] = '0'; 13. strcat(out, pview); 14. } 15. printf("%s: %sn", argv[i], out); 16. total_size += buf.st_size; 17. } 18. printf("total: %dn", total_size); } fileinfo buf.st_size ≥ 1GB verbose is true Overflow out
  • 23. Motivating Example int main(int argc, char **argv) { 1. int verbose, i, total_size = 0; 2. struct stat buf; 3. verbose = atoi(argv[1]); 4. for(i = 2; i < argc; i++) { 5. int fd = open(argv[i], O_RDONLY); 6. fstat(fd, &buf); 7. char *out = malloc(60); 8. sprintf(out, "%d", buf.st_size); 9. if(verbose) { 10. char *pview = malloc(51); 11. read(fd, pview, 50); 12. pview[50] = '0'; 13. strcat(out, pview); 14. } 15. printf("%s: %sn", argv[i], out); 16. total_size += buf.st_size; 17. } 18. printf("total: %dn", total_size); } fileinfo buf.st_size ≥ 1GB verbose is true Overflow out read 50 characters
  • 24. Motivating Example int main(int argc, char **argv) { 1. int verbose, i, total_size = 0; 2. struct stat buf; 3. verbose = atoi(argv[1]); 4. for(i = 2; i < argc; i++) { 5. int fd = open(argv[i], O_RDONLY); 6. fstat(fd, &buf); 7. char *out = malloc(60); 8. sprintf(out, "%d", buf.st_size); 9. if(verbose) { 10. char *pview = malloc(51); 11. read(fd, pview, 50); 12. pview[50] = '0'; 13. strcat(out, pview); 14. } 15. printf("%s: %sn", argv[i], out); 16. total_size += buf.st_size; 17. } 18. printf("total: %dn", total_size); } fileinfo
  • 25. 1. Many more inputs than lines of code. Motivating Example int main(int argc, char **argv) { 1. int verbose, i, total_size = 0; 2. struct stat buf; 3. verbose = atoi(argv[1]); 4. for(i = 2; i < argc; i++) { 5. int fd = open(argv[i], O_RDONLY); 6. fstat(fd, &buf); 7. char *out = malloc(60); 8. sprintf(out, "%d", buf.st_size); 9. if(verbose) { 10. char *pview = malloc(51); 11. read(fd, pview, 50); 12. pview[50] = '0'; 13. strcat(out, pview); 14. } 15. printf("%s: %sn", argv[i], out); 16. total_size += buf.st_size; 17. } 18. printf("total: %dn", total_size); } fileinfo
  • 26. 1. Many more inputs than lines of code. 2. Understanding the failure requires tracing interactions between inputs from multiple sources. Motivating Example int main(int argc, char **argv) { 1. int verbose, i, total_size = 0; 2. struct stat buf; 3. verbose = atoi(argv[1]); 4. for(i = 2; i < argc; i++) { 5. int fd = open(argv[i], O_RDONLY); 6. fstat(fd, &buf); 7. char *out = malloc(60); 8. sprintf(out, "%d", buf.st_size); 9. if(verbose) { 10. char *pview = malloc(51); 11. read(fd, pview, 50); 12. pview[50] = '0'; 13. strcat(out, pview); 14. } 15. printf("%s: %sn", argv[i], out); 16. total_size += buf.st_size; 17. } 18. printf("total: %dn", total_size); } fileinfo
  • 27. 1. Many more inputs than lines of code. 2. Understanding the failure requires tracing interactions between inputs from multiple sources. 3. Only a small percentage of all inputs are relevant for the failure. Motivating Example int main(int argc, char **argv) { 1. int verbose, i, total_size = 0; 2. struct stat buf; 3. verbose = atoi(argv[1]); 4. for(i = 2; i < argc; i++) { 5. int fd = open(argv[i], O_RDONLY); 6. fstat(fd, &buf); 7. char *out = malloc(60); 8. sprintf(out, "%d", buf.st_size); 9. if(verbose) { 10. char *pview = malloc(51); 11. read(fd, pview, 50); 12. pview[50] = '0'; 13. strcat(out, pview); 14. } 15. printf("%s: %sn", argv[i], out); 16. total_size += buf.st_size; 17. } 18. printf("total: %dn", total_size); } fileinfo
  • 28. fileinfo Penumbra Overview foo: 512 ... bar: 1024 ... baz: 150... total: 150... Foo 512B Bar 1KB Baz 1.5GB
  • 29. fileinfo Penumbra Overview foo: 512 ... bar: 1024 ... baz: 150... total: 150... Foo 512B Bar 1KB Baz 1.5GB Relevant context: 1. When the failure occurs. 2. Which data are involved in the failure.
  • 30. fileinfo Penumbra Overview foo: 512 ... bar: 1024 ... baz: 150... total: 150... Foo 512B Bar 1KB Baz 1.5GB 13. strcat(out, pview); In general, it is chosen using traditional debugging methods.
  • 31. fileinfo Penumbra Overview foo: 512 ... bar: 1024 ... baz: 150... total: 150... Foo 512B Bar 1KB Baz 1.5GB
  • 32. fileinfo Penumbra Overview foo: 512 ... bar: 1024 ... baz: 150... total: 150... Foo 512B Bar 1KB Baz 1.5GB 1 Taint inputs
  • 33. fileinfo Penumbra Overview foo: 512 ... bar: 1024 ... baz: 150... total: 150... Foo 512B Bar 1KB Baz 1.5GB 1 Taint inputs 1 2 3 4 5 6 7 8 9 0
  • 34. fileinfo Penumbra Overview foo: 512 ... bar: 1024 ... baz: 150... total: 150... Foo 512B Bar 1KB Baz 1.5GB 1 Taint inputs 2 Propagate taint marks 1 2 3 4 5 6 7 8 9 0
  • 35. fileinfo Penumbra Overview foo: 512 ... bar: 1024 ... baz: 150... total: 150... Foo 512B Bar 1KB Baz 1.5GB 1 Taint inputs 2 Propagate taint marks
  • 36. fileinfo Penumbra Overview foo: 512 ... bar: 1024 ... baz: 150... total: 150... Foo 512B Bar 1KB Baz 1.5GB 1 Taint inputs 2 Propagate taint marks 3 Identify relevant inputs
  • 37. fileinfo Penumbra Overview foo: 512 ... bar: 1024 ... baz: 150... total: 150... Foo 512B Bar 1KB Baz 1.5GB 1 Taint inputs 2 Propagate taint marks 3 Identify relevant inputs 0 8 9
  • 38. fileinfo Penumbra Overview foo: 512 ... bar: 1024 ... baz: 150... total: 150... Foo 512B Bar 1KB Baz 1.5GB 1 Taint inputs 2 Propagate taint marks 3 Identify relevant inputs 0 8 9
  • 39. fileinfo Penumbra Overview foo: 512 ... bar: 1024 ... baz: 150... total: 150... Foo 512B Bar 1KB Baz 1.5GB 1 Taint inputs 2 Propagate taint marks 3 Identify relevant inputs 0 8 9 verbose is true read 50 characters buf.st_size ≥ 1GB
  • 40. Outline • Penumbra approach 1. Tainting inputs 2. Propagating taint marks 3. Identifying relevant inputs • Evaluation • Conclusions and future work
  • 41. 1: Tainting Inputs Assign a taint mark to each input as it enters the application.
  • 42. 1: Tainting Inputs Assign a taint mark to each input as it enters the application. Per-byte Per-entity Domain specific
  • 43. 1: Tainting Inputs Assign a unique taint mark to each byte. (read from files) Assign the same taint mark to related bytes. (argv, argc, fstat, ...) Assign taint marks based on user- provided information. Assign a taint mark to each input as it enters the application. Per-byte Per-entity Domain specific
  • 44. 1: Tainting Inputs Assign a unique taint mark to each byte. (read from files) Assign the same taint mark to related bytes. (argv, argc, fstat, ...) Assign taint marks based on user- provided information. Assign a taint mark to each input as it enters the application. Precise identification Per-byte Per-entity Domain specific
  • 45. 1: Tainting Inputs Assign a unique taint mark to each byte. (read from files) Assign the same taint mark to related bytes. (argv, argc, fstat, ...) Assign taint marks based on user- provided information. Assign a taint mark to each input as it enters the application. Precise identification Unnecessarily expensive Per-byte Per-entity Domain specific
  • 46. 1: Tainting Inputs Assign a unique taint mark to each byte. (read from files) Assign the same taint mark to related bytes. (argv, argc, fstat, ...) Assign taint marks based on user- provided information. Assign a taint mark to each input as it enters the application. Precise identification Unnecessarily expensive Per-byte Per-entity Domain specific
  • 47. 1: Tainting Inputs Assign a unique taint mark to each byte. (read from files) Assign the same taint mark to related bytes. (argv, argc, fstat, ...) Assign taint marks based on user- provided information. Assign a taint mark to each input as it enters the application. Precise identification Unnecessarily expensive Maintains per - byte precision Per-byte Per-entity Domain specific
  • 48. 1: Tainting Inputs Assign a unique taint mark to each byte. (read from files) Assign the same taint mark to related bytes. (argv, argc, fstat, ...) Assign taint marks based on user- provided information. Assign a taint mark to each input as it enters the application. Precise identification Unnecessarily expensive Maintains per - byte precision Increases scalability Per-byte Per-entity Domain specific
  • 49. 1: Tainting Inputs Assign a unique taint mark to each byte. (read from files) Assign the same taint mark to related bytes. (argv, argc, fstat, ...) Assign taint marks based on user- provided information. Assign a taint mark to each input as it enters the application. Precise identification Unnecessarily expensive Maintains per - byte precision Increases scalability Per-byte Per-entity Domain specific
  • 50. 1: Tainting Inputs Assign a unique taint mark to each byte. (read from files) Assign the same taint mark to related bytes. (argv, argc, fstat, ...) Assign taint marks based on user- provided information. Assign a taint mark to each input as it enters the application. Precise identification Unnecessarily expensive Maintains per - byte precision Increases scalability Per-byte Per-entity Domain specific Maintains per - byte precision
  • 51. 1: Tainting Inputs Assign a unique taint mark to each byte. (read from files) Assign the same taint mark to related bytes. (argv, argc, fstat, ...) Assign taint marks based on user- provided information. Assign a taint mark to each input as it enters the application. Precise identification Unnecessarily expensive Maintains per - byte precision Increases scalability Per-byte Per-entity Domain specific Maintains per - byte precision Further increases scalability
  • 52. 1: Tainting Inputs Assign a unique taint mark to each byte. (read from files) Assign the same taint mark to related bytes. (argv, argc, fstat, ...) Assign taint marks based on user- provided information. Assign a taint mark to each input as it enters the application. When a taint mark is assigned to an input, log the input’s value and where the input was read from. Precise identification Unnecessarily expensive Maintains per - byte precision Increases scalability Per-byte Per-entity Domain specific Maintains per - byte precision Further increases scalability
  • 54. 2: Propagating Taint Marks Data-flow Propagation (DF) Data- and control-flow Propagation (DF + CF)
  • 55. 2: Propagating Taint Marks Taint marks flow along only data dependencies. Taint marks flow along data and control dependencies. Data-flow Propagation (DF) Data- and control-flow Propagation (DF + CF)
  • 56. 2: Propagating Taint Marks Taint marks flow along only data dependencies. Taint marks flow along data and control dependencies. C = A + B; Data-flow Propagation (DF) Data- and control-flow Propagation (DF + CF)
  • 57. 2: Propagating Taint Marks Taint marks flow along only data dependencies. Taint marks flow along data and control dependencies. C = A + B; 1 2 Data-flow Propagation (DF) Data- and control-flow Propagation (DF + CF)
  • 58. 2: Propagating Taint Marks Taint marks flow along only data dependencies. Taint marks flow along data and control dependencies. C = A + B; 1 21 2 Data-flow Propagation (DF) Data- and control-flow Propagation (DF + CF)
  • 59. 2: Propagating Taint Marks Taint marks flow along only data dependencies. Taint marks flow along data and control dependencies. C = A + B; 1 21 2 Data-flow Propagation (DF) Data- and control-flow Propagation (DF + CF)
  • 60. 2: Propagating Taint Marks Taint marks flow along only data dependencies. Taint marks flow along data and control dependencies. C = A + B; if(X) { C = A + B; } 1 21 2 Data-flow Propagation (DF) Data- and control-flow Propagation (DF + CF)
  • 61. 2: Propagating Taint Marks Taint marks flow along only data dependencies. Taint marks flow along data and control dependencies. C = A + B; if(X) { C = A + B; } 1 21 2 1 2 3 Data-flow Propagation (DF) Data- and control-flow Propagation (DF + CF)
  • 62. 2: Propagating Taint Marks Taint marks flow along only data dependencies. Taint marks flow along data and control dependencies. C = A + B; if(X) { C = A + B; } 1 21 2 1 2 3 1 2 3 Data-flow Propagation (DF) Data- and control-flow Propagation (DF + CF)
  • 63. 2: Propagating Taint Marks Taint marks flow along only data dependencies. Taint marks flow along data and control dependencies. C = A + B; if(X) { C = A + B; } 1 21 2 1 2 3 1 2 3 The effectiveness of each option depends on the particular failure. Data-flow Propagation (DF) Data- and control-flow Propagation (DF + CF)
  • 64. 3: Identifying Relevant-inputs 1. Relevant context indicates which data is involved in the considered failure. 2. Identify which taint marks as associated with the data indicated by the relevant context. 3. Use recorded logs to reconstruct inputs that are identified by the taint marks. Baz 1.5GB
  • 66. input vector executable trace relevant context Prototype Implementation Trace Processor Trace generator
  • 67. input vector executable trace relevant context Prototype Implementation Trace Processor Trace generator Implemented using Dytan, a generic x86 tainting framework developed in previous work [Clause and Orso 2007].
  • 68. input vector executable trace relevant context Prototype Implementation Trace Processor Trace generator
  • 69. input vector executable trace relevant context Prototype Implementation Trace Processor Trace generator input subset (DF) input subset (DF+CF)
  • 70. Evaluation Study 1: Effectiveness for debugging real failures Study 2: Comparison with Delta Debugging
  • 71. Evaluation Study 1: Effectiveness for debugging real failures Study 2: Comparison with Delta Debugging Application KLoC Fault location bc 1.06 10.5 more_arrays : 177 gzip 1.24 6.3 get_istat : 828 ncompress 4.24 1.4 comprexx : 896 pine 4.44 239.1 rfc822_cat : 260 squid 2.3 69.9 ftpBuildTitleUrl : 1024 Subjects:
  • 72. Evaluation Study 1: Effectiveness for debugging real failures Study 2: Comparison with Delta Debugging Application KLoC Fault location bc 1.06 10.5 more_arrays : 177 gzip 1.24 6.3 get_istat : 828 ncompress 4.24 1.4 comprexx : 896 pine 4.44 239.1 rfc822_cat : 260 squid 2.3 69.9 ftpBuildTitleUrl : 1024 Subjects: We selected a failure-revealing input vector for each subject.
  • 73. Data Generation Penumbra Delta Debugging Setup (manual) Execution (automated) Choose a relevant context Create an automated oracle Use prototype tool to identify failure-relevant inputs (DF and DF + CF) Use the standard Delta Debugging implementation to minimize inputs.
  • 74. Data Generation Penumbra Delta Debugging Setup (manual) Execution (automated) Choose a relevant context Create an automated oracle Use prototype tool to identify failure-relevant inputs (DF and DF + CF) Use the standard Delta Debugging implementation to minimize inputs.
  • 75. Data Generation Penumbra Delta Debugging Setup (manual) Execution (automated) Choose a relevant context Create an automated oracle Use prototype tool to identify failure-relevant inputs (DF and DF + CF) Use the standard Delta Debugging implementation to minimize inputs. • Location: statement where the failure occurs. • Data: any data read by such statement
  • 76. Data Generation Penumbra Delta Debugging Setup (manual) Execution (automated) Choose a relevant context Create an automated oracle Use prototype tool to identify failure-relevant inputs (DF and DF + CF) Use the standard Delta Debugging implementation to minimize inputs.
  • 77. Data Generation Penumbra Delta Debugging Setup (manual) Execution (automated) Choose a relevant context Create an automated oracle Use prototype tool to identify failure-relevant inputs (DF and DF + CF) Use the standard Delta Debugging implementation to minimize inputs.
  • 78. Data Generation Penumbra Delta Debugging Setup (manual) Execution (automated) Choose a relevant context Create an automated oracle Use prototype tool to identify failure-relevant inputs (DF and DF + CF) Use the standard Delta Debugging implementation to minimize inputs. • Use gdb to inspect stack trace and program data. • One second timeout to prevent incorrect results.
  • 79. Data Generation Penumbra Delta Debugging Setup (manual) Execution (automated) Choose a relevant context Create an automated oracle Use prototype tool to identify failure-relevant inputs (DF and DF + CF) Use the standard Delta Debugging implementation to minimize inputs.
  • 80. Study 1: Effectiveness Is the information that Penumbra provides helpful for debugging real failures?
  • 81. Study 1 Results: gzip & ncompress Crash when a file name is longer than 1,024 characters.
  • 82. Study 1 Results: gzip & ncompress Contents & Attributes Contents & Attributes bar Contents & Attributes foo./gzip Crash when a file name is longer than 1,024 characters. # Inputs: 10,000,056 long filename[ ]
  • 83. Study 1 Results: gzip & ncompress Contents & Attributes Contents & Attributes bar Contents & Attributes foo./gzip Crash when a file name is longer than 1,024 characters. # Inputs: 10,000,056 # Relevant (DF): 1 long filename[ ]
  • 84. Study 1 Results: gzip & ncompress Contents & Attributes Contents & Attributes bar Contents & Attributes foo./gzip Crash when a file name is longer than 1,024 characters. # Relevant (DF + CF): 3 # Inputs: 10,000,056 # Relevant (DF): 1 long filename[ ]
  • 85. Study 1 Results: pine Crash when a “from” field contains 22 or more double quote characters.
  • 86. Study 1 Results: pine # Inputs: 15,103,766 ... From clause@boar Tue Feb 20 11:49:53 2007 Return-Path: <clause@boar> X-Original-To: clause Delivered-To: clause@boar Received: by boar (Postfix, from userid 1000) id 88EDD1724523; Tue, 20 Feb 2007 11:49:53 -0500 (EST) To: clause@boar Subject: test Message-Id: <20070220164953.88EDD1724523@boar> Date: Tue, 20 Feb 2007 11:49:53 -0500 (EST) From: """"""""""""""""""""""""""""""""@host.fubar X-IMAPbase: 1172160370 390 Status: O X-Status: X-Keywords: X-UID: 5 ... Crash when a “from” field contains 22 or more double quote characters.
  • 87. Study 1 Results: pine # Inputs: 15,103,766 ... From clause@boar Tue Feb 20 11:49:53 2007 Return-Path: <clause@boar> X-Original-To: clause Delivered-To: clause@boar Received: by boar (Postfix, from userid 1000) id 88EDD1724523; Tue, 20 Feb 2007 11:49:53 -0500 (EST) To: clause@boar Subject: test Message-Id: <20070220164953.88EDD1724523@boar> Date: Tue, 20 Feb 2007 11:49:53 -0500 (EST) From: """"""""""""""""""""""""""""""""@host.fubar X-IMAPbase: 1172160370 390 Status: O X-Status: X-Keywords: X-UID: 5 ... … …" " " " " " " " " " " " Crash when a “from” field contains 22 or more double quote characters.
  • 88. Study 1 Results: pine # Inputs: 15,103,766 # Relevant (DF): 26 ... From clause@boar Tue Feb 20 11:49:53 2007 Return-Path: <clause@boar> X-Original-To: clause Delivered-To: clause@boar Received: by boar (Postfix, from userid 1000) id 88EDD1724523; Tue, 20 Feb 2007 11:49:53 -0500 (EST) To: clause@boar Subject: test Message-Id: <20070220164953.88EDD1724523@boar> Date: Tue, 20 Feb 2007 11:49:53 -0500 (EST) From: """"""""""""""""""""""""""""""""@host.fubar X-IMAPbase: 1172160370 390 Status: O X-Status: X-Keywords: X-UID: 5 ... … …" " " " " " " " " " " " Crash when a “from” field contains 22 or more double quote characters.
  • 89. Study 1 Results: pine # Relevant (DF + CF):15,100,344 # Inputs: 15,103,766 # Relevant (DF): 26 ... From clause@boar Tue Feb 20 11:49:53 2007 Return-Path: <clause@boar> X-Original-To: clause Delivered-To: clause@boar Received: by boar (Postfix, from userid 1000) id 88EDD1724523; Tue, 20 Feb 2007 11:49:53 -0500 (EST) To: clause@boar Subject: test Message-Id: <20070220164953.88EDD1724523@boar> Date: Tue, 20 Feb 2007 11:49:53 -0500 (EST) From: """"""""""""""""""""""""""""""""@host.fubar X-IMAPbase: 1172160370 390 Status: O X-Status: X-Keywords: X-UID: 5 ... … …" " " " " " " " " " " " Crash when a “from” field contains 22 or more double quote characters.
  • 91. Study 1: Conclusions 1. Data-flow propagation is always effective, data- and control-flow propagation is sometimes effective. ➡ Use data-flow first then, if necessary, use control-flow.
  • 92. Study 1: Conclusions 1. Data-flow propagation is always effective, data- and control-flow propagation is sometimes effective. ➡ Use data-flow first then, if necessary, use control-flow. 2. Inputs identified by Penumbra correspond to the failure conditions. ➡ Our technique is effective in assisting the debugging of real failures.
  • 93. Study 2: Comparison with Delta Debugging RQ1: How much manual effort does each technique require? RQ2: How long does it take to fix a considered failure given the information provided by each technique?
  • 94. RQ1: Manual effort Use setup-time as a proxy for manual (developer) effort.
  • 95. RQ1: Manual effort Use setup-time as a proxy for manual (developer) effort. 5,400 12,600 1,8001,800 1259731470163 ncompress bc pine Setup-time(s) gzip Penumbra Delta Debugging squid
  • 96. RQ1: Manual effort Use setup-time as a proxy for manual (developer) effort. 5,400 12,600 1,8001,800 1259731470163 ncompress bc pine Setup-time(s) gzip Penumbra Delta Debugging squid
  • 97. RQ1: Manual effort Use setup-time as a proxy for manual (developer) effort. 5,400 12,600 1,8001,800 1259731470163 ncompress bc pine Setup-time(s) gzip Penumbra Delta Debugging squid
  • 98. RQ1: Manual effort Use setup-time as a proxy for manual (developer) effort. 5,400 12,600 1,8001,800 1259731470163 ncompress bc pine Setup-time(s) gzip Penumbra Delta Debugging squid Penumbra requires considerably less setup time than Delta Debugging (although more time time overall for gzip and ncompress).
  • 99. RQ2: Debugging Effort Use number of relevant inputs as a proxy for debugging effort.
  • 100. RQ2: Debugging Effort Subject PenumbraPenumbra Delta Debugging DF DF + CF bc 209 743 285 gzip 1 3 1 ncompress 1 3 1 pine 26 15,100,344 90 squid 89 2,056 — Use number of relevant inputs as a proxy for debugging effort.
  • 101. RQ2: Debugging Effort Subject PenumbraPenumbra Delta Debugging DF DF + CF bc 209 743 285 gzip 1 3 1 ncompress 1 3 1 pine 26 15,100,344 90 squid 89 2,056 — Use number of relevant inputs as a proxy for debugging effort. • Penumbra (DF) is comparable to (slightly better than) Delta Debugging.
  • 102. RQ2: Debugging Effort Subject PenumbraPenumbra Delta Debugging DF DF + CF bc 209 743 285 gzip 1 3 1 ncompress 1 3 1 pine 26 15,100,344 90 squid 89 2,056 — Use number of relevant inputs as a proxy for debugging effort. • Penumbra (DF) is comparable to (slightly better than) Delta Debugging. • Penumbra (DF + CF) is likely less effective for bc, pine, and squid
  • 103. Conclusions & Future Work • Novel technique for identifying failure-relevant inputs. • Overcomes limitations of existing approaches • Single execution • Minimal manual effort • Comparable effectiveness • Combine Penumbra with existing code-centric techniques.