We present our implementation and our reflections on a preregistration-based publication process for the fuzzing community with a pre-stage in the FUZZING workshop (https://fuzzingworkshop.github.io/), plus Stage 1 and Stage 2 at ACM Transactions of Software Engineering and Methodology (TOSEM; https://dl.acm.org/journal/tosem/registered-papers ).
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
An Implementation of Preregistration
1. We think that the incentive structure for fuzzing research is broken;
so we would like to introduce preregistration to
fi
x this.
Preregistration
Stage 1 Stage 2
2. We think that the incentive structure for fuzzing research is broken;
so we would like to introduce preregistration to
fi
x this.
Preregistration
Stage 1 Stage 2
Stage 1
3. We think that the incentive structure for fuzzing research is broken;
so we would like to introduce preregistration to
fi
x this.
Preregistration
Stage 1 Stage 2
• Establish signi
fi
cance.
• Motivate the problem.
• Establish novelty.
• Discuss hypothesis for solution.
• Discuss related work.
• Establish soundness.
• Experimental design.
• Research questions & claims.
• Benchmarks & baselines.
In-principle Accepted!
Go to Stage 2.
Outcomes of Stage 1:
4. We think that the incentive structure for fuzzing research is broken;
so we would like to introduce preregistration to
fi
x this.
Preregistration
Stage 1 Stage 2
• Establish signi
fi
cance.
• Motivate the problem.
• Establish novelty.
• Discuss hypothesis for solution.
• Discuss related work.
• Establish soundness.
• Experimental design.
• Research questions & claims.
• Benchmarks & baselines.
In-principle Accepted!
Go to Stage 2.
Major / Minor Revision.
Back to Stage 1.
Outcomes of Stage 1:
5. We think that the incentive structure for fuzzing research is broken;
so we would like to introduce preregistration to
fi
x this.
Preregistration
Stage 1 Stage 2
• Establish signi
fi
cance.
• Motivate the problem.
• Establish novelty.
• Discuss hypothesis for solution.
• Discuss related work.
• Establish soundness.
• Experimental design.
• Research questions & claims.
• Benchmarks & baselines.
In-principle Accepted!
Go to Stage 2.
Major / Minor Revision.
Back to Stage 1.
Rejected.
Outcomes of Stage 1:
6. We think that the incentive structure for fuzzing research is broken;
so we would like to introduce preregistration to
fi
x this.
Preregistration
Stage 1 Stage 2
• Establish signi
fi
cance.
• Motivate the problem.
• Establish novelty.
• Discuss hypothesis for solution.
• Discuss related work.
• Establish soundness.
• Experimental design.
• Research questions & claims.
• Benchmarks & baselines.
• Establish conformity.
• Execute agreed exp. protocol.
• Explain small deviations fr. protocol.
• Investigate unexpected results.
• Establish reproducibility.
• Submit evidence towards
the key claims in the paper.
7. We think that the incentive structure for fuzzing research is broken;
so we would like to introduce preregistration to
fi
x this.
Preregistration
Stage 2
• Establish conformity.
• Execute agreed exp. protocol.
• Explain small deviations fr. protocol.
• Investigate unexpected results.
• Establish reproducibility.
• Submit evidence towards
the key claims in the paper.
Outcomes of Stage 2:
Accept
Major / Minor Revision
Explain deviations / unexpected results.
Improve artifact / reproducibility.
Reject
Severe deviations from experimental protocol.
8. Why Preregistration
• To get you fuzzing paper published, you need strong positive results.
• We believe, this unhealthy focus is a substantial inhibitor of scienti
fi
c progress.
• Duplicated E
ff
orts: Important investigations are never published.
9. Why Preregistration
• To get you fuzzing paper published, you need strong positive results.
• We believe, this unhealthy focus is a substantial inhibitor of scienti
fi
c progress.
• Duplicated E
ff
orts: Important investigations are never published.
• Hypothesis / approach perfectly reasonable and scienti
fi
c appealing,
If hypothesis proves to be invalid or approach ine
ff
ective, other groups will never now.
10. Why Preregistration
• To get you fuzzing paper published, you need strong positive results.
• We believe, this unhealthy focus is a substantial inhibitor of scienti
fi
c progress.
• Duplicated E
ff
orts: Important investigations are never published.
• Overclaims: Incentive to overclaim the bene
fi
ts of an approach.
11. Why Preregistration
• To get you fuzzing paper published, you need strong positive results.
• We believe, this unhealthy focus is a substantial inhibitor of scienti
fi
c progress.
• Duplicated E
ff
orts: Important investigations are never published.
• Overclaims: Incentive to overclaim the bene
fi
ts of an approach.
• Di
ffi
cult to reproduce the results and misinforms future investigations by the community.
• Authors are uncomfortable sharing their research prototypes.
In 2020 only 35 of 60 fuzzing papers we surveyed published code w/ paper.
14. Why Preregistration
• Sound fuzzer evaluation imposes high barrier to entry for newcomers.
1. Well-designed experiment methodology.
2. Substantial computation resources.
• Huge variance due to randomness
• Repeat 20x, 24hrs, X fuzzers, Y programs
• Statistical Signi
fi
cance, e
ff
ect size
• CPU centuries.
Many pitfalls of experimental design! Newcomers find out
only when receiving the reviews and after conducting
costly experiments following a flawed methodology.
Symptomatic plus-one comments.
15. Why Preregistration
• Address both issues by switching to a 2-stage publication process that
separates the review of (i) the methodology & ideas and (ii) the evidence.
16. Why Preregistration
• Address both issues by switching to a 2-stage publication process that
separates the review of (i) the methodology & ideas and (ii) the evidence.
• If Registered Report is in-principle accepted and proposed exp. design is
followed without unexplained deviations, results will be accepted as they are.
17. Why Preregistration
• Address both issues by switching to a 2-stage publication process that
separates the review of (i) the methodology & ideas and (ii) the evidence.
• If Registered Report is in-principle accepted and proposed exp. design is
followed without unexplained deviations, results will be accepted as they are.
• Minimizes incentive to overclaim (while not reducing quality of evaluation).
• Allow publication of interesting ideas and investigations irrespective of results.
18. Why Preregistration
• Address both issues by switching to a 2-stage publication process that
separates the review of (i) the methodology & ideas and (ii) the evidence.
• If Registered Report is in-principle accepted and proposed exp. design is
followed without unexplained deviations, results will be accepted as they are.
• Early feedback for newcomers.
• On signi
fi
cance and novelty of the problem/approach/hypothesis.
• On soundness and reproducibility of experimental methodology.
• Further lower barrier, Google pledges help with fuzzer evaluation via FuzzBench.
19. Why Preregistration
• Address both issues by switching to a 2-stage publication process that
separates the review of (i) the methodology & ideas and (ii) the evidence.
• If Registered Report is in-principle accepted and proposed exp. design is
followed without unexplained deviations, results will be accepted as they are.
• Early feedback for newcomers.
• We hope our initiative will turn the focus of the peer-reviewing process
back to the innovation and key claims in a paper, while leaving the burden of
evidence until after the in-principle acceptance.
20. Why Preregistration
• Address both issues by switching to a 2-stage publication process that
separates the review of (i) the methodology & ideas and (ii) the evidence.
• If Registered Report is in-principle accepted and proposed exp. design is
followed without unexplained deviations, results will be accepted as they are.
• Early feedback for newcomers.
• We hope our initiative will turn the focus of the peer-reviewing process
back to the innovation and key claims in a paper, while leaving the burden of
evidence until after the in-principle acceptance.
• Reviewers go from gate-keeping to productive feedback.
Authors and reviewers work to ensure best study design possible.
23. Why Preregistration
• What do you see as the main strengths of the model?
• More reproduciblity.
• Less overclaims, mitigates publication bias, less unhealthy focus on positive results.
• Publications are more sound. Publication process is more fair.
• Allows interesting negative results, no forced positive result, less duplicated e
ff
ort.
• Ideas and methodology above positive results.
24. Why Preregistration
• What do you see as the main strengths of the model?
The main draws for me are the removal of the unhealthy focus on positive results
(bad for students, bad for reproducibility, bad for impact) as well as the fact that
the furthering of the
fi
eld is pushed forward with negative results regarding newly
attempted studies that have already been performed by others. Lastly, it removes
the questionable aspect of changing the approach until something working
appears, with no regard for a validation step. In ML lingo, we only have a test set,
no validation set, and are implicitly over
fi
tting to it with our early stopping.
“
“
26. Why Preregistration
• What do you see as the main weaknesses of the model?
• Time to publish is too long. Increased author / reviewing load.
27. Why Preregistration
• What do you see as the main weaknesses of the model?
• Time to publish is too long. Increased author / reviewing load.
At
fi
rst hand maybe longer publication process because of the pre-registration,
but overall it could be even faster, when someone also includes the time for
rejection and re-work etc.
“ “
28. Why Preregistration
• What do you see as the main weaknesses of the model?
• Time to publish is too long. Increased author / reviewing load.
• Sound experimental designs may be hard to create and vet / review.
• For the
fi
rst time, preregistration enables conversations about the soundness of
experimental design. It naturally creates and communicates community standards.
• Previously, experimental design was either accepted as is
or criticized with a high cost to authors.
29. Why Preregistration
• What do you see as the main weaknesses of the model?
• Time to publish is too long. Increased author / reviewing load.
• Sound experimental designs may be hard to create and vet / review.
• Is the model
fl
exible enough to accommodate changes in experimental design?
30. Why Preregistration
• What do you see as the main weaknesses of the model?
• Time to publish is too long. Increased author / reviewing load.
• Sound experimental designs may be hard to create and vet / review.
• Is the model
fl
exible enough to accommodate changes in experimental design?
• Yes. Deviations from the agreed protocol are allowed but must be explained.
31. Why Preregistration
• What do you see as the main weaknesses of the model?
• Time to publish is too long. Increased author / reviewing load.
• Sound experimental designs may be hard to create and vet / review.
• Is the model
fl
exible enough to accommodate changes in experimental design?
• Ideas that look bad theoretically may work well in practice.
• Without performing the experiment, we can't say if it could be useful or not.
• The model is not meant to substitute the traditional publication model, but to augment it.
• This model might not work very well for exploratory research (hypothesis generation).
• This model might work better for con
fi
rmatory research (hypothesis testing).
33. Why Preregistration
• In your opinion, how could this publication model be improved?
• Stage 2 publication in conference, instead of a journal.
34. Why Preregistration
• In your opinion, how could this publication model be improved?
• Stage 2 publication in conference, instead of a journal.
• We see conference as a forum for discussion (which happens in this workshop).
• Maybe Stage 1 in conference, Stage 2 in journal (+ conference presentation)?
35. Why Preregistration
• In your opinion, how could this publication model be improved?
• Stage 2 publication in conference, instead of a journal.
• Fast-track through Stage 1 and Stage 2 when results exist.
• Sounds like a more traditional publication, not preregistration :)
36. Why Preregistration
• In your opinion, how could this publication model be improved?
• Stage 2 publication in conference, instead of a journal.
• Fast-track through Stage 1 and Stage 2 when results exist.
37. Why Preregistration
• In your opinion, how could this publication model be improved?
• Stage 2 publication in conference, instead of a journal.
• Fast-track through Stage 1 and Stage 2 when results exist.
• Flexible author-list within reason, to incentivize post-announcement collaboration.
• Preregistration (where Stage 1 is published) would also allow early decon
fl
icting or
lead to increased collaboration between people with similar ideas and goals.
38. Why Preregistration
• In your opinion, how could this publication model be improved?
• Stage 2 publication in conference, instead of a journal.
• Fast-track through Stage 1 and Stage 2 when results exist.
• Flexible author-list within reason, to incentivize post-announcement collaboration.
39. Why Preregistration
We think that the incentive structure for fuzzing research is broken;
so we would like to introduce preregistration to fix this.
Preregistration
Stage 1 Stage 2
• Establish significance.
• Motivate the problem.
• Establish novelty.
• Discuss hypothesis for solution.
• Discuss related work.
• Establish soundness.
• Experimental design.
• Research questions & claims.
• Benchmarks & baselines.
• Establish conformity.
• Execute agreed exp. protocol.
• Explain small deviations fr. protocol.
• Investigate unexpected results.
• Establish reproducibility.
• Submit evidence towards
the key claims in the paper.
40. Why Preregistration
We think that the incentive structure for fuzzing research is broken;
so we would like to introduce preregistration to fix this.
Preregistration
Stage 1 Stage 2
• Establish significance.
• Motivate the problem.
• Establish novelty.
• Discuss hypothesis for solution.
• Discuss related work.
• Establish soundness.
• Experimental design.
• Research questions & claims.
• Benchmarks & baselines.
• Establish conformity.
• Execute agreed exp. protocol.
• Explain small deviations fr. protocol.
• Investigate unexpected results.
• Establish reproducibility.
• Submit evidence towards
the key claims in the paper.
Why Preregistration
• Sound fuzzer evaluation imposes high barrier to entry for newcomers.
1. Well-designed experiment methodology.
2. Substantial computation resources.
• Huge variance due to randomness
• Repeat 20x, 24hrs, X fuzzers, Y programs
• Statistical Significance, effect size
• CPU centuries.
Many pitfalls of experimental design! Newcomers find out
only when receiving the reviews and after conducting
costly experiments following a flawed methodology.
Symptomatic plus-one comments.
41. Why Preregistration
We think that the incentive structure for fuzzing research is broken;
so we would like to introduce preregistration to fix this.
Preregistration
Stage 1 Stage 2
• Establish significance.
• Motivate the problem.
• Establish novelty.
• Discuss hypothesis for solution.
• Discuss related work.
• Establish soundness.
• Experimental design.
• Research questions & claims.
• Benchmarks & baselines.
• Establish conformity.
• Execute agreed exp. protocol.
• Explain small deviations fr. protocol.
• Investigate unexpected results.
• Establish reproducibility.
• Submit evidence towards
the key claims in the paper.
Why Preregistration
• Sound fuzzer evaluation imposes high barrier to entry for newcomers.
1. Well-designed experiment methodology.
2. Substantial computation resources.
• Huge variance due to randomness
• Repeat 20x, 24hrs, X fuzzers, Y programs
• Statistical Significance, effect size
• CPU centuries.
Many pitfalls of experimental design! Newcomers find out
only when receiving the reviews and after conducting
costly experiments following a flawed methodology.
Symptomatic plus-one comments.
Why Preregistration
• Address both issues by switching to a 2-stage publication process that
separates the review of (i) the methodology & ideas and (ii) the evidence.
• If Registered Report is in-principle accepted and proposed exp. design is
followed without unexplained deviations, results will be accepted as they are.
• Early feedback for newcomers.
• We hope our initiative will turn the focus of the peer-reviewing process
back to the innovation and key claims in a paper, while leaving the burden of
evidence until after the in-principle acceptance.
• Reviewers go from gate-keeping to productive feedback.
Authors and reviewers work to ensure best study design possible.
42. Why Preregistration
We think that the incentive structure for fuzzing research is broken;
so we would like to introduce preregistration to fix this.
Preregistration
Stage 1 Stage 2
• Establish significance.
• Motivate the problem.
• Establish novelty.
• Discuss hypothesis for solution.
• Discuss related work.
• Establish soundness.
• Experimental design.
• Research questions & claims.
• Benchmarks & baselines.
• Establish conformity.
• Execute agreed exp. protocol.
• Explain small deviations fr. protocol.
• Investigate unexpected results.
• Establish reproducibility.
• Submit evidence towards
the key claims in the paper.
Why Preregistration
• Sound fuzzer evaluation imposes high barrier to entry for newcomers.
1. Well-designed experiment methodology.
2. Substantial computation resources.
• Huge variance due to randomness
• Repeat 20x, 24hrs, X fuzzers, Y programs
• Statistical Significance, effect size
• CPU centuries.
Many pitfalls of experimental design! Newcomers find out
only when receiving the reviews and after conducting
costly experiments following a flawed methodology.
Symptomatic plus-one comments.
Why Preregistration
Your thoughts
or experience?
Why Preregistration
• Address both issues by switching to a 2-stage publication process that
separates the review of (i) the methodology & ideas and (ii) the evidence.
• If Registered Report is in-principle accepted and proposed exp. design is
followed without unexplained deviations, results will be accepted as they are.
• Early feedback for newcomers.
• We hope our initiative will turn the focus of the peer-reviewing process
back to the innovation and key claims in a paper, while leaving the burden of
evidence until after the in-principle acceptance.
• Reviewers go from gate-keeping to productive feedback.
Authors and reviewers work to ensure best study design possible.