This document summarizes a study on proof productivity in formal verification projects. It analyzes data from 9 proof projects involving the seL4 microkernel. The study found that total project effort is strongly correlated with final proof size. Additionally, individual contributors' effort is also strongly correlated with their proof output size. While schedule pressure and team size may impact effort, the sample size was too small to draw significant conclusions. The study aims to improve understanding of cost effectiveness in formal proof engineering.
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
167 - Productivity for proof engineering
1. NICTA Copyright 2014
Productivity for
Proof
Engineering
M. Staples, R. Jeffery,
J. Andronick, T. Murray, G. Klein,
R. Kolanski
2. In the beginning -
• Empirical software engineering –
• Formal methods/verification –
• Operating systems –
• seL4 and L4.verified projects at UNSW/NICTA
• Goal – “An implementation correctness proof for
seL4 with the kernel running on a mainstream
embedded processor within 10% of the
performance of L4.” Klein 2009.
NICTA Copyright 2014
2
3. History
• seL4 concluded successfully by end 2007
• 10,000 lines of C code
• 2.2 person years of effort
• L4.verified > 20 person years
• For cost effective proof engineering a key
consideration is proof productivity.
NICTA Copyright 2014
3
5. This study - Specs
• Retrospective 9 projects from L4.verified.
• All used Isabelle theorem prover.
• Three formal specifications of seL4 –
– Exec – models an executable representation of
seL4’s design
– Abstract – complete functional specification
– CapDL – capabilities (access rights) between
components
NICTA Copyright 2014
5
6. This study - Proofs
• Six proofs – Three refinement proofs –
– Code-to-exec,
– Exec-to-abstract,
– Abstract-to-CapDL.
• Two security proofs –
– Info.flow and
– Integrity
• CapDL policy proof.
NICTA Copyright 2014
6
7. Measures
• Effort – in person weeks
• Output – Lines of proof
• Other variables – maximum team size, schedule
pressure, overall difficulty, years experience with
Isabelle, formal methods or theorem proving, the
domain (operating systems).
NICTA Copyright 2014
7
8. The data
NICTA Copyright 2014
8
Final Size
(Kilo Lines of
proof)
Total Effort
(Person weeks)
Sched. Pressure Overall Diffic. Max Team
(Headcount)
CapDL Spec 2.14 27.5 AV LO 5
CapDL-policy proof 0.85 11.3 LO AV 1
Abstract-to-CapDL
Refinement
20.4 66 AV AV 5
Integrity 7.05 28.5 V. HI HI 4
Info.Flow 27.1 75.9 V.HI V.HI 8
Exec-to-Abstract
Refinement
96.6 368 HI V.HI 6
Code-to-Exec
Refinement
53.34 138 V.HI HI 6
Exec Spec Haskell 6.01 92 AV HI 1
Abstract Spec 4.9 15.3 AV AV 3
10. Project relationships
• Total Project Effort = 9.98 + 3.35*Final Size
R2 = 0.914, p<0.001
• Possible outliers – large abstract refinement and
executable spec.
• Weak evidence that schedule pressure is
associated with decreased effort, and overall
difficulty and maximum team size with increased
effort. But small sample size and not significant
at 0.05. Experience not significant.
NICTA Copyright 2014
10
12. Individual relationships
• 24 Individual contributions to five projects
• R2 = 0.93, p<0.001
NICTA Copyright 2014
12
13. Threats
• construct validity
– Limitations of lines of proof as a size measure (?)
– Subjective measures carefully defined
• external validity
– seL4 only therefore limited, but aids internal validity
– Generalization not known
• Internal validity
– Wherever possible measures were carefully defined
and reviewed by multiple persons
– Factors not measured?
NICTA Copyright 2014
13
14. Conclusions
• Proof engineering can bring the benefits of
formal verification to more software engineering
projects, but understanding cost effectiveness is
an issue.
• We find proof size and effort are strongly related
for projects and individuals in L4verified
• Significant opportunity for the empirical
community to help understand rework, tools and
techniques, proof patterns, reuse and so on in
proof engineering.
NICTA Copyright 2014
14