167 - Productivity for proof engineering

NICTA Copyright 2014
Productivity for
Proof
Engineering
M. Staples, R. Jeffery,
J. Andronick, T. Murray, G. Klein,
R. Kolanski

In the beginning -
• Empirical software engineering –
• Formal methods/verification –
• Operating systems –
• seL4 and L4.verified projects at UNSW/NICTA
• Goal – “An implementation correctness proof for
seL4 with the kernel running on a mainstream
embedded processor within 10% of the
performance of L4.” Klein 2009.
2

History
• seL4 concluded successfully by end 2007
• 10,000 lines of C code
• 2.2 person years of effort
• L4.verified > 20 person years
• For cost effective proof engineering a key
consideration is proof productivity.
3

This study - Specs
• Retrospective 9 projects from L4.verified.
• All used Isabelle theorem prover.
• Three formal specifications of seL4 –
– Exec – models an executable representation of
seL4’s design
– Abstract – complete functional specification
– CapDL – capabilities (access rights) between
components
5

This study - Proofs
• Six proofs – Three refinement proofs –
– Code-to-exec,
– Exec-to-abstract,
– Abstract-to-CapDL.
• Two security proofs –
– Info.flow and
– Integrity
• CapDL policy proof.
6

Measures
• Effort – in person weeks
• Output – Lines of proof
• Other variables – maximum team size, schedule
pressure, overall difficulty, years experience with
Isabelle, formal methods or theorem proving, the
domain (operating systems).
7

The data
8
Final Size
(Kilo Lines of
proof)
Total Effort
(Person weeks)
Sched. Pressure Overall Diffic. Max Team
(Headcount)
CapDL Spec 2.14 27.5 AV LO 5
CapDL-policy proof 0.85 11.3 LO AV 1
Abstract-to-CapDL
Refinement
20.4 66 AV AV 5
Integrity 7.05 28.5 V. HI HI 4
Info.Flow 27.1 75.9 V.HI V.HI 8
Exec-to-Abstract
Refinement
96.6 368 HI V.HI 6
Code-to-Exec
Refinement
53.34 138 V.HI HI 6
Exec Spec Haskell 6.01 92 AV HI 1
Abstract Spec 4.9 15.3 AV AV 3

Effort – Size Plot for projects
9
!

Project relationships
• Total Project Effort = 9.98 + 3.35*Final Size
R2 = 0.914, p<0.001
• Possible outliers – large abstract refinement and
executable spec.
• Weak evidence that schedule pressure is
associated with decreased effort, and overall
difficulty and maximum team size with increased
effort. But small sample size and not significant
at 0.05. Experience not significant.
10

Effort – Size plot for individuals
11
!

Individual relationships
• 24 Individual contributions to five projects
• R2 = 0.93, p<0.001
12

Threats
• construct validity
– Limitations of lines of proof as a size measure (?)
– Subjective measures carefully defined
• external validity
– seL4 only therefore limited, but aids internal validity
– Generalization not known
• Internal validity
– Wherever possible measures were carefully defined
and reviewed by multiple persons
– Factors not measured?
13

Conclusions
• Proof engineering can bring the benefits of
formal verification to more software engineering
projects, but understanding cost effectiveness is
an issue.
• We find proof size and effort are strongly related
for projects and individuals in L4verified
• Significant opportunity for the empirical
community to help understand rework, tools and
techniques, proof patterns, reuse and so on in
proof engineering.
14

167 - Productivity for proof engineering

More Related Content

What's hot

Viewers also liked

More from ESEM 2014

Recently uploaded

167 - Productivity for proof engineering