Deductive verification of unmodified Linux kernel library functions

Deductive verification
of unmodified Linux
kernel library functions
Denis Efremov NRU HSE
defremov@hse.ru
ISoLA 2018, Limassol, Cyprus, 6 November 2018

Motivation
Tools evaluation (Frama-C+AstraVer+Why3) on a real
code:
• Does our specification language expressive enough to
describe this code?
• How far can we go without posing restrictions on the C
syntax?
Tests for our tools:
• When we will change our memory/arithmetic models,
will a fully-proved function still be easy to reprove?

Linux kernel
Library functions
Linux kernel
• Doesn’t rely on any other piece of software (no stdlib, only gcc-
builtins)
• Userspace/Kernelspace pointers
• Contains no floating point operations (well, almost)
• Wide use of gcc extensions
• Heavy-weight casting operations container_of/offsetof, unions,
pointers-to-integers, void * to struct *, pointers to functions,
bitwise operations
• You can’t rewrite the code to be more “suitable” for verification in
all cases
Contains implementation of many «standard» functions on
strings, memory from stdlib
• Generic versions in C
• Architecture-optimized versions in assembler

What can we say about this function?
• This is a pure C function;
• It computes the average value of two int values.
int average(int a, int b) {
return (a + b) / 2;
}
What is the deductive
verification?
How does it look like on practice?

What can we say about this function?
• This is a pure C function;
• It computes the average value of two int values;
• There is a signed integer overflow under certain
conditions.
return (a + b) / 2;
}
verification?
How does it look like on practice?

verification?
Context of a function call
• Calling context for average: binary
search function;
• The indexes l and h are non-
negative, l is not greater than h;
• Integer overflow in m may lead to
out-of-bounds access (base[m]).
return (a + b) / 2;
}
int *binsearch(int *base, int n, int key)
{
int l = 0, h = n - 1;
while (l <= h) {
int m = average(l, h);
int val = base[m];
if (val < key) {
l = m + 1;
} else if (val > key) {
h = m - 1;
} else {
return base + m;
}
}
return NULL;
}

Formal specification for a C
function
Contract of a function
Describe call context (pre-conditions):
𝜙: 𝑍×𝑍 → ⊤, ⊥
𝜙 𝑎, 𝑏 ≡ 𝑎 ≥ 0 ∧ 𝑏 ≥ 0 ∧ 𝑎 ≤ 𝑏
Describe functional requirements on results
(post-conditions):
𝜓: 𝑍×𝑍×𝑍 → {⊤, ⊥}
𝜓 𝑎, 𝑏, 𝑟𝑒𝑠𝑢𝑙𝑡 ≡ 𝑟𝑒𝑠𝑢𝑙𝑡 =
𝑎 + 𝑏
2

function
Error model and code representation
• Define an error (an integer overflow):
𝑖𝑛_𝑏𝑜𝑢𝑛𝑑𝑠: 𝑍 → {⊤, ⊥}
𝑖𝑛_𝑏𝑜𝑢𝑛𝑑𝑠 𝑛 ≡ 𝑀𝐼𝑁_𝐼𝑁𝑇 ≤ 𝑛 ≤ 𝑀𝐴𝑋_𝐼𝑁𝑇
• Formalize the program code: the function 𝑀 𝑎𝑣𝑟 , returns
the result 𝑀 𝑎𝑣𝑟 (𝑎, 𝑏) according to its program code, iff it
terminates and terminates without an error otherwise,
special value 𝜔 returned
• Prove the total correctness:
∀𝑎, 𝑏 𝜙 𝑎, 𝑏 ⇒
𝑀 𝑎𝑣𝑟 𝑎, 𝑏 ≠ 𝜔 && 𝜓 𝑎, 𝑏, 𝑀 𝑎𝑣𝑟 𝑎, 𝑏

function
Code should comply with specification
/*@ requires 0 <= a && 0 <= b;
requires a <= b;
ensures result == (a + b) / 2;
*/
return (a + b) / 2;
}

function
Verification Condition
/*@ requires 0 <= a && 0 <= b;
requires a <= b;
*/
return (a + b) / 2;
}
predicate in_bounds (n:int) = min <= n / n <= max
constant a : t17
constant b : t17
axiom H : of_int 0 <= a / of_int 0 <= b / a <= b
axiom H1: in_bounds 2
constant o : t17
axiom H2 : to_int o = 2
goal WP_parameter_average:
in_bounds (to_int a + to_int b)

function
Verification Condition Pre-condition update
/*@ requires 0 <= a && 0 <= b;
requires a <= b;
*/
return (a + b) / 2;
}
/*@ requires 0 <= a <= INT_MAX / 2;
requires 0 <= b <= INT_MAX / 2;
requires a <= b;
*/
return (a + b) / 2;
}
constant a : t17
constant b : t17
constant o : t17

function
Verification Condition
Code fix
Pre-condition update
- return (a + b) / 2;
+ return a + (b - a) / 2;
}
/*@ requires 0 <= a && 0 <= b;
requires a <= b;
*/
return (a + b) / 2;
}
/*@ requires 0 <= a <= INT_MAX / 2;
requires 0 <= b <= INT_MAX / 2;
requires a <= b;
*/
return (a + b) / 2;
}
constant a : t17
constant b : t17
constant o : t17

Related Work
• M. Torlakcik «Contracts in OpenBSD» (2010)
• Frama-C + Jessie deductive verification plugin
• 12 functions (7 fully-proved functions)
• Solvers: Simplify (1.5.4), Alt-Ergo (0.7.3), Z3 (2.0)
• N. Carvalho, Silva Sousa, Cristiano and Pinto, Jorge
Sousa and Tomb, Aaron «Formal Verification of kLIBC
with the WP Frama-C Plug-in» (2014)
• Frama-C + WP deductive verification plugin
• 26 functions (14 fully-proved functions)
• Solvers: Alt-Ergo (0.95.1), CVC3 (2.4.1), Z3 (4.3.1)
• D. R. Cok, I. Blissard, J. Robbins «C Library annotations
in ACSL for Frama-C: experience report» (2017)

Known problems (1)
And how to handle them (ACSL extensions)
char *strnchr(const char *s, size_t count, int c)
{
for (; count-- && *s != '0'; ++s)
if (*s == (char)c)
return (char *)s;
return NULL;
}

Known problems (1)
{
for (; count-- && *s != '0'; ++s)
if (*s == (char)c)
return (char *)s;
return NULL;
}
• The underflow of an unsigned loop iterator at the
last iteration step due to the postfix decrement;

Known problems (1)
{
for (; count-- /*@%*/ && *s != '0'; ++s)
if (*s == (char)c)
return (char *)s;
return NULL;
}

Known problems (2)
{
for (; count-- /*@%*/ && *s != '0'; ++s)
if (*s == (char)c)
return (char *)s;
return NULL;
}
• The intended cast to a smaller integer type;

Known problems (2)
{
for (; count-- /*@%*/ && *s != '0'; ++s)
if (*s == (char) /*@%*/ c)
return (char *)s;
return NULL;
}

Known problems (3)
{
for (; count-- /*@%*/ && *s != '0'; ++s)
if (*s == (char) /*@%*/ c)
return (char *)s;
return NULL;
}
• Pointer casts. Example: unsigned char * to char *.

VerKer Results (1)
• 26 library functions from Linux (Frama-C+AstraVer+Why3)
• 17 str* functions
• 6 mem* functions
• 3 others
• 25 fully-proved functions
• in memmove we were not able to discharge one verification
condition
• In 9 functions there was an intended integer overflow
• In 7 functions there was an intended integer cast to a
smaller type
• In 2 we «slightly» changed the code to prove them
• Solvers: Alt-Ergo (2.0), CVC4 (1.4), CVC4 (1.6)

VerKer Results (2)
• VC transformation strategy for solvers benchmarking
• Total number of verification conditions is 2781
• Number of lemmas 69 lemmas (37 proved
automatically)
• Integration with CI system (TravisCI)
• An average number of spec lines for a single C line
(about 900) is ~2.6
• Open specifications and verification artifacts (proofs)
http://forge.ispras.ru/projects/verker
https://github.com/evdenis/verker

Memmove
• The memory model implemented in AstraVer (Jessie)
plugin allows arithmetic operations on pointers only
when the pointers belong to the same allocated
memory block;
• For memmove, this is not necessarily the case;
• If we state in the specification contract that src and
dest may belong to different allocated memory blocks,
then it is impossible to prove the VC states that they
should belong to the same memory block;
• Comparison of pointers to different memory blocks is
the undefined behavior in ACSI C.
void *memmove(void *dest,
const void *src,
size_t count) {
if (dest <= src)

The modified functions
An implicit cast in memset, strcmp
void *memset(void *s, int c, size_t count) {
char *xs = s;
while (count--)
*xs++ = c;
return s;
}
int strcmp(const char *cs, const char *ct) {
unsigned char c1, c2;
while (1) {
c1 = *cs++;
c2 = *ct++;
if (c1 != c2)
return c1 < c2 ? -1 : 1;
if (!c1)
break;
}
return 0;
}

char *xs = s;
while (count--)
*xs++ = (char) c;
return s;
}
while (1) {
c1 = (unsigned char) *cs++;
c2 = (unsigned char) *ct++;
if (c1 != c2)
return c1 < c2 ? -1 : 1;
if (!c1)
break;
}
return 0;
}

char *xs = s;
while (count--)
*xs++ = (char) /*@%*/ c;
return s;
}
while (1) {
c1 = (unsigned char) /*@%*/ *cs++;
c2 = (unsigned char) /*@%*/ *ct++;
if (c1 != c2)
return c1 < c2 ? -1 : 1;
if (!c1)
break;
}
return 0;
}

What’s next (1)
• «Lemma Functions for Frama-C: C Programs as Proofs»
G. Volkov, M. Mandrykin, D. Efremov
https://arxiv.org/abs/1811.05879
• Auto-active verification technique for the Frama-C
framework
• Lemma-functions ACSL extension
• Interactive proving (Coq) vs auto-active verification
technique
• 31 lemma-functions
• Source code:
https://github.com/evdenis/verker/tree/lemma_functi
ons

What’s next (2)
Arch-optimized implementations of functions
Function Implementations on architectures
memmove powerpc (2), s390, mips, x86_64, alpha, sparc (2)
memcpy ia64 (2), powerpc (2), s390, mips (2), x86_64 (3), alpha, spark
memset ia64, powerpc (2), s390, mips, x86_64, alpha (2), spark (2)
memchr powerpc, alpha (2)
memcmp powerpc (2), spark
memscan spark (4)
strcat alpha (2)
strchr alpha (2)
strncmp powerpc, spark (2)
strcpy alpha
strlen ia64, powerpc, alpha (2), spark
strrchr alpha, arm64

How to verify all these
implementations?
• Runtime verification
• Translate a contract for a generic function in
assertions (Frama-C + E-ACSL)
• Extract particular implementation
• Integrate fuzzer (e.g., libfuzzer) with assertions
• Run the testing in QEMU (user mode)
• Catch the violations of postconditions

Translate specifications to
assertions
Frama-C E-ACSL
/*@ requires a <= 0 && 0 <= b;
requires a <= b;
*/
int average(int a, int b)
{
return (a + b) / 2;
}
bool average_precondition(int a, int b) {
if (0 <= a && 0 <= b)
if (a <= b)
return true;
return false;
}
bool average_postcondition(int ret_value)
{ long result = ((long) a +
(long) b) / 2;
if (result == ret_value)
return true;
return false;
}
int _average(int a, int b) {
assert(average_precondition(a, b));
int _tmp = average(a, b);
assert(average_postcondition(_tmp));
return _tmp;
}

Translate specifications to
assertions
Frama-C E-ACSL
/*@ requires a <= 0 && 0 <= b;
requires a <= b;
*/
int average(int a, int b)
{
return (a + b) / 2;
}
bool average_precondition(int a, int b) {
if (0 <= a && 0 <= b)
if (a <= b)
return true;
return false;
}
bool average_postcondition(int ret_value)
{ long result = ((long) a +
(long) b) / 2;
if (result == ret_value)
return true;
return false;
}
void _fuzz_average(int fuzz_a, int fuzz_b) {
if (average_precondition(fuzz_a, fuzz_b)) {
int _tmp = average(fuzz_a, fuzz_b);
assert(average_postcondition(_tmp));
}
}

Logic errors. Can you see the
contradiction?
The artificial example
/*@ requires 0 == 1;
ensures result == 0
&& result == 1
&& result == 2;
*/
int main(void) {
int a = 1;
return a / 0;
}
• The contradiction in the
specification;
• Division-by-zero in the main
function;
• Errors in specification may
lead to missing errors in code.

contradiction?
The real example
logic Z Count{L}(int *a, Z m, Z n, int v);
axiom CountSectionEmpty:
∀ int *a, v, Z m, n;
n ≤ m ⇒ Count(a, m, n, v) == 0;
axiom CountSectionHit:
∀ int *a, v, Z n, m;
a[n] == v ⇒
Count(a, m, n + 1, v) == Count(a, m, n, v) + 1;

n ≤ m ⇒ Count(a, m, n, v) == 0;
a[n] == v ⇒
contradiction?
The real example

n ≤ m ⇒ Count(a, m, n, v) == 0;
a[n] == v ⇒
int a = 5;
assert Count(&a+1, 0, -1, 5) == 0
&& Count(&a+1, 0, 0, 5) == 0;
contradiction?
The real example

n ≤ m ⇒ Count(a, m, n, v) == 0;
a[n] == v ⇒
int a = 5;
assert Count(&a+1, 0, -1, 5) == 0
&& Count(&a+1, 0, 0, 5) == 0;
assert Count(&a+1, 0, 0, 5) == Count(&a+1, 0, -1, 5) + 1;
contradiction?
The real example

n ≤ m ⇒ Count(a, m, n, v) == 0;
a[n] == v ⇒
int a = 5;
assert Count(&a+1, 0, -1, 5) == 0
&& Count(&a+1, 0, 0, 5) == 0;
assert Count(&a+1, 0, 0, 5) == Count(&a+1, 0, -1, 5) + 1;
assert 0 == 1;
contradiction?
The real example

• Insert an incorrect assertion.
It should not be proved;
• Perform a special
transformation of a
verification condition:
• Try to prove it;
• Negate. Try to prove it;
• If it holds in either case,
you’ve got a problem.
/*@ requires 0 == 1;
ensures result == 0
&& result == 1
&& result == 2;
*/
int main(void) {
int a = 1;
//@ assert 0 == 1;
return a / 0;
}
contradiction?
How to check yourself

Deductive verification of unmodified Linux kernel library functions

More Related Content

What's hot

Similar to Deductive verification of unmodified Linux kernel library functions

More from Denis Efremov

Recently uploaded

Deductive verification of unmodified Linux kernel library functions