Fast and Precise Symbolic Analysis of Concurrency Bugs in Device Drivers

Fast and Precise Symbolic Analysis of
Concurrency Bugs in Device Drivers
Pantazis Deligiannis
Alastair Donaldson
Zvonimir Rakamarić
We acknowledge funding by a gift from Intel Corporation and from NSF CCF 1346756

The relative error-rate in device
driver code is up to 10 times higher
than the rest of the Linux kernel [1]
[1] Chou et al. An empirical study of operating systems errors. SOSP 2001

85% of the system crashes in
Windows XP are caused by faulty
device drivers [2]
[2] Swift et al. Improving the reliability of commodity operating systems. SOSP 2003

19% of the bugs in Linux device
drivers are due to concurrency issues,
such as data races [3]
[3] Ryzhyk et al. Dingo: Taming device drivers. EuroSys 2009

Data races, can lead to subtle, hard
to debug errors in complex programs
such as device drivers
How to efficiently detect them?

Overview of our approach
Concurrent Driver

Concurrent Driver
Sequential Driver
(sound over-approximation)
transform

Concurrent Driver
Sequential Driver
Symbolic
Lockset Analyser
(identify all potential races)
transform

Concurrent Driver
Sequential Driver
Symbolic
Lockset Analyser
(identify all potential races)
Symbolic Verifier
(precise but bounded exploration)
We developed in this work …
exploit race
info to speedup
trace
transform

Concurrent Driver
EP1 EP2 EPN…
For each driver …

EP Pair
Concurrent Driver
EP1 EP2 EPN…
Consider every pair of
entry points that can
execute concurrently
EP1 EP2
EP Pair
EP1 EP3
EP Pair
EPM EPN
…
For each driver …

EP Pair
EP1 EP2
For each pair …

EP Pair
EP1 EP2
Sequential Pair
call EntryPoint1
call EntryPoint2
Call 2 entry points in sequence
(order, i.e. schedule, does not
matter due to lockset analysis)
Sequentialize
For each pair …

Sequential Pair
call EntryPoint1
call EntryPoint2
For each pair …

Sequential Pair
call EntryPoint1
call EntryPoint2
Soundly model the effects of any
other concurrently running EPs
by over-approximating read
accesses to shared memory
For each pair …

Sequential Pair
call EntryPoint1
call EntryPoint2
Instrument entry points
with additional state to
record locksets for
symbolic lockset analysis
For each pair …

Sequential Pair
call EntryPoint1
call EntryPoint2
Instrument entry points
with additional state to
record locksets for
symbolic lockset analysis
Attempt to verify race-freedom
by asserting that all accesses to
each shared memory location
are consistently protected by at
least one common lock
For each pair …

EP Pair
Good reason: we can symbolically
analyse each pair of entry points in
parallel (but we leave this for future work)
EP1 EP2
EP Pair
EP1 EP3
EP Pair
EPM EPN
…
Why pairwise analysis?

Lightweight data race detection method:
- proposed in Eraser [4], a dynamic race detector
Lockset analysis
[4] Savage et al. Eraser: A Dynamic Data Race Detector for Multithreaded Programs. TOCS 1997

- key idea: track sets of locks that consistently protect
each memory location S during program execution
Lockset analysis

- if the lockset of S ever becomes empty, report a
potential race on S
Lockset analysis

- if the lockset of S ever becomes empty, report a
potential race on S
- this is because an empty lockset suggests that S may
be accessed simultaneously by two or more threads
Lockset analysis

Program CLST1 CLST2 LSA
Initial { } { } { M, N }
T1
lock (M);
lock (N);
write (A);
unlock (N);
write (A);
unlock (M);
T2
lock (N);
write (A);
unlock (N);

Initial { } { } { M, N }
T1
lock (M);
lock (N);
write (A);
unlock (N);
write (A);
unlock (M);
T2
lock (N);
write (A);
unlock (N);
{ M } { M, N }

Initial { } { } { M, N }
T1
lock (M);
lock (N);
write (A);
unlock (N);
write (A);
unlock (M);
T2
lock (N);
write (A);
unlock (N);
{ M } { M, N }
{ M, N } { M, N }

Initial { } { } { M, N }
T1
lock (M);
lock (N);
write (A);
unlock (N);
write (A);
unlock (M);
T2
lock (N);
write (A);
unlock (N);
{ M } { M, N }
{ M, N } { M, N }
{ M, N } { M, N }
compute lockset
intersection at
access point

Initial { } { } { M, N }
T1
lock (M);
lock (N);
write (A);
unlock (N);
write (A);
unlock (M);
T2
lock (N);
write (A);
unlock (N);
{ M } { M, N }
{ M, N } { M, N }
{ M, N } { M, N }
{ M } { M, N }
compute lockset
intersection at
access point

Initial { } { } { M, N }
T1
lock (M);
lock (N);
write (A);
unlock (N);
write (A);
unlock (M);
T2
lock (N);
write (A);
unlock (N);
{ M } { M, N }
{ M, N } { M, N }
{ M, N } { M, N }
{ M } { M, N }
{ M } { M }
compute lockset
intersection at
access point

Initial { } { } { M, N }
T1
lock (M);
lock (N);
write (A);
unlock (N);
write (A);
unlock (M);
T2
lock (N);
write (A);
unlock (N);
{ M } { M, N }
{ M, N } { M, N }
{ M, N } { M, N }
{ M } { M, N }
{ M } { M }
{ } { M }
compute lockset
intersection at
access point

Initial { } { } { M, N }
T1
lock (M);
lock (N);
write (A);
unlock (N);
write (A);
unlock (M);
T2
lock (N);
write (A);
unlock (N);
{ M } { M, N }
{ M, N } { M, N }
{ M, N } { M, N }
{ M } { M, N }
{ M } { M }
{ } { M }
{ N } { M }
compute lockset
intersection at
access point

Initial { } { } { M, N }
T1
lock (M);
lock (N);
write (A);
unlock (N);
write (A);
unlock (M);
T2
lock (N);
write (A);
unlock (N);
{ M } { M, N }
{ M, N } { M, N }
{ M, N } { M, N }
{ M } { M, N }
{ M } { M }
{ } { M }
{ N } { M }
{ N } { }
compute lockset
intersection at
access point
compute lockset
intersection at
access point

Initial { } { } { M, N }
T1
lock (M);
lock (N);
write (A);
unlock (N);
write (A);
unlock (M);
T2
lock (N);
write (A);
unlock (N);
{ M } { M, N }
{ M, N } { M, N }
{ M, N } { M, N }
{ M } { M, N }
{ M } { M }
{ } { M }
{ N } { M }
{ N } { }
compute lockset
intersection at
access point
warning: access
to A may not be
consistently
protected

Initial { } { } { M, N }
T1
lock (M);
lock (N);
write (A);
unlock (N);
write (A);
unlock (M);
T2
lock (N);
write (A);
unlock (N);
{ M } { M, N }
{ M, N } { M, N }
{ M, N } { M, N }
{ M } { M, N }
{ M } { M }
{ } { M }
{ N } { M }
{ N } { }
{ } { }
compute lockset
intersection at
access point
warning: access
to A may not be
consistently
protected

Advantages of lockset analysis:

- easy to implement, lightweight, has the
potential to scale well (in contrast with
happens-before based analyses)

Limitations of lockset analysis:

- imprecision (a violation of locking discipline is
not always a race)

not always a race)
- code coverage in dynamic tools is limited by
execution paths that are explored

not always a race)
- code coverage in dynamic tools is limited by
execution paths that are explored
- to counter the latter, we apply lockset analysis
in a symbolic context by analysing source code

We use Read and Write Sets during
lockset analysis to keep track of shared
memory location accesses — not in
previous example for simplicity
Some other (driver-specific) optimisations
read our paper!

Boogie IVL
code, instrumented
with yields
Data Race
Reports
No Errors
(Under Given Bounds)
WHOOP
Error Traces
Z3
Chauﬀeur
SMACK
Linux driver
source code in C
Boogie
IVL code
llvm-IR
Linux
Environmental
Model
Instrumentation
Sequentialization
Invariant Generation
Boogie
Veriﬁcation
Engine
CORRAL
A. Translation Phase B. Symbolic Lockset Analysis Phase C. Bug-Finding Phase
Clang /
LLVM
entry point
information
Toolchain

W
Inst
Sequ
Invari
Ve
Translation Phase
Chauﬀeur
SMACK
Linux driver
source code in C
Boogie
IVL code
llvm-IR
Linux
Environmental
Model
Clang /
LLVM
entry point
information

W
Inst
Sequ
Invari
Ve
Translation Phase
Chauﬀeur
SMACK
Linux driver
source code in C
Boogie
IVL code
llvm-IR
Linux
Environmental
Model
Clang /
LLVM
entry point
information
Debugging information is preserved so that
we can map bugs back to C source code

Symbolic Lockset Analysis Phase
ACK
Boogie
IVL code
entry point
information
Data Race
Reports
WHOOP
Z3
Instrumentation
Sequentialization
Boogie
Veriﬁcation
Engine
Boogie IVL
code, instrumented
with yields
No Errors
Error Tr
CORRAL

Challenges and Solutions
Aliasing is a big challenge — we exploit LLVM
alias analyses and SMACK heap modelling
Scalability is a big issue — we generate
and infer invariants and perform
summarisation for modular verification
(each procedure verified in isolation)
read our paper!

- imprecision as it inherits the limitations of
lockset analysis
Limitations

lockset analysis
- sound over-approximation of shared state
and lock-free data structures can lead to
false alarms
Limitations

lockset analysis
- sound over-approximation of shared state
and lock-free data structures can lead to
false alarms
- do not currently handle interrupt handlers
in a special way, we just assume they
execute concurrently at all times (which
however is sound)
Limitations

To improve Whoop’s precision, we use
Corral [5], an industrial strength bug-finder
for device drivers from Microsoft Research
that is currently used by Microsoft in the
backend of the Static Driver Verifier (SDV)
Improving precision
[5] A. Lal, S. Qadeer, and S. K. Lahiri. A solver for reachability modulo theories. CAV 2012

The sequentialised program is bounded by
a number of context-switches per thread
[6] A. Lal and T. W. Reps. Reducing concurrent analysis under a context bound
to sequential analysis. CAV 2008
Corral uses the Lal-Reps sequentialisation
[6] to transform a concurrent program into
a non-deterministic sequential program
Bounded Bug-Finding

Bounded Bug-Finding
Corral can then explore the sequential
program to detect data races, encoded as
race-checking assertions
If a race is found, a trace (mapped on the
original C code) is reported to the user

Bug-Finding Phase
ie
de
int
ion
Data Race
Reports
WHOOP
Z3
Instrumentation
Sequentialization
Boogie
Veriﬁcation
Engine
Boogie IVL
code, instrumented
with yields
No Errors
Error Traces
CORRAL

State-space explosion
Corral inserts a context switch before
each shared memory access and at each
synchronisation point (e.g. lock/unlock)

Problem: this can be too coarse and state-
space can explode in large drivers!
State-space explosion
Corral inserts a context switch before
each shared memory access and at each
synchronisation point (e.g. lock/unlock)

Sound partial-order reduction
If our symbolic lockset analysis finds that
a given shared memory access is not
racy, then we do not instrument a context
switch before this memory access

This tames the sequentialisation and
can greatly speedup Corral as we
have found in our experiments
Sound partial-order reduction
If our symbolic lockset analysis finds that
a given shared memory access is not
racy, then we do not instrument a context
switch before this memory access

We applied our technique on 16 drivers from the
Linux 4.0 kernel:
- block, char, ethernet, nfc, usb and watchdog (250
— 7300 LoC)
- exploiting our symbolic lockset analysis we can
significantly accelerate Corral by 1.5-20x !
- in 1 driver there was slowdown, because we did
not manage to verify race freedom
- detected potential races (filled bug report, but
have not confirmed yet)
Evaluation

Whoop Corral Whoop + Corral
Benchmarks Time (sec)
# Context
Switches
CSB = 9
Time (sec)
# Context
Switches
CSB = 9
Time (sec)
generic_nvram 2.3 92 197.5 29 49.3
pc9736x_gpio 4.1 691 27337.6 167 1358.9
intel_scu_wd 2.4 314 1571.7 196 689.3
fs3270 3.2 321 T.O. (10 hours) 211 8883.0
intel_nfcsim 3.6 732 1539.9 601 278.0
sx8 2.8 227 21.4 217 24.3
Read the paper for more results!

Applied our symbolic lockset analysis on >1000
concurrency benchmarks from SVCOMP 2016:
- Early results: reducing Corral runtime by ~75%
- No bugs missed, gives more confidence in our
partial-order reduction!
We want to extend our Linux model so that we can
apply our technique on more drivers!
We want to experiment with applying our technique
on other Linux modules (e.g. filesystems)
New / Future work

Thanks!
http://www.doc.ic.ac.uk/~pd1113/
p.deligiannis@imperial.ac.uk
https://github.com/pdeligia/whoop
https://github.com/pdeligia/lockpwn

Fast and Precise Symbolic Analysis of Concurrency Bugs in Device Drivers

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (13)

Similar to Fast and Precise Symbolic Analysis of Concurrency Bugs in Device Drivers

Similar to Fast and Precise Symbolic Analysis of Concurrency Bugs in Device Drivers (20)

Recently uploaded

Recently uploaded (20)

Fast and Precise Symbolic Analysis of Concurrency Bugs in Device Drivers