%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
Fast and Precise Symbolic Analysis of Concurrency Bugs in Device Drivers
1. Fast and Precise Symbolic Analysis of
Concurrency Bugs in Device Drivers
Pantazis Deligiannis
Alastair Donaldson
Zvonimir Rakamarić
We acknowledge funding by a gift from Intel Corporation and from NSF CCF 1346756
2. The relative error-rate in device
driver code is up to 10 times higher
than the rest of the Linux kernel [1]
[1] Chou et al. An empirical study of operating systems errors. SOSP 2001
3. 85% of the system crashes in
Windows XP are caused by faulty
device drivers [2]
[2] Swift et al. Improving the reliability of commodity operating systems. SOSP 2003
4. 19% of the bugs in Linux device
drivers are due to concurrency issues,
such as data races [3]
[3] Ryzhyk et al. Dingo: Taming device drivers. EuroSys 2009
5. Data races, can lead to subtle, hard
to debug errors in complex programs
such as device drivers
How to efficiently detect them?
8. Overview of our approach
Concurrent Driver
Sequential Driver
(sound over-approximation)
Symbolic
Lockset Analyser
(identify all potential races)
transform
9. Overview of our approach
Concurrent Driver
Sequential Driver
(sound over-approximation)
Symbolic
Lockset Analyser
(identify all potential races)
Symbolic Verifier
(precise but bounded exploration)
We developed in this work …
exploit race
info to speedup
trace
transform
11. EP Pair
Concurrent Driver
EP1 EP2 EPN…
Consider every pair of
entry points that can
execute concurrently
EP1 EP2
EP Pair
EP1 EP3
EP Pair
EPM EPN
…
For each driver …
13. EP Pair
EP1 EP2
Sequential Pair
call EntryPoint1
call EntryPoint2
Call 2 entry points in sequence
(order, i.e. schedule, does not
matter due to lockset analysis)
Sequentialize
For each pair …
15. Sequential Pair
call EntryPoint1
call EntryPoint2
Soundly model the effects of any
other concurrently running EPs
by over-approximating read
accesses to shared memory
For each pair …
16. Sequential Pair
call EntryPoint1
call EntryPoint2
Instrument entry points
with additional state to
record locksets for
symbolic lockset analysis
Soundly model the effects of any
other concurrently running EPs
by over-approximating read
accesses to shared memory
For each pair …
17. Sequential Pair
call EntryPoint1
call EntryPoint2
Instrument entry points
with additional state to
record locksets for
symbolic lockset analysis
Soundly model the effects of any
other concurrently running EPs
by over-approximating read
accesses to shared memory
Attempt to verify race-freedom
by asserting that all accesses to
each shared memory location
are consistently protected by at
least one common lock
For each pair …
18. EP Pair
Good reason: we can symbolically
analyse each pair of entry points in
parallel (but we leave this for future work)
EP1 EP2
EP Pair
EP1 EP3
EP Pair
EPM EPN
…
Why pairwise analysis?
19. Lightweight data race detection method:
- proposed in Eraser [4], a dynamic race detector
Lockset analysis
[4] Savage et al. Eraser: A Dynamic Data Race Detector for Multithreaded Programs. TOCS 1997
20. Lightweight data race detection method:
- proposed in Eraser [4], a dynamic race detector
- key idea: track sets of locks that consistently protect
each memory location S during program execution
Lockset analysis
[4] Savage et al. Eraser: A Dynamic Data Race Detector for Multithreaded Programs. TOCS 1997
21. Lightweight data race detection method:
- proposed in Eraser [4], a dynamic race detector
- key idea: track sets of locks that consistently protect
each memory location S during program execution
- if the lockset of S ever becomes empty, report a
potential race on S
Lockset analysis
[4] Savage et al. Eraser: A Dynamic Data Race Detector for Multithreaded Programs. TOCS 1997
22. Lightweight data race detection method:
- proposed in Eraser [4], a dynamic race detector
- key idea: track sets of locks that consistently protect
each memory location S during program execution
- if the lockset of S ever becomes empty, report a
potential race on S
- this is because an empty lockset suggests that S may
be accessed simultaneously by two or more threads
Lockset analysis
[4] Savage et al. Eraser: A Dynamic Data Race Detector for Multithreaded Programs. TOCS 1997
24. Program CLST1 CLST2 LSA
Initial { } { } { M, N }
T1
lock (M);
lock (N);
write (A);
unlock (N);
write (A);
unlock (M);
T2
lock (N);
write (A);
unlock (N);
{ M } { M, N }
25. Program CLST1 CLST2 LSA
Initial { } { } { M, N }
T1
lock (M);
lock (N);
write (A);
unlock (N);
write (A);
unlock (M);
T2
lock (N);
write (A);
unlock (N);
{ M } { M, N }
{ M, N } { M, N }
26. Program CLST1 CLST2 LSA
Initial { } { } { M, N }
T1
lock (M);
lock (N);
write (A);
unlock (N);
write (A);
unlock (M);
T2
lock (N);
write (A);
unlock (N);
{ M } { M, N }
{ M, N } { M, N }
{ M, N } { M, N }
compute lockset
intersection at
access point
27. Program CLST1 CLST2 LSA
Initial { } { } { M, N }
T1
lock (M);
lock (N);
write (A);
unlock (N);
write (A);
unlock (M);
T2
lock (N);
write (A);
unlock (N);
{ M } { M, N }
{ M, N } { M, N }
{ M, N } { M, N }
{ M } { M, N }
compute lockset
intersection at
access point
28. Program CLST1 CLST2 LSA
Initial { } { } { M, N }
T1
lock (M);
lock (N);
write (A);
unlock (N);
write (A);
unlock (M);
T2
lock (N);
write (A);
unlock (N);
{ M } { M, N }
{ M, N } { M, N }
{ M, N } { M, N }
{ M } { M, N }
{ M } { M }
compute lockset
intersection at
access point
29. Program CLST1 CLST2 LSA
Initial { } { } { M, N }
T1
lock (M);
lock (N);
write (A);
unlock (N);
write (A);
unlock (M);
T2
lock (N);
write (A);
unlock (N);
{ M } { M, N }
{ M, N } { M, N }
{ M, N } { M, N }
{ M } { M, N }
{ M } { M }
{ } { M }
compute lockset
intersection at
access point
30. Program CLST1 CLST2 LSA
Initial { } { } { M, N }
T1
lock (M);
lock (N);
write (A);
unlock (N);
write (A);
unlock (M);
T2
lock (N);
write (A);
unlock (N);
{ M } { M, N }
{ M, N } { M, N }
{ M, N } { M, N }
{ M } { M, N }
{ M } { M }
{ } { M }
{ N } { M }
compute lockset
intersection at
access point
31. Program CLST1 CLST2 LSA
Initial { } { } { M, N }
T1
lock (M);
lock (N);
write (A);
unlock (N);
write (A);
unlock (M);
T2
lock (N);
write (A);
unlock (N);
{ M } { M, N }
{ M, N } { M, N }
{ M, N } { M, N }
{ M } { M, N }
{ M } { M }
{ } { M }
{ N } { M }
{ N } { }
compute lockset
intersection at
access point
compute lockset
intersection at
access point
32. Program CLST1 CLST2 LSA
Initial { } { } { M, N }
T1
lock (M);
lock (N);
write (A);
unlock (N);
write (A);
unlock (M);
T2
lock (N);
write (A);
unlock (N);
{ M } { M, N }
{ M, N } { M, N }
{ M, N } { M, N }
{ M } { M, N }
{ M } { M }
{ } { M }
{ N } { M }
{ N } { }
compute lockset
intersection at
access point
warning: access
to A may not be
consistently
protected
33. Program CLST1 CLST2 LSA
Initial { } { } { M, N }
T1
lock (M);
lock (N);
write (A);
unlock (N);
write (A);
unlock (M);
T2
lock (N);
write (A);
unlock (N);
{ M } { M, N }
{ M, N } { M, N }
{ M, N } { M, N }
{ M } { M, N }
{ M } { M }
{ } { M }
{ N } { M }
{ N } { }
{ } { }
compute lockset
intersection at
access point
warning: access
to A may not be
consistently
protected
35. Advantages of lockset analysis:
- easy to implement, lightweight, has the
potential to scale well (in contrast with
happens-before based analyses)
36. Limitations of lockset analysis:
Advantages of lockset analysis:
- easy to implement, lightweight, has the
potential to scale well (in contrast with
happens-before based analyses)
37. Limitations of lockset analysis:
- imprecision (a violation of locking discipline is
not always a race)
Advantages of lockset analysis:
- easy to implement, lightweight, has the
potential to scale well (in contrast with
happens-before based analyses)
38. Limitations of lockset analysis:
- imprecision (a violation of locking discipline is
not always a race)
- code coverage in dynamic tools is limited by
execution paths that are explored
Advantages of lockset analysis:
- easy to implement, lightweight, has the
potential to scale well (in contrast with
happens-before based analyses)
39. Limitations of lockset analysis:
- imprecision (a violation of locking discipline is
not always a race)
- code coverage in dynamic tools is limited by
execution paths that are explored
- to counter the latter, we apply lockset analysis
in a symbolic context by analysing source code
Advantages of lockset analysis:
- easy to implement, lightweight, has the
potential to scale well (in contrast with
happens-before based analyses)
40. We use Read and Write Sets during
lockset analysis to keep track of shared
memory location accesses — not in
previous example for simplicity
Some other (driver-specific) optimisations
read our paper!
41. Boogie IVL
code, instrumented
with yields
Data Race
Reports
No Errors
(Under Given Bounds)
WHOOP
Error Traces
Z3
Chauffeur
SMACK
Linux driver
source code in C
Boogie
IVL code
llvm-IR
Linux
Environmental
Model
Instrumentation
Sequentialization
Invariant Generation
Boogie
Verification
Engine
CORRAL
A. Translation Phase B. Symbolic Lockset Analysis Phase C. Bug-Finding Phase
Clang /
LLVM
entry point
information
Toolchain
45. Symbolic Lockset Analysis Phase
ACK
Boogie
IVL code
entry point
information
Data Race
Reports
WHOOP
Z3
Instrumentation
Sequentialization
Invariant Generation
Boogie
Verification
Engine
Boogie IVL
code, instrumented
with yields
No Errors
(Under Given Bounds)
Error Tr
CORRAL
46. Symbolic Lockset Analysis Phase
ACK
Boogie
IVL code
entry point
information
Data Race
Reports
WHOOP
Z3
Instrumentation
Sequentialization
Invariant Generation
Boogie
Verification
Engine
Boogie IVL
code, instrumented
with yields
No Errors
(Under Given Bounds)
Error Tr
CORRAL
47. Challenges and Solutions
Aliasing is a big challenge — we exploit LLVM
alias analyses and SMACK heap modelling
Scalability is a big issue — we generate
and infer invariants and perform
summarisation for modular verification
(each procedure verified in isolation)
read our paper!
48. - imprecision as it inherits the limitations of
lockset analysis
Limitations
49. - imprecision as it inherits the limitations of
lockset analysis
- sound over-approximation of shared state
and lock-free data structures can lead to
false alarms
Limitations
50. - imprecision as it inherits the limitations of
lockset analysis
- sound over-approximation of shared state
and lock-free data structures can lead to
false alarms
- do not currently handle interrupt handlers
in a special way, we just assume they
execute concurrently at all times (which
however is sound)
Limitations
51. To improve Whoop’s precision, we use
Corral [5], an industrial strength bug-finder
for device drivers from Microsoft Research
that is currently used by Microsoft in the
backend of the Static Driver Verifier (SDV)
Improving precision
[5] A. Lal, S. Qadeer, and S. K. Lahiri. A solver for reachability modulo theories. CAV 2012
52. The sequentialised program is bounded by
a number of context-switches per thread
[6] A. Lal and T. W. Reps. Reducing concurrent analysis under a context bound
to sequential analysis. CAV 2008
Corral uses the Lal-Reps sequentialisation
[6] to transform a concurrent program into
a non-deterministic sequential program
Bounded Bug-Finding
53. Bounded Bug-Finding
Corral can then explore the sequential
program to detect data races, encoded as
race-checking assertions
If a race is found, a trace (mapped on the
original C code) is reported to the user
56. Problem: this can be too coarse and state-
space can explode in large drivers!
State-space explosion
Corral inserts a context switch before
each shared memory access and at each
synchronisation point (e.g. lock/unlock)
57. Sound partial-order reduction
If our symbolic lockset analysis finds that
a given shared memory access is not
racy, then we do not instrument a context
switch before this memory access
58. This tames the sequentialisation and
can greatly speedup Corral as we
have found in our experiments
Sound partial-order reduction
If our symbolic lockset analysis finds that
a given shared memory access is not
racy, then we do not instrument a context
switch before this memory access
59. We applied our technique on 16 drivers from the
Linux 4.0 kernel:
- block, char, ethernet, nfc, usb and watchdog (250
— 7300 LoC)
- exploiting our symbolic lockset analysis we can
significantly accelerate Corral by 1.5-20x !
- in 1 driver there was slowdown, because we did
not manage to verify race freedom
- detected potential races (filled bug report, but
have not confirmed yet)
Evaluation
60. Whoop Corral Whoop + Corral
Benchmarks Time (sec)
# Context
Switches
CSB = 9
Time (sec)
# Context
Switches
CSB = 9
Time (sec)
generic_nvram 2.3 92 197.5 29 49.3
pc9736x_gpio 4.1 691 27337.6 167 1358.9
intel_scu_wd 2.4 314 1571.7 196 689.3
fs3270 3.2 321 T.O. (10 hours) 211 8883.0
intel_nfcsim 3.6 732 1539.9 601 278.0
sx8 2.8 227 21.4 217 24.3
Read the paper for more results!
61. Applied our symbolic lockset analysis on >1000
concurrency benchmarks from SVCOMP 2016:
- Early results: reducing Corral runtime by ~75%
- No bugs missed, gives more confidence in our
partial-order reduction!
We want to extend our Linux model so that we can
apply our technique on more drivers!
We want to experiment with applying our technique
on other Linux modules (e.g. filesystems)
New / Future work