Aizatulin

2010 CRC PhD Student Conference

Verifying Implementations of Security Protocols in C
Mihhail Aizatulin
m.aizatulin@open.ac.uk

Supervisors Dr Andrew Gordon, adg@microsoft.com,
Dr Jan J¨rjens, jan.jurjens@cs.tu-dortmund.de,
u
Prof Bashar Nuseibeh, B.Nuseibeh@open.ac.uk
Department Computing
Status Full-time
Probation viva Passed
Starting date November 2008
Our goal is verification of cryptographic protocol implementations (such as
OpenSSL or Kerberos), motivated by the desire to minimise the gap between
verified and executable code. Very little has been done in this area. There are
numerous tools to find low-level bugs in code (such as buffer overflows and zero
division) and there are verifiers for cryptographic protocols that work on fairly
abstract descriptions, but so far very few attempts have been done to verify
cryptographic security directly on the code, especially for low-level languages
like C.
We attempt to verify the protocol code by extracting an abstract model that
can be used in high-level cryptographic verification tools such as ProVerif or
CryptoVerif. This is the first such approach that we are aware of. Currently we
investigate the feasibility of the approach by extracting the model from running
code, using the so called concolic (concrete + symbolic) execution. We run
the protocol implementation normally, but at the same time we record all the
operations performed on binary values and then replay those operations on
symbolic values. The resulting symbolic expressions reveal the structure of the
messages sent to the network and the conditions that are checked for incoming
messages.
We are able to produce symbolic execution traces for the handshake imple-
mented in the OpenSSL library. To give an example of what the extracted traces
look like, consider a simple request-response protocol, protected by hashing with
a shared key:
A → B : m|hash(‘request’|m, kAB ),
B → A : m |hash(‘response’|m|m , kAB ).
We implemented the protocol in about 600 lines of C code, calling to the OpenSSL
cryptographic library. Our concolic execution tool produces a trace of 8 lines

Page 1 of 125


write(i39)
payload1 = payload()
key2 = key()
write(i14|7c|payload1|HMAC(sha1, i7|7c52657175657374|payload1, key2))
msg3 = read()
var4 = msg3{5,23}
branchF((memcmp(msg3{28,20},
HMAC(sha1, i8|7c526573706f6e7365|i14|7c|payload1|var4, key2)) != i0))
accept(var4)

Figure 1: An excerpt from the symbolic client trace. X{start, len} denotes
the substring of X starting at start of length len. iN is an integer with value N
(width information is omitted), and branchT and branchF are the true or false
branches taken by the code.

for the client side shown in figure 1: we see the client sending the request and
checking the condition on the server response before accepting it.
We are currently working to implement symbolic handling of buffer lengths
and sound handling of loops as well as making the extracted models compatible
with those understood by ProVerif and CryptoVerif, in particular simplifying
away any remaining arithmetic expressions from the symbolic trace.
One obvious drawback of concolic execution is that it only follows the single
path that was actually taken by the code. This is enough to produce an accurate
model when there is only one main path, however, libraries like OpenSSL contain
multiple nontrivial paths. Thus, to achieve verification of those libraries, we
plan to move the analysis towards being fully static in future.

Related Work One of the earliest security verification attempts directly
on code is probably CSur [Goubault-Larrecq and Parrennes, 2005] that deals
directly with C protocol implementations. It translates programs into a set
of Horn clauses that are fed directly into a general purpose theorem prover.
Unfortunately, it never went beyond some very simple implementations and has
not been developed since.
The work [J¨rjens, 2006] describes an approach of translating Java programs
u
in a manner similar to above. In our work we try to separate reasoning about
pointers and integers from reasoning about cryptography, in hope to achieve
greater scalability.
Some work has been done on verification of functional language implementa-
tions, either by translating the programs directly into π-calculus [Bhargavan et
al., 2006; Bhargavan et al., 2008] or by designing a type system that enforces
security [Bengtson et al., 2008]. Unfortunately, it is not trivial to adapt such
approaches to C-like languages.
ASPIER [Chaki and Datta, 2008] is using model checking for verification and
has been applied to OpenSSL. However, it does not truly start from C code: any
code explicitly dealing with pointers needs to be replaced by abstract summaries

Page 2 of 125


that presumably have to be written manually.
Concolic execution is widely used to drive automatic test generation, like in
[Cadar et al., 2008] or [Godefroid et al., 2008]. One difference in our concolic
execution is that we need to assign symbols to whole bitstrings, whereas the
testing frameworks usually assign symbols to single bytes. We believe that our
work could be adapted for testing of cryptographic software. Usual testing
approaches try to create an input that satisfies a set of equations resulting from
checks in code. In presence of cryptography such equations will (hopefully) be
impossible to solve, so a more abstract model like ours might be useful.
A separate line of work deals with reconstruction of protocol message formats
from implementation binaries [Caballero et al., 2007; Lin et al., 2008; Wondracek
et al., 2008; Cui et al., 2008; Wang et al., 2009]. The goal is typically to
reconstruct field boundaries of a single message by observing how the binary
processes the message. Our premises and goals are different: we have the
advantage of starting from the source code, but in exchange we aim to reconstruct
the whole protocol flow instead of just a single message. Our reconstruction
needs to be sound to enable verification — all possible protocol flows should be
accounted for.

References
[Bengtson et al., 2008] Jesper Bengtson, Karthikeyan Bhargavan, C´dric Four-
e
net, Andrew D. Gordon, and Sergio Maffeis. Refinement types for secure
implementations. In CSF ’08: Proceedings of the 2008 21st IEEE Computer
Security Foundations Symposium, pages 17–32, Washington, DC, USA, 2008.
IEEE Computer Society.
[Bhargavan et al., 2006] Karthikeyan Bhargavan, C´dric Fournet, Andrew D.
e
Gordon, and Stephen Tse. Verified interoperable implementations of security
protocols. In CSFW ’06: Proceedings of the 19th IEEE workshop on Computer
Security Foundations, pages 139–152, Washington, DC, USA, 2006. IEEE
Computer Society.
[Bhargavan et al., 2008] Karthikeyan Bhargavan, C´dric Fournet, Ricardo Corin,
e
and Eugen Zalinescu. Cryptographically verified implementations for TLS.
In CCS ’08: Proceedings of the 15th ACM conference on Computer and
communications security, pages 459–468, New York, NY, USA, 2008. ACM.
[Caballero et al., 2007] Juan Caballero, Heng Yin, Zhenkai Liang, and Dawn
Song. Polyglot: automatic extraction of protocol message format using
dynamic binary analysis. In CCS ’07: Proceedings of the 14th ACM conference
on Computer and communications security, pages 317–329, New York, NY,
USA, 2007. ACM.
[Cadar et al., 2008] Cristian Cadar, Daniel Dunbar, and Dawson Engler. Klee:
Unassisted and automatic generation of high-coverage tests for complex sys-

Page 3 of 125


tems programs. In USENIX Symposium on Operating Systems Design and
Implementation (OSDI 2008), San Diego, CA, december 2008.
[Chaki and Datta, 2008] Sagar Chaki and Anupam Datta. Aspier: An auto-
mated framework for verifying security protocol implementations. Technical
Report 08-012, Carnegie Mellon University, October 2008.
[Cui et al., 2008] Weidong Cui, Marcus Peinado, Karl Chen, Helen J. Wang, and
Luis Irun-Briz. Tupni: automatic reverse engineering of input formats. In CCS
’08: Proceedings of the 15th ACM conference on Computer and communications
security, pages 391–402, New York, NY, USA, 2008. ACM.
[DBL, 2008] Proceedings of the Network and Distributed System Security Sympo-
sium, NDSS 2008, San Diego, California, USA, 10th February - 13th February
2008. The Internet Society, 2008.
[Godefroid et al., 2008] Patrice Godefroid, Michael Y. Levin, and David A. Mol-
nar. Automated whitebox fuzz testing. In NDSS [2008].
[Goubault-Larrecq and Parrennes, 2005] J. Goubault-Larrecq and F. Parrennes.
Cryptographic protocol analysis on real C code. In Proceedings of the 6th
International Conference on Veriﬁcation, Model Checking and Abstract Inter-
pretation (VMCAI’05), volume 3385 of Lecture Notes in Computer Science,
pages 363–379. Springer, 2005.
[J¨rjens, 2006] Jan J¨ rjens. Security analysis of crypto-based Java programs
u u
using automated theorem provers. In ASE ’06: Proceedings of the 21st
IEEE/ACM International Conference on Automated Software Engineering,
pages 167–176, Washington, DC, USA, 2006. IEEE Computer Society.
[Lin et al., 2008] Zhiqiang Lin, Xuxian Jiang, Dongyan Xu, and Xiangyu Zhang.
Automatic protocol format reverse engineering through context-aware moni-
tored execution. In NDSS [2008].
[Wang et al., 2009] Zhi Wang, Xuxian Jiang, Weidong Cui, Xinyuan Wang, and
Mike Grace. Reformat: Automatic reverse engineering of encrypted messages.
In Michael Backes and Peng Ning, editors, ESORICS, volume 5789 of Lecture
Notes in Computer Science, pages 200–215. Springer, 2009.
[Wondracek et al., 2008] Gilbert Wondracek, Paolo Milani Comparetti, Christo-
pher Kruegel, and Engin Kirda. Automatic Network Protocol Analysis. In
15th Symposium on Network and Distributed System Security (NDSS), 2008.

Page 4 of 125

Aizatulin

Recommended

Recommended

More Related Content

What's hot

What's hot (17)

Viewers also liked

Viewers also liked (7)

Similar to Aizatulin

Similar to Aizatulin (20)

More from anesah

More from anesah (20)

Recently uploaded

Recently uploaded (20)

Aizatulin