Published on

Published in: Technology, Education
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide


  1. 1. 2010 CRC PhD Student Conference Verifying Implementations of Security Protocols in C Mihhail Aizatulin Supervisors Dr Andrew Gordon,, Dr Jan J¨rjens,, u Prof Bashar Nuseibeh, Department Computing Status Full-time Probation viva Passed Starting date November 2008 Our goal is verification of cryptographic protocol implementations (such as OpenSSL or Kerberos), motivated by the desire to minimise the gap between verified and executable code. Very little has been done in this area. There are numerous tools to find low-level bugs in code (such as buffer overflows and zero division) and there are verifiers for cryptographic protocols that work on fairly abstract descriptions, but so far very few attempts have been done to verify cryptographic security directly on the code, especially for low-level languages like C. We attempt to verify the protocol code by extracting an abstract model that can be used in high-level cryptographic verification tools such as ProVerif or CryptoVerif. This is the first such approach that we are aware of. Currently we investigate the feasibility of the approach by extracting the model from running code, using the so called concolic (concrete + symbolic) execution. We run the protocol implementation normally, but at the same time we record all the operations performed on binary values and then replay those operations on symbolic values. The resulting symbolic expressions reveal the structure of the messages sent to the network and the conditions that are checked for incoming messages. We are able to produce symbolic execution traces for the handshake imple- mented in the OpenSSL library. To give an example of what the extracted traces look like, consider a simple request-response protocol, protected by hashing with a shared key: A → B : m|hash(‘request’|m, kAB ), B → A : m |hash(‘response’|m|m , kAB ). We implemented the protocol in about 600 lines of C code, calling to the OpenSSL cryptographic library. Our concolic execution tool produces a trace of 8 lines Page 1 of 125
  2. 2. 2010 CRC PhD Student Conference write(i39) payload1 = payload() key2 = key() write(i14|7c|payload1|HMAC(sha1, i7|7c52657175657374|payload1, key2)) msg3 = read() var4 = msg3{5,23} branchF((memcmp(msg3{28,20}, HMAC(sha1, i8|7c526573706f6e7365|i14|7c|payload1|var4, key2)) != i0)) accept(var4) Figure 1: An excerpt from the symbolic client trace. X{start, len} denotes the substring of X starting at start of length len. iN is an integer with value N (width information is omitted), and branchT and branchF are the true or false branches taken by the code. for the client side shown in figure 1: we see the client sending the request and checking the condition on the server response before accepting it. We are currently working to implement symbolic handling of buffer lengths and sound handling of loops as well as making the extracted models compatible with those understood by ProVerif and CryptoVerif, in particular simplifying away any remaining arithmetic expressions from the symbolic trace. One obvious drawback of concolic execution is that it only follows the single path that was actually taken by the code. This is enough to produce an accurate model when there is only one main path, however, libraries like OpenSSL contain multiple nontrivial paths. Thus, to achieve verification of those libraries, we plan to move the analysis towards being fully static in future. Related Work One of the earliest security verification attempts directly on code is probably CSur [Goubault-Larrecq and Parrennes, 2005] that deals directly with C protocol implementations. It translates programs into a set of Horn clauses that are fed directly into a general purpose theorem prover. Unfortunately, it never went beyond some very simple implementations and has not been developed since. The work [J¨rjens, 2006] describes an approach of translating Java programs u in a manner similar to above. In our work we try to separate reasoning about pointers and integers from reasoning about cryptography, in hope to achieve greater scalability. Some work has been done on verification of functional language implementa- tions, either by translating the programs directly into π-calculus [Bhargavan et al., 2006; Bhargavan et al., 2008] or by designing a type system that enforces security [Bengtson et al., 2008]. Unfortunately, it is not trivial to adapt such approaches to C-like languages. ASPIER [Chaki and Datta, 2008] is using model checking for verification and has been applied to OpenSSL. However, it does not truly start from C code: any code explicitly dealing with pointers needs to be replaced by abstract summaries Page 2 of 125
  3. 3. 2010 CRC PhD Student Conference that presumably have to be written manually. Concolic execution is widely used to drive automatic test generation, like in [Cadar et al., 2008] or [Godefroid et al., 2008]. One difference in our concolic execution is that we need to assign symbols to whole bitstrings, whereas the testing frameworks usually assign symbols to single bytes. We believe that our work could be adapted for testing of cryptographic software. Usual testing approaches try to create an input that satisfies a set of equations resulting from checks in code. In presence of cryptography such equations will (hopefully) be impossible to solve, so a more abstract model like ours might be useful. A separate line of work deals with reconstruction of protocol message formats from implementation binaries [Caballero et al., 2007; Lin et al., 2008; Wondracek et al., 2008; Cui et al., 2008; Wang et al., 2009]. The goal is typically to reconstruct field boundaries of a single message by observing how the binary processes the message. Our premises and goals are different: we have the advantage of starting from the source code, but in exchange we aim to reconstruct the whole protocol flow instead of just a single message. Our reconstruction needs to be sound to enable verification — all possible protocol flows should be accounted for. References [Bengtson et al., 2008] Jesper Bengtson, Karthikeyan Bhargavan, C´dric Four- e net, Andrew D. Gordon, and Sergio Maffeis. Refinement types for secure implementations. In CSF ’08: Proceedings of the 2008 21st IEEE Computer Security Foundations Symposium, pages 17–32, Washington, DC, USA, 2008. IEEE Computer Society. [Bhargavan et al., 2006] Karthikeyan Bhargavan, C´dric Fournet, Andrew D. e Gordon, and Stephen Tse. Verified interoperable implementations of security protocols. In CSFW ’06: Proceedings of the 19th IEEE workshop on Computer Security Foundations, pages 139–152, Washington, DC, USA, 2006. IEEE Computer Society. [Bhargavan et al., 2008] Karthikeyan Bhargavan, C´dric Fournet, Ricardo Corin, e and Eugen Zalinescu. Cryptographically verified implementations for TLS. In CCS ’08: Proceedings of the 15th ACM conference on Computer and communications security, pages 459–468, New York, NY, USA, 2008. ACM. [Caballero et al., 2007] Juan Caballero, Heng Yin, Zhenkai Liang, and Dawn Song. Polyglot: automatic extraction of protocol message format using dynamic binary analysis. In CCS ’07: Proceedings of the 14th ACM conference on Computer and communications security, pages 317–329, New York, NY, USA, 2007. ACM. [Cadar et al., 2008] Cristian Cadar, Daniel Dunbar, and Dawson Engler. Klee: Unassisted and automatic generation of high-coverage tests for complex sys- Page 3 of 125
  4. 4. 2010 CRC PhD Student Conference tems programs. In USENIX Symposium on Operating Systems Design and Implementation (OSDI 2008), San Diego, CA, december 2008. [Chaki and Datta, 2008] Sagar Chaki and Anupam Datta. Aspier: An auto- mated framework for verifying security protocol implementations. Technical Report 08-012, Carnegie Mellon University, October 2008. [Cui et al., 2008] Weidong Cui, Marcus Peinado, Karl Chen, Helen J. Wang, and Luis Irun-Briz. Tupni: automatic reverse engineering of input formats. In CCS ’08: Proceedings of the 15th ACM conference on Computer and communications security, pages 391–402, New York, NY, USA, 2008. ACM. [DBL, 2008] Proceedings of the Network and Distributed System Security Sympo- sium, NDSS 2008, San Diego, California, USA, 10th February - 13th February 2008. The Internet Society, 2008. [Godefroid et al., 2008] Patrice Godefroid, Michael Y. Levin, and David A. Mol- nar. Automated whitebox fuzz testing. In NDSS [2008]. [Goubault-Larrecq and Parrennes, 2005] J. Goubault-Larrecq and F. Parrennes. Cryptographic protocol analysis on real C code. In Proceedings of the 6th International Conference on Verification, Model Checking and Abstract Inter- pretation (VMCAI’05), volume 3385 of Lecture Notes in Computer Science, pages 363–379. Springer, 2005. [J¨rjens, 2006] Jan J¨ rjens. Security analysis of crypto-based Java programs u u using automated theorem provers. In ASE ’06: Proceedings of the 21st IEEE/ACM International Conference on Automated Software Engineering, pages 167–176, Washington, DC, USA, 2006. IEEE Computer Society. [Lin et al., 2008] Zhiqiang Lin, Xuxian Jiang, Dongyan Xu, and Xiangyu Zhang. Automatic protocol format reverse engineering through context-aware moni- tored execution. In NDSS [2008]. [Wang et al., 2009] Zhi Wang, Xuxian Jiang, Weidong Cui, Xinyuan Wang, and Mike Grace. Reformat: Automatic reverse engineering of encrypted messages. In Michael Backes and Peng Ning, editors, ESORICS, volume 5789 of Lecture Notes in Computer Science, pages 200–215. Springer, 2009. [Wondracek et al., 2008] Gilbert Wondracek, Paolo Milani Comparetti, Christo- pher Kruegel, and Engin Kirda. Automatic Network Protocol Analysis. In 15th Symposium on Network and Distributed System Security (NDSS), 2008. Page 4 of 125