Your SlideShare is downloading. ×
Habilitation draft
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Habilitation draft

1,241
views

Published on

text of my habilitation (draft)

text of my habilitation (draft)

Published in: Technology, Education

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
1,241
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
8
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. A Logical Approach to Security Analysis of Distributed Systems Yannick Chevalier December 13, 2010
  • 2. 2
  • 3. Contents1 Introduction 7 1.1 Information Management . . . . . . . . . . . . . . . . . . . . . . 7 1.2 Information Management in Computer Systems . . . . . . . . . . 8 1.3 Document Outline . . . . . . . . . . . . . . . . . . . . . . . . . . 9I Domain 132 Cryptographic Protocols 15 2.1 Cryptographic Protocols . . . . . . . . . . . . . . . . . . . . . . . 15 2.1.1 Secured Communications . . . . . . . . . . . . . . . . . . 15 2.1.2 RFCs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 2.1.3 Narrations . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 2.1.4 Security Properties . . . . . . . . . . . . . . . . . . . . . . 18 2.1.5 Formal methods . . . . . . . . . . . . . . . . . . . . . . . 19 2.2 Validation of Cryptographic Protocols . . . . . . . . . . . . . . . 21 2.2.1 Validation in a symbolic model . . . . . . . . . . . . . . . 21 2.2.2 Soundness w.r.t. a concrete model . . . . . . . . . . . . . 21 2.3 Refutation of Cryptographic Protocols . . . . . . . . . . . . . . . 22 2.3.1 Advantages over validation . . . . . . . . . . . . . . . . . 22 2.3.2 Personal Work on the Refutation of Cryptographic Pro- tocols . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 233 Web Services 27 3.1 Web Services . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 3.1.1 Basic services . . . . . . . . . . . . . . . . . . . . . . . . . 27 3.1.2 Software as a Service . . . . . . . . . . . . . . . . . . . . . 29 3.1.3 Security Policies . . . . . . . . . . . . . . . . . . . . . . . 29 3.2 Results achieved in the domain of Web Services . . . . . . . . . . 32II Tools 354 Fundamentals of First-Order Logic 37 3
  • 4. 4 CONTENTS 4.1 Facts, sentences, and truth . . . . . . . . . . . . . . . . . . . . . 37 4.1.1 Reasoning on facts . . . . . . . . . . . . . . . . . . . . . . 37 4.2 Orders . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 4.2.1 Definitions and first properties . . . . . . . . . . . . . . . 39 4.2.2 Orderings on terms and atoms . . . . . . . . . . . . . . . 40 4.3 Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 4.3.1 Terms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 4.3.2 Substitutions . . . . . . . . . . . . . . . . . . . . . . . . . 41 4.3.3 Predicates . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 4.3.4 Logical connectives and formulas . . . . . . . . . . . . . . 43 4.3.5 Quantifiers . . . . . . . . . . . . . . . . . . . . . . . . . . 44 4.4 Semantics of First-Order Logic . . . . . . . . . . . . . . . . . . . 45 4.4.1 Interpretation . . . . . . . . . . . . . . . . . . . . . . . . . 45 4.4.2 Satisfiability, validity . . . . . . . . . . . . . . . . . . . . . 46 4.5 Foundations of Resolution . . . . . . . . . . . . . . . . . . . . . . 47 4.5.1 Skolemization . . . . . . . . . . . . . . . . . . . . . . . . . 48 4.5.2 Clauses . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 4.5.3 Herbrand’s theorem . . . . . . . . . . . . . . . . . . . . . 50 4.5.4 Concluding remarks . . . . . . . . . . . . . . . . . . . . . 54 4.6 Resolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55 4.6.1 Recognizing unsatisfiable theories . . . . . . . . . . . . . . 55 4.6.2 Ground resolution . . . . . . . . . . . . . . . . . . . . . . 56 4.6.3 Unification and Most General Unifiers . . . . . . . . . . . 59 4.6.4 Resolution . . . . . . . . . . . . . . . . . . . . . . . . . . . 63 4.7 First-order Logic with Equality . . . . . . . . . . . . . . . . . . . 66 4.7.1 Axiomatizing Equality in First-Order Logic . . . . . . . . 67 4.7.2 Unification Modulo an Equational Theory . . . . . . . . . 67 4.7.3 Some properties of E-unification systems. . . . . . . . . . 70 4.8 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 745 Refinements of Resolution 77 5.1 Ordered Resolution . . . . . . . . . . . . . . . . . . . . . . . . . . 77 5.1.1 Liftable orderings . . . . . . . . . . . . . . . . . . . . . . . 77 5.1.2 Pre- and Post-ordered resolution . . . . . . . . . . . . . . 78 5.2 Previous Work on Ordered Saturation . . . . . . . . . . . . . . . 81 5.3 Decidability of ground entailment problems . . . . . . . . . . . . 82 5.3.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . 82 5.3.2 Locality and Saturation . . . . . . . . . . . . . . . . . . . 83 5.3.3 Saturation . . . . . . . . . . . . . . . . . . . . . . . . . . . 84 5.3.4 Decidability of the ground entailment problem . . . . . . 89 5.3.5 Conclusion and future works . . . . . . . . . . . . . . . . 90
  • 5. CONTENTS 5III Modeling 936 Symbolic models for Cryptographic Protocols 95 6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95 6.2 Role-based Protocol Specifications . . . . . . . . . . . . . . . . . 97 6.2.1 Specification of messages and basic operations . . . . . . 97 6.2.2 Role Specification . . . . . . . . . . . . . . . . . . . . . . 98 6.3 Operational semantics for roles . . . . . . . . . . . . . . . . . . . 100 6.4 Compilation of role specifications . . . . . . . . . . . . . . . . . . 102 6.4.1 Computation of a first implementation . . . . . . . . . . . 102 6.4.2 Computation of a prudent implementation . . . . . . . . . 102 6.5 Symbolic derivations . . . . . . . . . . . . . . . . . . . . . . . . . 104 6.5.1 Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . 104 6.5.2 Solutions of symbolic derivations . . . . . . . . . . . . . . 110 6.5.3 Decision problems . . . . . . . . . . . . . . . . . . . . . . 112 6.5.4 Relation with static equivalence . . . . . . . . . . . . . . . 113 6.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1157 Proposition for WS Modeling 119 7.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119 7.2 The model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120 7.2.1 Presentation of the car registration process (CRP) . . . . 121 7.2.2 On the encoding of CRP into our framework . . . . . . . 121 7.3 Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123 7.3.1 Values and terms . . . . . . . . . . . . . . . . . . . . . . . 124 7.3.2 Access control rules . . . . . . . . . . . . . . . . . . . . . 125 7.3.3 Workflow . . . . . . . . . . . . . . . . . . . . . . . . . . . 128 7.3.4 Entities and states . . . . . . . . . . . . . . . . . . . . . . 129 7.3.5 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130 7.4 Semantics for access control . . . . . . . . . . . . . . . . . . . . . 131 7.4.1 Application of substitution in an entity . . . . . . . . . . 131 7.4.2 Predicate evaluation . . . . . . . . . . . . . . . . . . . . . 131 7.4.3 Rule evaluation . . . . . . . . . . . . . . . . . . . . . . . . 131 7.5 Workflow operational semantics . . . . . . . . . . . . . . . . . . . 132 7.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134IV Results Achieved 1358 Cryptographic Protocols Refutation 137 8.1 Locality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137 8.1.1 Locality . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137 8.1.2 Oracle Deduction Systems . . . . . . . . . . . . . . . . . . 138 8.1.3 On the importance of locality . . . . . . . . . . . . . . . . 141 8.2 Combination of decision procedures . . . . . . . . . . . . . . . . . 143 8.2.1 Presentation of the problem . . . . . . . . . . . . . . . . . 143
  • 6. 6 CONTENTS 8.2.2 Symmetric Combination problem . . . . . . . . . . . . . . 144 8.2.3 Asymmetric Combination problem . . . . . . . . . . . . . 150 8.3 Saturation-based decision procedures . . . . . . . . . . . . . . . . 154 8.3.1 A special case of asymmetric combination . . . . . . . . . 154 8.3.2 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . 155 8.3.3 Results obtained . . . . . . . . . . . . . . . . . . . . . . . 156 8.4 Research Directions . . . . . . . . . . . . . . . . . . . . . . . . . 158 8.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1609 Web Services Orchestration & Choreography 161 9.1 Trace-based Synthesis of an Orchestration . . . . . . . . . . . . . 161 9.1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 161 9.1.2 Mediator synthesis . . . . . . . . . . . . . . . . . . . . . . 165 9.1.3 Mediator prudent implementation . . . . . . . . . . . . . 169 9.1.4 Mediator validation . . . . . . . . . . . . . . . . . . . . . 179 9.1.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . 181 9.2 Trace-Based synthesis of a choreography . . . . . . . . . . . . . . 181 9.2.1 Agent cooperation . . . . . . . . . . . . . . . . . . . . . . 181 9.2.2 Book publishing . . . . . . . . . . . . . . . . . . . . . . . 182 9.2.3 Formal specification of the problem . . . . . . . . . . . . . 183 9.2.4 Solving the problem . . . . . . . . . . . . . . . . . . . . . 185 9.2.5 Signature and deduction systems . . . . . . . . . . . . . . 187 9.3 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18910 Equivalence of Cryptographic Protocols 193 10.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193 10.2 Finitary Deduction Systems . . . . . . . . . . . . . . . . . . . . . 195 10.2.1 Aware and stutter-free ASDs . . . . . . . . . . . . . . . . 196 10.2.2 Sets of solutions . . . . . . . . . . . . . . . . . . . . . . . 197 10.2.3 Finitary deduction systems . . . . . . . . . . . . . . . . . 199 10.3 Decidability of Symbolic Equivalence for Finitary Deduction Sys- tems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 199 10.4 Research directions . . . . . . . . . . . . . . . . . . . . . . . . . . 204V Epilogue 20511 Research project 207 11.1 From security to safety . . . . . . . . . . . . . . . . . . . . . . . . 207 11.2 Reachability analysis and automated deduction . . . . . . . . . . 209 11.3 Validation of aspect-oriented programs . . . . . . . . . . . . . . . 209
  • 7. Chapter 1Introduction Anu granted him the totality of knowledge of all. He saw the Secret, discovered the Hidden, he brought information of (the time) before the Flood. (Epic of Gilgamesh) The best things in life aren’t things. (3:26 PM Jul 21st via UberTwitter, P. Hilton)1.1 Information ManagementIn what is often considered as the oldest written story, the main character isfirst described as a man of knowledge. The mysteries in ancient Greece alsoconsidered the possession of secret knowledge as a source of enlightenment.More prosaically, priests, astrologers, physicists and so on formed congregationsbased on their possession of unique knowledge, and the preservation of thesecongregations depended upon their monopoly on these pieces of useful knowl-edge, e.g. the computation of the areas allocated to peasants after each flood ofthe Nile. In ancient societies being able to retain and control secrets was thusa self-preservation issue for organizations. These ancient origins of information retention are in contrast with nowa-days society which emphasizes the instantaneous diffusion of information viaplatforms such as twitter.com or facebook.com. CEOs have their own blogon their company’s strategy1 and facing a crisis situation corporations try to beas open as possible to gain or recover citizens, consumers and peers confidence.In nowadays societies, being able to disseminate as much as possible informationis now a survival issue for corporations and individuals. Of course the delineation between the necessity of preserving secrecy of someinformation and dissemination of information is not as coarse, and both aspectsget along at the same time in almost every society, think e.g. of advertising and 1 See http://www.wired.com/wired/archive/15.04/wired40_ceo.html for more context,the blog itself being at http://blog.redfin.com. 7
  • 8. 8 CHAPTER 1. INTRODUCTIONpatents. This is particularly visible in nowadays complex industrial projectssuch as the development of a new plane, as demonstrated by Boeing with the787 dreamliner, which relies on contractors disseminated all over the world,some of whom being also contractors for its competitor Airbus. Thus the contrast between ancient and nowadays societies also routinely oc-curs as everyone, from the manager of a complex program involving contractorsto the facebook website member, has to manage, i.e. share information withpartners or withhold it. One particular difficulty in the management of infor-mation is the lack of reliability of electronic systems. Facebook members havedifficulties in adapting to the latest changes in Facebook access control policies,while information system specialists fear the possible computer attacks on theirinformation systems.1.2 Information Management in Computer Sys- temsChoosing to share or disclose information in a face-to-face meeting is relativelyeasy, as it suffices to express it or not. When in a discussion one wants someinformation to be passed to some partners but not to others, it is still possibleto skillfully resort to some common knowledge, ambiguities, or any type of non-verbal communication to precisely disclose the information to the intend person. The variety of possibilities offered to human for direct communications isbeyond the capacity of modern days computers. Computer systems conversa-tions are message exchanges, and the lack of ambiguity in these is crucial totheir proper functioning. When accounting for the fact that anyone who is will-ing to may participate, even passively and without the other participants beingaware of it, in any conversation occurring over a medium such as the Internet,it would seem that computer users only have the choice of disclosing a piece ofinformation to everyone or to no one, as were groups thousands of years ago. The role of cryptography is to provide to computer systems the ability hu-mans naturally have to alter how information is expressed to guarantee theidentity of the participants who can extract meaningful information from themessages, or of the possible source of the message. Cryptographic protocols arepredefined conversations in which the messages exchanged by the participantsare protected by cryptographic operations. Most of my research work has con-sisted in determining whether a cryptographic protocol satisfies the guaranteesit claims to achieve, and more precisely in trying to determine in a fixed settingwhether the protocol fails to provide its users with its claimed guarantees. But as presented above, an intelligent information management requires notonly the control over some pieces of information but also the proper dissemina-tion of other pieces of information. For example the Web Services frameworkaims at maximizing the availability of information by making it accessible viaon-line services. Here the notion of information is taken in the broad sense anddenotes data as well as processes. A continuation of my research on crypto-
  • 9. 1.3. DOCUMENT OUTLINE 9graphic protocols has been the extension of some results into the Web Serviceframework and consists in deciding, given the messages the putative Web Ser-vices are willing to exchange one with another, whether there exists an elec-tronic conversation that satisfies everyone’s information management policy. Ihave considered this problem under two different angles, depending on whetherone is interested in the how, i.e. considers the structure of the exchangeablemessages, or in the what, i.e. considers the conditions under which a participantagrees to disclose a piece of information to someone else.1.3 Document OutlineIn the rest of this section I describe more precisely the four parts that composethis document, namely: a) the domain of application of my researchs that con-tains a short description of crpytographic protocols and Web Services, b) thefirst-order logic tools that I rely upon to solve problems in the aforementioneddomain, c) a description of the formal modelling in first-order logic based frame-works of cryptographic protocols and Web Services, and d) a summary of theresults achieved.Domain. The first part contains the description of the two application do-mains of my work. The first one is the analysis of cryptographic protocols, onwhich I have begun to work under the supervision of Laurent Vigneron andMicha¨l Rusinowitch during my PhD. I present in Chapter 2 cryptographic pro- etocols, and surveys the existing analysis methods. Chapter 3 is an introductionto Web Services biased towards our purpose, which is the analysis of their com-munications under security constraints.Tools. Both out of didactical purpose and to serve as a reference for the latterparts of this document, I begin Chapter 4 with an introduction to the basicsof first-order logic byb surveying the classical skolemization, compacity prop-erty, and resolution. The latter is of special importance to us as it permitsone to prove automatically that a first-order theory is unsatisfiable—one saysthat resolution is refutationally complete—, and thus by contradiction that aproperty is a logical consequence of other properties. This chapter ends withmore advanced materials on reasoning modulo an equational theory that endswith the replacement properties that underlies a large part of my work on theanalysis of cryptographic protocols. The refutational completeness of resolu-tion is insufficient for the practical purpose of automated deduction as it relieson non-determinism, and the amount of computation required even for simpletheories is too large even for modern days computer. Refinements of resolutionaim at reducing the non-determinism to turn this procedure into one suited toautomated deduction, and in some cases permits one to obtain a decision proce-dure. We first present in Chapter 5 the classical result of Basin and Ganzingerthat proves that for first-order theories in which all permitted resolution steps
  • 10. 10 CHAPTER 1. INTRODUCTIONhave been performed, the logical consequence problem is decidable. This re-sult is based on a refinement of resolution based on an ordering in which everyatom without variables is greater than only a bounded number of other atoms.This presentation is followed by its (unpublished) extension to well-foundedorderings I have obtained with Mounira Kourjieh when solving cryptographicprotocol analysis problems.Modelling. Now that the reader is equipped with a “survival toolkit” in first-order logic I present the formal models on which the analysis is performed.Chapter 6 includes an article written in collaboration with M. Rusinowitch onthe compilation of standard cryptographic protocol specifications into activeframes. These are a simplified formal model of protocol participants in whichonly the global effects, not the individual operations, of the participant are takeninto account. Also in this chapter I introduce symbolic derivations in which alloperations must be atomic. In contrast with active frames, which have an in-tuitive semantics, and with process calculi, that rely on standard programmingconstructions, symbolic derivations are designed to ease the reasoning on pro-tocol participants and on the intruder, at the cost of a difficulty to relate thismodel of computation to standard constructions. In contrast with cryptographic protocols in which entities usually terminatetheir participation to the protocol after a few execution steps, Web Servicesmay exhibit a rich behavior. Trust negotiation in particular usually ends once afixpoint is reached. Thus in order to take into account the access control part ofthe Web Service specifications we need to consider a framework in which loopsare allowed. In collaboration with Philippe Balbiani and Marwa ElHouri I haveproposed one such framework in [21, 22], from which Chapter 7 is extracted.Results obtained. The last part of this document presents the decidabilityor combination results I have obtained since I obtained my Ph.D. In a firstchapter I present a synthesis of several results obtained around the decidabilityof the insecurity problem of cryptographic protocols when only a finite number ofmessage exchanges by honest agents are allowed. Instead of focusing on each ofthe settings considered, I have tried to how these different results are connectedone with another. In doing so I have assumed that the reader is already familiarwith the proofs and techniques employed in the articles [61, 67, 62]. Then in Chapter 9 I present the results obtained while I was invited in theCassis project at INRIA Nancy Grand Est. I have worked there in collaborationwith M. Rusinowitch, M. Turuani, and with two Ph.D. students, MohammedAnis Mekki and Tigran Avanesov. We have worked on the application of thetechniques developped primarily for cryptographic protocol analysis to solve ba-sic orchestration problems, which are both special reachability problems. WithM.A. Mekki the study was focused on building a complete tool that takes in itsinput a description of the available services in an Alice&Bob-like notation anda description of the goal of the orchestration, and produces a deployment-readyvalidated orchestrator service. At the time of writing, that service is deployed
  • 11. 1.3. DOCUMENT OUTLINE 11as a tomcat servlet, but all the cryptography is implemented within the bodyof the SOAP messages. With T. Avanesov we have considered a multi-intruderextension of the standard cryptographic protocol analysis setting. When per-forming security analysis, this setting permits us to model situations in whichseveral intruders are willing to collaborate one with another, but cannot com-municate directly, and thus have to pass the information they want to exchangethrough honest agents. When composing Web Services, we look at a distributedorchestration problem: several partners are willing to collaborate, but they donot wish to share all the information they have. The problem then is to decidewhether the participants’ security policies are flexible enough to allow themto collectively implement the goal service. Generally speaking, this problemis strictly more difficult than standard orchestration (or cryptographic protocolanalysis) given that in addition to a decision procedure for the case of Dolev-Yaolike message manipulations, we have obtained an undecidability result when theequational theory that defines the operations is subterm and convergent. Finally in Chapter 10 I present some work on the equivalence of symbolicderivations. The problem is to determine whether an intruder can observe dif-ferences in the execution of two different protocols. A preliminary result ob-tained in collaboration with M. Rusinowitch was published in [75]. In thatpaper we have provided a more succinct proof of the decidability of this prob-lem for subterm convergent equational theories, a result originally obtained byM. Baudet [27]. In this chapter I present a criterion that actually permits oneto reduce this equivalence problem to the reachability analysis performed whenconsidered the usual trace properties. I believe that the reduction can easily beimplemented in reachability analysis tools such as CL-AtSe or OFMC, and thusmay be of practical interest.Epilogue. This document ends with a last chapter on the future research di-rections stemming from the results obtained so far. A one-sentence summarywould be more of the same, but differently. While I plan to continue the workaround reachability analysis problems, I also plan to explore further the side-ways, namely: • to work on the potential applications to safety analysis; • to explore further the relation between reachability analysis and first-order automated reasoning techniques; • to obtain a comprehensive framework for service composition that also takes into account trust negotiation, and as a consequence to relate more formally the models for protocols and Web Services presented in this doc- ument; • to extend the modularity results obtained to address the modular verifi- cation of aspect-based programs.
  • 12. 12 CHAPTER 1. INTRODUCTION
  • 13. Part IDomain 13
  • 14. Chapter 2Cryptographic Protocols The starting point of the work presented in this document is the security analysis of cryptographic protocols. We describe in this chapter what these communicating programs are, which properties they guarantee, and how they are specified. We also present a short survey on the analyzes they may be subject to with an emphasis on our domain of research.2.1 Cryptographic ProtocolsWe present in this section the cryptographic protocols. In Subsection 2.1.1 wepresent the setting in which they are specified: the participants, the electroniccommunications, and the cryptographic operations. Then in Subsection 2.1.2we briefly present a short specification of a cryptographic protocol in a Re-quest for Comments document issued by the Internet Engineering Task Force(IETF), a standardization body. Though we do not consider exclusively cryp-tographic protocols specified in such documents, this serves as the basis for ourfirst formal model of cryptographic protocols, in which the participants and thediscussion they are intended to have is specified by a narration, presented inSubsection 2.1.3. Then we present some of the standard properties they canguarantee in Subsection 2.1.4. Finally we explain in Subsection 2.1.5 how thecorrespondence between the narrations and their properties can be established.2.1.1 Secured CommunicationsA cryptographic protocol defines which messages can be exchanged betweenparticipants. The advantage gained by reducing one’s possible actions to thosedescribed in the protocol is the implicit guarantee that each participant behavingas prescribed is provided with security guarantees on the data he has exchanged.This guarantee is obtained via the clever use of cryptographic primitives. These are algorithms that rely on the asymmetry of information betweenindividuals, and are classified according to the assumptions on this asymmetry. 15
  • 15. 16 CHAPTER 2. CRYPTOGRAPHIC PROTOCOLSThe most common types are:Secret key cryptosystems: this type of cryptography has been the only type of cryptography until the 1970s. It relies on a secret piece of information, called a secret key, known only within a small group. Every member of this group can both cipher and decipher messages with the key, while agents outside of it can neither cipher nor decipher the encoded message. Instances of secret key cryptosystems are the Enigma [214], DES [165], 3DES [169], and the current AES [170]. Given a message M , and a secret key sk(k) we denote: encs (M, sk(k)):the encryption of M with the key sk(k) decs (M, sk(k)):the decryption of M with the key sk(k)Public key cryptosystems: the first (tentative) publication [158] on public key cryptography was met with skepticism, as in the words of a reviewer: “Experience shows that it is extremely dangerous to transmit key information in the clear.” 1 The first accepted paper on the topic was the presentation by Diffie and Hellman [104] of a clever usage of exponentiation in modular arithmetic. The result of their analysis was the possibility to compute a couple of keys (pk(k), sk(k)) such that the messages encrypted with the key pk(k) can be decrypted only with the key sk(k), and such that sk(k) cannot feasibly be computed from pk(k). Thus the key pk(k) can be published as a phone number would be, and any participant can send information only to the agent knowing the key sk(k), given that only that agent can decrypt, i.e. understand. Examples of public-key cryptosystems include RSA [186, 31, 179, 180], ElGamal [116]. Given a message M , a public key pk(k) and a secret key sk(k) we denote: encp (M, pk(k)) the encryption of M with the key pk(k) decp (M, sk(k)) the decryption of M with the key sk(k)Signature cryptosystems: the asymmetry of public key cryptosystems can also be employed to authenticate the creator of a message. The sender signs the message he wants to send with a secret key sk(k). Anybody knowing the public key pk(k) can then verify that the signature was com- posed with the key sk(k), and thus originates from the possessor of that key. Given a message M , a public key pk(k) and a secret key sk(k) we denote:   sign(M, sk(k)) the signature of M with the key sk(k) verif (M , M, pk(k)) the check that M is the signature of M with the inverse of the key pk(k)  1 http://www.merkle.com/1974/
  • 16. 2.1. CRYPTOGRAPHIC PROTOCOLS 17 Other functions are employed to construct messages such as the concatena-tion M1 , M2 of two messages. We also consider the modeling of mathematicsfunctions such that the bitwise exclusive-or or the modular exponentiation, andwill add the corresponding symbols as necessary.2.1.2 RFCsCryptographic protocols are published and endorsed by various governmentalor private organizations. These organizations can be formed to support one spe-cific (set of) protocols, such as the “Liberty Alliance”, or have a more generalinterest in one domain, such as the “Oasis Open consortium” or the “WorldWide Web Consortium”, for respectively the transmission and representationof information in the XML format or the Web. The Internet Engineering TaskForce (IETF) is particularly important as an organization focusing on the basicprotocols employed in the computer-to-computer communications, and on theinteroperability of their implementations. Transport Layer Security [102, 103](TLS) is specified by a Request for Comments (RFC) document, as are someprotocol proposals in early stages, such as RFC 2945 that describes the SRPAuthentication and Key Exchange System. In the latter case implementationissues are not discussed, but the principle of the protocol is presented. Oftensuch documents contain a finite state automaton describing the different statesin which a program implementing the protocol can be as well as the possibleactions in each state, and/or the intended sequence of messages between par-ticipants in the protocol, as in Figure 2.1. Client Host U =<username> → ← s =<salt from passwd file> Upon identifying himself to the host, the client will receive the salt stored on the host under his username. a =random() A = g a %N → v =<stored password verifier> b =random() ← B = (v + g b )%N p =<raw password> x = SHA(s|SHA(U |” : ”|p)) S = (B − g x )(a+u∗x) %N S = (A ∗ v u )b %N K =SHA Interleave(S) K =SHA Interleave(S)Figure 2.1: Annotated message sequence chart extracted from the RFC 2945(SRP Authentication and Key Exchange System)
  • 17. 18 CHAPTER 2. CRYPTOGRAPHIC PROTOCOLS2.1.3 NarrationsThough in the Avispa and Avantssar we have worked on the definition of morecomplex protocol specification languages, the specification of a protocol by asingle sequence of messages as in [98, 148, 126, 162] is sufficient for most cryp-tographic protocols even though the internal computations of the agents is notspecified. In its simplest form, a narration is a sequence of message exchangesfollowed by the initial knowledge each participant must have to engage in theprotocol (Needham-Schroeder Public Key protocol, [166]): A→B:encp ( A, Na , KB ) B→A:encp ( Na , Nb , KA ) A→B:encp (Nb , KB ) where −1 A knows A, B, KA , KB , KA −1 B knows A, B, KA , KB , KBThe names A and B in this sequence do not refer to any particular individualbut to roles in the narration: common names instead of A and B are Client,Server, Initiator,. . . Actual participants in an instance (also called session) ofthe protocol play each one of the roles defined by the message exchange. We note that the messages Na and Nb are not in the knowledge of A norof B. These are nonces, i.e. random values created at the beginning of eachinstance of the protocol. Personal work: We present in Chapter 6 how these narrations can be given an operational semantics. The languages we have developed in the course of the Avispa and Avantssar projects did not need such developments given that the modeler of a protocol in HSPSL [64] or ASLan V.2 has to specify also the internal actions of the roles. Though it is often tedious to write such specifications, the language aims at a greater accuracy of the protocol model. We note that latest works such as [163] step back on this choice and return to simpler models.2.1.4 Security PropertiesGenerally speaking [83] one can distinguish two kinds of properties for programssuch as protocols: • Properties that are defined by a set of possible executions of the protocol; • Hyper-properties that are defined by the set of the sets of possible execu- tions of the protocol.Our work principally focuses on the properties of protocols such as: • Secrecy, i.e. determining whether one of the messages exchanged can be constructed by an attacker;
  • 18. 2.1. CRYPTOGRAPHIC PROTOCOLS 19 • Authentication, i.e. determining whether the principals accept only the messages originating from the participants listed in the narration.Example 1. The simplified [147] version of the Needham-Schroeder Public Keyprotocol (NSPK) [166] exhibits vulnerabilities to both secrecy and authentica-tion. Whereas at the end of their respective execution A and B shall be assuredto have engaged in a conversation one with another and that the nonces Na andNb are kept secret, Lowe [147] found the following attack: A → I :encp ( A, Na , KI ) I(A)→ B :encp ( A, Na , KB ) B →I(A):encp ( Na , Nb , KA ) I → A :encp ( Na , Nb , KA ) A → I :encp (Nb , KI ) I(A)→ B :encp (Nb , KB )In this attack A starts a legitimate instance of the protocol with an intruder, i.e.a dishonest agent I. This intruder then masquerades as A—the correspondingevents are denoted I(A)—and initiates a session with B. B responds as if hewere talking to A, and ends successfully his part of the protocol. However, inthe course of his protocol instance B has accepted messages issued by I insteadof A, hence an authenticity failure. Furthermore, the nonces Na and Nb , whichare believed by B to be a common secret shared with A, are actually known byI, hence a secrecy breach. Personal work: Until recently I have worked only on the security analysis of properties such as secrecy and authentication. However in a debuting series of work I also consider the problem of the security analysis w.r.t. the equivalence of protocols. This notion is employed to reason about anonymity, e-voting protocols, abstraction of a perfect primitive by a concrete one, and so on. Chapter 10 includes these results, which are related to the refutation of cryptographic protocols.2.1.5 Formal methodsWe have worked on the formal analysis of cryptographic protocols. This meansthat given a specification such as a narration we built a logical model of theprotocol and its environment consisting in three parts describing respectively: • the possible actions of agents behaving as prescribed by the roles in the protocol; • the possible actions of an attacker in the setting considered; • the property we want to verify.The parallel execution of roles and of the intruder is interpreted by a conjunc-tion. Two types of logical analysis can then be performed:
  • 19. 20 CHAPTER 2. CRYPTOGRAPHIC PROTOCOLSValidation: one proves that the property is logically implied by the specifica- tions of the protocol and of the intruder;Refutation: one constrains the logical specifications e.g. by imposing an ini- tial state, bounds the number of possible instances of the protocol,. . . and proves that under these restrictions the property is not logically implied by the specifications of the protocol and of the intruder.When failing in refuting a protocol, we can only conclude that under the con-straints imposed there is no attack. Of course this does not mean that there isno attack when weaker constraints, or none, are imposed. Let us review someof the constraints routinely imposed:Isolation: no protocol is executed concurrently with the one under scrutiny. While unrealistic, this assumption, or some weaker version of it, is needed given that for any protocol P one can construct a protocol P’ [132] such that, when P’ is executed concurrently with P the attacker can discover a secret message exchanged in P. While this result is theoretical as the second protocol has to be constructed from the first one, such attacks also often occur in practice [91]. In [50, 19] the isolation assumption is weakened into assuming, in some form or another, that no other protocol executed concurrently uses the same cryptographic data. Concerning symbolic analysis of protocols, one can find in [163] similar assumptions employed to obtain the soundness of the composition of transport protocols. Other similar conditions for the sequential or parallel composability can also be found in [10, 88] and others that can be traced back to the non-unifiability condition initially introduced for the decidability of secrecy in [185].Soundness: the properties of cryptographic primitives are usually [119, 115, 184] expressed by games in which an intruder, modeled by a probabilistic Turing machine, cannot in a reasonable amount of time have a significant gain over a toss of coin. For instance in IND-CPA games the intruder is given a public key. He then chooses two messages m0 and m1 , and is then presented with the encryption of either m0 or m1 . He wins the game if he can choose m0 and m1 such that he has strictly2 more than 50% chances of guessing the right answer. While there are some attempts [23, 24] to directly interpret the construc- tions on messages in terms of probability distributions, the usual lifting of these properties into a symbolic world is problematic given that they express what the intruder cannot do, whereas the symbolic analysis rests on the description of what the intruder can do. We present how the trans- lation from the concrete cryptographic setting to the symbolic world can be justified in Subsection 2.2.2. 2 The actual condition is actually even more restrictive, and depends on the length of thekey
  • 20. 2.2. VALIDATION OF CRYPTOGRAPHIC PROTOCOLS 21Bounds on the instances of the protocol: though in practice the number of distinct agents that can engage in an unbounded number of sessions of a cryptographic protocol is a priori unbounded, it has been proved [85] that if there is a secrecy (resp. authentication) failure in an arbitrary (w.r.t. the number of sessions and the agents participating in each session) instance of the protocol then there is a secrecy (resp. authentication) failure with the same number of sessions but only 1 (resp. 2) distinct honest agents, in addition to the intruder, instantiating the roles of the protocol. Furthermore Stoller [200, 201] remarked that essentially all “standard” protocols either had a flaw found when examining a couple of sessions or were safe. While this cannot be argued for cryptographic protocols in general [160] this remark lead to the refutation-based methods in which one only tries to find an attack involving a couple of distinct instances of the protocol. We present more in details in Section 2.3 the history of refutation with a bounded number of instances of the protocol.2.2 Validation of Cryptographic Protocols2.2.1 Validation in a symbolic modelValidation of cryptographic protocols is usually performed under the assumptionthat the protocol is executed in isolation, this assumption being justified by thework on the soundness w.r.t. the concrete cryptographic setting described inSection 2.2.2. Under this isolation hypothesis, validation of a protocol amountsto proving that for any number of parallel instances of the protocol, each instanceprovides the guarantees claimed by the protocol. This problem is usually treatedby translating the descriptions of the intruder and of the honest agents into setsof (usually Horn) clauses, and by reducing the problem of the existence of anattack to a satisfiability problem. This approach is successful in practice, see for example the ProVerif toolby B. Blanchet [38], and some decision procedures were also obtained. Thesatisfiability of sets of clauses in which each clause either has at most one variableor one function symbol is decidable [84], a NEXPTIME bound is given in [194,195]. This problem is DEXPTIME-complete if all the clauses are furthermoreHorn clauses. The class of sets of clauses was later extended to take into accountblind copy [90] while preserving decidability. It was also extended to take into account the properties of an exclusiveor [196]. While in this article it is also proven that adding an abelian group ad-dition operation leads to undecidability, it was implemented in ProVerif in [137],and the decidability of some particular case, including some group protocols,was proven.2.2.2 Soundness w.r.t. a concrete modelValidation of a cryptographic protocol is done w.r.t. a given attacker model.However there is no assurance that the modeled attacker is as strong as an at-tacker who can take advantage of the precise arithmetic relations between the
  • 21. 22 CHAPTER 2. CRYPTOGRAPHIC PROTOCOLSmessages, the keys, and so on. For example the Pollard ρ method [182] is basedon the computation of collisions (different products having the same result) ina finite group and speeds-up significantly the factorization of some integers. Wethus have a discrepancy between the symbolic analysis of cryptographic primi-tives, which is conducted independently from the actual values of the messagesexchanged and the keys, and the analysis in the concrete setting in which theattacker has access to the actual values of the messages and the keys, withthis additional information opening the possibility of additional attacks on aprotocol. There has been a lot of work trying to relate concrete settings to symbolicones, starting with [177]. As demonstrated by e.g. [50] finding a good setting is adifficult and error-prone task. However more recent works such as [19, 138, 139]have provided sound and usable definitions and cryptographic settings. If oneagrees on the restriction on the usage of cryptographic protocols and of keysimposed by these settings there exists a cryptographic library that hides theconcrete values of the keys by imposing the use of pointers instead of real dataand such that every useful manipulation on message can be performed by callsto this library.2.3 Refutation of Cryptographic Protocols2.3.1 Advantages over validationValidation of cryptographic protocols is undecidable even in the simplest settingsin which perfect cryptography is employed, the protocol is executed in isolationfrom other protocols, and either only a finite number of distinct values areexchanged or some typing systems ensures that the complexity of the messagesis bounded. Furthermore the soundness of a validation procedure is hard toestablish: though one can prove that in a given symbolic model there is noattack on a protocol, this result does not necessarily translate into the validationof a concrete version of the protocol as was described in 2.2.2. However, when trying to refute a protocol, the translation to the concretelevel is simpler as it suffices to prove that any action performed by the attackerin the symbolic model can be translated into an action of an attacker in theconcrete model. Also the restrictions imposed on the protocols to ensure thedecidability of their validation are usually too strong for real-life case studies. These reasons motivated the refutation of cryptographic protocols underconstraints: instead of trying to prove that a protocol is valid one tries to dis-cover an attack when additional constraints on the protocol are imposed. Inaccordance with the observations by Stoller [200, 201] the most common con-straint consists in: a) bounding the number of messages the honest participantscan receive; and b) forcing the participant either to accept a message or abortshis execution of the protocol. These assumptions can be translated in termsof processes by imposing that the honest participants are modeled by processeswithout loop and in which the “else” branch of the conditional is always an
  • 22. 2.3. REFUTATION OF CRYPTOGRAPHIC PROTOCOLS 23abort. Usually one further imposes that the tests in the conditional must be(conjunctions of) positive equality tests. Another common restriction consistsin bounding the complexity of the terms representing the messages. Under these assumptions it is possible to devise decision procedures for therefutation of cryptographic protocols w.r.t. a model of the attacker. Whenconducting such an analysis one first has to provide the reader with a messageand deduction model, and then only can one present a decision procedure w.r.t.these models. In more details we have:Message model: Messages are modeled by first-order terms, i.e. finite recur- sive structures defined by the applications of some functions on terms and by constants. The first task in protocol refutation consists in defining the properties of these functions. For instance one should model that a bitwise exclusive-or operation ⊕ is commutative, i.e. for every messages x and y the equality x ⊕ y = y ⊕ x holds;Deduction model: Then one has to model how the attacker can use messages at his disposal to create new ones. This is usually done by assuming that the intruder can apply (a subset of) the symbols employed to define the messages to construct new messages. For example an asymmetric encryption algorithm can be employed by the intruder to construct new messages, but the sk( ), pk( ) symbols, employed to denote the public and private keys, cannot be employed by the intruder to construct new keys;Decision procedure: Finally one searches a decision procedure applicable to all finite message exchanges where the messages are as defined in the first point when attacked by an intruder having the deduction power as defined in the second point.Since we attempt to refute protocols the soundness of the message and de-duction models is more important than their completeness. Forgetting somepossible equalities or deductions may lead to inconclusive analysis (stating thatno attack is found under the current hypotheses), but having unsound equal-ities or deductions could lead to false positives, i.e. a valid protocol could bedeclared as flawed.2.3.2 Personal Work on the Refutation of Cryptographic ProtocolsDuring my PhD I have worked on the refutation of cryptographic protocolswhen the number of messages exchanged among the honest agents is bounded.In collaboration with Laurent Vigneron, I first extended Amadio and Lugiez’sdecision procedure [8] to take into account the case of non-atomic secret keysand implemented it in daTac [78]. Then we have presented an abstraction ofthe parallel sessions of a cryptographic protocol [77, 79] in which it is possibleto validate strong authentication, in contrast with other existing abstractions(e.g. [41]) in which replay attacks cannot be detected. This abstraction is based
  • 23. 24 CHAPTER 2. CRYPTOGRAPHIC PROTOCOLSon a saturation of the protocol rules modeled as clauses, and on the extension ofthe intruder’s deduction capacities with these so-called “oracle” rules, insteadof simply checking the property in the saturated set of rules. Then, and beforeI finished my PhD, I have worked with R. K¨sters, M. Rusinowitch, and M. Tu- uruani on the extension of the complexity result obtained in the case of perfectcryptography [190, 144] to the cases in which an exclusive-or [68, 61], an expo-nential for Diffie-Hellman [69, 62], commutative asymmetric encryption [60, 62],or oracle rules [63] were added to the standard set of intruder deduction rules.I finally presented a lazy constraint solving procedure [56] that extends the onein [78] to protocols in which an exclusive-or symbol appears. This procedurewas implemented in CL-AtSe [208] by M. Turuani and M. T¨ngerthal with some ufurther optimization on the exclusive-or unification algorithm [207]. This serie of results was however non-satisfactory given that there was noresult on the decidability of refutation when e.g. both an exponential and anexclusive-or appear in the protocol. In collaboration with M. Rusinowitch wehave considered the problem of the combination of decision procedures for refu-tation, and presented a solution [70, 76] that reduces the refutation of protocolsexpressed over the union of two disjoint sets of operators and with ordering re-strictions to problems of refutation in individual signatures with the same kindof ordering constraints. We later extended this result to well-moded but non-disjoint union of signatures in [71, 72]. In [11] the authors build upon the firstcombination result to obtain a similar one on the combination of static equiv-alence decision procedures, while [157, 136] obtain similar conditions for thecombination on non-disjoint signatures, and [47] extends it to take into accountsome specific properties of homomorphisms. Finally let me mention that thewell-moded constraint is rather general and intuitive, given that it was definedto model the properties of exponential w.r.t. the abelian group of its exponents,but was also employed in [97] to model the relationship between access controland deductions on messages in PKCS#11. When Mounira Kourjieh began her PhD under my supervision, we startedto work on a novel research direction. As explained above, the traditionalresearch on the relation between concrete and symbolic models of cryptographicprimitives is based on the establishment of a set of assumptions on the use ofthese primitives and on the management of the keys, and in proving that underthese assumptions one can build a complete symbolic model such that, if thereis no flaw on the symbolic level then there is no flaw on the concrete level. Weremark that: • the approach may be too restrictive for real-life protocols, as it requires e.g. that the keys are created and managed by a trusted entity—the cryptographic library; • the soundness of validation in the symbolic model is hard to establish given that one has to account for all the possible actions of the attackers. This is in contrast with the soundness of refutation for which one only has to prove that the actions described in the symbolic setting are feasible in the concrete setting.
  • 24. 2.3. REFUTATION OF CRYPTOGRAPHIC PROTOCOLS 25For these two reasons we have tried to model the weaknesses of the cryptographicprimitives when no assumption is made on the keys creation and management:instead of restricting the concrete level to make it fit a symbolic model wehave instead augmented the symbolic model to take into account the knownattacks on the concrete primitives. We have achieved decidability results forsignatures in the multi-user setting [58] and the decidability3 of the refutationfor hash functions for which it is feasible to compute collisions [57]. This workis presented in more details in Chapter 8. 3 Under the assumption that the combination result of [71] on deduction systems also holdson extended deduction systems.
  • 25. 26 CHAPTER 2. CRYPTOGRAPHIC PROTOCOLS
  • 26. Chapter 3Web Services As a continuation of my work on cryptographic protocols I have begun research on Web Services when I arrived in Toulouse in 2004. While at first they were simply viewed as crypto- graphic protocols exchanging XML messages, this very active area turned out to be the source of a variety of research prob- lems related to the modeling of the access control policy and of the workflow of Business Processes. Also of interest is the emerging development of modular methods for the validation of Web Services. We introduce in this chapter Web Services with a short historical introduction, followed by a description of the aspects of concern to my research. I conclude it with a summary of my research on this topic.3.1 Web Services3.1.1 Basic services1 The usual characterization of Web Service defines a Web Service as an appli-cation that communicates with remote clients using the HTTP [114] transportprotocol. The principle of having applications executed on a server computerand used by remote clients is not an original one, as was already present in Sun’smid-90’s motto “Network is the computer”. However the first implementationswere impractical, for several reasons: • Sun’s proposal was to code all the applications in Java to ensure inter- operability. • The Corba2 framework aimed at the independence from Java, but suffered from the choice of a binary encoding of data (which implies the difficulty 1 This historical discussion is based, among other sources, on http://www.ibm.com/developerworks/webservices/library/ws-arc3/. 2 Common Object Request Broker Architecture. 27
  • 27. 28 CHAPTER 3. WEB SERVICES for different vendors to provide interoperable solutions) and of a dedicated transport protocol called IIOP [159] that imposes constraints on the pro- grammer and limits interoperability to platforms understanding it;These limitations have not prevented both Java and Corba to be successfulin a closed environment, but were too strong for the overall adoption of thesesolutions for client/server communications. Given the workforce needed to specify, standardize, and implement inter-operatively a protocol on a variety of platform, a natural choice for the transportprotocol was to rely on an off-the-shelf widely implemented protocol. HTTPstood out among other possibilities because a) it is an open protocol, andb) client interfaces are already provided by existing Web browsers, and c) theseWeb browsers also already support scripting languages, and d) its traffic is inmost cases not blocked by firewalls. Furthermore, when employed in combina-tion with the TLS [102, 103] protocol it provides the basic security guaranteesof server authentication and confidentiality. One usually differentiate betweenSOAP and REST Web Services. The former are based on SOAP, an application-level transport protocol that relies on post/get HTTP verbs. In addition tothese verbs the REST Web Services also use the update/delete ones, but donot need the extra abstraction provided by the SOAP protocol. Another characterization of Web Services (starting from WSDL 2.0 [187]) isthe description of an available service in the Web Service Description Language.This is a language in which the individual functionalities, called operations, areadvertised together with a description of their in- and output messages, as wellas a description of how one can connect to the service. An important pointis that for Web Services described in WSDL, HTTP is not the only possibletransport protocol. Originally WSDL [81] was designed to describe Web Servicescommunicating using the SOAP [120] protocol, an application-level protocoloriginally running on top of HTTP. Bindings of SOAP to other protocols suchas JMS or smtp have since been defined, and with WSDL 2.0 the application-level transport protocol is not necessarily SOAP anymore.Example 2. The Amazon S33 (Simple Storage Service) provides users with astorage space as well as with operations enabling the user to set an access controlpolicy to her files and add, view, remove files from the store. It is available bothin the REST style and in the SOAP style.Model. In the rest of this document we consider an abstraction of Web Ser-vices in which the exact transport protocol employed is irrelevant, assumingthat one could describe more precisely the messages whenever one wants toconsider the exact binding employed. As a result, a Web Service is akin to arole specification in which request/response pairs of messages are defined, butwithout necessarily constraints on the order in which the requests are received. 3 API description available at url http://docs.amazonwebservices.com/AmazonS3/latest/API/.
  • 28. 3.1. WEB SERVICES 293.1.2 Software as a ServiceWSDL defines which functionalities a service offers as well as how one com-municates with the service. However, since their inception, Web services havegradually turned from remotely accessible libraries to full-fledged applications.The general idea is to transform existing applications, or create new ones, bywriting independent software components and by establishing communicationsequences between these components. The goal is to: • ease the deployment of new applications and the development of new com- ponents; • ease the changes in an application by containing each one in a single component; • rely on the fact that each component is remotely accessible to gain flexi- bility on the hardware infrastructure, i.e. the actual computers running the components, for example by relying on a Web server to dispatch a request to the computer on which the application is deployed.The separation into atomic components necessitates a way to glue these com-ponents into applications. This glue is called a business process, and is writtenin a language in which, besides the usual assignments, conditionals, and loopsconstructs, there exists basic constructs to invoke a remote service. Some ofthese languages are scripting languages such as python or Ruby, but we havechosen to focus on BPEL [128] Business Process Execution Language becauseof its natural integration in the WSDL description of a service: services in-voked are referenced using their WSDL description, and the process itself canbe advertised by publishing a WSDL description of it. A current trend is also to employ Web Services to outsource the computers inwhich a corporation’s applications are executed. I.e. the services are not hostedon a computer belonging to the corporation but on computers provided by athird party, who in returns perceives some payment according to the resourcesused by the applications. A merit of this cloud computing approach is thelow initial cost of deployment of services as well as the reduced uncertaintyon the running cost/customer ratio, a crucial benefit in nowadays economicenvironment.Model. When analyzing the security of a Web Service, we simply model Busi-ness Processes with an ordering on the possible input and output messages. Butwhen considering the access control policy of services we introduce a process de-scription language which is a simplified version of BPEL, see Chapter 7.3.1.3 Security PoliciesIn general terms, a policy controls the possible invocation of the operations ofa service, such as its Quality of Service, or its business logic. In a frameworksuch as JBOSS, even the business process can be encoded as a policy over the
  • 29. 30 CHAPTER 3. WEB SERVICESacceptable requests. Instead of analyzing policies in general, we focus on twotypes of security-related policies: • the message-level security policy, which expresses how the data transmit- ted to and from the service has to be cryptographically secured; • the access control policy, which is expressed at the level of the application and expresses when an invocation is legitimate.Message ProtectionThere are two main ways to secure the communications of a service with itspartners: a) to impose that the transport protocol must be secured, and b) toimpose the usage of cryptographic primitives to protect the sensitive parts ofthe transmitted messages. Given that there exists secure transport protocols such as TLS, one couldwonder why one would need to further protect the messages. The main moti-vation for this extra protection is the fact that the protection provided by TLSis a point-to-point one, whereas complex service interactions depend upon end-to-end security. A simple example would be the payment of an item purchasedon Internet. One does not necessarily trust the e-commerce web site enough tosend it one’s credit card information, even though they have to be transmittedto the bank to complete the transaction. Thus the client has to send to thee-commerce web site her credit card information cryptographically protected insuch a way that: a) this web site will be able to employ the protected data tocomplete the transaction with the bank, but also b) this web site will not beable to derive the credit information from the data. Other applications includedigital contract signing, electronics bidding, etc.Model. Cryptographically protected messages are simply cryptographic pro-tocol messages. When analyzing access control policies, which rely on the pay-load of messages rather than on the cryptography employed to secure the mes-sages, we partially abstract the message layer by simply assuming that thepayload is either signed, encrypted, or both, or none, by a user and that thetransport protocol is either secured or not. See Chapter 7.Authentication–Assertion–AuthorizationAccess control consists in determining whether a given entity has the right,under the actual known circumstances, to perform a given action on a protectedobject. Access control rules emit opinions on whether the access should begranted or denied, and an access control policy gathers these opinions and usesa policy combination algorithm to grant or deny the access to the resource. Arule is said to be applicable on a request if it emits a grant or deny opinion.In the most simple form rules are totally ordered, and the opinion of the firstapplicable rule is the resulting opinion of the set of rules, but other combinationsalgorithms can be found e.g. in [173].
  • 30. 3.1. WEB SERVICES 31Expressibility. Just as Object Oriented programming simplifies the manage-ment of objects by organizing them in a hierarchy, a lot of research on accesscontrol is focused on the simplest ways to write rules that are both sound w.r.t.desired policies and easily writable and understandable. In this line we notethe RBAC (Role Based Access Control ) framework proposed by Ferraiolo andKuhn [113] that organizes individuals according to the administrative role theyhave (doctor, visitor, etc.) together with a role hierarchy that defines the inher-itance of permissions of junior role r to a senior role r . Access control decisionsare based uniquely on the role played by the requester, on the action, and onthe object in the request. OrBAC [129] refines this model by introducing a hi-erarchy of contexts in which a request has to be analyzed as well as a hierarchyon objects. These models often yield very simple policies but at the expense ofexpressibility. For example in pure RBAC it is not possible to express that thesame individual, regardless of her role, shall not perform two different actions inthe same execution context (this is called dynamic separation of duty). On theother side of the spectrum, ABAC (Attribute-Based Access Control ) providesno hierarchy, and the decision is based solely on the values of a set of attributesextracted from the request and from the environment. This implies that everyaspect that can influence an access control decision has to be modeled by avalued attribute, and thus that this type of access control system, while beingable to express any kind of policy, is hard to deploy and manage. Its versa-tility nonetheless made it the system of choice for Web Service access controlsystems such as XACML [173], especially in the currently developed XACML3.0 version, with its WS profile [9].Layered model of Access Control. A layered model has emerged over theyears from the industry best practices as well as from the availability of dedicatedsystems. Access control in distributed systems is now viewed as consisting inthree interacting components:Authentication: the first phase is implemented in applications such as Shib- boleth and consists in the authentication of users. I.e., a user has to authenticate to one such server using e.g. his login and password or a more complex authentication protocol, and once the authentication con- straints imposed on the server are satisfied (e.g. the user has provided a valid certificate authenticating his signature verification key and has re- sponded successfully to a challenge-response protocol) the server issues a token that can be employed by the user to prove his identity to other services. Alternatively, in the case of SAML Single Sign-On, the server will authenticate the user to other services.Assertions: once the user is identified he can negotiate with security services to obtain assertions that qualify him. For example a user can use his identity to activate a role and thereby obtain a role membership credential. This credential can then be employed to gain new ones expressing permissions associated with this role.
  • 31. 32 CHAPTER 3. WEB SERVICESAuthorization: Finally, when trying to execute an action on a resource, the user decorates his request with the necessary credentials, and an autho- rization decision is taken based on the value and origin of the provided attributes.Model. Given that we are less interested in a user-friendly access controlsystem than in the analysis of the access control policy of a set of Web Serviceswe have adopted a formal model of attribute-based access control. We haveabstracted away the authentication phase by using secure channels providingauthentication, and are left with the modeling of the assertion collection partand of the authorization part of access control. We present in Chapter 7 acomprehensive model of a distributed access control system for Web Serviceswhere the rules are furthermore modeled as Horn clauses.3.2 Results achieved in the domain of Web Ser- vicesI have collaborated with Marwa El Houri, a PhD student I supervised, andPhilippe Balbiani on the definition of a formal model for the analysis of WebServices [110]. Our final proposal consists in modeling each component in aWeb Service infrastructure by a communicating entity, i.e. an agent that has: • a store that permits to model a memory, a database, the history of the service, etc.; • a trust negotiation policy that indicates which credentials the entity is ready to share with which other entities on which kind of channel; • A workflow which consists in a set of tasks. Tasks are recursively defined, and an authorization rule controls each invocation of a task.Given the part of an infrastructure (a database system, a human agent, a trustnegotiation engine or a Business Process Engine) modeled by an entity some ofthe above parts may be empty. This model permits us to seamlessly encode Role Based Access Control with(dynamic) separation or binding of duties constraints as well as advanced fea-tures such as all surveyed kinds of delegation [110]. We have also enriched itwith cryptographic primitives and secure channels to enable the validation of agiven set of entities w.r.t. untrusted users [110]. In collaboration with Mohammed Anis Mekki—a PhD student I co-supervisewith M. Rusinowitch—and M. Rusinowitch we have considered the choreogra-phy problem for a set of services. This problem consists in building, given afinite set of available services, an orchestrator that communicates with theseservices to achieve a given goal. I detail this work in Chapter 9. Also presentedin that chapter is the work in collaboration with Tigran Avanesov, M. Rusi-nowitch and Mathieu Turuani on the choreography problem for services which
  • 32. 3.2. RESULTS ACHIEVED IN THE DOMAIN OF WEB SERVICES 33consists in, again given a set of available services and a goal, to compute se-quences of communication for each of the available services such that the goalis satisfied at the end once every participating service has ended its sequence ofcommunication.
  • 33. 34 CHAPTER 3. WEB SERVICES
  • 34. Part IITools 35
  • 35. Chapter 4Fundamentals ofFirst-Order Logic We introduce in this chapter the formalism and notions that will be employed in the rest of this document. This chapter is aimed at presenting first-order logic with an emphasis on resolution, and should be read as a basis for a course on first-order logic ori- ented towards resolution and its applications. This focus means that significant though unrelated notions are lacking. The in- terested reader can find in particular complements on sequent calculus and semantic tableaux in [94]. This chapter ends with the definition of equational theories, a more advanced concept that we need to analyze cryptographic protocols. In particular we extend the unification notions intro- duced together with resolution to unification modulo an equa- tional theory. We also prove a few important facts on equational unification.4.1 Facts, sentences, and truth4.1.1 Reasoning on factsConsider the following sentences: • It is summer or the temperature is cold; • It is not summer or the weather is rainy.We rely on the excluded-middle law 1 which states that a fact can only be true orfalse. As a consequence we can reason on the possible truth value of the fact “It 1 In Scottish courts the result of a criminal prosecution can be either proven (meaningguilty), not proven, or not guilty. In this case we can have at the same time that the resultof the prosecution is not “proven” and is not “not proven”. Beyond the anecdote logic withno excluded-middle law (intuitionistic logic, linear logic, . . . ) have been employed fruitfully 37
  • 36. 38 CHAPTER 4. FUNDAMENTALS OF FIRST-ORDER LOGICis summer”. If it is true then the fact “It is not summer” must be false. Sincethe second sentence is true one can deduce that the weather is rainy. But it mayalso be the case that the fact “It is summer” is false. Since the first sentence istrue we must then have that the temperature is cold. As a conclusion of thesetwo sentences, either the temperature is cold or the weather is rainy. Generally speaking, if A, B1 , . . . , Bn , C1 , . . . , Ck are facts, and the sentences: • A or B1 or . . . or Bn ; • not(A) or C1 or . . . or Ck .are true, then if A is true, not(A) must be false, and thus C1 or . . . or Ck istrue since the second sentence is. Symmetrically if A is false we must have B1or . . . or Bn because the first sentence is true. This reasoning is sound since ifthe assumptions are true then the conclusion must be true. This reasoning can also be conducted if there is no alternative in one of thesentences. Assume the following two sentences are true: • It is day or it is night; • It is not day.One ought to conclude that it is night. Another special case is when there is noalternative in both sentences. For instance assume the following two sentencesare true: • It is day; • It is not day.By following the general scheme given above we deduce that a sentence withno facts must be true. But the common sense also tells us that the assumptionthat both sentences are true does not hold: a fact and its negation cannot beboth true. We reconcile these two conclusions by imposing that a sentencewith no facts must always be false, and rely on the soundness of our deductionmechanism to deduce (by contrapositive reasoning) that if the conclusion isfalse then one of the premises must be false. In this case, i.e. when in a set ofsentences at least one must be false whatever truth value is chosen on the facts,we say that this set is inconsistent. The case-based reasoning on sentences illustrated above is called resolution.It was introduced by Robinson [3] as a reasoning mechanism for the whole offirst-order logic, in which one can e.g. axiomatize Zermelo-Fraenkel set theory.Outline of this chapter. We begin this chapter with a section on orders,and review some definitions and properties. Then we define in Section 4.3 thelanguage employed to describe sentences. We give a semantics to first-orderto reason about the existence of a proof of a theorem, a proof of the negation of a theorem,and the absence of proof for both a theorem and its negation.
  • 37. 4.2. ORDERS 39logic sentences by defining how the language constructs are interpreted. Wepresent in Section 4.5 some of the mathematical properties of first-order logic,namely that it suffices to consider finite sets of universally quantified clauses,where each clause is a disjunction of facts, and that it suffices to consider thetruth in particular interpretations called Herbrand’s interpretations. Then wepresent in Section 4.6 a calculus on finite sets of clauses that recognizes thefinite sets of clauses that are always false. We present in Section 4.7 how tointegrate an equality predicate in this setting.4.2 Orders4.2.1 Definitions and first propertiesOrderings and pre-orderings. A strict ordering < on a set S is a transitive,anti-reflexive, and anti-symmetric relation on elements of this set. An ordering≤ is the union of a strict ordering and of the equality relation. An equivalence isa transitive, symmetric and reflexive relation. A pre-ordering is the transitiveclosure of the union of an equivalence relation with a strict ordering. A strict ordering < on a set S is said to be total whenever for two elementse1 , e2 ∈ S we have either e1 = e2 , or e1 < e2 , or e2 < e1 . It is said to be well-founded whenever there is no infinite strictly decreasing sequence e1 > . . . >en > . . .. These definitions are extended as usual to orderings and pre-orderings.We call an element e maximal (respectively strictly maximal ) with respect to aset η of elements, if for any element e in η we have e e (respectively e e).Extension to sets and multisets. Any ordering on a set E can be ex-tended to an ordering set on finite subsets of E as follows: given two finitesubsets η1 and η2 of E we define η1 set η2 if (i) η1 = η2 , and (ii) for everye ∈ η2 η1 there exists e ∈ η1 η2 such that e e. Given a set, any smaller setis obtained by replacing an element by a (possibly empty) set of strictly smallerelements. Similarly, any ordering on a set E can be extended to an ordering mulon finite multisets over E as follows: let ξ1 and ξ2 be two finite multisets overE. As usual we denote ξ(e) the number of occurrences of e in the multisetξ, and we let > denote the standard “greater-than” relation on the naturalnumbers. We define ξ1 mul ξ2 if (i) ξ1 = ξ2 and (ii) whenever ξ2 (e) > ξ1 (e)then ξ1 (e ) > ξ2 (e ), for some e such that e e. Given a multiset, any smaller multiset is obtained by replacing an occurrenceof element by occurrences of smaller elements. We call an element e maximal(respectively strictly maximal ) with respect to a multiset ξ of elements, if forany element e in ξ we have e e (respectively e e). If the ordering is total (resp. well-founded), so is its multiset extension.It is easy to see that in turn this implies that if the ordering is total (resp.well-founded), so is its set extension.
  • 38. 40 CHAPTER 4. FUNDAMENTALS OF FIRST-ORDER LOGIC4.2.2 Orderings on terms and atomsLemma 4.1. Let t be a complete simplification ordering over terms, andassume that a is compatible with t . Then a is: 1. well-founded; 2. monotone; 3. B a A implies Var(B) ⊆ Var(A).Proof. We recall that the ordering a is compatible with the complete simpli-fication ordering t and a is total on ground atoms. 1. Let us prove that a is well-founded. By contradiction there otherwise exists an infinite descending chain of atoms A0 a A1 a . . .. Since the ordering is total on terms the compatibility of a with t , we deduce that there is an infinite descending chain of terms t0 t t1 t . . . where ti is a term occurring in the atom Ai . Thus t is not well-founded, a contradiction with the assumption that t is a complete simplification ordering. 2. Let A, B be two atoms such that B a A. Suppose that A = I(t1 , . . . , tn ) and B = I (s1 , . . . , sm ). By the compatibility of a with t , for all i ∈ {1, . . . , m}, there is j ∈ {1, . . . , n} such that si t tj , and then, by monotonicity of t , si σ t tj σ for any substitution σ. Again by the compatibility of a with t , we deduce that Bσ a Aσ for any σ and then the monotonicity of a . 3. Let A, B be two atoms such that B a A. The compatibility of a with t implies that for each term tB occurring in B there exists a term tA occurring in A such that tB t tA . Since t is subterm, this implies Var(t) ⊆ Var(t ). We conclude that Var(B) ⊆ Var(A).4.3 SyntaxWe have adopted a bottom-up presentation of the constructions employed to de-fine the language first-order logic. We first define the terms in Subsection 4.3.1.Then we introduce the predicate symbols in Subsection 4.3.3. At this point wehave defined the atoms (called facts in the introduction of this chapter) that arethe basic elements of first-order logic. A formula is the arrangement of atomsusing the logical connectives defined in Subsection 4.3.4. Quantifiers are thenintroduced to precise the meaning of formulas in Subsection 4.3.5. Finally weintroduce clauses which are formulas of a special form and correspond to thesentences in the introduction.
  • 39. 4.3. SYNTAX 414.3.1 TermsDefinition 1. (Signature) Let F be a finite or denumerable set. A signature αis a mapping from F to the set of natural numbers I The image α(f ) of an N.element f ∈ F is called its arity. A signature α employed to define terms is called a functional signature. Itsdomain is then called a set of function symbols. Given a functional signature αthe constants are the elements e ∈ F of arity 0. We denote T (α, X ) the set of terms built on a functional signature α anda denumerable set of variables X . A term is an expression built in finite timesuch that: • constants and variables are terms; • If t1 , . . . , tn are terms and α(f ) = n then f (t1 , . . . , tn ) is a term.Given a term t we denote Var(t) (resp. Const(t)) the set of variables (resp.constants) occurring in t. A term t is ground if Var(t) = ∅Example 3. For instance we can choose a functional signature mapping ev-ery rational number to 0, the symbol “minus” to 2, the symbol “abs” to 1,and the symbol f to 1. A term in this signature is an expression t such asabs(minus(x, f ( 1 ))). 24.3.2 SubstitutionsA substitution is a function that replaces the variables occurring in a term byother terms. It can be thought of as similar to an assignment in imperativelanguages, since the effect of an instruction: x := 1is to replace the value of the variable x with the term 1. However some careneeds to be taken when considering assignments such as: x := x + 1since one needs to distinguish the current value of x, employed to computeexpression on the left-hand side, and the next value of x that will be the resultof the sum. We avoid such intricacies by imposing that a variable changed by a substi-tution does not occur in a term in the image of the same substitution. A simpleway to obtain this is to mandate that a substitution must be an idempotentfunction, i.e. that applying it twice yields the same result as applying it onlyonce. Another point is that we want the application of a substitution to be effec-tively applicable in finite time. Accordingly we impose on substitutions to befunctions that change only a finite number of variables. There are two ways tomandate this:
  • 40. 42 CHAPTER 4. FUNDAMENTALS OF FIRST-ORDER LOGIC • The first one is to define substitutions as partial functions from variables to terms, and to impose that they have a finite domain; • The second possibility is to say that substitutions are total functions but with a finite support set, i.e. there exists only a finite set of variables x such that σ(x) = x.Definition 2. (Substitutions) A substitution σ : X → T (F, X ) is an idempo-tent function such that the set {x ∈ X | x = σ(x)} is finite. A substitution σ is ground is σ(x) = x implies that σ(x) is a ground term. We extend substitutions homomorphically to terms in T (F, X ) by defining: σ(t) If t ∈ X σ(t) = f (σ(t1 ), . . . , σ(tn )) If t = f (t1 , . . . , tn )Finally we improve the readability of this document by writing the applicationof a substitution σ on a term t in the postfix notation tσ. The application of firstthe substitution σ and then the substitution τ on t is thus written tστ insteadof τ (σ(t)). Since substitutions are endomorphisms on the algebra of terms, theycan be composed, and the composition is associative.Positions. It is often convenient to refer to a specific subterm in a term t. Thisis achieved by using positions which can be viewed as pointers to the subtermsof t and are finite sequences of integers. They are defined as follows: • the set of positions of constants and variables contains only one position which is denoted ε, and is an empty sequence of integers; • If t1 , . . . , tn are terms with respective sets of positions P1 , . . . , Pn , then the set of positions of the term f (t1 , . . . , tn ) is: n {ε} ∪ {i · p | p ∈ Pi } i=1The set of the positions in a term t is denoted Pos(t). Let t be a term, and p ∈ Pos(t) be a position. We define recursively thesubterm of t at position p, denoted t|p , and the symbol at position p, denotedSymb(t, p), as follows: • t|ε = t and Symb(f (t1 , . . . , tn ), ε) = f ; • f (t1 , . . . , tn )|i·p = ti|p and Symb(f (t1 , . . . , tn ), i · p) = Symb(ti , p);
  • 41. 4.3. SYNTAX 434.3.3 PredicatesThe terms on a signature α are related one with another with relations. Whilethe usual examples of relations are “. . . is smaller than. . . ” or “. . . is equalto. . . ”, the principle of relational database systems is to model each aspect ofa problem by a relation called table. A signature employed to define predicate symbol is called a relational signa-ture. Given a relational signature β and a functional signature α a (β, α)-atomis an expression p(t1 , . . . , tn ) where β(p) = n and t1 , . . . , tn ∈ T (α, X ).Example 4. Beside the functional signature of Example 3 let us consider thefollowing predicate signature: β = inf → 2Under this choice the expressions inf(abs(minus(x, x )), λ) inf(abs(minus(f (x), f (x ))), ε)are (β, α)-atoms. Given an atom a = p(t1 , . . . , tn ) we denote Var(a) (resp. Const(a)) the set∪n Var(ti ) (resp. ∪n Const(ti )). i=1 i=14.3.4 Logical connectives and formulasLet α be a functional signature and β be a relational signature. Formulasexpress truth relations between (β, α)-atoms. One may for instance write thattwo atoms must be both true, or that at least one must be true, etc. We callthe functions that relate the atom one with another logical connectives. If onedenotes true with the symbol and false with the symbol ⊥, these connectivescan be a priori any function f : {⊥, }n → {⊥, } where n is the numberof connected atoms. However, defining one function for each arrangement ofatoms one wishes to express would be tedious. Hopefully it has long been notedthat every such function can be written as the composition of three logicalconnectives: • a ∨ b: is false iff a and b are false; • a ∧ b: is true iff a and b are true; • ¬a: is true iff a is false.For example the logical implication a ⇒ b which is read “a implies b” can bewritten ¬a ∨ b. Note that this implication does not have the causation meaningassociated to the implication in natural languages. It simply means that eitherthe value of the atom a is false (an implication with a false premise is alwaystrue) or else that the value of the atom b must be true. The (β, α)-formulas are the expressions built in finite time such that:
  • 42. 44 CHAPTER 4. FUNDAMENTALS OF FIRST-ORDER LOGIC • a (β, α)-atom is a (β, α)-formula; • if f1 , f2 are (β, α)-formulas then f1 ∨ f2 and f1 ∧ f2 are (β, α)-formulas; • if f is a (β, α)-formula then ¬f is a (β, α)-formula.Example 5. Continuing the examples 3 and 4 a formula is an expression like: ¬(inf(abs(minus(x, x )), λ)) ∨ inf(abs(minus(f (x), f (x ))), ε) Given a formula ϕ where the atoms a1 , . . . , an occur we denote Var(ϕ) (resp.Const(ϕ)) the set ∪n Var(ai ) (resp. ∪n Const(ai )). i=1 i=14.3.5 QuantifiersThe definition of (β, α)-formulas is still ambiguous. When one writes a(x) ∨ b(x)it is not clear one means that for some value c of x it is true that a(c) ∨ b(c),or one means that whatever the value c of x is it is true that a(c) ∨ b(c). Inorder to precise the meaning of the variables in the formulas one introducesexistential (for some value of) and universal (for all values of) quantifiers denotedrespectively ∃ and ∀. Formally, • A (β, α)-formula is a (β, α)-quantified formula with an empty set of quan- tified variable; • If ϕ is a (β, α)-quantified formula with a set of quantified variables Q and x ∈ Var(ϕ) Q then ∃xϕ is a (β, α)-quantified formula with a set of quantified variables Q ∪ {x}; • If ϕ is a (β, α)-quantified formula with a set of quantified variables Q and x ∈ Var(ϕ) Q then ∀xϕ is a (β, α)-quantified formula with a set of quantified variables Q ∪ {x}.A (β, α)-quantified formula in which every variable is quantified is called a(β, α)-sentence. Note that in the traditional presentation of sentences in first-order logic the quantifiers may be interleaved with the logical connectives. Theprice of the added complexity (in terms of defining the semantics, the quantifiedvariables, the handling of variable names clash, etc.) is however paid for nothing:any (β, α)-sentence in the standard setting is logically equivalent to a formula inthe simpler language described above. An equivalent formula can be effectivelycomputed by algorithms that rewrite sentences in prenex normal form (see [146,151, 94], for example).Example 6. We complete the formula in the preceding example by quantifyingthe variables occurring in two different ways, thereby obtaining two differentsentences: ∀x∀ε∃λ∀x , ¬(inf(abs(minus(x, x )), λ)) ∨ inf(abs(minus(f (x), f (x ))), ε) ∀ε∃λ∀x∀x , ¬(inf(abs(minus(x, x )), λ)) ∨ inf(abs(minus(f (x), f (x ))), ε)
  • 43. 4.4. SEMANTICS OF FIRST-ORDER LOGIC 45The educated reader should by now have noticed that we have given the usualdefinitions of continuity and uniform continuity in a normed space. We leave asan exercise the determination of an arrangement of quantifiers expressing thatthe function f is a) bounded, or b) constant.4.4 Semantics of First-Order Logic4.4.1 InterpretationGiving a semantics to a logic means defining when a formula is true. Since themeaning of quantifiers and logical connectives is fixed, it suffices to define whenan atom is true. This is achieved by interpreting the symbols occurring in aformula.Definition 3. (Interpretation) Let α (resp. β) be a functional (resp. relational)signature, and X be a set of variables. A (α, β)-interpretation I is defined by2 : • A non-empty set DI , called the domain of the interpretation; β(p) • For each predicate symbol p in the domain of β a function I(p) : DI → { , ⊥}; α(f ) • For each function symbol f in the domain of α a function I(f ) : DI → DI . Given an interpretation I of domain DI a valuation v is a mapping from theset of variables to elements in DI . Valuations are extended homomorphicallyon terms, atoms, and formulas as expected. The truth value of a sentence ϕ in an interpretation I of domain DI isdenoted [[ϕ]]I is determined as follows: • If ϕ = ∃xψ(x) then [[ϕ]]I = if, and only if, there exists a valuation v of domain x such that [[v(ψ(x))]]I = ; • If ϕ = ∀xψ(x) then [[ϕ]]I = if, and only if, for all c ∈ DI we have [[vc (ψ(x))]]I = with vc is the valuation mapping x to c; • If ϕ = ϕ1 ∧ ϕ2 then [[ϕ]]I is if, and only if, [[ϕ1 ]]I = and [[ϕ2 ]]I = ; • If ϕ = ϕ1 ∨ ϕ2 then [[ϕ]]I = if, and only if, [[ϕ1 ]]I = or [[ϕ2 ]]I = ; • If ϕ = ¬ϕ1 then [[ϕ]]I = if, and only if, [[ϕ1 ]]I = ⊥; • If ϕ = p(t1 , . . . , tn ) then [[ϕ]]I = I(p)(I(t1 ), . . . , I(tn )); 2 We note that the interpretation of a variable is not defined. While usually interpretationsare extended over variables with valuations—functions mapping variables in the formula toelements in the domain of the interpretation—we have chosen to instantiate in the formulas thevariables by the elements of the domain. Given that this interleaving is not defined formally,this instantiation should be thought of as syntactic sugar.
  • 44. 46 CHAPTER 4. FUNDAMENTALS OF FIRST-ORDER LOGIC • Given a valuation v we have [[x]]I = v(x) if x is a variable. Otherwise we must have t = f (t1 , . . . , tn ), and we define [[t]]I = I(f )([[t1 ]]I , . . . , [[tn ]]I ).Note that since all the variables in a sentence are bound by a quantifier andall quantifiers appear first every variable in the formula is in the domain of avaluation when evaluating an atom. An interpretation that makes a sentencetrue is called a model of this sentence.Definition 4. (Model) Let ϕ be a first-order sentence and I be an interpretationwith [[ϕ]]I = . We say that I is a model of ϕ, and denote I |= ϕ. Given two formulas ϕ and ψ we also denote ϕ |= ψ the fact that for everymodel I of ϕ we have I |= ψ.Example 7. For instance, consider the following exercise: Prove that the function f : I → I defined by f : x → R R x2 is continuous.As it was already noted the first formula of Example 6 is the definition ofcontinuity if one considers the interpretation I: • with a domain I R; • I(inf) =<, the usual order on I R; • I(abs) = x → |x|, the function that associates to an element of I its R absolute value; • I(minus) = (x, y) → x − y, the usual subtraction in I R.This interpretation is not complete as it lacks the interpretation of the functionsymbol f . This last part is contained in the statement of the exercise, withI(f ) = x → x2 .4.4.2 Satisfiability, validityIt is clear that the truth of a formula depends on the chosen interpretation. Forinstance the first (resp. second) formula of Example 6 is true in the interpre-tation I of Example 7 if, and only if, f is interpreted by a continuous (resp.uniformly continuous) function. The goal of automated reasoning techniquesfor first-order logic is to decide, given a sentence ϕ, whether: • there exists at least one interpretation in which ϕ is true; • or if for all interpretations ϕ is true.In the former case we say the sentence is satisfiable, and in the latter case thatit is valid.Definition 5. (Satisfiability, validity) A sentence ϕ is
  • 45. 4.5. FOUNDATIONS OF RESOLUTION 47 • satisfiable if there exists one interpretation in which ϕ is true; • valid if it is true in any interpretation.Example 8. The definition of continuity is certainly satisfiable since it is truein every interpretation I in which I(f ) is a continuous function, but is not validsince it will be false if one interprets f with a non-continuous function. For the sake of completeness we also say that a sentence is unsatisfiable ifit is not satisfiable—i.e. is false in every interpretation—, and falsifiable if it isnot valid—i.e. is false in some interpretation.Logical equivalence. Let us now define the notion of logical equivalence thatwe have employed in Section 4.3.5 when stating that every first-order sentencein which the quantifiers are scattered in the formula, such as ∀x((∃yp(x, y)) ∨(∀zp(y, z))) is logically equivalent to a sentence in which all the quantifiers ap-pear in sequence at the beginning of the formula, e.g. ∀x∃y∀z(p(x, y) ∨ p(y, z)).Definition 6. (Logical equivalence) Two first-order logic sentences ϕ and ψare logically equivalent if, and only if, for every interpretation I we have: [[ϕ]]I = [[ψ]]I4.5 Foundations of ResolutionThe logical equivalence between two first-order sentences means that they haveexactly the same set of models. However as long as one is concerned with sat-isfiability or validity (by considering the negation of the formula), the relevantnotion is the one of having or not a model. A second equivalence betweenfirst-order sentences, called equisatisfiability, reflects this importance. Two for-mulas ϕ and ψ are equisatisfiable when ϕ is satisfiable if, and only if, ψ issatisfiable. This equivalence relation is very coarse since it defines only twoequivalence classes. It is however very useful when considering algorithms thathave to decide whether a given formula is satisfiable. Indeed, this notion al-lows such algorithms to transform sentences into non-logically equivalent one aslong as the transformations performed change a sentence into an equisatisfiableone. In particular skolemization first brick of automated reasoning techniquesin first-order logic—transforms any first-order sentence into an equisatisfiablefirst-order sentence with no existential quantification. We then prove that whenconsidering their satisfiability it suffices to interpret these sets of universallyquantified clauses in Herbrand’s interpretations, i.e. interpretations that equal-ize the functions in the domain with the function symbols in the formula. Thenwe prove that to prove the unsatisfiability of a finite set of clauses it suffices toprove the unsatisfiability of a finite set of instances of these clauses.
  • 46. 48 CHAPTER 4. FUNDAMENTALS OF FIRST-ORDER LOGIC4.5.1 SkolemizationSkolemization, in spite of its name, is an operation naturally performed whenfacing a logical problem. Let us consider an example of skolemization.Example 9. Let us continue Example 7. To prove that the function f : x → x2is continuous, one usually gives an explicit bound on α such that whenever|x − x | < α the inequality |f (x) − f (x )| < ε holds. Given the quantifications,this bound depends on the values of x of . For instance one can reason asfollows: √ • If x = 0 then α = ε satisfies the condition; • Otherwise it suffices to look for a bound α < |x|. This bound implies that x, x are of the same sign, and 0 < |x + x | < 2 · |x|. Since: 2 ε |x2 − x | < ε ⇔ |x − x | · |x + x | < ε ⇔ |x − x | < |x + x | ε ε Since 2·|x| < |x+x | this inequality holds as soon as: ε |x − x | < 2 · |x| ε Thus if x = 0 it suffices to set α = min(|x|, |x| ). In order to prove that the formula is satisfiable we have instantiated theexistentially quantified variable α by a function of x and ε. While this construc-tion seems to be an ad hoc solution of the problem, it is actually a very generaltechnique that works for any interpretation.Lemma 4.2. (Skolemization) Let ϕ = ∀x1 . . . ∀xn ∃yψ(x1 , . . . , xn , y) be a first-order (β, α)-sentence. Let α be the function extending α on a function symbolf ∈ Dom(α) with α (f ) = n. / Then ϕ is satisfiable if, and only if, ϕ = ∀x1 . . . ∀xn (ψ(x1 , . . . , xn , f (x1 , . . . , xn )))is satisfiable.Proof. ⇒ Assume there exists an interpretation I of domain D = ∅ such thatI |= ϕ. By definition of the evaluation of a formula in an interpretation, for alln-tuples a = (a1 , . . . , an ) ∈ Dn we have I |= ∃yψ(a1 , . . . , an , y) = ∃yϕa (y). Fora ∈ Dn let Sa be the set of values c ∈ D such that I |= ϕa (c), and let: S = Πa∈Dn SaSince for all a ∈ Dn we have I |= ∃yϕa (y) all the sets Sa are non-empty. SinceD = ∅ the set S is the product of a non-empty family of non-empty sets andis thus itself non-empty3 , and thus contains an element s = Πa∈Dn sa . Letf I : Dn → D be the function a → sa . Let I be the interpretation of the same 3 This is an alternative statement of the Axiom of Choice.
  • 47. 4.5. FOUNDATIONS OF RESOLUTION 49domain D as I, equal to I on the symbols in the domains of the signatures αand β, and such that I (f ) = f I . By construction I is a model of ϕ . ⇐ Let I be a model of ϕ , and let f I = I (f ). By definition everyoccurrence of f in ϕ is in the term f (x1 , . . . , xn ). Thus there exists in D anelement b = f (a1 , . . . , an ) such that in ϕ(a1 , . . . , an , b) evaluates to in I .Thus I’ is an interpretation that satisfies ϕ. The skolemization lemma can be iterated on a sentence to remove everyexistential quantifier from the left to the right. Since each iteration transformsa sentence into an equisatisfiable one we obtain the following theorem.Theorem 4.1. (Skolem, [198]) Every first-order sentence ϕ is equivalent withrespect to satisfiability to a universally quantified sentence. Since the variables in a universally quantified sentence are all bound bythe same quantifier we will often, in the rest of this document and when thisintroduces no ambiguity, write sentences without the quantifiers.4.5.2 ClausesThe logical connectives we have employed to relate the atoms one with anotherin a formula share some properties known as de Morgan laws. Among these wenote especially the following ones: Laws that move the negation down: ¬ ∧ ¬ ∨ ∨ ≡ ¬ ¬ ∧ ≡ ¬ ¬ a b a b a b a b Laws that move the disjunction down: ∨ ∧ ∨ ∧ a ∧ ≡ ∨ ∨ ∧ a ≡ ∨ ∨ b c a b a c b c b a c aIt is clear that using these laws and the fact that ¬¬x ≡ x it is possible to: • First push the negation downward so that a formula is written as disjunc- tions and conjunctions of atoms or negation of atoms. We call literals the formulas that are either atoms or the negation of an atom; • Then push the disjunction downward, resulting in a formula which is a conjunction of disjunctions of literals. In order to complete our transformation of sentences we need another lemmathat permits us to push quantifications downwards.
  • 48. 50 CHAPTER 4. FUNDAMENTALS OF FIRST-ORDER LOGICLemma 4.3. The formulas ∀x(ϕ(x) ∧ ψ(x)) and (∀xϕ(x)) ∧ (∀xψ(x)) are logi-cally equivalent.Proof. We prove only that every model of ∀x(ϕ(x)∧ψ(x)) is a model of (∀xϕ(x))∧(∀xψ(x)), the converse being similar. Let I be a model of ∀x(ϕ(x) ∧ ψ(x)) with a domain D = ∅. By definition forall a ∈ D we have [[ϕ(a) ∧ ψ(a)]]I = , and thus by definition of the evaluationof ∧, for all a ∈ D we have [[ϕ(a)]]I = and [[ψ(a)]]I = . Thus, • For every a ∈ D we have [[ψ(a)]]I = , and thus I |= ∀xψ(x); • For every a ∈ D we have [[ϕ(a)]]I = , and thus I |= ∀xϕ(x);Thus by definition of the evaluation of the ∧ connective we have I |= (∀xψ(x))∧(∀xϕ(x)). We are now ready to sum up the transformations applied. First, we definea clause as a universally quantified disjunction of literals, i.e. a formula of thetype: ∀x1 , . . . , ∀xn , l1 ∨ . . . ∨ lkwere each literal li is either an atom p(t1 , . . . , tm ) or its negation ¬p(t1 , . . . , tm ).Defining a first-order theory as a conjunction of clauses, the transformationsdescribed in this section imply the following theorem. Given that a theory isalways a conjunction of clauses it is also viewed as a finite set of clauses.Theorem 4.2. Every first-order sentence can be effectively transformed into anequisatisfiable first-order theory.4.5.3 Herbrand’s theoremWe have seen that there are two distinct levels to first-order logic: a) the lan-guage level in which formulas are defined; and b) the interpretation level inwhich the symbols of a formula are interpreted as functions on a non-emptydomain. In order to avoid heavy notations we have already mixed both levelswhen proving the correctness of skolemization, noting that it is possible to avoidthis interleaving of notations by completing the interpretation with an explicitfunction that maps every variable to an element of the domain. The questionthen arises as to whether one could go further and equate the symbols of thelanguage with those of the interpretation, or if a strict separation should bekept. To answer this question we first introduce a special domain, called the Her-brand’s domain of a theory T , constructed as follows. The functional signature of a first-order theory T is denoted αT and is afunction mapping every function symbol appearing in T to its arity. Addition-ally, if no constant (i.e. symbols of arity 0) occurs in a formula of T we extendαT on a symbol a not occurring in T with α(a) = 0. This construction permits one to define the Herbrand’s domain HT of atheory T as the set of terms T (α). In particular we note that this domain is
  • 49. 4.5. FOUNDATIONS OF RESOLUTION 51never empty, and is finite if, and only if, every function symbol occurring in Tis of arity 0.Example 10. Assume: T = ∀x∀ε∀x ¬(|x − x | < g(x, ε)) ∨ |f (x) − f (x )| < εSince T does not contain any constant its functional signature is the functionα: α = {a → 0, | | → 1, f → 1, − → 2, g → 2}The Herbrand’s domain HT is the set of terms: a, |a|, f (a), a − a, g(a, a), ||a||, f (|a|), . . .One easily sees that the Herbrand’s domain of a first-order theory is denumer-able, the proof being left as an exercise to the reader. Given a relational signature βT describing the arity of the predicate symbolsoccurring in the clauses of T and the Herbrand’s domain HT we define theHerbrand’s universe to be the set of atoms p(t1 , . . . , tn ) where β(p) = n andt1 , . . . , tn ∈ HT . A term in HT or an atom in UT is said to be ground.Definition 7. (Herbrand’s interpretation) A Herbrand’s interpretation of afirst-order theory T is an interpretation I in which the domain is the Herbrand’sdomain HT of T and such that, for every function symbol f occurring in T we nhave I(f ) = (t1 , . . . , tn ) ∈ HT → f (t1 , . . . , tn ) ∈ HT . Thus in a Herbrand’s interpretation the terms are both syntax and semanticsas they occur in the domain and in the formula. We note that since everyinterpretation of T must interpret the function symbols occurring in T , theHerbrand’s domain can be viewed as the set of all the expressions definablein all interpretations of T . Accordingly given an interpretation I there existsan embedding ΘI of the Herbrand’s universe into the set of distinct atoms inI. Sinnce ΘI is a mapping the preimages of the atoms of the interpretationare disjoints. Thus the truth value of an atom in the interpretation I can bemapped to the truth value of the atoms in a Herbrand’s interpretation which arein its preimage. For these reasons Herbrand’s universes are called the Canonicalmodels of first-order logic. Given a clause C = ∀x1 . . . ∀xn l1 ∨ . . . ∨ lk of T a ground instance of C is aclause l1 σ ∨ . . . ∨ lk σ where σ is a substitution mapping the variables x1 , . . . , xnto ground terms t1 , . . . , tn of the Herbrand’s domain. We let T HT be the set ofall ground instances of all clauses in T .Lemma 4.4. (Lemma 1.6.1 in [146]) A theory T is satisfiable if, and only if,T HT is satisfied by a Herbrand’s interpretation.Proof. ⇒ First let us prove that if T is satisfiable then T HT is satisfied bya Herbrand’s interpretation. Let I be a model of T of domain D = ∅. If a
  • 50. 52 CHAPTER 4. FUNDAMENTALS OF FIRST-ORDER LOGICconstant a was added to the function symbols occurring in T , fix some c ∈ Dand set I(a) = c. Since I(f ) is defined for every function symbol occurring inT , by structural induction on the terms, it is trivial that I can be extendedas a mapping from Θ : HT → D. We build a Herbrand’s model U of T HT asfollows: for each predicate symbol p of arity n and for every ground terms t1 , . . . , tn ∈ HT let U(p(t1 , . . . , tn )) = I(p)(Θ(t1 ), . . . , Θ(tn ))By contradiction assume that U is not a model of T HT . By definition thereexists a clause C = ∀x1 . . . ∀xn l1 ∨ . . . ∨ lk of T and a ground substitution σmapping the variables x1 , . . . , xn to ground terms t1 , . . . , tn of the Herbrand’sdomain such that: U(l1 σ ∨ . . . ∨ lk σ) = ⊥Reordering the literals if necessary let us fix the notations with atoms a1 , . . . , ak , bk +1 , . . . , bksuch that: ai If i ≤ k li σ = ¬bi If i > kWe have U(a1 ) = . . . = U(ak ) = ⊥ and U(bk +1 ) = . . . = U(bk ) = . Byconstruction every atom ai , bi has an image by Θ. By definition of U we have: I(Θ(ai )) = ⊥ I(Θ(bi )) =and thus I(l1 σ ∨ . . . ∨ lk σ) = ⊥. There is an instance of a clause of T which isnot evaluated to true by I, which contradicts the fact that I is an interpretationof T . Thus U is a Herbrand’s model of T HT . ⇐ Trivial, since assume the existence of an interpretation in which allinstances of all clauses in T are satisfied. Lemma 4.4 reduces the general problem of the (un)satisfiability of a first-order theory to the particular case of the existence of a Herbrand’s model.The cost to pay for this reduction is that we are now looking for a model of aninfinite set of ground clauses. We now follow Quine [183] to prove that it actuallysuffices to consider finite sets of ground instances to derive the (un)satisfiabilityof this infinite set of ground clauses. The proof relies depends on the notion ofcondemnation.Definition 8. (Condemnation) Let S be a finite set of ground clauses wherethe atoms ξ1 , . . . , ξk occur and I be a truth-value assignment I(ξ1 ), . . . , I(ξl )with l ≤ k. We say that I condemns S if I cannot be extended to a truth-valueassignment I’ on ξ1 , . . . , ξk satisfying S. We note that when k = l the truth-value assignment condemns the finite setof ground clauses if, and only if, it does not satisfy this set. Actually we canrelate condemnation with satisfiability even more tightly.
  • 51. 4.5. FOUNDATIONS OF RESOLUTION 53Lemma 4.5. Let S be a finite set of ground clauses. If S is unsatisfiable thenevery truth-value assignment condemns S. Conversely, if there exists a set ofatoms Ξ such that every truth-value assignment on Ξ condemns S then S isunsatisfiable.Proof. ⇒ Let S be a finite set of clauses and assume there exists a finitetruth-value assignment I that does not condemn S. Then by definition I canbe extended into a truth assignment that satisfies S. ⇐ Assume that there exists a set of atoms Ξ such that every truth-valueassignment on Ξ condemns S. Then in particular every extension on the atomson S of truth-value assignment on Ξ does not satisfy S, and thus no truth-valueassignment on the atoms of S satisfies S. Hence S is unsatisfiable. Herbrand’s Theorem, at least the version we give here and whose prooffollows [183] relates the unsatisfiability of a theory to the unsatisfiability offinite sets of ground instances of its clauses in the Herbrand’s domain.Theorem 4.3. (Herbrand) A first-order theory T is unsatisfiable if, and only if,there exists a finite subset of T HT not satisfied by any Herbrand’s interpretation.Proof. ⇐ If there is a finite unsatisfiable subset of T HT then by definitionT HT is unsatisfiable, and thus by the contrapositive of the direct direction ofLemma 4.4 the theory T is unsatisfiable. ⇒ By the contrapositive of the converse direction of Lemma 4.4 we haveT unsatisfiable implies T HT unsatisfiable by a Herbrand’s interpretation. Letξ1 , ξ2 , . . . be an enumeration of the ground atoms in the Herbrand’s universe ofT , and let us consider the interpretation I that maps the sequence of atomsξ1 , ξ2 , . . . to the truth value t1 , t2 , . . . such that: ti = iff the truth value assignment t1 , . . . , ti−1 , does not condemn any finite subset of clause instances. Since T HT is unsatisfiable there exists at least one instance C of a clause ofT which is not satisfied by the truth-value assignment we have just defined. Letξj be the atom in C that is enumerated last. By maximality the truth value of allatoms occurring in C is determined by t1 , . . . , tj . Since C is not satisfied by thetruth assignment t1 , . . . it is not satisfied by the truth assignment t1 , . . . , tj . Afortiori we note that t1 , . . . , tj condemns a finite subset {C} of clause instances.This yields the existence of a finite j such that t1 , . . . , tj condemns a finite subsetof clause instances. Let h be a minimal integer such that t1 , . . . , th condemns a finite subset ofclause instances. For that h we must have th = ⊥ by the choice of the sequenceof truth values. So: (i) t1 , . . . , th−1 , ⊥ condemns a finite subset ω of clause instances; (ii) Since we have not chosen th = by definition of the sequence we also have that t1 , . . . , th−1 , condemns a finite subset ω of clause instances.
  • 52. 54 CHAPTER 4. FUNDAMENTALS OF FIRST-ORDER LOGICThis implies that if h > 1 the truth-value assignment t1 , . . . , th−1 condemnsthe finite subset of clause instances ω ∪ ω , which contradicts the minimality ofh. Thus we must have h = 1. But then the points (i) and (ii) above implythat regardless of whether one chooses t1 = or t1 = ⊥ the finite set of clauseinstances ω ∪ ω is condemned by t1 . Since there is no truth-value assignmentthat satisfies ω ∪ ω this is a finite unsatisfiable subset of T HT . The direct part of the proof actually proves an important property of first-order logic known as compacity, in which the interpretation is not restricted tobe a Herbrand’s interpretation.Theorem 4.4. (Compactness theorem) A set of clauses is unsatisfiable if, andonly if, there exists a finite and unsatisfiable set of clause instances.4.5.4 Concluding remarksThe theorem we have attributed to Herbrand is quite different from the originalstatement by Herbrand who considered the provability of a first-order theory.The standard proof for our statement of Herbrand’s theorem is based on thefiniteness of proofs, and thus relies on the notion of provability. Formally, if Sis a set of formulas, S A denotes the existence of a proof (which is a finite listof formulas) of the formula A from S in a predicate calculus whose languageincludes the symbols of S ∪ A. A set S of formulas is inconsistent if there existsa formula A such that S A ∧ ¬A. If S is not inconsistent it is consistent. Theconsistency—a syntactic notion given that one is interested in the manipulationof formulas—is related to satisfiability by the following theorem.Theorem 4.5. (G¨del Completeness Theorem) A first-order theory T is con- osistent if, and only if, it is satisfiable. This theorem implies the existence of a finite proof of A ∧ ¬A for an unsat-isfiable theory T . The formulas in this proof provide an example of a finite setof unsatisfiable instances of the clauses in T when T is unsatisfiable, and thusthe compactness theorem 4.4. This theorem is then employed to directly obtaina finite unsatisfiable subset of clause instances from T HT . Instead of this usual proof we have prefered to present the approach ofQuine [183] which is purely model-theoretic and based on an enumeration of theset of atoms in a Herbrand’s interpretation. In particular we believe that hisproof of the compactness Lemma is an excellent introduction to resolution as wellas to the ordering refinements of resolution. We note that this model-theoreticapproach was also followed in the second chapter of [146] in a presentationbased on semantic trees. That presentation opened the way to the semantic treesapproach that eventually lead to completeness results of ordered paramodulationand superposition [189]. We refrain from going further down that road to focuson our own results even though some are based on these ordering refinements.
  • 53. 4.6. RESOLUTION 554.6 ResolutionWhile knowing that a first-order sentence is valid certainly seems important, itis much more obscure as to why would anyone be interested in sentences thatare always false. The main rational of this interest is that the negation of analways-true sentence is an always-false sentence. Thus to prove that a sentenceis valid it suffices to prove that its negation is unsatisfiable. The resolution method was defined by Robinson [3] to turn the mathemat-ical proof of the existence of a finite unsatisfiable set of ground clauses into aprocedure that searches for a finite witness sets. In this section we first present ageneric procedure that recognizes unsatisfiable theories in Subsection 4.6.1, anddiscuss its shortcomings. Then we present ground resolution in Subsection 4.6.2as a procedure that turns Quine’s proof of Herbrand’s Theorem into an effec-tive method. The abstraction from ground instances relies on unification, andmore precisely on the existence of most general unifiers, which are defined inSubsection 4.6.3. These most general unifiers are employed in Subsection 4.6.4to simulate ground resolution on finite sets of ground instances by resolution.4.6.1 Recognizing unsatisfiable theoriesAssume that a first-order theory T is unsatisfiable. Then by Theorem 4.4 thereexists a finite unsatisfiable set of ground instances of clauses in T which isunsatisfiable. This provides a procedure that recognizes the unsatisfiable first-order theories, described in Algorithm 4.1. This algorithm is effective in the Algorithm 4.1: Naive algorithm recognizing whether T is unsatisfiable for all finite sets of ground instances S of clauses in T do if S is unsatisfiable then return theory unsatisfiable end if end forsense that: • it is possible to enumerate all the terms in the Herbrand’s domain of the theory T , for example by first enumerating all the terms with one symbol, then all the terms with 2 symbols, and so on, given that each of these sets is finite; • it is thus possible to enumerate all the ground atoms by enumerating first the ground atoms in which the predicate symbol takes as arguments the first term, then the two first terms, and so on. Since the number of predicate symbols is finite each of these sets is finite; • it is thus possible to enumerate all the ground instances of clauses in T by considering first all the ground instances that contain only the first atom,
  • 54. 56 CHAPTER 4. FUNDAMENTALS OF FIRST-ORDER LOGIC then all the ground instances that contain the first and the second atom, and so on. Since each clause contains a finite number of atoms, and since the number of clauses is finite, each set in this enumeration is finite. • it is thus possible to enumerate all the finite sets of ground instances of clauses in T by first enumerating the singleton set containing the first clause, then the sets contained in the set of the two first clauses, and so on. Since the number of subsets of a finite set is finite, each of these sets is finite.Then checking whether a finite set of ground clauses is unsatisfiable can be doneby looking at all the possible interpretations e.g. by writing a truth table. Given that this algorithm blindly enumerates all the possible instances of afirst-order theory T , it is clear that it is not adequate for recognizing unsatis-fiable theories in practice. The resolution principle was introduced by Robin-son [3] to guess efficiently subsets of clause instances that might be unsatisfiable.Before presenting resolution in Subsection 4.6.4 we present in Subsection 4.6.2an alternative approach to truth-tables to check for the unsatisfiability of a finiteset of ground clauses, called ground resolution.4.6.2 Ground resolutionLet S = {C1 , . . . , Cn } be a finite set of ground clauses. Since S is finite theset of atoms occurring in S is finite. Informally, the ground resolution principleconsists in reducing the set S to an equisatisfiable finite set of clauses S wherethe number of distinct atoms occurring in S is strictly less than the number ofdistinct atoms occurring in S. This overall reduction is called the resolution onξk of S, and consists in the eager application in order of each of the followingrules (written modulo a permutation of literals):Ground elimination on ξk : Remove from S all the ground clauses ξk ∨ ¬ξk ∨ C;Ground factorization of ξk : From a ground clause l ∨l ∨C deduce the clause l ∨ C where l is the literal ξk or ¬ξk ;Ground resolution on ξk : From the two ground clauses ξk ∨ C1 and ¬ξk ∨ C2 form the clause C1 ∨ C2 .Since a clause eliminated by ground elimination on ξk is satisfied whatever thetruth assignment to ξk is, it is clear that a set of clauses S is unsatisfiable if,and only, S {C = ξk ∨ ¬ξk ∨ C | C ∈ S} is satisfiable.Lemma 4.6. A truth-value assignment satisfies l ∨ l ∨ C if, and only if, itsatisfies l ∨ C.Proof. Let I be a truth-value assignment. By definition of the interpretationof disjunctions, If [[l]]I = then [[l ∨ l ∨ C]]I = [[l ∨ C]]I = . If [[l]]I = ⊥ then[[l ∨ l ∨ C]]I = [[l ∨ C]]I = [[C]]I .
  • 55. 4.6. RESOLUTION 57Lemma 4.7. For any atom ξ not occurring in C1 nor in C2 , a truth-valueassignment that does not satisfy C1 ∨ C2 condemns {ξ ∨ C1 , ¬ξ ∨ C2 }.Proof. By contrapositive reasoning. Let I be a truth-value assignment with[[C1 ∨ξ]]I = [[C2 ∨¬ξ]]I = . Then if [[ξ]]I = we have [[C2 ∨¬ξ]]I = [[C2 ]]I = ,and thus [[C1 ∨ C2 ]]I = by definition of the interpretation of the disjunction.Same reasoning if [[ξ]]I = ⊥. Also, if S is a set of ground clauses on which the ground elimination on ξkhas been performed, then every clause C ∈ S contains only the literal ξk , orits negation ¬ξk , or none of them. Then, applying ground factorization on ξkon this set yields a set of clauses in which every clause contains at most oneoccurrence of a literal ξk or ¬ξk . Thus and wlog we can assume the set S canbe written as the disjoint union of three sets of clauses S+ , S− , S0 such that: S+ = {ξk ∨ C | ξk ∨ C ∈ S and the atom ξk does not occurs in C } S− = {¬ξk ∨ C | ¬ξk ∨ C ∈ S and the atom ξk does not occurs in C } S0 = S (S+ ∪ S− )The eager application of the ground resolution on ξk on clauses of S is calledthe resolution on ξk of S, is denoted Resgr (ξk , S), and is the set of clauses: Resgr (ξk , S) = S0 ∪ {C ∨ C | ξk ∨ C ∈ S+ and ¬ξk ∨ C ∈ S− }With respect to satisfiability, this principle is sound, that is if Resgr (ξk , S) isunsatisfiable then S is unsatisfiable, and complete, that is if S is unsatisfiablethen Resgr (ξk , S) is unsatisfiable. Let us prove these simple facts.Lemma 4.8. (Soundness) Assume S is a set of clauses on which ground elim-ination and factorization on ξk have been eagerly applied. If Resgr (ξk , S) isunsatisfiable then S is unsatisfiable.Proof. Assume Resgr (ξk , S) is unsatisfiable, i.e. for each truth-value assignmentI = t1 , . . . , tk−1 to the atoms ξ1 , . . . , ξk−1 there exists a clause CI ∈ Resgr (ξk , S)which is not satisfied by I. Writing CI as the disjunction of literals l1 ∨ . . . ∨ lmthis means that I interprets each of these li as false. If CI ∈ S0 then we havefound a clause in S which is condemned by I. Otherwise by definition we haveCI = C ∨ C with C1 = ξk ∨ C and C2 = ¬ξk ∨ C in S. It is then clearthat the subset {C1 , C2 } of S is condemned by I. Thus every interpretationI = t1 , . . . , tk−1 condemns a non-empty set of clauses in S, and thus S isunsatisfiable by Lemma 4.5.Lemma 4.9. (Completeness) If S is unsatisfiable then Resgr (ξk , S) is unsatis-fiable.Proof. Since S is unsatisfiable every truth-value assignment I = t1 , . . . , tk−1 tothe atoms ξ1 , . . . , ξk−1 condemns S by Lemma 4.5. Thus for every interpretationI on ξ1 , . . . , ξk−1 the set of subsets of S condemned by I is not empty. Let uschoose a minimal one (for inclusion) UI .
  • 56. 58 CHAPTER 4. FUNDAMENTALS OF FIRST-ORDER LOGICClaim 1. For every I either UI = {C} with C ∈ S0 or UI ⊆ S+ ∪ S− . Proof of the claim. If UI ∩ S0 = ∅ then this intersection contains a clause C. Since the atom ξk does not occur in C, this clause is either satisfied or not satisfied by I. In the first case UI is not minimal since every extension of I satisfies C. In the second case C is also condemned by I, and thus the minimality of UI for inclusion implies UI = {C}. ♦Claim 2. If UI ⊆ S+ ∪ S− then UI ∩ S+ = ∅ and UI ∩ S− = ∅. Proof of the claim. Assume UI ⊆ S+ ∪ S− and wlog UI ∩ S+ = ∅. If UI ∩ S− = ∅ then I = t1 , . . . , tk−1 , satisfies UI , thereby contradicting that UI is condemned by I. ♦Claim 3. Assume ξk ∨ C ∈ UI ∩ S+ and ¬ξk ∨ C ∈ UI ∩ S− . Then C ∨ C isnot satisfied by I. Proof of the claim. If I satisfies C (resp. C ) then every extension of I satisfies ξk ∨ C (resp. ¬ξk ∨ C ). This would contradict the minimality of UI . Thus I satisfies neither C nor C , and thus I does not satisfy C ∨C . ♦ It is now clear that S unsatisfiable implies Resgr (ξk , S) unsatisfiable. Indeedfor every interpretation I = t1 , . . . , tk−1 , in the first case of Claim 1 I does notsatisfy a clause in S0 ⊆ Resgr (ξk , S) and in the second case it does not satisfy aclause in Resgr (ξk , S) S0 by Claim 3. Thus Resgr (ξk , S) is unsatisfiable. We note that since the clauses are normalized the atom ξk does not occurin Resgr (ξk , S) for any finite set of ground clauses S. Since only finitely manyatoms occur in S it is clear that applying resolution on a set of ground clauses Sterminates with a set of clauses that does not contain any atom, and thereforeany literal. There are two possibilities for this set: • the obvious one is that the final set is empty. In this case we note that every clause in this set is satisfiable, and thus this final set is satisfiable; • another possibility is that this set contains a clause which is an empty disjunction of literals. Since a clause is interpreted as true if at least one of its literal is interpreted as true, this clause is unsatisfiable.The clause which is an empty disjunction of literals is denoted [ ].Example 11. (Satisfiable set of clauses) Consider the set S = {a, a ∨ b, a ∨ ¬b}.We have:   Resgr (b, S) = {a, a ∨ a} = {a, a} = {a} Resgr (a, S) = ∅ Resgr (a, Resgr (b, S)) = ∅ Since the final set is empty we conclude that S is satisfiable.
  • 57. 4.6. RESOLUTION 59Example 12. (Unsatisfiable set of clauses) Consider the set S = {¬a, a ∨ b, a ∨¬b}.We have:   Resgr (b, S) = {¬a, a ∨ a} = {¬a, a} Resgr (a, S) = {¬b, b} Resgr (a, Resgr (b, S)) = {[ ]}  We summarize the results of this section with the following theorem.Theorem 4.6. Let S be a finite set of ground clauses over the atoms ξ1 , . . . , ξk .Then S is unsatisfiable if, and only if, Resgr (ξ1 , . . . Resgr (ξk , S)) contains theempty clause.4.6.3 Unification and Most General UnifiersIn the rest of this section we will try to apply the ground resolution and fac-torization rules before knowing the ground instance of the clauses. This implieswe have to be able to describe the set of equal ground instances of two distinctatoms, and furthermore to describe this set with one atom. The process ofcomputing this new atom is called unification. Since the proofs and algorithmsin this subsection apply to atoms as well as to terms, we will consider only thecase of the unification of terms.Example 13. Consider the two terms t1 = f (x, g(y, a)) and t2 = f (z, v).Though they are different, we have: • If σ = {x → b, y → b, z → b, v → g(b, a)} then t1 σ = t2 σ; • If τ = {x → c, y → b, z → c, v → g(b, a)} then t1 τ = t2 τ ; • Actually for any term t, for the substitution θt = {x → t, y → b, z → t, v → g(b, a)} then t1 θt = t2 θt ; • Even more generally, for any terms t, t , the substitution θt,t = {x → t, y → t , z → t, v → g(t , a)} we have t1 θt,t = t2 θt,t ; • Instead of quantifying universally on terms, we can use two variables x1 and x2 , form the substitution σx1 ,x2 = {x → x1 , y → x2 , z → x1 , v → g(x2 , a)}, and remark that: – t1 σx1 ,x2 = t2 σx1 ,x2 , and thus σx1 ,x2 makes the terms equal; – For any substitution τt,t = {x1 → t, x2 → t } we have σx1 ,x2 τt,t = θt,t . Example 13 leads us to the definition of several notions. First let us namethe substitutions that equalize two terms.Definition 9. (Unifier) A substitution σ is a unifier of two terms t, t if tσ = t σ.Given two terms t, t we denote Σ(t, t ) the set of unifiers of t and t .
  • 58. 60 CHAPTER 4. FUNDAMENTALS OF FIRST-ORDER LOGIC In Example 13 the unifier σx1 ,x2 could be composed with other substitutionsto obtain new unifiers.Definition 10. (Generalization) A substitution σ is more general than a sub-stitution θ, and we denote σ mgt θ, if there exists a substitution τ such thatστ = θ. The mgt relation on substitutions has several properties. We write σ ≡mgtτ if σ mgt τ and τ mgt σ.Lemma 4.10. (Properties of mgt ) • mgt is a pre-order on substitutions; • σ ≡mgt τ implies that there exists a substitution θ = {x1 → y1 , . . . , xn → yn }, with x1 , . . . , xn , y1 , . . . , yn pairwise distinct variables, such that σ = τ θ; • mgt is a well-founded ordering on substitutions modulo ≡mgt .Proof. • To prove that mgt is a pre-order we have to prove that: – this relation is reflexive, i.e. for all substitution σ we have σ mgt σ; – this relation is transitive, i.e. for all substitutions σ, τ, θ we have σ mgt τ and τ mgt θ implies σ mgt θ; The first point is trivial if we consider the identity substitution that maps every variable to itself. To prove the second point it suffices to remark that the hypotheses imply the existence of two substitutions ησ,τ and ητ,θ such that σησ,τ = τ and τ ητ,θ = θ. Thus σ(ησ,τ ητ,θ ) = θ by associativity of substitution composition. • We note that if σ ≡mgt τ there exists by definition two substitutions θ1 , θ2 such that: σθ1 = τ τ θ2 = σ and thus σ = σθ1 θ2 . Thus on each variable x in the image of σ we have xθ1 θ2 = x. If θ1 maps x to a term f (t1 , . . . , tn ) we have xθ1 θ2 = f (t1 θ2 , . . . , tn θ2 ) = x. Thus θ1 must map x to a variable y, and with the same reasoning θ2 must also map y to x. Furthermore θ1 θ2 is a one-to-one correspondence from and to Var(σ). Thus there exists a set of variables V with |V | = | Var(σ)| and θ1 is a one-to-one correspondence from Var(σ) to V , and θ2 is the inverse one-to-one correspondence from V to Var(σ). • We associate to each substitution σ the number mσ of function symbols employed to write σ. If τ maps at least one variable to a term f (t1 , . . . , tn ) we have mστ > mσ . Since the ordering on positive integers is well-founded, if there exists an infinite sequence σ1 σ2 . . . there exists an index i0 such that j > i0 implies mσj = mσi0 . Thus every substitution θj,j+1 with
  • 59. 4.6. RESOLUTION 61 σj+1 = σj θj,j+1 maps a variable to a variable, and thus the number of variables in the σj for j > i0 is decreasing, and thus becomes constant after an index j0 . Thus for all j > j0 the substitution θj,j+1 is a one-to- one correspondence between variables, and therefore for j > j0 all the σj are equivalent modulo ≡mgt . Given the second point of Lemma 4.10 we usually say “modulo a renamingof variables” rather than writing explicitly ≡mgt . Since we have a pre-orderingon substitutions we can consider the minimal elements in this ordering. Gettingback to Example 13, these minimal elements are like σx1 ,x2 since by definitionof the ordering every unifier can be written as the composition of a minimalunifier and another substitution.Definition 11. (Most general unifiers) The set of most general unifiers of t andt is denoted Σmgu (t, t ) and is the set of minimal elements for mgt of Σ(t, t ). When defining resolution in [3] Robinson proved the following lemma.Lemma 4.11. (Unicity of most general unifiers) Given two terms t, t eitherΣmgu (t, t ) = ∅ or all elements in it are equal modulo a renaming of variables. The proof of Lemma 4.11 is constructive in the sense that it results fromthe direct computation of a unifier whose instances form the set of all unifiers.Before presenting this algorithm let us prove a sequence of lemmas that justifyits soundness.Lemma 4.12. (Extension of equality) Assume t, t have a unifier σ. Then forall p ∈ Pos(t) ∩ Pos(t ) we have (t)|p σ = (t )|p σProof. The equality tσ = t σ means that every position p ∈ Pos(tσ) we have(tσ)|p = (t σ)|p . If p ∈ Pos(t) (resp p ∈ Pos(t )) we have t|p σ = (tσ)|p (resp.t|p σ = (t σ)|p . Hence the equality A consequence is the following lemma that relates the subterms of t and t .Lemma 4.13. (No clash) Assume t, t have a unifier σ. Then for all p ∈Pos(t) ∩ Pos(t ) we have either Symb(t, p) = Symb(t , p) or at least one of{Symb(t, p), Symb(t , p)} is a variable x.Proof. For p ∈ Pos(t) ∩ Pos(t ) we have t|p σ = t|p σ. Assume Symb(t, p) is nota variable, and thus is a function symbol f . By definition the equality of termsimplies the equality of their root symbols, and thus f is the root of t|p σ. Twocases can occur: • If Symb(t , p) is a function symbol g, then since the root symbol of t|p σ is f we must have g = f ; • Otherwise Symb(t , p) is a variable, and thus t|p is a variable.
  • 60. 62 CHAPTER 4. FUNDAMENTALS OF FIRST-ORDER LOGICLemma 4.14. (Variable replacement) Assume there exists p ∈ Pos(t) ∩ Pos(t )such that t|p = x ∈ X and t|p = y ∈ X . Let θ = {x → y}. Then every unifier σof t and t is a unifier of tθ and t .Proof. For every unifier σ we must have by Lemma 4.12 t|p σ = t|p σ, and thusxσ = yσ.Lemma 4.15. (Term replacement) Let t and t be two unifiable terms, andassume there exists p ∈ Pos(t) ∩ Pos(t ) such that t|p = x and t|p is a non-variable term. Then we have: • x ∈ Var(t|p ); / • The substitution θ = {x → t|p } is such that . – Σ(t, t ) ⊆ Σ(tθ, t θ); – Every unifier σ ∈ Σ(tθ, t θ) with xσ = xθσ is in Σ(t, t )Proof. • for every unifier σ of t and t we have xσ = t|p σ. However since t|p is not a variable, if x ∈ Var(t|p ) then xσ is also a strict subterm of t|p σ, which is a contradiction. • For any unifier σ of t and t we must have xσ = t|p σ = (xθ)σ. Given the definition of θ, for every variable y = x we have yθσ = yσ. Thus for every variable z we have zσ = zθσ, and therefore every unifier of t and t is a unifier of tθ and t θ. Conversely, if a unifier σ of tθ and t θ is such that xσ = xθσ it is clear that it is also a unifier of t and t . We are now ready to present a unification algorithm of two terms t andt . The procedure we present is recursive, and certainly not fit for the realcomputation of most general unifiers, which can be done in linear time [152]. One easily proves that, invoking the procedure with the identity substitution,that the variables of Algorithm 4.2: • At each step the domain of θ is disjoint from Var(t) ∪ Var(t ); • The number of variables in Var(t) ∪ Var(t ) strictly decreases at each iteration, which ensures the termination of the procedure; • When Unif(t, t , Id) is invoked, at each subsequent call of Unif(t1 , t2 , θ) we have Σ(t, t ) = {θσ | σ ∈ Σ(t1 , t2 )}; • Consequently, this procedure always halt, and when it returns a substi- tution θ on the invocation Unif(t, t , Id) we have tθ = t θ and for every substitution σ ∈ Σ(t, t ) there exists τ such that θτ = σ. Thus the returned substitution is smaller for mgt than any substitutionin Σ(t, t ). This proves Lemma 4.11. From now on this substitution will bedenoted, when Σ(t, t ) = ∅, mgu(t, t )
  • 61. 4.6. RESOLUTION 63Properties of unificationWe now state the property of unification that is critical for lifting ground reso-lution to resolution.Lemma 4.16. Let t and t be two terms such that Var(t) ∩ Var(t ) = ∅ andsuch that there exists two substitutions σ and τ with tσ = t τ . Then t and thave a most general unifier.Proof. Consider the set S of couples of terms {t, t } with Var(t) ∩ Var(t ) = ∅such that there exists σ, τ with tσ = t τ but t and t do not have a mgu. The lemma states that the set S is empty. Let us prove this emptiness bycontradiction. Assume S = ∅ and consider the ordering on couples (t1 , t1 ) <(t2 , t2 ) iff t1 is a subterm of t2 and t1 is a subterm of t2 . Since the subtermordering is well-founded, this ordering on pairs is well-founded. Thus S = ∅implies that S has a minimal element (t, t ). First let us note that neither t nor t can be a variable, for if e.g. t isa variable, then Var(t) ∩ Var(t ) = ∅ implies that t ∈ Var(t ) and thus the /unification of t, t terminates immediately and returns the mgu {t → t } byLemma 4.15. Thus we must have t = ft (t1 , . . . , tn ) and t = ft (t1 , . . . , tm ) for some func-tion symbols ft , ft of respective arities n and m. Then since tσ = t τ we musthave ft = ft and n = m. Thus if t and t do not have a mgu, there exists1 ≤ i ≤ b such that ti and ti do not have a mgu. But then the couple (ti , ti ) isin S, and contradicts the minimality of (t, t ). Thus S must be empty.4.6.4 ResolutionWhen considering Algorithm 4.1, ground resolution is of little help, given that itcomes into action only once a finite set of ground instances has been chosen. Inhis presentation of Resolution in [3] Robinson comments Herbrand’s Theorem bysaying that to be of effective use one would need a “. . . benevolent and omniscientdemon who could provide us, in reasonable time, with a proof set 4 . . . ”. Resolu-tion is then presented as one such demon who computes the ground instancesof the clauses in the theory T while applying ground resolution. It is based onground resolution but relies on most general unifiers to build incrementally theinstances of the clauses. One difficulty of not knowing the ground instance isthat the normalization phase of ground resolution cannot be conducted deter-ministically: one does not know whether the instances of two literals in a clauseare equal. Given the importance of normalization for the completeness of reso-lution, we introduce a factorization rule that non-deterministically guesses thecommon instances of literals by trying to unify literals and, when succeeding,adds the “normalized” clause to the set of clauses. Then we present a resolu-tion rule, also based on unification and also applied non-deterministically, thatguesses when a ground resolution rule can be applied between two instances oftwo clauses. Then we prove that applying non-deterministically these two rules 4a set of atoms with which the clauses are instantiated
  • 62. 64 CHAPTER 4. FUNDAMENTALS OF FIRST-ORDER LOGICpermits one to simulate the operations of labeled resolution. This simulationimplies that the empty clause is reachable by resolution and factorization froma set of clauses S if, and only if, S is unsatisfiable.Definition 12. (Factor) Let C = L1 ∨ L2 ∨ C be a clause and assume σ =mgu(L1 , L2 ). Then (L1 ∨ C)σ is a factor of C.Definition 13. (Resolvent) Let L1 ∨ C, ¬L2 ∨ C be two clauses of disjoint setsof variables and assume σ = mgu(L1 , L2 ). Then (C ∨ C )σ is a resolvent of C. The computation of a factor of a given clause is called factorization, and thecomputation of the resolvent of two clauses is called resolution. The applicationof the Factorization rule on a set of clauses S consists in: (i) extracting C from S; (ii) trying to apply the rule (a) of Figure 4.1 on C;(iii) When succeeding, adding the factor of C to S.Similarly, the application of the resolution rule on a set of clauses S consists in: (i) extracting two clauses C1 and C2 from S; (ii) renaming the variables of C2 so that the domains of C1 and C2 are disjoints;(iii) trying to apply the rule (b) of Figure 4.1 on C1 and C2 ;(iv) When succeeding, adding the resolvent of C1 and C2 to S.We call resolution the iterated application of the factorization and resolutionrules. L1 ∨ L2 ∨ C L1 ∨ C ¬L2 ∨ C σ = mgu(L1 , L2 ) σ = mgu(L1 , L2 ) (L2 ∨ C)σ (C ∨ C )σ (a) Factorization F ac(L1 , L2 , C) (b) Resolution Res(L1 , L2 , L1 ∨ C, ¬L2 ∨ C ) Figure 4.1: The (a) factorization and (b) resolution rulesDefinition 14. (Simulation relation) Let S be a set of clauses and Sg be a setof ground clauses. We say that S simulates Sg , and denote Sg S, if for everyCg ∈ Sg there exists C ∈ S and a ground substitution σ such that Cσ = Cgmodulo a reordering of literals. Assume a set of clauses S is unsatisfiable. Then by Herbrand’s Theoremthere exists a finite set Sg of ground instances of clauses in S which is unsat-isfiable. We trivially have Sg S. Since Sg is a finite and unsatisfiable set ofground clauses, Theorem 4.6 implies that a finite sequence of normalization andground resolution ends with a set of clauses that contains the empty clause [ ].
  • 63. 4.6. RESOLUTION 65Lemma 4.17. (Lifting lemma) Let l1 ∨ C1 and ¬l2 ∨ C2 be two clauses withVar(l1 ∨ C1 ) ∩ Var(¬l2 ∨ C2) = ∅, and σ1 , σ2 be two ground substitutions suchthat l1 σ1 = l2 σ2 . Then there exists two substitutions θ and τ such that: • θ is the most general unifier of l1 and l2 ; • (C1 ∨ C2 )θτ = C1 σ1 ∨ C2 σ2 .Proof. The hypothesis implies in particular that Var(l1 ) ∩ Var(l2 ) = ∅. Thusby Lemma 4.16, θ = mgu(l1 , l2 ) is defined and there exists τ0 such that, for x ∈Var(l1 ) ∪ Var(l2 ) we have xθτ0 = xσ1 = xσ2 . We extend τ0 into a substitutionτ on variables in (Var(C1 ) ∪ Var(C2 )) (Var(l1 ) ∪ Var(l2 )) by setting xτ = xσ1(resp. xτ = xσ2 ) if x ∈ Var(C1 ) Var(l1 ) (resp. x ∈ Var(C2 ) Var(l2 )).Lemma 4.18. Let C = l1 ∨l2 ∨C and assume there exists a ground substitutionσ with l1 σ = l2 σ. Then there exists a most general unifier θ of l1 and l2 , andl1 σ ∨ Cσ is a ground instance of l1 θ ∨ Cθ.Proof. Since l1 σ = l2 σ the atoms l1 and l2 are unifiable, and thus θ = mgu(l1 , l2 )is defined. Since θ is a most general unifier of l1 and l2 and σ is a unifier ofl1 and l2 , there exists a substitution τ such that θτ = σ. Hence l1 σ ∨ Cσ is aground instance of l1 θ ∨ Cθ. Lemma 4.17 states that the ground resolvent of the ground instances of twoclauses with disjoint sets of variables is a ground instance of a resolvent of thesetwo clauses. Similarly Lemma 4.18 states that the ground factor of a groundinstance of a clause C is a ground instance of a factor of the clause C. As a consequence for each transformation applied on a set of ground clausessimulated by S (except the elimination of a trivially satisfiable clause or of theclauses that contain the resolved atom, but this does not compromise the simu-lation) there exists a corresponding application of the factorization or resolutionrule on S that preserves the simulation relation. There is only a finite number ofground factorization and resolution applicable on any given finite set of groundinstances of clauses in S. If the finite set of ground instances is unsatisfiablethen the final simulated set of ground clauses contains [ ] by Theorem 4.6. Sincethe clause [ ] can only be simulated by itself modulo a reordering of literals wehave the following theorem.Theorem 4.7. (Completeness of resolution) Let S be a finite and unsatisfiableset of clauses. Then there exists a finite sequence of applications of the resolutionand factorization rules that reaches a set of clauses S that contains [ ]. We note that if Sg is a finite and unsatisfiable set of ground instances of Sit is possible to apply a resolution or factorization rule on S that has no groundcounterpart. Also some clauses are eliminated when applying ground resolution.Thus the set of clauses we obtain from S by applying factorization and resolutionrules typically contains clauses that do not simulate any ground clause obtainedfrom Sg . Next theorem states that while that may be true, the addition to S ofthese “non-simulating” clauses never turns S into an unsatisfiable set of clausesunless S is unsatisfiable before the application of any rule.
  • 64. 66 CHAPTER 4. FUNDAMENTALS OF FIRST-ORDER LOGICTheorem 4.8. (Soundness of resolution) Let S be a finite set of clauses andC be either a factor of a clause in S or the resolvent of two clauses in S. IfS ∪ {C} is unsatisfiable then S is unsatisfiable.Proof. Let S = S∪{C} where C is either a factor of a clause in S or the resolventof two clauses in S, and by contrapositive reasoning assume that S is satisfiable.By Theorem 4.3 there exists an Herbrand’s interpretation I that satisfies everyinstance of a clause in S. Assume that I does not satisfy every instance of aclause in S . By construction of S there exists a ground substitution σ suchthat I does not satisfy the clause Cσ. • If C is a factor of a clause Cf ∈ S then Lemma 4.6 implies that Cf σ is also not satisfied by I, a contradiction with the assumption that I is a model of S; • If C is the resolvent of two clauses ξ1 ∨ C1 , ¬ξ2 ∨ C2 ∈ S obtained by applying the substitution θ, i.e. C = (C1 ∨ C2 )θ then let τ = θσ. We have that I does not satisfy any literal in (C1 ∨ C2 )τ whereas it satisfies both (ξ1 ∨ C1 )τ and (ξ2 ∨ C2 )τ . A case-based analysis on whether I satisfies ξ1 τ or ¬ξ2 τ yields a contradiction. We thus have the soundness of the factorization and resolution rules. Ifstarting from a set S a finite sequence of application of these rules reaches a setS containing [ ] then S is unsatisfiable. And if S is unsatisfiable one such finitesequence exists.Theorem 4.9. Let S be a finite set of clauses. Then S is unsatisfiable if,and only if, there exists a finite sequence of applications of the resolution andfactorization rules that reaches a set of clauses S that contains [ ]. Note that in Theorems 4.7 and 4.8 we mentioned the existence of a finitesequence of applications of the rule F ac(L1 , L2 , C) and Res(L1 , L2 , C1 , C2 ), butnever stated that we were sure to apply this sequence. However there is alwaysa finite number of choices for applying resolution or factorization on each set ofclauses obtained from S. It is thus possible to enumerate all the possible ruleapplications starting from S. While this enumeration is in general infinite, it willreach the empty clause if, and only if, the starting set of clauses is unsatisfiable.4.7 First-order Logic with EqualityIn Herbrand’s theorem, the cornerstone of the reduction of any interpretationsatisfying a theory T to a Herbrand’s interpretation satisfying T is that inthe latter domain, the function symbols are interpreted as one-to-one functionsof disjoint image. For this reason Herbrand’s interpretations fail to capturenatively simple facts such as 1 + 1 = 2: the terms on the two sides of the
  • 65. 4.7. FIRST-ORDER LOGIC WITH EQUALITY 67equality are syntactically distincts, and thus this atom may be interpreted astrue or false. It is obvious that for expressiveness reasons, it is important to handle effi-ciently the equality symbols to be able to reason on algebraic structures. Wereview in this section additional clauses that can be added to a theory thatensures that in any interpretation I satisfying T the equality atoms will be in-terpreted as they should (e.g. that x = y implies y = x and f (x) = f (y)). Thenwe present the special case of equational theories, which are sets of universallyquantified unary positive clauses, and are the core of my work on the refutationof cryptographic protocols.4.7.1 Axiomatizing Equality in First-Order LogicThe first approach consists in adding to a first-order theory T that containsthe equality predicate clauses that express its properties. Since equality is acongruence it must satisfy the follow axioms w.r.t. the function and predicatesymbols defined in an interpretation I:Reflectivity: ∀x, x = x;Symmetry: ∀x∀y, x = y ⇒ y = xTransitivity: ∀x∀y∀z, (x = y ∧ y = z) ⇒ x = zCongruence on functions: For every function symbol f of arity n, for every 1 ≤ i ≤ n we have ∀x1 . . . ∀xn ∀y, xi = y ⇒ f (x1 , . . . , xi−1 , xi , xi+1 , . . . , xn ) = f (x1 , . . . , xi−1 , y, xi+1 , . . . , xn )Congruence on atoms: For every predicate symbol p of arity n, for every 1 ≤ i ≤ n we have ∀x1 . . . ∀xn ∀y, (xi = y ∧ p(x1 , . . . , xi−1 , xi , xi+1 , . . . , xn )) ⇒ p(x1 , . . . , xi−1 , y, xi+1 , . . . , xn )This set of equations is called K and was given by [53]. While it is complete,the Congruence on atoms clauses can be resolved with any clause. Theensuing combinatorial explosion makes it an unpractical choice for automatedtheorem proving. Since it is practical to reason modulo these equations, givena first-order theory T we denote I |== T the fact that I |= T ∪ K.4.7.2 Unification Modulo an Equational TheoryA fruitful research direction is to consider extensions of the resolution rule, suchas paramodulation [216] and its superposition [44, 141] variant, that take intoaccount the properties of the equality predicate. However in many cases theclauses that contain the equality predicate contain only one positive literal.
  • 66. 68 CHAPTER 4. FUNDAMENTALS OF FIRST-ORDER LOGICExample 14. In order to model lists one can use one nullary function symbol“elist”, and one binary function symbol “cons”. The usual list operations “head”and “tail” can be modeled by the clauses: ∀x∀l, head(cons(x, l)) = x ∀x∀l, tail(cons(x, l)) = lDefinition 15. (Equational theory) An equational theory E is a conjunctionof clauses ∀x1 . . . ∀xn , t = s where t and s are terms with variables among thex1 , . . . , x n . Plotkin [181] was the first to notice that when reasoning modulo an equa-tional theory it suffices to consider the terms in the Herbrand’s domain modulothe equations. As a consequence the only adaptation needed w.r.t. to our pre-sentation of first-order logic is to consider unification modulo the equalities inthe equational theory.Definition 16. (E-unifiers) Let E be an equational theory. We say that twoterms t and s are E-equal, and denote s =E t, if E |== t = s. We say that asubstitution σ is a E-unifier of s and t if E |== tσ = sσ. We say that two terms that have a E-unifier are E-unifiable. We extend thenotion of unifier to conjunctions of equations as follows.Definition 17. (Unification systems) Let E be an equational theory. An E- ?Unification system S is a finite set of equations denoted by {ui = vi }i∈{1,...,n}with terms ui , vi ∈ T (F, X ). It is satisfied by a substitution σ, and we noteσ |= E S, if for all i ∈ {1, . . . , n} ui σ =E vi σ. One easily proves that the definition of unifiers in Section 4.6.3 correspondto the case where the equational theory E is an empty set of clauses. As inSection 4.6.3 we denote ΣE (t, t ) the set of unifiers of t and t . Also, we say thata substitution σ is more general than a substitution τ modulo E, and denoteσ E τ if there exists a substitution θ such that for every variable x we have mgtxσθ =E xτ .Example 15. Consider the equational theory E = {f (x, f (y, y)) = x}. Thenthe substitution σ = {x → f (y, z)} is more general than the substitution τ ={z → f (v, v), x → y} since for all variable w we have wσθ =E wθ. As Example 15 demonstrate we can have two unifiers that instantiate oneanother but are not a renaming one of the other, as was the case in Lemma 4.10.Since the relation between unifiers that are instances one of the other is morecomplex than in the case of the empty theory, we introduce the notion of com-plete set of unifiers.Definition 18. (Complete set of unifiers) Let E be an equational theory andt, t be two terms. We say that a subset S of ΣE (t, t ) is a complete set of unifiersof t and t if, for every substitution σ ∈ ΣE (t, t ) there exists a substitution τ ∈ Sand a substitution θ such that τ θ =E σ.
  • 67. 4.7. FIRST-ORDER LOGIC WITH EQUALITY 69Example 16. In the empty theory, if Σ(t, t ) = ∅ and if σ = mgu(t, t ), thenboth {σ} and {σθ | θ renaming of variables} are complete sets of unifiers of tand t . As shown by Example 16 complete sets of unifiers may include redundancies.In order to obtain in the case of the empty theory the notion of unique mostgeneral unifier we thus consider minimal (for inclusion) complete sets of unifiers.One easily proves that such sets do not contain two substitutions of which oneis the instance of the other.Lemma 4.19. Let E be an equational theory, t, t be two terms, and S, S betwo minimal complete sets of unifiers of t and t . Then S and S have the samecardinality.Proof. By definition of complete sets of unifiers, there exists two functions f, gsuch that: f: S → S g:S → S σ → σ τ → τand f (σ) (resp. g(τ )) is more general than σ (resp. τ ). Wlog assume that fis not injective. Then there exists σ1 , σ2 ∈ S such that f (σ1 ) = f (σ2 ) = σ , andlet σ = g(σ ). By definition of the “more general than” relation there existsthree substitutions θ1 , θ2 , θ such that:   σ = σθ σ1 = σ θ1 σθθ1 σ2 = σ θ2 σθθ2 Since σ1 = σ2 let us assume wlog that σ = σ1 . By removing σ1 we still have acomplete set of unifiers, which contradicts the minimality of S. Thus f must beinjective. The same reasoning can be applied on g, and thus g is also injective.Since there are two injective functions from S to S and from S to S thereexists a bijection between S and S . Consequently these two sets have the samecardinality. An informal consequence of Lemma 4.19 is that there is no reason to favorone minimal complete set of unifiers over another. Given that we have actu-ally proved that the relation E between elements in S and S is a bijection mgt(since every function whose graph is contained in this relation must be injec-tive) the different minimal complete sets of unifiers contain essentially the samesubstitutions.Definition 19. (Most general E-unifiers) Let E be an equational theory andt, t be two terms. We denote mguE (t, t ) a minimal complete set of unifiers oft and t . As described above, the finiteness or even the existence of a minimal com-plete set of unifiers of two terms unifiable modulo E is not guaranteed. Weclassify the equational theories according to the possible cardinality of this set.
  • 68. 70 CHAPTER 4. FUNDAMENTALS OF FIRST-ORDER LOGICDefinition 20. Let E be an equational theory and t, t be any two E-unifiableterms. We say that: • E is nullary if mguE (t, t ) does not necessarily exist; • If mguE (t, t ) necessarily exists, we say that: – E is unary if mguE (t, t ) must be a singleton; – Otherwise, E is finitary if mguE (t, t ) must be a finite set; – Otherwise, E is infinitary if mguE (t, t ) can be a infinite set; Also, unification systems are classified w.r.t. the terms occuring in them.Let E be an equational theory in which the non-variable symbols occurring inthe equations of E are in a signature F. We say that a unification system S is:Elementary if the terms occurring in S are in T (F, X ) ;with constants if the terms occurring in S are built from symbols in S, vari- ables, and nullary symbols not in F;General if the terms occurring in S are built from symbols in S, variables, and arbitrary symbols not in F.Accordingly we say that a symbol occurring in a term t is free (w.r.t. theequational theory E defined over the signature F) if it is not a symbol in F. Inthe rest of this document and when reasoning modulo an equational theory wedenote C a denumerable set of free constants, i.e. nullary symbols not occurringin any equation of E.4.7.3 Some properties of E-unification systems.There exists few properties that are common to all equational theories. Howeversome of them are instrumental in our work on the analysis of cryptographicprotocols, and are presented here. In the rest of this section, we assume thatE is an equational theory defined by equations over a signature F, that C isa denumerable set of constants not occurring in F, and that T (F, X ) andT (F) denote respectively the sets of terms and of ground terms built over thesignature F ∪ C.Existence of a convergent rewriting relationWe shall first introduce the notion of ordered rewriting [100]. Let < be a sim-plification ordering on T (F) 5 assumed to be total on T (F) and such that theminimum for < is a constant cmin ∈ C. Given a possibly infinite set of equa-tions O on the signature T (F) we define the ordered rewriting relation →Oby s →O s iff there exists a position p in s, an equation l = r in O and asubstitution τ such that s = s[p ← lτ ], s = s[p ← rτ ], and lτ > rτ . 5 by definition < satisfies for all s, t, u ∈ T (F ) s < t[s] and s < u, t|p = s imply t < t[p ← u]
  • 69. 4.7. FIRST-ORDER LOGIC WITH EQUALITY 71 It has been shown (see [100]) that by applying the unfailing completionprocedure [123] to a set of equations E one can derive a (possibly infinite) set ofequations O such that: 1. the congruence relations =O and =E are equal on T (F). 2. →O is convergent (i.e. terminating and confluent6 ) on T (F).We shall say that O is an o-completion of H. The relation →O being convergent on ground terms we define (t)↓O as theunique normal form of the ground term t for →O . Given a ground substitutionσ we denote by (σ)↓O the substitution with the same support such that for allvariables x ∈ Supp(σ) we have x(σ)↓O = (xσ)↓O . A substitution σ is normal ifσ = (σ)↓O .ReplacementAn important property of E-unification systems, whose proof can be foundin [70], is the following replacement property. Given terms u, v, t, we denoteby tδu,v the parallel replacement of all occurrences of u by v in t. Given a sub-stitution σ we denote by σδu,v the substitution such that x(σδu,v ) = σ(x)δu,vfor every variable x.Remark 1. A replacement behaves like a substitution, with the main differencebeing that it replaces a term, and not a variable, with another term. The useof replacement instead of substitutions is mandatory from a technical point ofview: unfailing completion provides one with a convergent rewriting system onground terms when they are totally ordered with a simplification ordering. Non-ground terms are generally speaking never totally ordered by a simplificationordering, the rationale being that two distinct variables cannot be ordered by aliftable ordering (proof left to the reader). Let us first extend the notion of free constant w.r.t. an equational theory E.Let T be a set of terms. We say that a term t is bound by σ in T whenever thereexists r ∈ T X such that rσ =∅ t. A term t is σ-free in T if it is not bound byσ in T . We say that t is bound in T if there exists σ such that t is bound by σin T . Otherwise we say that t is free in T . Given an equational theory E let usdefine : TE = Sub(r) ∪ Sub(s) r=s∈EWe say that a term t is bound (resp. free) in E if t is bound (resp. free) in TE .Given a term t and an equational theory E we call the factors of t, and denoteFactors(t), the set of maximal strict subterms of t which are free in E. First letus note an important result that has a trivial proof. 6 if two terms t , t are equal modulo = 1 2 O there exists a term t3 reachable from both t1 andt2 by a sequence of ordered rewriting
  • 70. 72 CHAPTER 4. FUNDAMENTALS OF FIRST-ORDER LOGICLemma 4.20. (Subterms and Substitutions) Let t be a term and σ be a substi-tution of domain Var(t). Then: Sub(tσ) = (Sub(t) X )σ ∪ Sub(σ)Proof. By induction on the structure of terms. The lemma is trivial for variablesand constants. For the induction case it suffices to note: n Sub(f (t1 σ, . . . , tn σ)) = {f (t1 σ, . . . , tn σ)} ∪ Sub(ti σ) i=1 n = {f (t1 , . . . , tn )}σ ∪ ((Sub(ti ) X )σ ∪ Sub(σ)) i=1 n = ({f (t1 , . . . , tn )}σ ∪ (Sub(ti ) X )σ) ∪ Sub(σ) i=1 = (Sub(f (t1 , . . . , tn )) X )σ ∪ Sub(σ) I.e. if a term t is free in Sub(r) then every occurrence of t in rσ is “in”the instance of a variable. In order to demonstrate its usage we reference itexplicitely in the proof of next lemma. Since it is trivial Lemma 4.20 willsubsequently be employed without being refered to.Lemma 4.21. (Replacement of free subterms) Let t be a σ-free term in Sub(r).Then for every term u we have: (rσ)δt,u = r(σδt,u )Proof. Since t is σ-free in Sub(r) we have t ∈ (Sub(r)X )σ. Thus by Lemma 4.20 /for every position p such that (rσ)|p = t there exists a variable x ∈ Var(r)such that t ∈ Sub(xσ). Thus this variable must be in a position q ≤ p, andthere exists a position q such that (xσ)|q = t and q · q = p. Thus we have(σδt,u )q = u and thus r(σδt,u )|p = u. Since this is true for every position psuch that (rσ)|p = t all the replacements performed when computing (rσ)δt,uare performed when computing r(σδt,u ). Conversely for every position q and every variable x ∈ Var(r) at position qsuch that (xσ)|q = t there is an occurrence of t in rσ at position q · q . Thuswe do not apply more replacement in r(σδt,u ) than in (rσ)δt,u .Lemma 4.22. (Replacement lemma) Let E be a consistent equational theory,r, s be two ground terms such that r =E s and such that the factors of r and sare in normal form modulo E. Let t be a free term in E which is in normal formmodulo E, and u be any ground term. Then rδt,u =E sδt,u .
  • 71. 4.7. FIRST-ORDER LOGIC WITH EQUALITY 73Proof. By contradiction let us assume the set Ω of couples (r, s) which arecounterexamples to the lemma is not empty. Since for each (r, s) ∈ Ω wehave r =E s and since E is a congruence, let µ(r, s) be the minimal number ofequations in E to apply to rewrite r into s. Since Ω cannot contain a couple(r, r) (for which the lemma would be trivially true) the minimum of µ over Ωis strictly positive. This minimum cannot be greater than or equal to 2 forotherwise we would have r =1 r =E s—where =1 denotes the equality after E Ethe application of exactly one equation in E—with r = r and r = s, and thuseither rδt,u =E r δt,u or r δt,u =E sδt,u . We thus have both µ(r, r ) < µ(r, s) andµ(r , s) < µ(r, s). Since at least one of these couples must be in Ω we contradictthe minimality of µ(r, s). Thus if Ω = ∅ there exists two terms r, s whose factors are in normal form,a term t free in E, and a term u such that r =1 s but rδt,u =E sδt,u . We have: E • We recall that t is a free term in E in normal form. Thus by definition of factors every occurrence of t in r, s must be a subterm of a factor; • Let g = d be the equation in E applied at position p in r that yields the term s. I.e. there exists a substitution σ such that r|p = gσ, and s = r[p ← dσ]. Since t is a free term in E it is free in Sub(g, d); • Thus by Lemma 4.21 we have (gσ)δt,u = g(σδt,u and (dσ)δt,u = d(σδt,u . • Thus the same equation can be applied at the same position between rδt,u and sδt,u with the substitution σδt,u , and therefore rδt,u ==E sδt,u . • This contradicts the membership of the couple (r, s) in Ω.Thus we must have Ω = ∅, which proves the lemma. When studying terms modulo an equational theory an interesting point toconsider is the conditions under which one can “combine” Lemmas 4.21 and 4.22to obtain a replacement lemma for solutions of a unification system modulo anequational. The main difficulty here is that Lemma 4.22 assumes that thefactors are already in normal form. However when one considers an arbitraryset of equations it is not true, in general, that a bottom-up rewriting strategy iscomplete. One way to recover completeness for such a strategy is to use orderedrewriting with the o-completion of the equational theory. The complete proofof this lemma can be found in [70, 76].Lemma 4.23. For any equational theory E, if a E-unification system S is sat-isfied by a substitution σ, and c is any constant in C away from S, then for anyterm t, σδc,t is also a solution of S. The proof of Lemma 4.23 consists in first analyzing the unfailing comple-tion algorithm to prove that no free constant occur in the equations of orderedcompletion of a theory E, and thus that c free in E implies that c is free in anyo-completion of E. One then considers a sequence of ordered rewriting transi-tions from a term t to its normal form and prove that rewriting commutes withthe replacement δc,t .
  • 72. 74 CHAPTER 4. FUNDAMENTALS OF FIRST-ORDER LOGIC For the empty theory this lemma admits a kind of reciprocal:Lemma 4.24. If σ satisfies a ∅-unification system S and for all s ∈ Sub(S)we have sσ = t then for any constant c not occurring in t, (sσ)δt,c = s(σδt,c ).Hence σδt,c is also a solution of S.Proof. By structural induction on term s. If s is a constant sσ = t impliess = t and thus s = (sσ)δt,c = s(σδt,c ). If s is a variable we simply applythe definition of replacement to get sσ)δt,c = s(σδt,c ). If s = f (s1 , . . . , sn ),sσ = t implies (f (s1 , . . . , sn )σ)δt,c = f ((s1 σ)δt,c , . . . , (sn σ)δt,c ) and we applythe induction hypothesis to (si σ)δt,c .4.8 ConclusionThe material presented in this chapter is classical, and could have been refer-enced to instead of included. However, given its importance as the backgroundof all my work on cryptographic protocols and Web Services, I hope that thechoice of the inclusion of this material, with a focus on the points on which therest of this document depends, makes it easier to read.
  • 73. 4.8. CONCLUSION 75 Algorithm 4.2: A procedure Unif(t, t , θ) computing the mgu of tθ and t θ if ∀p ∈ Pos(t) ∩ Pos(t ), Symb(t, p) = Symb(t , p) then {the terms are syntactically equal} return θ else {there exists p ∈ Pos(t) ∩ Pos(t ) with Symb(t, p) = Symb(t , p)} let p ∈ Pos(t) ∩ Pos(t ) be such that Symb(t, p) = Symb(t , p) if Symb(t, p) ∈ X ∧ Symb(t , p) ∈ X then / / {terms not unifiable by Lemma 4.13} return error, clash found else if Symb(t, p) ∈ X ∧ Symb(t , p) ∈ X then {Two variables, substitution by Lemma 4.14} let σ = {Symb(t, p) → Symb(t , p)} return Unif(tσ, t σ, θσ ∪ σ) else if Symb(t, p) ∈ X ∧ Symb(t , p) ∈ X then / {One variable, one term, substitution or fail by Lemma 4.15} if Symb(t, p) ∈ Var(t|p ) then return error, occur-check failed else let σ = {Symb(t, p) → t|p } return Unif(tσ, t σ, θσ ∪ σ) end if else {Symb(t, p) ∈ X ∧ Symb(t , p) ∈ X } / {One variable, one term, substitution or fail by Lemma 4.15} if Symb(t , p) ∈ Var(t|p ) then return error, occur-check failed else let σ = {Symb(t , p) → t|p } return Unif(tσ, t σ, θσ ∪ σ) end if end if end if
  • 74. 76 CHAPTER 4. FUNDAMENTALS OF FIRST-ORDER LOGIC
  • 75. Chapter 5Refinements of Resolution Refinements of resolution are restrictions on the possible fac- torization or resolution inferences between clauses, as well as simplifications on the set of clauses under scrutiny. The first motive for the introduction of these restrictions was practical as it accelerated the search of the empty clause (see the discussion in [95]). It later turned out that in some cases resolution with refinements starting from a theory T terminates with a set of clauses T ’ that is not unsatisfiable. These sets are called sat- urated w.r.t. the refinement adopted, and can be employed to decide whether the theory T entails a sentence ϕ [112]. The goal of this chapter is to present the refinement proposed in collaboration with Mounira Kourjieh. To this end we do not provide an overview of all existing refinements as the one in [18] but instead to focus on the ones related to our own.5.1 Ordered Resolution5.1.1 Liftable orderingsWhile resolution is much more efficient than the naive algorithm to prove thata finite set of clauses is unsatisfiable, its degree of non-determinism still makesit unfit as soon as the theory under scrutiny has more than a few clauses eachwith few literals. In Chapter 4 we have proved the following theorem on finitesets of ground clauses.Theorem 4.6, p. 59. Let S be a finite set of ground clauses over the atomsξ1 , . . . , ξk . Then S is unsatisfiable if, and only if, Resgr (ξ1 , . . . Resgr (ξk , S))contains the empty clause. We remark that the atoms ξ1 , . . . , ξk can be chosen in an arbitrary order.Thus let us assume a is an arbitrary ordering over the atoms in the Herbranduniverse of a theory T . 77
  • 76. 78 CHAPTER 5. REFINEMENTS OF RESOLUTIONCorollary 5.1. (of Theorem 4.6) Let a is an arbitrary ordering over the atomsin the Herbrand universe of a theory T , S be a finite set of ground instances ofclauses in T , and ξ1 , . . . , ξk be the atoms occurring in S. If for all 1 ≤ i ≤ kwe have ξi maximal for a in {ξ1 , . . . , ξi }, then S is unsatisfiable if, and onlyif, Resgr (ξ1 , . . . Resgr (ξk , S)) contains the empty clause. We recall that the operation Resgr (ξ, S) consists in applying eagerly theground factorization on ξ on the clauses in S, to add all the resolvents of reso-lution on ξ between the obtained clauses, and finally to remove all the clausesthat contain the atom ξ. Thus by definition the atom ξi does not occur inResgr (ξ, S), and therefore at each step i in Resgr (ξ1 , . . . Resgr (ξk , S)) the atomξi on which ground resolution and factorization are applied is maximal for theordering a w.r.t. the atoms ξ1 , . . . , ξi of Res(ξi+1 , . . . Resgr (ξk , S)). As usual this corollary on a finite set Sg of ground instances of clauses in T isnot sufficient to derive a practical procedure testing whether T is unsatisfiable.However we know that the set S of clauses in T simulates Sg , and that the liftinglemmas 4.17 and 4.18 extend this simulation to the clauses computed by groundresolution and factorization on Sg . To restrict the usage of factorization andresolution it suffices to import the ordering constraints in a finite set of groundclauses to a set of clauses that simulates it. This is the role of the restriction toliftable orderings which preserve the maximality in the following sense.Definition 21. (Liftable orderings) An ordering a on atoms is liftable if, andonly if, for all atoms ξ1 , ξ2 and for all substitution σ we have ξ1 σ a ξ2 σ impliesξ1 a ξ2 .Lemma 5.1. (Preservation of maximality) Let l ∨ C be a clause and σ be aground substitution. If the atom ξσ in lσ is maximal for a liftable atom ordering a w.r.t. the atoms occurring in Cσ, then the atom occurring in l is maximalw.r.t. the atoms occurring in C.Proof. Let ξ be the atom occurring in l and assume it is maximal for a liftableordering a among the atoms ξ1 σ, . . . , ξk σ occurring in Cσ. Since the orderingis liftable this implies that for 1 ≤ i ≤ k we have ξσ a ξi σ. Since the orderingis liftable this implies that for 1 ≤ i ≤ k we have ξ a ξi . Thus the atomoccurring in l is maximal w.r.t. the atoms occurring in C.5.1.2 Pre- and Post-ordered resolutionWe elaborate on Lemma 5.1 to define factorization and resolution rules in whichthe atom in the factored or resolved literal is maximal w.r.t. the other atomsoccurring in the clause(s). We have two flavors of such rules depending onwhether the maximality is tested before or after the most general unifier isapplied on the clauses.Post-ordered resolutionWe consider the two following rules applicable on a set of clauses S given aliftable ordering a :
  • 77. 5.1. ORDERED RESOLUTION 79Post-ordered factorization: If l1 ∨ l2 ∨ C and ξi is the atom occurring in li for ı ∈ {1, 2}, then if σ = mgu(l1 , l2 ), and if both ξ1 σ and ξ2 σ are maximal w.r.t. the atoms occurring in Cσ, then l1 σ ∨ Cσ is a post-ordered factor of l1 ∨ l2 ∨ C;Post-ordered resolution: If ξ1 ∨ C1 and ¬ξ2 ∨ C2 are two clauses such that σ = mgu(ξ1 , ξ2 ) and ξ1 σ (resp. ξ2 σ) is maximal w.r.t. the atoms occurring in C1 σ (resp. C2 σ), then (C1 ∨ C2 )σ is a post-ordered resolvent of ξ1 ∨ C1 and ¬ξ2 ∨ C2 .We call post-ordered resolution the iterated application of the post-ordered fac-torization and resolution rules. We note that whenever a post-ordered factorization or resolution rule can beapplied on one or two clauses, then factorization or resolution can be applied onthe same set of clauses and yields the same resolvent. Thus Theorem 4.8 impliesthat if an iterated application of the post-ordered factorization and resolutionrules on a set of clauses S reaches the empty clause [ ], then S is unsatisfiable.However, since we have restricted the possible applications of factorization andresolution the completeness part of Theorem 4.8 is not necessarily true. It ishowever preserved thanks to Corollary 5.1 and Lemma 5.1.Theorem 5.1. (Completeness of post-ordered resolution) If S is an unsatisfi-able set of clauses there exists a finite sequence of application of post-orderedfactorization and resolution starting from S reaching the empty clause [ ].Proof. By Theorem 4.4 S unsatisfiable implies that there exists an unsatisfiablefinite set Sg of ground instances of clauses in S. By definition of the simula-tion relation we have Sg S. By Corollary 5.1 there exists a finite sequenceof ground factorization and resolution rules starting from Sg that reaches theempty clause such that, for each rule application:ground factorization lg ∨ lg ∨ Cg : let ξg be the atom occurring in lg and ξg an atom occurring in Cg . We have ξg a ξg ;ground resolution between ξg ∨ Cg and ¬ξg ∨ Cg : for every atom ξg occur- ring in Cg or Cg we have ξg a ξg . Let Sg be a finite ground unsatisfiable set of clauses and S be such thatSg S . Let us prove that for every application with the above restrictionsof the ground factorization or resolution rule on Sg there exists a post-orderedfactorization or resolution rule applicable on S that preserves the simulation.Factorization. Assume lg ∨ lg ∨ Cg ∈ Sg , let ξg be the atom occurring inl, and ξg be an atom occurring in Cg . Since S simulates Sg there exists aclause l1 ∨ l2 ∨ C ∈ S and a ground substitution σ such that l1 σ = l2 σ = lg andCσ = Cg . By Lemma 4.18 there exists θ = mgu(l1 , l2 ) and a ground substitutionτ such that ((l1 ∨ C)θ)τ = lg ∨ Cg . By Lemma 5.1 the atom occurring in l1 θis maximal for a w.r.t. the atoms occurring in Cθ. Thus (l1 ∨ C)θ is a post-ordered factor of a clause in S that simulates lg ∨ Cg .
  • 78. 80 CHAPTER 5. REFINEMENTS OF RESOLUTIONResolution. Assume ξg ∨ C, ¬ξg ∨ C ∈ Sg , and that ξg is maximal w.r.t.the atoms occurring in C and C . Since Sg S there exists by Lemma 4.17ξ1 ∨ C1 , ¬ξ2 ∨ C2 ∈ S and two substitutions θ and τ such that: • ((ξ1 ∨ C1 )θ)τ = ξg ∨ C and ((¬ξ2 ∨ C2 )θ)τ = ¬ξg ∨ C ; • ξ1 θ = ξ2 θ.By Lemma 5.1 ξ1 θ is maximal w.r.t. the atoms occurring in C1 θ and C2 θ, andthus (C1 ∨ C2 )θ is a post-ordered resolvent of ξ1 ∨ C1 and ¬ξ2 ∨ C2 ∈ S thatsimulates C ∨ C . Thus if S is unsatisfiable there exists a finite sequence of post-ordered factor-ization and resolution rule applications that reaches a set of clauses containing[ ].Pre-ordered ResolutionWhen implementing a resolution theorem prover, it can be costly to test aftereach tentative factorization or resolution whether the factored or resolved atomis maximal. Thus one sometimes prefers to compute the set of maximal atomsin a clause only once, and to compute the ordered factors and resolvents w.r.t.the maximal atoms found. This schema corresponds to the two following rulesapplicable on a set of clauses S given a liftable ordering a :Pre-ordered factorization: If l1 ∨ l2 ∨ C and ξi is the atom occurring in li for ı ∈ {1, 2}, then if σ = mgu(l1 , l2 ), and if both ξ1 and ξ2 are maximal w.r.t. the atoms occurring in C, then l1 σ ∨ Cσ is a pre-ordered factor of l1 ∨ l2 ∨ C;Pre-ordered resolution: If ξ1 ∨ C1 and ¬ξ2 ∨ C2 are two clauses such that σ = mgu(ξ1 , ξ2 ) and ξ1 (resp. ξ2 ) is maximal w.r.t. the atoms occurring in C1 (resp. C2 ), then (C1 ∨ C2 )σ is a pre-ordered resolvent of ξ1 ∨ C1 and ¬ξ2 ∨ C2 .We call pre-ordered resolution the iterated application of the pre-ordered fac-torization and resolution rules. We note that every pre-ordered factorization rule application is a factor-ization rule application, and every pre-ordered resolution rule application is aresolution rule application. Thus the soundness of resolution implies the sound-ness of pre-ordered resolution. Also we note that since the ordering is liftable, every post-ordered factor-ization rule application is a pre-ordered factorization rule application, and thatevery post-ordered resolution rule application is a pre-ordered resolution ruleapplication. Thus the completeness of post-ordered resolution implies the com-pleteness of pre-ordered resolution.Theorem 5.2. (Soundness and completeness of pre-ordered resolution) A setS of clauses is unsatisfiable if, and only if, there exists a finite sequence of pre-ordered factorization and resolution rule application starting from S reaching aset of clauses containing [ ].
  • 79. 5.2. PREVIOUS WORK ON ORDERED SATURATION 81ConclusionThese completeness theorems have first been proved in [153, 154, 135] usingeither the inverse method [153, 154] or semantic trees [135]. Another approachof note to prove completeness consists in building explicitly a Herbrand inter-pretation [18]. The argument we have employed is a variation of the one in [135]but without the machinery of semantic trees. In particular we use an orderingon the atoms, whereas [153, 154] employs an ordering on the literals. The majordifference with [135] is that we first obtain a finite set of atoms from HerbrandTheorem and then consider an ordering on this set, whereas Kowalski and Hayesobtain this set of atoms once an infinite semantic trees is built.5.2 Previous Work on Ordered SaturationWhen a resolvent C between two clauses of S is added to S we obtain anequisatisfiable set of clauses. Thinking in terms of procedures, we however wantto have more than mere equisatisfiability, i.e. ensure that some sort of progresshappens when the resolvent is added. This notion of progress was formalized byBachmair and Ganzinger in [17] by using an ordering on clauses. They remarkedthat the resolvent obtained by post-ordered resolution between two clauses wassmaller, for a well-founded ordering on clauses based on the ordering on atoms,than one of the premises. This remark lead to a criterion that permits one toremove a clause from a set of clauses when it does not progresses. Later thisresult was built upon in [26] by defining a clause C to be redundant in S if it isentailed by a set of instances of clauses in S which are each smaller than C. Let a be a atom ordering total on ground terms and compatible with a termordering t . Equipped with this definition Basin and Ganzinger have provedthat a set S of clauses saturated by post-ordered resolution w.r.t. a is localw.r.t. a if S is reductive w.r.t. a and t , i.e. if for each ground instance Cof a clause in S, if A is maximal is maximal in C, then for each atom B in C,for each term t occurring in B, there exists a term s occurring in A such thatt t s. As a consequence of this GivanM92 result w.r.t. a total, well-founded atomordering compatible with a term ordering t , Basin and Ganzinger proved thatif a set of clauses S is reductive w.r.t. a and t and if, for every groundatom A there exists only a bounded number of ground atoms smallerthan A, then the ground entailment problems are decidable for S, i.e. thefunction: Sat if S |= C entailment(S, C) = Unsat Otherwisecan be computed. The last part of the proof is trivial: by GivanM92 and theboundedness assumption if S |= C then there exists a refutation of ¬C ∪ S inwhich only atoms smaller for a than those occurring in C occur. It then sufficesto form all the ground instances of the clauses in S that satisfy this criterion.
  • 80. 82 CHAPTER 5. REFINEMENTS OF RESOLUTIONThis construction yields a finite set of ground clauses whose unsatisfiability canbe decided.Introduction to our contribution. In contrast with this approach, I haveproposed with Mounira Kourjieh an extension to finite sets of clauses of ourwork on saturated deduction systems (presented in Chapter 8. We removed theassumptions that a and t are total on ground atoms and terms1 , and replacedreductiveness and compatibility by the (admittedly more restrictive) liftabilityof the atom ordering and the condition that A a B implies Var(A) ⊆ Var(B).But more importantly, we removed the boundedness assumption, i.e. we donot assume that for every ground atom A there exists only a boundednumber of ground atoms smaller than A. Having replaced totality onground terms, reductiveness and boundedness2 assumptions by liftability andvariable inclusion, we prove that if a set of clauses is saturated by orderedresolution w.r.t. a suitable ordering a then its ground entailment problemis decidable. We present this approach in the rest of this chapter. The shortversion of this result was presented at LPAR 16, in Dakar.5.3 Decidability of ground entailment problems5.3.1 MotivationIn [26, 25], D. Basin and H. Ganzinger showed that the order saturation of a setS of Horn clauses w.r.t. a well-founded and liftable ordering is not sufficient toobtain the decidability of the ground entailment problem for S, as demonstratedby the following example.Example 17. (Uwe Waldmann, presented in [26, 25]) Let S be an arbitrary setof clauses and C be a ground clause. Construct S and C such that S consistsof the set of clauses q() ∨ C such that C ∈ S, and let C = q() ∨ C . Chooseany ordering such that q() is the maximal atom, Thereby implying that everyproof of S |= C is order local. The ground entailment problem problem S |= Cis trivially reducible to S |= C . Since the former is in general undecidable sois the latter problem. Thus there exists order local sets of Horn clauses whoseground entailment problem is undecidable. Let a be an atom ordering. We note that in Example 17 it is possible tochoose the ordering a to be well-founded and liftable. Let us prove that ifone assumes in addition to liftability and well-foundedness of a that A a Bimplies Var(A) ⊆ Var(B) then ground entailment problems become decidable. As usual we assume a functional signature F and a relational signature P,and denote T (F, X ) the set of terms over F, and T (F) the Herbrand domain 1 As remarked by Basin and Ganzinger in [26], the totality assumption does not lose gen-erality when the ordering is bounded, as one can then try all the total extensions of the atomordering. This construction is however not effective if the boundedness condition is removed. 2 I insist given that a majority of the reviewers of our submissions of this result insist thatit is entailed by the one by Basin and Ganzinger, or that the proof is the same.
  • 81. 5.3. DECIDABILITY OF GROUND ENTAILMENT PROBLEMS 83associated to the signature F. Given a clause C we denote atoms(C) the set ofthe atoms occurring in C, called its domain. We extend the notion of domainto sets of clauses as expected with atoms(S) = ∪C∈S atoms(C). We say that aclause is a unit clause if it contains only one literal. Given a clause C = l1 ∨. . .∨lkwe denote ¬C the set of unit clauses {¬l1 , . . . , ¬lk }.Ground entailment problem. We are interested in this section in givingconditions such that it is possible to decide whether a ground clause C is alogical consequence of a set of clauses S. Let us now formally define this problem.Given a set of clauses S, the ground entailment problem for S is the followingdecision problem: Ground EntailmentS (C) Input: a ground clause C Output: Sat if and only if S |= CExample 18. Let us consider the ordering on atoms defined by the closureby stability of the ordering p(x, t(x, y)) a p(s(x), y), for any term t(x, y) havingvariables x and y. One easily sees that this atom ordering is well-founded (andbounds the length of a chain starting from an atom p(t1 , t2 ) by the size of t1 )and that A a B implies Var(A) ⊆ Var(B). The quantification over any termt however implies that an atom may have an infinite number of atoms smallerthan itself.5.3.2 Locality and SaturationOur presentation follows the historical development of first the notion of (sub-term) GivanM92 as introduced by GivanM92 in [118, 118] for sets of Hornclauses, and then the notion of order GivanM92 as defined by Basin and Ganzingerin [26, 25].Subterm GivanM92. GivanM92’s work [118] is based on Horn clauses. Thelocal entailment of a clause C by a set of clause S, denoted S |=l C, meansthat there exists a finite set S g of ground instances of clauses in S such thatS g , ¬C is unsatisfiable and such that every term occurring in a clause in S g isa subterm of some term occurring in C. A set of Horn clauses S is subterm local if for every ground Horn clause C,we have S |= C if and only if S |=l C. It is proved in [118] that if a set S ofHorn clauses is finite and subterm local then its ground entailment problem isdecidable in polynomial time.Order GivanM92. Basin and Ganzinger [26, 25] generalized GivanM92’swork by allowing any strict well-founded term ordering t over terms, and full(not Horn) clauses. Again, a set of clauses S is said to locally entail a ground
  • 82. 84 CHAPTER 5. REFINEMENTS OF RESOLUTIONclause C, which is denoted S |= t C, whenever there exists a finite set S g ofground instances of clauses in S such that S g , ¬C is unsatisfiable and such thatevery term occurring in a clause in S g is smaller for t than a term occurringin C. A set of clauses S is order local for the term ordering t whenever for everyground clause C we have S |= C iff S |= t C. Given a term ordering t we can have at the same time—as e.g. for lexi-cographic or recursive path ordering—that t is well-founded and is such thatfor some ground term t there exists an infinite set of terms t such that t t t.We remark that in this case order GivanM92 does not imply the decidability ofground entailment problems. However it is often sufficient to consider term orderings of finite complexity.A term ordering t is said to be of complexity f, g whenever for each clause ofsize n (the size of a term is the number of nodes in its dag representation, andthe size of a clause is the sum of sizes of its terms) there exists O(f (n)) termsthat are smaller or equal (under t ) to a term in the clause, and that may beenumerated in time g(n). It is easy to see that if t is of complexity f, g theneach ground term has finitely many smaller terms that may be enumerated infinite time [26, 25].Theorem 5.3. (Basin, Ganzinger [26, 25]) If S is a set of Horn clauses that isorder local with respect to a term ordering t of complexity f, g then the groundentailment problem for S is decidable. The work we present can be considered as a weakening of the conditionsunder which order GivanM92 implies decidability. On the one hand Basin andGanzinger mandate that the atom ordering must be total and well-founded onground atoms, compatible with a term ordering of finite complexity, and thatthe set of clauses has to be reductive w.r.t. the atom and term orderings.On the other hand we do not consider the ordering on terms and assume thatthe ordering on atoms is well-founded, liftable and is such that A a B impliesVar(A) ⊆ Var(B).5.3.3 SaturationAs specified above, we consider an atom ordering a which is liftable, well-founded and such that A a B implies Var(A) ⊆ Var(B).Rewriting atomsDefinition Rewriting systems are usually defined over terms and are employedto model equational theories. In contrast with this standard setting, we considerrewriting systems on atoms to define finitely branching orderings on atoms.Definition 22. A rewriting system on atoms R based on a is a set of couples(L, R) where L and R are atoms with R a L. Each couple (L, R) is called arewriting rule and is denoted L → R.
  • 83. 5.3. DECIDABILITY OF GROUND ENTAILMENT PROBLEMS 85 We say that an atom A rewrites to B by the rewriting system on atoms R,or more simply that A rewrites to B by R, whenever there exists a rewrite ruleL → R ∈ R and a substitution σ such that Lσ = A and Rσ = B. We denotethis A →R B. When R is a singleton {L → R} we simply write A →L→R B.Ordering defined by a rewriting system Given a rewriting system onatoms R and an atom A we denote A ↓R the set of atoms reachable from Awhen applying rules in R. This notion is extended to sets of atoms by denotingS ↓R the union, for every atom A occurring in S, of the sets A ↓R . We let A ↓− Rbe the set A ↓R {A} We denote A R B whenever A ∈ B ↓− . RLemma 5.2. If R is a finite atom rewriting system based on a then for everyground atom C the set C ↓R is finite.Proof. Consider the (infinite) directed graph whose vertices are ground atoms,and there is an edge from A to B whenever A →R B. First we note that sincein every rewrite rule L → R we have Var(R) ⊆ Var(L) then for every atomA there is most |R| successors. Second we note that A →R B implies B a A,and thus this graph is acyclic. Also, the fact that a is well-founded impliesthat this graph does not contain any infinite path. Consider its (potentiallyinfinite) tree build from the vertice C by considering the possible paths to allother nodes. We note that this tree is of finite branching and every path in it isfinite. Thus by K¨nig’s lemma this graph has only a finite number of vertices. oSince all atoms in C ↓R must be by definition vertices in this tree, we have thatC ↓R is finite.Rewriting systems defined by sets of clauses Let S be a set of clauses.We define an atom rewriting system R(S) that captures the ordering relationsbetween atoms in the clauses of S.Definition 23. (Rewriting system based on a set of clauses) Let S be a finiteset of clauses. The atom rewriting system R(S) is defined as the set of rewritingrules L → R such that there exists a clause C ∈ S with: • L, R are two distinct atoms of C; • We have R a L. First let us remark that since S is finite we also have that R(S) is finite. Wealso remark that if S ⊆ S , then R(S) ⊆ R(S ). Further, since the ordering ais liftable, we have that A →R B also implies B a A. As a consequence, since the ordering a is well-founded we conclude that therewriting system R(S) is terminating for any finite set of clauses S. Furthermoregiven two sets of clauses S and S and their associated rewriting systems R(S)and R(S ) we note that since the ordering a is fixed the union R(S) ∪ R(S ) isalso terminating. We note that given this definition, adding to a set of clausesS a finite set of unit clauses S we have R(S) = R(S ∪ S ).
  • 84. 86 CHAPTER 5. REFINEMENTS OF RESOLUTIONRedundancyFirst let us define the local entailment, i.e. the entailment by instances in whichthe atoms are smaller than those in the conclusion.Definition 24. (Local entailment) Let S be a set of clauses, C be a clause andA be a set of ground atoms. We say that S A-locally entails C whenever thereexists an unsatisfiable finite set Sg of ground instances of S ∪¬C such that everyatom A occurring in Sg is in A. We denote S A C the A-local entailment of C by S. Of course by definition we have S A C for some set A implies S |= C. Theproblem is to prove that the converse holds for some specific set A. We say thata substitution σ is a grounding of a clause C for a set of clauses S if: • the domain of σ is the set of variables occurring in C; • σ is one-to-one and maps each variable x to a constant cx that does not occur in S or C.We denote σS,C a substitution grounding C for the set of clauses S. Using thesenotations we have the following lemmas.Lemma 5.3. Let S be a set of clauses and C be a clause. Using the abovenotations we have S |= CσS,C iff S |= C.Proof. Assume S |= CσS,C . By Herbrand’s theorem there exists a finite unsatis-fiable set Sg of ground instances of S ∪¬CσS,C . Let σ be a arbitrary substitutionwhose domain is Var(C) and δσ be the replacement of every constant cx = xσS,Cby xσ. By completeness of ground resolution there exists a finite sequence ofresolution and factorization that deduces the empty clause from Sg . Since noconstant cx appears in S nor in C this finite sequence can also be applied onSg δσ to deduce the empty clause. By correctness of the resolution this impliesthat no ground instance (¬C)σ of ¬C is satisfied in a model of S. Since aninterpretation satisfies either a ground clause or its negation this implies thatall models of S are models of Cσ for any ground substitution σ. Thus we haveS |= C. Conversely if S |= C then in particular S |= CσS,C . Lemma 5.4 follows immediately.Lemma 5.4. The problem consisting in determining, given a finite set S ofclauses, a ground clause C and a finite atom rewriting system R, whetherS C↓R C is decidable.Proof. It suffices to remark that, seeing that C ↓R is finite by Lemma 5.2, theset of all instances of clauses in S with atoms occurring in C ↓R is finite.
  • 85. 5.3. DECIDABILITY OF GROUND ENTAILMENT PROBLEMS 87Redundancy. When defining a redundant inference we allow the presence ofclauses that are strictly bigger than the entailed among the clauses demonstrat-ing the redundancy of the inference.Definition 25. (Redundancy) Let R be a finite set of atom rewriting rules. • A ground clause C is R-redundant in a set of clauses S if S C↓R C. • A non-ground clause C is R-redundant in a set of clauses S if all its instances are redundant; • Consider an inference by ordered resolution C , C” C where the resolved atom is A. We say this inference is R-redundant in the set of clauses S if either C or C” is R-redundant in S or S CσS,C ↓R ∪AσS,C ↓− CσS,C . R We note that this notion can be employed to relate a priori and a posterioriresolution.Lemma 5.5. Let C1 , C2 be two clauses and let σ be a substitution such thatC1 σ, C2 σ C is an inference by a priori ordered resolution. Let R = R(C1 σ) ∪R(C2 σ). Then this inference is R-redundant or is an inference by a posterioriordered resolution.Proof. Assume this is not an inference for a posteriori ordered resolution. Thenthe resolved atom A is not maximal for a in the set of atoms of C. Thusthere exists in C1 σ or C2 σ an atom B with A a B. By definition we thus haveB → A ∈ R. As a consequence all the atoms in C1 σ, C2 σ are in C ↓R . Bydefinition this inference is R-redundant in {C1 , C2 }. We may now define our notion of redundancy for ordered resolution.Definition 26. (Saturated sets of clauses) Let R be a atom rewriting system.We say that a set of clauses S is R-saturated up to redundancy under orderedresolution with respect to R, if any inference by ordered resolution from premisesin S is R-redundant in S and if: 1. R(S) ⊆ R; 2. For each a priori ordered resolution inference between two clauses C1 , C2 of S with substitution σ and of conclusion C, if the resolved atom Aσ is not maximal in C1 σ, C2 σ then we have R(C1 σ, C2 σ) ⊆ R. Let us now present a procedure that, starting from a finite set of clauses S,and providing it terminates, constructs a finite set S of clauses and an atomrewriting system R such that every ground entailment problem for a clause Cis C ↓R -local. That is to say, for all ground clauses C, S |= C iff S C↓R C.SaturationLet us now present our saturation algorithm. Let S be a set of clauses, and abe a liftable, well-founded ordering on atoms such that A a B implies Var(A) ⊆Var(B).
  • 86. 88 CHAPTER 5. REFINEMENTS OF RESOLUTIONSaturation procedure. The procedure starts from the couple (S, R(S)) andis iterated until a fixed-point is reached. Each step is a transformation (S1 , R1 ) →(S2 , R2 ) constructed as follows: • Let C1 , C2 be two clauses in S1 , and C be the conclusion of an ordered resolution inference on C1 , C2 where the substitution employed is σ and the resolved atom is Aσ. • Three cases are possibles: Non-maximality: If Aσ is not maximal for a in the atoms of C1 σ, C2 σ then S2 = S1 and R2 = R1 ∪ R({C1 σ, C2 σ}); Redundancy: If S1 C↓R1 C, then S2 = S1 and R2 = R1 ; Discovery: Otherwise a new clause useful for establishing local proofs has been discovered. In this case we set S2 = S1 ∪ {C} and R2 = R1 ∪ R(C).A sequence of steps is fair [18] if every possible inference by a priori orderedresolution is eventually performed.Definition 27. (Result of the saturation procedure) Given a finite set of clausesS and an atom ordering a we denote min a (S) a couple (S , R) obtained bya fair sequence of steps by the saturation procedure in case it terminates. First let us prove that the procedure actually constructs a saturated set ofclauses.Proposition 5.1. Let S be a finite set of clauses and a be a liftable, well-founded atom ordering such that A a B implies Var(A) ⊆ Var(B). If the saturation procedure terminates on S and min a (S) = (S , R) then Sis R-saturated.Proof. Assume there exists two clauses C1 , C2 ∈ S and a substitution σ suchthat the inference C1 σ, C2 σ C is not R-redundant. In the saturation algo-rithm it thus falls into one of the non-maximality or discovery cases.non-maximality: Assume the resolved atom A is not maximal in the atoms of C1 σ, C2 σ. Then this inference is not an inference by a posteriori ordered resolution. It is thus R(C1 σ) ∪ R(C2 σ)-redundant. Since it is not redun- dant we must have R(C1 σ) ∪ R(C2 σ) ⊆ R. This implies that (S , R) is not a result of the saturation algorithm.discovery: If (S , R) were a result of the saturation algorithm we would have had C ∈ S , which would trivially (for any atom rewriting system) have implied that the inference was redundant in S.As a consequence every inference between two clauses of (S , R) must be R-redundant. We leave the conditions on R to the reader. Thus the set S isR-saturated by Definition 26.
  • 87. 5.3. DECIDABILITY OF GROUND ENTAILMENT PROBLEMS 895.3.4 Decidability of the ground entailment problemWe consider in this section a R-saturated set of clauses S. In spite of thedifferences in definitions we prove that as in [26, 25] saturation implies GivanM92in our sense. The spirit of the proof is a combination of those in [59, 26, 25].Proposition 5.2. Let S be a R-saturated set of clauses, and C be a groundclause. Then S |= C implies S C↓R CProof. Assume that S |= C, and let T be the set of unsatisfiable finite sets ofground instances of S ∪ ¬C. By Herbrand’s Theorem we know that T = ∅. LetTmin ⊆ T be a set of finite sets T such that the set atoms(T ) ↓R atoms(C) ↓Ris minimal for the extension on sets of atoms of the ordering a . If this set ofatoms is empty then we are done as each T ∈ Tmin is then an unsatisfiable finiteset of ground instances of S ∪ ¬C in which all atoms are in C ↓R . Otherwise for any T ∈ Tmin the set of atoms in T is finite and thereforeatoms(T ) ↓R is also finite by Lemma 5.2. Thus we can consider a maximalelement A (the same for all T in Tmin ) in atoms(T ) ↓R C ↓R . Since A ismaximal we also have that A is an atom occurring in T for each T ∈ Tmin .Claim 4. For any T ∈ Tmin the atom A is maximal in atoms(T ) for the ordering R. Proof of the claim. By contradiction if this were not the case there would exist B ∈ T with A R B. Since A is maximal in T ↓R C ↓R we would have that B would not be in this set. Since B ∈ atoms(T ) this would imply B ∈ C ↓R . By definition we would then have A ∈ C ↓R , which would contradict A ∈ T ↓R C ↓R . ♦ Let T be in Tmin , and let Leaves+ be the set of clauses in T that contain Athe atom A, and Leaves− be the subset of clauses of T that do not contain A. ALet us consider the set Leaves of all possible conclusions of resolution on Abetween clauses in Leaves+ . The set of ground clauses Leaves ∪ Leaves− is also A Aunsatisfiable.Claim 5. Each clause CA ∈ Leaves+ is an instance with a substitution σ of a Aclause CA ∈ S that has a maximal atom As for a with As σ = A. s Proof of the claim. By definition CA is either an instance of a clause in S or of a clause in ¬C. Since A is not an atom occurring in C the latter case is excluded. Thus there exists CA ∈ S, an atom As ∈ CA , and s s s s s a substitution σ such that A σ = A and CA σ = CA . Finally if A is not s maximal for a in CA then it is not maximal for R and thus A cannot be maximal for R in the atoms of CA . This would contradict the fact that A is maximal for R among the atoms occurring in T . ♦ Thus every resolution on A between clauses in Leaves+ is an instance with Asubstitution σ of an a priori ordered resolution inference between two clausesC1 and C2 of S. Let C3 ∈ Leaves be its conclusion. Since S is R-saturated
  • 88. 90 CHAPTER 5. REFINEMENTS OF RESOLUTIONeach such inference is redundant. We note that A maximal in atoms(T ) for Rand the fact that S is saturated (second point of the ordering condition) forR imply that A cannot be smaller for R than an atom in C3 . Thus for eachconclusion C3 we can define a set §(C3 ) which is either: g • the singleton {C3 } if C3 is an instance of a clause C3 ∈ S; g • or a set SC3 of instances of clauses of S whose atoms are in C3 ↓R ∪A ↓− R that entails C3The set of ground clauses S g = Leaves− ∪ C3 ∈Leaves §(C3 ) is unsatisfiable. ABy construction we have atoms(Sg ) ↓R ⊆ (atoms(T ) {A}) ↓R ∪A ↓− . Since RA is maximal in atoms(T ) for R and A is not in C ↓R this implies thatatoms(Sg ) ↓R C ↓R a atoms(T ) ↓R C ↓R . This contradicts the fact that Tis in the set of minimal consequences Tmin .Theorem 5.4. Let a be a well-founded, liftable atom ordering such that forany two atoms A and B we have A a B implies Var(A) ⊆ Var(B). Let S be aset of clauses, and assume that saturation terminates using the atom ordering a. Then the ground entailment problems for S are decidable.Proof. Let (S , R) be the result of the saturation of S with the ordering a .Since S ⊆ S for every ground clause C we have S |= C implies S |= C.Conversely since all clauses in in S S are logical consequences of S we haveS |= C implies S |= C. By Proposition 5.3 S |= C is decidable, hence so is theequivalent problem S |= C. We have already noted that S C↓R C trivially implies S |= C. As aconsequence of Lemma 5.4 and of Proposition 5.2 we thus have the followingproposition.Proposition 5.3. If S is a R-saturated set of clauses then the ground entail-ment problems for S are decidable. Our final theorem is a self-contained re-formulation of the above propositionusing the initial set of clauses.Theorem 5.4. Let a be a well-founded, liftable atom ordering such that forany two atoms A and B we have A a B implies Var(A) ⊆ Var(B). Let S be aset of clauses, and assume that saturation terminates using the atom ordering a. Then the ground entailment problems for S are decidable.5.3.5 Conclusion and future worksWe have presented in this section an extension of a result by Basin and Ganzinger [26,25]. The relaxation of the hypothesis on the ordering may lead to a further ex-tension for resolution modulo an equational theory [124, 168, 209]. We believe
  • 89. 5.3. DECIDABILITY OF GROUND ENTAILMENT PROBLEMS 91the technique employed can be extended to add a reflexivity or transitivityaxiom to an already saturated theory. Also, we thank Chris Lynch [150] forhaving pointed to us (by giving a counter-example) that the method cannot beextended as is to superposition. Finally we believe that a consequence of ourproof is that saturated theories are complete for contextual deduction [43, 167],which may help in the resolution of [101], though further work is needed toconfirm this conjecture.
  • 90. 92 CHAPTER 5. REFINEMENTS OF RESOLUTION
  • 91. Part IIIModeling 93
  • 92. Chapter 6Symbolic models forCryptographic Protocols We begin in this chapter the presentation of the core of our work on the symbolic analysis of cryptographic protocols. We first associate to each narration a logical model called an active frame. Though it is not strictly speaking a first-order theory as are the protocol models in [126], it nonetheless captures the essential message exchange features of cryptographic protocols. From these active frames we can derive the constraint systems routinely employed [8, 161, 55] to model a finite execution of a protocol. We then present symbolic derivations, a refinement of active frames. The compilation process described in this section was published in [74]. We have included it in this document to have a self- contained presentation of our work. We then present a more refined model of the internal computations of a protocol partic- ipant, the symbolic derivations, which was originally introduced in [65].6.1 IntroductionCryptographic protocols are designed to prescribe message exchanges betweenagents in hostile environment in order to guarantee some security propertiessuch as confidentiality. There are many apparently similar ways to describe agiven security protocol. However one has to be precise when specifying howa message should be interpreted and processed by an agent since overlookingsubtle details may lead to dramatic flaws. The main issues are the following: • What parts of a received message should be extracted and checked by an agent? 95
  • 93. 96CHAPTER 6. SYMBOLIC MODELS FOR CRYPTOGRAPHIC PROTOCOLS • What actions should be performed by an agent to compute an answer?These questions are often either partially or not at all adressed in commonprotocol descriptions such as the protocol narrations 2.1.3, p. 18 such as theNeedham-Schroeder Public Key protocol [166] which is conveniently specifiedby the following text: A→B:encp ( A, Na , KB ) B→A:encp ( Na , Nb , KA ) A→B:encp (Nb , KB ) where −1 A knows A, B, KA , KB , KA −1 B knows A, B, KA , KB , KBProtocol narrations are also a textual representation of Message Sequence Charts(MSC), which are employed e.g. in RFCs (see Subsection 2.1.2, p. 17). We claimthat all internal computations specified in RFCs, and more generally most suchannotations, can be computed automatically from the protocol narration. Ourgoal in this chapter is to give an operational semantics to—or, equivalently, tocompile—protocol narrations so that internal actions (excluding e.g. storing avalue in a special list for a use external to the protocol) are described.Related works Although many works have been dedicated to verifying cryp-tographic protocols in various formalisms, only a few have considered the dif-ferent problem of extracting operational (non ambiguous) role definitions fromprotocol descriptions. Operational roles are expressed as multiset rewrite rulesin CAPSL [99], CASRUL [126], or sequential processes of the spi-calculus withpattern-matching [49]. This extraction is also used for end-point projectionin [156, 155]. A pioneering work in this area is one by Carlsen [51] who hasproposed a translation of protocol narrations into CKT5 [36], a modal logic ofcommunication, knowledge and time. Compiling narrations to roles has been extended beyond perfect encryptionprimitives to algebraic theories in [55, 162]. An advantage of [162] is that itsupports implicit decryption which may lead to more efficient secrecy decisionprocedures. We can note that, although these works admit very similar goals, alltheir operational role computations are ad-hoc and lack of a uniform principle.In particular they essentially re-implemented previously known techniques.Our work Another motivation of this chapter is the existing amount of workon the security analysis of cryptographic with various cryptographic primitives.In these settings one considers operational models of the protocols given with-out any justification. In particular there is no guarantee that the operationalmodel considered represents a prudent implementation of the protocol. A firstresult of this chapter is the formalization of the notions of implementation andprudent implementation in the sense that the receiver checks (and correlates)the reachable parts of the received messages.
  • 94. 6.2. ROLE-BASED PROTOCOL SPECIFICATIONS 97 As a consequence of these definitions we can relate the problems of comput-ing a (prudent) implementation to classic decision problems, namely reachabilityand static equivalence problems. In particular we describe how, given a deduc-tion system, an algorithm solving the reachability problems for this deductionsystem can be employed to compute an implementation, and how an algorithmsolving the refinement problem can be employed to compute a prudent imple-mentation. This paves the way for using tools such as Yapa [29] to automaticallycompile cryptographic protocols.6.2 Role-based Protocol SpecificationsFirst we show how we derive from a narration a plain role-based specification.Then the specification will be refined in the following Sections.6.2.1 Specification of messages and basic operationsWe consider a slight variation of the basic notions from Chapter 4. We consideran infinite set of free constants C and an infinite set of variables X . For eachsignature F (i.e. a set of function symbols with arities), we denote by T (F)(resp. T (F, X ) ) the set of terms over F ∪ C (resp. F ∪ C ∪ X ). The former iscalled the set of ground terms over F, while the later is simply called the set ofterms over F. Variables are denoted by x, y, terms are denoted by s, t, u, v, andfinite sets of terms are written E, F, . . ., and decorations thereof, respectively. In a signature F a constant is either a free constant in C or a functionsymbol of arity 0 in F.Deduction systemsGiven its importance, let us recall the fundamental assumption underlying thesymbolic protocol analysis:Fundamental assumption. Our work on the analysis of cryptographic proto-cols rely on the assumption that all the agents operate on messages via a messagemanipulation library.Thus we have a signature F containing the function symbols employed to denotethe messages. In particular the functions of the library form a subset Fp of F.Definition 28. (Deduction systems) A deduction system is defined by a triple(E, F, Fp ) where E is an equational presentation on a signature F and Fp asubset of public constructors in F.Example 19. For instance the following deduction system models public keycryptography: ({decp (encp (x, y), y −1 ) = x}, {decp ( , ), encp ( , ), −1 }, {decp ( , ), encp ( , )})
  • 95. 98CHAPTER 6. SYMBOLIC MODELS FOR CRYPTOGRAPHIC PROTOCOLSThe equational theory is reduced here to a single equation that expresses thatone can decrypt a ciphertext when the inverse key is available.Remark 2. The fact that we model the application of a function by equationsimplies that, by transitivity of the equality, all the results f (t1 , . . . , tn ) of afunction f on a given sequence of arguments t1 , . . . , tn are equal. Thus wecan only model deterministic functions. This is not problematic for modellingnon-deterministic cryptographic primitives as it suffices to add an argumentrepresenting the random part of the algorithm. However there are some casesin which we want to model the ambiguity of a function. For these specific caseswe have introduced extended deduction systems [65, 57], but have chosen to notpresent them in depth in this document in order to preserve its uniformity. These extended deduction systems were introduced in [65] to model the non-determinism in the handling of some messages by honest participants. The dif-ference with standard deduction systems is that instead of deducing f (x1 σ, . . . , xn σ)from any term x1 σ, . . . , xn σ when f is a public symbol, extended deductionsdeduce a term (tσ)↓ from the terms (t1 σ)↓, . . . , (tn σ)↓. The only constraint isthat—omitting a technical detail for the sake of the clarity of exposition—weimpose that for every substitution σ every constant occurring in tσ must occurin at least one of the (ti σ)↓.Contexts. Let D be a deduction system. A D-context C[x1 , . . . , xn ] is a termin which all symbols are public and such that its nullary symbols are eitherpublic non-free constants or variables.6.2.2 Role SpecificationWe present in this subsection how protocol narrations are transformed into setsof roles. A role can be viewed as the projection of the protocol on a principal.The core of a role is a strand which is a standard notion in cryptographicprotocol modeling [111]. A strand is a finite sequence of messages each with label (or polarity) ! or?. Messages with label ! (resp. ?) are said to be “sent” (resp.“received”). Astrand is positive iff all its labels are !. Given a list of message l = m1 , . . . , mnwe write ?l (resp. !l) as a short-hand for ?m1 , . . . , ?mn , (resp. !m1 , . . . , !mn ).Definition 29. A role specification is an expression A(l) : νn.(S) where A is aname, l is a sequence of constants (called the role parameters), n is a sequenceof constants (called the nonces of the role), and S is a strand. Given a role rwe denote by nonces(r) the nonces n of r and strand(r) the strand S of r.Example 20. For example, the initiator of the NSPK protocol is modeled, atthis point, with the role: −1 νNa .(?Na , ?A, ?B, ?KA , ?KB , ?KA , !msg(B, encp ( A, Na , KB )), ?msg(B, encp ( Na , Nb , KA )), !msg(B, encp (Nb , KB )))
  • 96. 6.2. ROLE-BASED PROTOCOL SPECIFICATIONS 99with the equational theory of public key cryptography, plus the equations {π1 ( x, y ) =x, π2 ( x, y ) = y}. Note that nothing guarantees in general that a protocol defined as a set ofroles is executable. For instance some analysis is necessary to see whether arole can derive the required inverse keys for examining the content of a receivedciphertext. We also stress that role specfications do not contain any variables.The symbols Na , A, . . . in the above example are constants, and the messagesoccurring in the role specification are all ground terms.Plain roles extracted from a narration From a protocol narration whereeach nonce originates uniquely we can extract almost directly a set of roles,called plain roles as follows. The constants occurring in the initial knowledgeof a role are the parameters of the strand describing this role. We model thisinitial knowledge by a sequence of receptions (from an unspecified agent) of eachterm in the initial knowledge. In order to encode narrations we assume thatwe have in the signature three public function symbols msg( , ), partner( ) andpayload( ) satisfying the equational theory: partner(msg(x, y)) = x payload(msg(x, y)) = yFor every agent name A in the protocol narration, a role specification for Ais A(l) : ν nonces(S).(? nonces(S), ?K, S A ), where K is such that A knows Koccurs in the protocol narration, l is the set of constants in K. nonces(S) andstrand S A are computed as follows:Computation of S A : Init S0 = ∅ A On the (n + 1)-th line S → R : M do   Sn , !msg(R, M ) If A = S A Sn+1 = Sn , ?msg(S, M ) If A = R  A Sn OtherwiseComputation of nonces(A): This set contains each constant N that appears in the strand ?K, S A inside a message labelled ! and such that N does not occur in previous messages (with any polarity).This computation always extracts role specifications from a given protocol nar-ration and it has the property that every constant appears in a received messagebefore appearing in a sent message. Since a nonce is to be created within an in-stance of a role, we reject protocol narrations from which the algorithm describedabove extracts two different roles A and B with nonces(A) ∩ nonces(B) = ∅. Example 20 is a plain role that can be derived by applying the algorithm tothe NSPK protocol narration. We now define the input of a role specificationwhich informally is the sequence of messages sent to a role as defined by theprotocol narration.
  • 97. 100CHAPTER 6. SYMBOLIC MODELS FOR CRYPTOGRAPHIC PROTOCOLS !Definition 30. Let r = νN.( ? Mi )1≤i≤n be a role specification, and let (R1 , . . . , Rk )be the subsequence of the messages Mi labeled with ?. The input of r is denotedinput(r) and is the positive strand (!R1 , . . . , !Rk ). In the next section we define a target for the compilation of role specifica-tions. Then we compute constraints to be satisfied by sent and received mes-sages. and by adding the constraints to the specification this one gets executablein the safest way as possible w.r.t. to its initial specification.6.3 Operational semantics for rolesIn Section 6.2 we have defined roles and shown how they can be extracted fromprotocol narrations. In this section we define what an implementation of a roleis and in Section 6.4 we will show how to compute such an implementation froma protocol narration. Intuitively an operational model for a role has to reflect the possible ma-nipulations on messages performed by a program implementing the role. Theseoperations are specified here by a deduction system D = (E, F, S) where the setof public functions S, a subset of the signature F, is defined by equations in theequational theory E.Active frames We introduce now the set of implementations of a role speci-fication as active frames. An active frame extends the role notion by specifyinghow a message to be sent is constructed from already known messages, and howa received message is checked to ascertain its conformity w.r.t. already knownmessages. The notation !vi (resp. ?vi ) refers to a message stored in variable viwhich is sent (resp. received).Definition 31. Given a deduction system D with equational theory E, a D-active frame is a sequence (Ti )1≤i≤k where ?   !vi with vi = Ci [v1 , . . . , vi−1 ] (send) Ti = or  ?vi with Si (v1 , . . . , vi ) (receive)where Ci [v1 , . . . , vi−1 ] denotes a context over variables v1 , . . . , vi−1 and Si (v1 , . . . , vi )denotes a E-unification system over variables v1 , . . . , vi . Each variable vi occur-ing with polarity ? is an input variable of the active frame.Example 21. The following is an active frame denoted φa that can be employedto model the role A in the NSPK protocol: (?vNa ?vA , ?vB , ?vKA , ?vKB , ?vK −1 , A ? !vmsg1 with vmsg1 = msg(vB , encp ( vA , vNa , vKB )), ?vr with ∅ ? !vmsg2 with vmsg2 = msg(vB , encp (π2 (decp (vr , vK −1 )), vKB ))) A
  • 98. 6.3. OPERATIONAL SEMANTICS FOR ROLES 101 Compilation is the computation of an active frame from a role specificationsuch that, when receiving messages as intended by the role specification, the ac-tive frame emits responses equal modulo the equational theory to the responsesissued in the role specification. More formally, we have the following:Definition 32. Let D be a deduction system with equational theory E. Letϕ = (Ti )1≤i≤k be an active frame, where the Ti ’s are as in Definition 31, andwhere the input variables are r1 , . . . , rn . Let s be a positive strand !M1 , . . . , !Mn .Let σϕ,s be the substitution {ri → Mi } and S be the union of the E-unificationsystems in ϕ. The evaluation of ϕ on s is denoted ϕ · s and is the strand(m1 , . . . , mk ) where: !Ci [m1 , . . . , mi−1 ] If vi has label ! in Ti mi = ?vi σϕ,s If vi has label ? in TiWe say that ϕ accepts s if Sσϕ,s is satisfiable. To simplify notations, the application of a D-context C[x1 , . . . , xn ] on apositive strand s = (!t1 , . . . , !tn ) of length n is denoted C · s and is the termC[t1 , . . . , tn ].Example 22. Let r be the role specification of role A in NSPK as given inExample 20 and φA be the active frame of Example 21. Let M be the messagemsg(B, encp ( Na , Nb , KA )). We have: −1 input(r) = (!Na , !A, !B, !KA , !KB , !KA , !M )and φA · input(r) is the strand: −1 (?Na , ?A, ?B, ?KA , ?KB , ?KA , !msg(B, encp ( A, Na , KB )), −1 ?M, !msg(B, encp (π2 (decp (payload(M ), KA )), KB ))Modulo the equational theory, this strand is equal to the strand: −1 (?Na , ?A, ?B, ?KA , ?KB , ?KA , !msg(B, encp ( A, Na , KB )), ?M, !msg(B, encp (Nb , KB )) It is not coincidental that in Example 22 the strands ϕ · input(r) andstrand(r) are equal as it means that within the active frame, the sent mes-sages are composed from received ones in such a way that when receiving themessages expected in the protocol narration, the role responds with the mes-sages intended by the protocol narration. This fact gives us a criterion to definewhat an implementations of a role is.Definition 33. An active frame ϕ is an implementation of a role specificationr if ϕ accepts input(r) and ϕ · input(r) =E strand(r). If a role admits animplementation we say this role is executable.
  • 99. 102CHAPTER 6. SYMBOLIC MODELS FOR CRYPTOGRAPHIC PROTOCOLSExample φa defined above is a possible implementation of the initiator rolein NSPK. However this implementation does not check the conformity of themessages with the intended patterns, e.g. it neither checks that vr is really anencryption with the public key vKA of a pair, nor that the first argument of theencrypted pair has the same value as the nonce vNa . In Section 6.4 we show notonly how to compute an active frame when the role specification is executable,but also to ensure that all the possible checks are performed.6.4 Compilation of role specificationsUsually the compilation of a specification is defined by a compilation algorithm.An originality of this work is that we present the result of the compilation asthe solution to decision problems. This has the advantage of providing for freea notion of prudent implementation as explained below.6.4.1 Computation of a first implementationLet us first present how to compute an implementation of a role specification inwhich no check is performed, as given in the preceding example. To build such animplementation we need to compute for every sent message m a context Cm thatevaluates to m when applied to the previously received ones. This reachabilityproblem is unsolvable in general. Hence we have to consider systems that admita reachability algorithm, formally defined below:Definition 34. Given a deduction system D with equational theory E, a D-reachability algorithm AD computes, given a positive strand s of length n and aterm t, a D-context AD (s, t) = C[x1 , . . . , xn ] such that C · s =E t iff there existssuch a context and ⊥ otherwise. We will show that several interesting theories admit a reachability algorithm.This algorithm can be employed as an oracle to compute the contexts in sentmessages and therefore to derive an implementation of a role specification r.We thus have the following theorem.Theorem 6.1. If there exists a D-reachability algorithm then it can be decidedwhether a role specifications r is executable and, if so one can compute an im-plementation of r. !Proof sketch. Let r = ( ? Mi )i∈{1,...,n} be an executable role specification. Bydefinition there exists an active frame ϕ that implements r, i.e. for each sentmessage Mi , there exists a context Ci such that Ci [M1 , . . . , Mi−1 ] is equal toMi modulo the equational theory. Thus if there exists a D-reachability algo-rithm AD , the result AD (M1 , . . . , Mi−1 ), Mi ) cannot be ⊥ by definition. As aconsequence, AD ((M1 , . . . , Mi−1 ), Mi ) is a context Ci [x1 , . . . , xn ]. Thus for allindex i such that Mi is sent we can compute a context Ci that, when applied onprevious messages, yields the message to send. We thus have an implementationof the role specification.
  • 100. 6.4. COMPILATION OF ROLE SPECIFICATIONS 1036.4.2 Computation of a prudent implementationWe note that having an implementation of a role specification is of little usew.r.t. the security analysis of a protocol. For example the active frame ofExample 21 is an implementation of the initiator of the NSPK protocol but itwill accept any message from the intruder without aborting. Any of the algorithms proposed so far for the compilation of cryptographicprotocols would at least require that the role checks that the received messagecontains the nonce sent at the first step. We now present an algorithm thatcomputes this kind of checks for arbitrary deduction system. It formalizes acheck as an equation between contexts over messages received so for, includingthe initial knowledge. For example, and reusing the notations of Example 21 itcomputes that upon reception of the message the initiator must, among othertests, check the validity of the equation: ? π1 (decp (payload(vr ), vK −1 )) = vNa ALet us first formalize what an acceptable message is by a refinement relationon sequences of messages. We will say a strand s refines a strand s if anyobservable equality of messages in strand s can be observed in s using the sametests. To put it formally:Definition 35. A positive strand s = (!M1 , . . . , !Mn ) refines a positive strands = (!M1 , . . . , !Mn ) if, for any pair of contexts (C1 [x1 , . . . , xn ], C2 [x1 , . . . , xn ])one has C1 · s = C2 · s implies C1 · s = C2 · s. For instance the strand s = (! encp (encp (a, k ), k), ! encp (a, k ), !k, !k , !a) re-fines s = (! encp (encp (a, k ), k), ! encp (a, k ), !k, !k , !a) since all equalities thatcan be checked on s can be checked on s. We can now define an implementationφ to be prudent if every equality satisfied by the sequence of messages of therole specification is satisfied by any sequence of messages accepted by φ.Definition 36. Let r be a role specification and ϕ be an implementation of r.We say that ϕ is prudent if any positive strand s accepted by ϕ is a refinementof input(r). Most deduction systems considered in the context of cryptographic protocolsanalysis have the property that it is possible to compute, given a positive strand,a finite set of context pairs that summarizes all possible equalities in the senseof the next definition. Let us first introduce a notation: Given a positive strands we let Ps be the set of context pairs (C1 , C2 ) such that C1 · s = C2 · s.Definition 37. A deduction system D has the finite basis property if for each fpositive strand s one can compute a finite set Ps of pairs of D-contexts suchthat, for each positive strand s : f Ps ⊆ Ps iff Ps ⊆ Ps
  • 101. 104CHAPTER 6. SYMBOLIC MODELS FOR CRYPTOGRAPHIC PROTOCOLS Let us now assume that a deduction system D has the finite basis property.There thus exists an algorithm AD (s) that takes a positive strand s as input, fcomputes a finite set Ps of context pairs (C[x1 , . . . , xn ], C [x1 , . . . , xn ]) and re- ?turns as a result the E-unification system Ss : {C[x1 , . . . , xn ] = C [x1 , . . . , xn ] | (C, C ) ∈ fPs }. For any positive strand s = (!m1 , . . . , !mn ) of length n, let σs be the sub-stitution {xi → mi }1≤i≤n . By definition of Ss we have that σs |= Ss if andonly if s is a refinement of s. Given the preceding definition of AD (s, t), weare now ready to present our algorithm for the compilation of role specificationsinto active frames. ! !Algorithm Let r be a role specification with strand(r) = ( ? M1 , . . . , ? Mn )and let s = (!M1 , . . . , !Mn ). Let us introduce two notations to simplify thewriting of the algorithm, i.e. we write r(i) to denote the i-th labelled message! i? Mi in r, and s to denote the prefix (!M1 , . . . , !Mi ) of s. Compute, for 1 ≤ i ≤n: ? Ti = !vi with vi = AD (si−1 , Mi ) If r(i) =!Mi ?vi with AD (si ) If r(i) =?Miand return the active frame ϕr = (Ti )1≤i≤n . By construction we have thefollowing theorem.Theorem 6.2. Let D be a deduction system such that D-ground reachabilityis decidable and D has the finite basis property. Then for any executable rolespecification r one can compute a prudent implementation ϕ.6.5 Symbolic derivationsActive frames are sufficient to express the relationships between input and out-put messages in a role implementation as well as to describe precisely whichmessages are acceptable by a prudent implementation. However they do notdescribe precisely the internal computations of an implementation. For examplethe usage of contexts means that the output is computed only from the mes-sage received and the initial knowledge, and thus that already computed valueshave to be re-computed every time they are employed. Also, active frames donot provide us with a communication model, i.e. a way to describe the mes-sages exchanged during an execution of a protocol. We now introduce symbolicderivations, a structure in which one can express both the communications andthe internal computations at the expense of heavier notations.6.5.1 DefinitionsSymbolic derivations. Given a deduction system (F, P, E), a role appliespublic symbols in P to construct a response from its initial knowledge and frommessages received so far. Additionally, it may test equalities between messagesto check the well-formedness of a message. Hence the activity of a role can beexpressed by a fixed symbolic derivation:
  • 102. 6.5. SYMBOLIC DERIVATIONS 105Definition 38. (Symbolic Derivations) A symbolic derivation for a deductionsystem (F, P, E) is a tuple (V, S, K, In, Out) where V is a mapping from a finiteordered set (Ind, <) to a set of variables Var(V), K is a set of ground terms (theinitial knowledge) In is a subset of Ind, Out is a multiset of elements of Indand S is a set of equations. The set Ind represents internal states of the symbolic derivation. We imposethat any i ∈ Ind denotes a state of one of the following kind:Deduction state: There exists a public symbol f ∈ P of arity n such that ? S contains the equations V(i) = f (V(α1 ), . . . , V(αn )) with αj < i for j ∈ {1, . . . , n} . ?Re-use state: Otherwise, if there exists j < i with V(j) = V(i); ?Memory state: Otherwise, if there exists t in K and an equation V(i) = t in S;Reception state: Otherwise, we must have i ∈ In;Additionally, a state i is also an emission state if i ∈ Out. A symbolic derivation is closed if it has no reception state. A substitutionσ satisfies a closed symbolic derivation if σ |=E S.Remark 3. We believe that using symbolic derivations instead of more stan-dard constraint systems permits one to simplify the proofs by having a morehomogeneous framework. There is however one drawback to their usage. Whilemost of the time it is convenient to have an identification between the orderof deduction of messages and their send/receive order, building in this identifi-cation too strictly would prevent us from expressing simple problems. Re-usestates are employed to reorder the deduced messages to fit an order of sendingmessages which can be different. For example consider an intruder that knows(after reception) two messages a and b received in that order, and that he has tosend first b, then a. Since the states in a symbolic derivation have to be ordered,we have to use at least one re-use state (for a) to be able to consider a sendingof a after the sending of b. We note that re-use states that are not employedin a connection can be safely eliminated without changing the deductions, thedefinition of the knowledge nor the tests in the unification system.Remark 4. Symbolic derivations were originally defined in [65] w.r.t. extendeddeduction systems. We refer the interested reader to [65] for the exact definitionin that case.Example 23. Let us consider the cryptographic protocol for deduction systemDY where FD and PD have been extended by a free public symbol f : A→B: encp (Na , pk(B)) B→A: encp (f (Na ), pk(A)) where A knows A, B, pk(B), pk(A), sk(A) B knows A, B, pk(A), pk(B), sk(B)
  • 103. 106CHAPTER 6. SYMBOLIC MODELS FOR CRYPTOGRAPHIC PROTOCOLSLet us define a symbolic derivation for role B: Ind = {0, . . . , 8} V = i ∈ Ind → xi K = {A, B, pk(A), pk(B), sk(B)} In = {5} Out = {8} ? ? ? ? ? S = {x0 = A, x1 = B, x2 = pk(A), x3 = pk(B), x4 = sk(B) ? ? ? x6 = decp (x5 , x4 ), x7 = f (x6 ), x8 = encp (x7 , x2 )}The set of deduction states is {6, 7, 8}, there are no re-use state, the set ofmemory states is {0, . . . , 4} and the only reception state is 5. Assuming thatthe role B tests whether the received message is a cipher, one may add a ninth ? ?deduction state with x9 = encp (x6 , x3 ) and an equation x5 = x9 . In addition we assume that two symbolic derivations do not share any vari-able, and that equality between symbolic derivations is defined modulo a re-naming of variables. We represent graphically a symbolic derivation as follows: Deduction of V(i) .. . ....... ... .. . V(1) ... .. V(i) V(n) O S C • The sequence of variables V(1), . . . , V(n) represents the sequence V(Ind); • an arrow pointing to V(i) means that i ∈ In, as is the case for V(1) in the above figure; • an arrow pointing away from V(i) means that i ∈ Out, as is the case for V(n) in the above figure; • S is the unification system. Let us now consider the ordered completion of the equational theory E. Sinceordered rewriting is convergent on ground terms one can define for every groundterm t a normal form (t)↓. We rely on this normal form to prove that everyclosed symbolic derivation defines in a unique way the terms deduced.Lemma 6.1. Let I be a deduction system, and consider a closed and satisfiableI-symbolic derivation C = (V, S, K, In, Out). Then there exists a unique groundsubstitution σ in normal form of support Image(V) such that any unifier of Sis an extension of σ.Proof. Since the symbolic derivation C = (V, S, K, In, Out) is closed is has bydefinition no input states, and thus all states are either knowledge, re-use ordeduction states. By induction on the set of indices Ind ordered by .
  • 104. 6.5. SYMBOLIC DERIVATIONS 107Base case: Assume i is a minimal element in Ind. By minimality i cannot be a re-use state. If it is a knowledge state then by definition there exists in ? S an equation V(i) = t, with t a ground term in normal form, and thus for every unifier τ of S we must have V(i)τ = t. If i is a deduction state, and since it is minimal, the public symbol employed must be of arity 0 and hence is a constant, i.e. again a ground term t. In both cases there exists a unique ground substitution σ in normal form defined on {V(i)} and such that any unifier of S is an extension of σ.Induction case: Assume there exists a unique ground substitution σ in normal form with support: {V(j) | j i} such that any unifier of S is an extension of σ. If i is a re-use state, we note that V(i) is already in the support of σ, and we are done. If it is a knowledge state, reasoning as in the basic case permits us to extend σ to V(i) if necessary. If it is a deduction ? state then there exists in S an equation V(i) = f (V(j1 ), . . . , V(jn )) with j1 , . . . , jn i that has to be satisfied by every unifier θ of S. By induction every such unifier has to be equal to σ on {V(j1 ), . . . , V(jn )}. Thus for every unifier θ of S we have V(i)θ =E f (V(j1 )θ, . . . , V(jn )θ). By induction f (V(j1 )θ, . . . , V(jn )θ) =E f (V(j1 )σ, . . . , V(jn )σ) and thus we must have V(i)θ = (f (V(j1 )σ, . . . , V(jn )σ))↓. Therefore σ can be uniquely extended on V(i) by setting V(i)σ = (f (V(j1 )σ, . . . , V(jn )σ))↓ which is again a ground term. By Lemma 6.1, if a derivation is closed, then for every i ∈ Ind the variableV(i) is instantiated by a ground term. Figuratively we say that a term t isknown at step i in a closed symbolic derivation if there exists j ≤ i such thatV(j) is instantiated by t.Ground symbolic derivations. An important case when considering pro-tocol refutation is the one in which the attacker cannot alter the messagesexchanged among the honest participants. This case can either be employed tomodel a weaker attacker or, when trying to refutate a cryptographic protocol,by guessing first which messages are sent by the attacker, and then by checkingwhether these guesses correspond to messages the attacker can actually send.Definition 39. (Ground symbolic derivation) We say that a symbolic derivationCh = (Vh , Sh , Kh , Inh , Outh ) is a ground symbolic derivation whenever Sh issatisfiable and there exists a ground substitution σ such that, for every unifierτ of Sh and every i ∈ Indh we have h (i)σ = h (i)τ . In other words the input and output messages of a ground symbolic deriva-tion are fixed ground terms. We note that since Ch is not closed, and in spiteof having Sh satisfiable, it is not necessarily true that Ch = ∅. Also a simpleanalysis of the case study of the proof of Lemma 6.1 shows that it suffices toassume that σ is defined only on indices i ∈ Inh .
  • 105. 108CHAPTER 6. SYMBOLIC MODELS FOR CRYPTOGRAPHIC PROTOCOLSConnection. We express the communication between two agents representedeach by a symbolic derivation by connecting these symbolic derivations. Thisoperation consists in identifying some input variables of one derivation withsome output variables of the other and vice-versa. This connection should becompatible with the variable orderings inherited from each symbolic derivation,as detailed in the following definition:Definition 40. Let C1 , C2 be two symbolic derivations with for i ∈ {1, 2} Ci =(Vi , Si , Ki , Ini , Outi ), with disjoint sets of variables and index sets (Ind1 , 1 )and (Ind2 , 2 ) respectively. Let I1 , I2 , be subsets of In1 , In2 , and O1 , O2 besub-multisets of Out1 , Out2 respectively. Assume that there is a monotone bijection φ from I1 ∪ I2 to O1 ∪ O2 suchthat φ(I1 ) = O2 and φ(I2 ) = O1 . A connection of C1 and C2 over the connectionfunction φ, denoted C1 ◦φ C2 is a symbolic derivationC = (V, φ(S1 ∪ S2 ), K1 ∪ K2 , (In1 ∪ In2 ) (I1 ∪ I2 ), (Out1 ∪ Out2 ) (O1 ∪ O2 ))where: • (Ind, ) is defined by: – Ind = (Ind1 I1 ) ∪ (Ind2 I2 ); – is the transitive closure of the relation: 1 ∪ 2 ; • φ is extended to a renaming of variables in Var(V1 ) ∪ Var(V2 ) such that φ(V1 (i)) = V2 (j) (resp. φ(V2 (i)) = V1 (j)) if i ∈ I1 (resp. I2 ) and φ(i) = jWhen the exact connection function in a connection does not matter, is uniquelydefined, or is described otherwise, we will omit the subscript and denote it C1 ◦C2 . A connection is satisfiable if the resulting symbolic derivation is satisfiable.Example 24. Let Ch be the symbolic derivation in Example 23: Indh = {0, . . . , 8} Vh = i ∈ Ind → xi Kh = {A, B, pk(A), pk(B), sk(B)} Inh = {5} Outh = {0, . . . , 8, 8} ? ? ? ? ? Sh = {x0 = A, x1 = B, x2 = pk(A), x3 = pk(B), x4 = sk(B) ? ? ? x6 = decp (x5 , x4 ), x7 = f (x6 ), x8 = encp (x7 , x2 )}We model the initial knowledge of the intruder with another symbolic derivationCK : IndK = {0k , . . . , 3k } VK = ik ∈ Indk → yi KK = {A, B, pk(A), pk(B)} InK = ∅ OutK = IndK ? ? ? ? SK = {y0 = A, y1 = B, y2 = pk(A), y3 = pk(B)}
  • 106. 6.5. SYMBOLIC DERIVATIONS 109and we let C be the following derivation: Ind = {0 , . . . , 8} V = i ∈ Ind → zi K = {n} ⊂ Cnew In = {0 , . . . , 3 , 8 } Out = {5 } ∪ Ind ? ? S = {z4 = n, z5 = encp (z4 , z3 ), ? ? ? z6 = f (z4 ), z7 = encp (z6 , z2 ), z8 = z7 }Let φ be the application from 0k , . . . , 3k , 5 , 8 to 0 , . . . , 3 , 5, 8 respectively andψ be a function of empty domain. Then we have (Ch ◦ψ CK ) ◦φ C : Ind = {0, . . . , 4, 0k , . . . , 3k , 5 , 6 , 7 , 6, 7, 8} V = Vh |Ind ∪ VK |Ind ∪ V |Ind K = {A, B, pk(A), pk(B), sk(B), n} In = ∅ Out = Ind ∩ Ind ? ? ? ? ? S = {x0 = A, x1 = B, x2 = pk(A), x3 = pk(B), x4 = sk(B) ? ? ? x6 = decp (x5 , x4 ), x7 = f (x6 ), x8 = encp (x7 , x2 ) ? ? ? ? y0 = A, y1 = B, y2 = pk(A), y3 = pk(B) ? ? z5 = n, z6 = encp (z5 , z3 ), ? ? ? z7 = f (z5 ), z8 = encp (z7 , z2 ), z9 = z8 }with the ordering: 012345 678 0k . . . 3k 4 . . . 7 8 The connection of two symbolic derivations C1 and C2 identifies variables inthe input of one with variables in the output of the other. Variables that havebeen identified are removed from the input/output set of the resulting symbolicderivation C. The set of equality constraints of C is the union of the equalityconstraints in C1 and C2 , plus equalities stemming from the identification ofinput and output. O _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ S x1 xOn S1 C1 O O C = C1 ◦ C 2 _ 1 _ _ _ _ _ _ _ _ _ _ _ _S2 _ _ _ C_ y yn 2 _ One easily checks that a connection of two symbolic derivations is also a sym-bolic derivation. Also, the associativity of function composition applied on theconnections implies the associativity of the connection of symbolic derivations.
  • 107. 110CHAPTER 6. SYMBOLIC MODELS FOR CRYPTOGRAPHIC PROTOCOLSSince connection functions are bijective, we will also identify C ◦ C and C ◦ C.Thus when we compose several symbolic derivations, we will freely re-arrangeor remove parentheses.Traces. Let C1 and C2 be two I-symbolic derivations and ϕ be a connectionsuch that C = C1 ◦ϕ C2 = (V, S, K, In, Out) is closed. Lemma 6.1 impliesthat there exists a unique ground substitution τ in normal form such that anyunifier σ of S1 ∪ S2 is equal to τ on the image of V. We denote TrC1 ◦ϕ C2 (C )the restriction of this substitution τ to the variables in the sequence of C , forC ∈ {C1 , C2 , C1 ◦ϕ C2 }, and call it the trace of the connection on C . In the rest ofthis chapter we will always assume that trace substitutions are in normal form.6.5.2 Solutions of symbolic derivationsHonest and attacker symbolic derivationsWe consider two types of symbolic derivations, one that is employed to modelhonest agents, and one to model an attacker.Honest derivations. We do not impose constraints on the symbolic deriva-tions representing honest principals, but for the avoidance of constants in Cnew ,since these constants are employed to model new values created by an attacker.We assume that nonces created by the honest agents are created at the beginningof their execution and are constants away from Cnew .Definition 41. (Honest symbolic derivations) A symbolic derivation C is anhonest symbolic derivation or HSD, if the constants appearing in C are awayfrom Cnew .Example 25. The symbolic derivation for role B in Example 23 is honest.Attacker derivations. We consider an attacker modeled by a symbolic deriva-tion in which only the following actions are possible: • create a fresh, random value; • receive from and send a message to one of the honest participant; • deduce a new message from the set of already known messages; • every state is in Out given that the intruder should be able to observe his own knowledge; • given that we consider an actual execution, the set of states is totally ordered.The definition of attacker symbolic derivations models these constraints:Definition 42. (Attacker symbolic derivations) A symbolic derivation C =(V, S, K, In, Out) is an attacker symbolic derivation, or ASD, if
  • 108. 6.5. SYMBOLIC DERIVATIONS 111 • Ind is a total order; • Out contains at least one occurrence of each index in Ind; • K is a subset of Cnew , and • S contains only equations of the form ? Test equation: V(i) = V(j) for i, j ∈ Ind; ? Deduction at state i: V(i) = f (V(i1 ), . . . , V(in )), with i1 , . . . , in i, and f a public symbol; ? Nonce creation at state i: V(i) = ci with ci ∈ Cnew . The fact that the initial knowledge of the attacker is empty but for the noncesis not a restriction when analyzing protocols, as one can see from Ex. 24, andis justified in Sec. 6.5.4.Example 26. The following derivation C is an ASD for the same deductionsystem as Example 23: Ind = {0 , . . . , 8} V = i ∈ Ind → zi K = {n} ⊂ Cnew In = {0 , . . . , 3 , 8 } Out = {5 } ∪ Ind ? ? S = {z4 = n, z5 = encp (z4 , z3 ), ? ? ? z6 = f (z4 ), z7 = encp (z6 , z2 ), z8 = z7 }Informally the ASD expresses that the attacker receives some key k, creates anonce n, sends the encrypted nonce to a role B as in Example 23. Then theattacker tries to check that applying f to n gives a term equal to the decryptionof B’s response.Solutions of a symbolic derivation. Given a symbolic derivation Ch wedenote Ch the set of couples (C, ϕ) where C is an ASD and ϕ is a connectionfunction between C and Ch such that Ch ◦ C is closed and satisfiable. In thatcase we say that C is a solution of Ch , and we sometimes improperly refer to Chas the set of solutions of Ch .Example 27. In Example 24 the ASD C is a solution of Ch ◦ CK since (Ch ◦ψCK ) ◦φ C has no input variables and S is satisfiable (by simply propagating theequalities x0 = A, x1 = B, . . .).
  • 109. 112CHAPTER 6. SYMBOLIC MODELS FOR CRYPTOGRAPHIC PROTOCOLS6.5.3 Decision problemsSatisfiability. Though it is expressed using different notations, the problemof the existence of a secrecy attack on a protocol execution with a finite numberof messages is equivalent, in the setting of this chapter, to the satisfiabilityproblem below. It has been shown to be NP-complete in [190] for the standardDolev-Yao deduction system.I-Satisfiability Input: a HSD C Output: Sat iff C = ∅ A variant of I-satisfiability is its restriction to set of inputs C which areground symbolic derivations, and that we call I-ground satisfiability.I-Ground Satisfiability Input: a ground HSD C Output: Sat iff C = ∅Equivalence. As a special case of a hyperproperty we are interested in theequivalence of HSDs w.r.t. an active intruder.Definition 43. Two HSDs Ch and Ch are symbolically equivalent iff Ch = Ch . Thanks to Lemma 10.3, p. 200 we will see that when the states in the HSDsare totally ordered this notion is the same as the one of symbolic equivalencein [54].I-Symbolic Equivalence Input: Two honest I-symbolic derivations Ch and Ch Output: Sat iff Ch = Ch . Again it is possible to define a ground version of the I-symbolic equivalenceproblem when the input consists in two ground symbolic derivations.I-Symbolic Equivalence Input: Two honest I-ground symbolic derivations Ch and Ch Output: Sat iff Ch = Ch .Remark. Let us remark that it makes sense to compare Ch and Ch only ifthere exists a bijection between the in- and output states of these derivationssuch that every closed connection between an ASD and Ch can be mapped, usingthis bijection, to a closed connection between the same ASD and Ch . In orderto simplify notations we implicitly quantify over all connection functions suchthat a composition is closed and satisfiable and consider the same connection(modulo the bijection) with the two HSDs Ch and Ch .
  • 110. 6.5. SYMBOLIC DERIVATIONS 1136.5.4 Relation with static equivalenceThe problem we consider is whether two cryptographic processes, represented byHSDs in our setting, are observationally equivalent, in the sense that an attackercannot built a sequence of interactions that would produce different results whenapplied to the two processes. Solving this problem has many applications. Forinstance if the two processes only differ by a data value this shows that this datais confidential. In [5] the observational equivalence problem for an attacker whodoes not interact with the honest agents is reduced to the one of the staticequivalence between two sequences of messages. In the broader setting in which an attacker interacts online with the honestparticipants, [89] reduces the observational equivalence to trace equivalence fora class of processes corresponding to honest symbolic derivations. Their traceequivalence corresponds to symbolic equivalence in our setting.Static equivalence.Contexts. Let us first recall the notion of static equivalence between framesas introduced in [5]. A frame is a substitution σ of finite support {x1 , . . . , xn }hiding a finite sequence c of constants, which is denoted νc·σ. A public construc-tor is a function symbol f of arity k such that, if the intruder knows t1 , . . . , tkhe also knows f (t1 , . . . , tk ). A public context M over the frame νc · σ is a termwhose variables are in the support of σ, whose constants are away from c andwhose other symbols are public constructors. Finally, equality is defined moduloan equational theory E.Constants. Without loss of generality, we can assume that all free constantsin a context M are away from those appearing in σ: the rationale for this as-sumption is that if a free constant c0 is in σ but not in c we can always considerthe public contexts on the frame ν c, c0 · {x0 → c} ∪ σ which are the same—butfor the replacement of c by x0 —as those on the frame νc · σ. This motivates thesplitting of the set of free constants into two sets, C and Cnew , where C desig-nates those free constants that can be used by honest users, and Cnew those thatcan be used by an attacker. We emphasize here that, as in [5], the attacker canmanipulate terms containing constants in C. We have just ensured that theseconstants have to be passed explicitely to the attacker through the substitutionσ. When considering symbolic derivations, this translates into imposing thatthe knowledge of an ASD must contain only constants in Cnew . Let us now recast the definition of static equivalence, as stated in [5], ac-cording to these assumptions.Definition 44. (Static equivalence) Two frames ϕ = νc · σ and ψ = ν c · τthat have the same domain are statically equivalent if for any public contextsM and N whose constants are away from c ∪ c one has M σ =E N σ iff one hasM τ =E N τ . The definition of contexts corresponds to the notion of derivation in thefollowing sense: we define I to be the deduction system defined over a signature
  • 111. 114CHAPTER 6. SYMBOLIC MODELS FOR CRYPTOGRAPHIC PROTOCOLSF, modulo an equational theory E, with P equal to the set of public symbols. Wenote that, given the possible deductions, the quantification is over all symbolicderivations that takes in input terms in the frame and constants away from theseframes, and thus in Cnew . Static equivalence states that any couple (M, N ) ofcontexts yields the same result in one frame iff it yields the same result in theother frame. This suggests us to express static equivalence of frames in termsof sets of solutions of symbolic derivations as follows. First, to a substitution σ of finite support x1 , . . . , xn we associate the closedsymbolic derivation: ? Cσ = (V, {V(i) = xi σ}i=1,...,n , Image(σ), ∅, {1, . . . , n})with V of support {1, . . . , n}. To represent the construction of contexts by theattacker, we consider symbolic derivations CI = (VI , SI , cI , InvarI , ∅), with|InI | = n, and cI a finite subset of Cnew . The equality of two contexts M andN over σ can then be translated as the satisfiability of the following compositionof symbolic derivations: . . . . . . . . . . . . . . .. . . . . . . . . . .. . ........ . . . . . .. .. M ....N Solution of Cσ .. .. . ? c V (1) O V (n)O V (iM ) V (iN ) S with: V (iM ) = V (iN ) ? V(1) V(n) {V(i) = xi σ}i∈{1,...,n} Cσ Clearly, two frames νc·σ and νc·τ are statically equivalent, with the standarddefinition, iff for any ASD C , C ◦ Cσ is closed and satisfiable iff C ◦ Cτ is closedand satisfiable. In our notation this is translated into the equality Cσ = Cτ ,and the problem of deciding whether two closed frames are in static equivalenceis the same problem as deciding whether two closed symbolic derivations aresymbolically equivalent.Relation with ground symbolic equivalence. One could have expected tohave a definition of static equivalence in terms of ground symbolic equivalence.But such a definition would have made the problem more difficult. Indeed, it hasonly been shown in [4] that when there exists at least one free function symbol thedecidability of static equivalence implies the decidability of ground satisfiability.This was actually taken into account in [11] where it is actually proven thatground symbolic equivalence (in lieu of static equivalence) is modular.Equational theories and equivalenceThe original problem one is interested in is whether two cryptographic processesare bisimilar for an external observer. In [5] this problem is reduced to the oneof the static equivalence between two sequences of ground messages. Howeverthe cryptographic operations considered were total, which means e.g. that adecryption applied on a message with a key always returns a message even
  • 112. 6.6. CONCLUSION 115when the decryption key does not match the encryption key. As a result, theobserver is not aware of whether a cryptographic operation is successful. Wenote that under these assumptions the frames: ϕ = νa, k · {x1 → enc(a, k), x2 → k −1 } ψ = νa, k , k · {x1 → enc(a, k ), x2 → k −1 }are equivalent when assuming that an observer has no way to differentiatea =E dec(x1 , x2 ) · ϕ and dec(enc(a, k ), k −1 ) = dec(x1 , x2 ) · ψ. This is e.g. thecase when no padding nor other security measure permits one to check that thedecryption has succeeded. But when one assumes that the cryptographic prim-itives abstracted by the enc and dec symbols are such that dec(enc(a, k ), k −1 )can be detected to be an incorrect decryption result (for example because it doesnot have a correct padding), the two frames ϕ and ψ shall be distinguishables.The choice between the two models shall be made on a per operation basis andaffects both the HSDs and the ASDs:HSDs: In the second case, it makes sense to assume that there is no “decom- position” symbol in the honest symbolic derivations considered (assuming thereby that in a prudent implementation a raised exception would have stopped the execution), while in the first case this distinction is irrelevant.ASDs: In the second case, we have to ensure that the traces seen by the in- truder are equivalent w.r.t. to equational rules applied on the contexts constructed by the intruder, i.e. we have to ensure that the unification system is normalized in the same way when composing an ASD with two HSDs. Remembering that the equational theory models an arbitrary set of functions with the possibility of recursive calls there is no generic way to ensure that one can check that the same functions are successfully called. However there is an important class of equational theories, namely those for which some complete narrowing strategy terminates, for which one can “symbolically” compute the possible function calls. This was employed in the specific case of subterm equational theories in [75]. Technically, one guessrd a set of narrowing steps on the unification system of an ASD be- fore composing it with the HSDs. In the first case, one does not guess the normalization steps before composing, and just relies on the satisfiability of the unification system.6.6 ConclusionWe have presented a formal model of cryptographic protocols which is amenableto security analysis via the resolution of some decision problems. However thismodel is defined for protocols described by narrations, which is not alwayspossible. Examples outside the scope of the translation presented include: • protocols with loops, in which a sequence of actions can be repeated until some criterion is satisfied;
  • 113. 116CHAPTER 6. SYMBOLIC MODELS FOR CRYPTOGRAPHIC PROTOCOLS • protocols that do not fail silently when an unacceptable message is re- ceived; • protocols manipulating parameterized messages of unbounded size; • group protocols, which are parameterized by the (unbounded number of) members of a group, in which both the data and the actions can be pa- rameterized; • protocols in which the participants have access to sets of pieces of data, e.g.: – certificate revocation lists; – databases, encoded by sets of messages; – sets of nonces already used; – timestamps; – ...This list is not exhaustive but most unabstracted cryptographic protocol alreadyfalls into one or another category. The AVISPA and Avantssar tools can handlepartially some of these extensions, but we note that there is barely any publishedarticle on these extensions except with very strong limitations. For example: • T. Tr¨derung considered in [206] has proposed an extension to finite pro- u tocols in which the knowledge of the intruder is defined by a regular tree language instead of being just a finite set of terms. It permits one to partially encode the messages acceptable by Web Services, though the limitations on the possible manipulations on the messages by the honest participants are severely limited. An interesting extension of this work would be to consider the case in which the keys are not atomic; • R. K¨sters and T. Wilke [140] consider the case in which the honest par- u ticipants are modeled by regular transducers, i.e. finite state automata rewriting the received the received message into a response. They proved the decidability of the analysis for a class of regular transducers, and the undecidability for several extensions of this class. • N. Chridi, M. Turuani, and M. Rusinowitch [80] have considered a set- ting in which the restrictions on the possible manipulations by the honest participants are relaxed by using a severe tagging discipline; • While these two works impose restrictions on the messages, I have con- sidered in collaboration with D. Lugiez and M. Rusinowitch the case in which honest participants can test the presence of a piece of data in a database [66] by using positive subterm contraints. However in contrast with the two previously mentionned works the setting adopted does not permit one to express constraints imposing e.g. that a message contains a sequence of messages of a particular type.
  • 114. 6.6. CONCLUSION 117The extension of these results to take into account real protocol is still open,and promise to be a challenging future research direction.
  • 115. 118CHAPTER 6. SYMBOLIC MODELS FOR CRYPTOGRAPHIC PROTOCOLS
  • 116. Chapter 7Proposition for WSModeling We present in this chapter a framework in which one can ex- press the access control policy of a service as well as the tran- sition rules dealing with both the access control policy on a workflow and its dynamic evolution. Each service is protected by a trust negotiation policy that controls the accessibility of the credentials used in the decision making in other services. Unlike most of the access control policies which are uniquely based on roles, we chose an attribute based framework leading to more flexibility in the characterization of users. The strength of this framework is its ability to control and check the access control aspect of the services and its dynamic evolution based on an exchange of credentials. We provide a unified framework for reasoning on access control policies, trust negotiation and workflows.7.1 IntroductionThere is an increasingly widespread acceptance of Service-Oriented Architectureas a paradigm for integrating software applications within and across organi-zational boundaries. In this paradigm, independently developed and operatedapplications and resources are exposed as (Web) services. These services com-municate one with another by passing messages over HTTP, SOAP, etc. Afundamental advantage of this paradigm is the possibility to orchestrate exist-ing services in order to create new business services adapted to a given task.Several languages (WS-CDL [131], WSBPEL [128], BPMN [213],. . . ) have beenproposed to describe the workflow of an orchestrating service. These languagescan be given an operational semantics in terms of (extension of) π-calculus [149]or Petri nets [122]. 119
  • 117. 120 CHAPTER 7. PROPOSITION FOR WS MODELING For business, security and legal reasons, it is necessary to control within aworkflow and on the workflow interface in which contexts an action can be exe-cuted. This implies that, together with the workflow defining the orchestratingservice one has to provide an application-level security policy describing therole, separation of duty and other constraints to be enforced in the workflow.In order to foster agility (i.e. to specify the process so that it can be employedin a variety of environment) one usually adds a trust negotiation layer so thatprincipals can get the chance to prove that they are legitimate users of theservice. Given the skills required to implement these aspects, they are usually sep-arated into a security token server, an XACML firewall, a Business Processmanagement system, plus additional ones for aspects abstracted in this paper.We have chosen to describe services with logical entities that gather all theaspects pertaining to one application or resource. The main originality of thiswork is the interplay between workflow execution and access control which ispermitted by this unified framework. It permits us to express naturally theconstraints that are encountered when dealing with real-life business processes.Related works. There exists already some works aiming at adding an accesscontrol aspect to workflows. In [35, 175] the access control is specified withroles that can execute activities, users that have attributes allowing them toenter roles, and ordering on activities. We believe that RBAC-WS-BPEL lan-guage is significantly less expressive than our proposal. In particular it doesnot provide for dynamic separation of duty constraints, or other complex con-straints based on the documents exchanged and the environment of execution.In [133] is proposed a framework in which even messages are interpreted asmobile processes, and in which processes communicate one with another to ex-change credentials. The trust negotiation rules and their evaluation is similarto what we propose, but the workflow description is absent and thus we believeit to be much harder to express fine access control policies that depend on theexecution so far of a processus. Moreover the overall architecture is completelydifferent. In [121, 30, 107] the workflow is embedded within the access controlsystem, i.e. the possible evolutions of a process are embedded in the accesscontrol rules. Another point is that there is no notion of local state, which isreplaced by the proof of reachability of a state . This approach implies that onedoes not follow exactly how many times a given task is executed. In Sect. 7.2 we give an informal description of the model. We present theaccess control rules and the workflow in Sect. 7.3. Section 7.4 gives the semanticsof access control rules and Section 7.5 presents the operational semantics of theworkflow.7.2 The modelOur aim is to develop a language that is capable of managing access controlpolicies and state evolution in a distributed environment. In this section we
  • 118. 7.2. THE MODEL 121present the structure of our framework by defining the different constituents ofthe model.7.2.1 Presentation of the car registration process (CRP)Before giving a formal description of the model, we present a concrete case study[202] to illustrate the use of this dynamic framework. Mike is a citizen and wantsto register his newly purchased car. To do so he sends a completed registra-tion form to the car registration office along with all the necessary documents.The car registration office acts as a portal between the employees that studythe document form and make a decision on one hand, and the central reposi-tory where the forms are to be stored on the other hand. The car registrationoffice allows employees to access and store documents in its local repository.When a request form is studied and a decision is made, the document has to bestored in the central repository and the citizen has to be notified of the decisionthrough the car registration office. Employees can access documents in the cen-tral repository and they can store documents in the central repository only ifthey have a certificate form their boss. The Registration office central authorityprovides the needed certificates for both the employees and the head of the carregistration office. Employees can access the documents in the local repository,make comments and store them back in the local repository at all times. Oncea decision is taken, the document shall be stored in the central repository andthe citizen is to be notified.7.2.2 On the encoding of CRP into our frameworkAn overall view leads us to define three distinct concepts upon which the modelis built.An entity is an abstract service formed of a set of access control rules, a setof negotiation rules, a repository containing certificates and documents and aworkflow that orchestrates the state evolution. In addition, an entity possessesa set of local identifiers that can be used in any rule within the entity. Theaccess control policy of the entity is state-based and attribute-based, i.e. thedecisions are taken by examining its local state and provided certificates. In theabove example we can distinguish between four different entities, namely the carregistration office(CRO), the central repository(CR), the Central authority(CA)and the employee(Empl), each having its own access control policy and set ofpermitted actions. For example, the access control policy of (CR) states that anemployee can store a document if a certificate from his/her boss certifies thathe/she can store document in the central repository, whereas in the (CRO) acertificate stating that the user is an employee is enough to allow the user tostore a document in the local repository.A local state associates values to the local identifiers and to the workflowvariables. The local state of an entity evolves depending on the actions per-
  • 119. 122 CHAPTER 7. PROPOSITION FOR WS MODELINGformed by users of that entity. Certificates can be added or modified and possi-bly removed according to the transition policy of the entity, and messages canbe received, stored or sent. In contrast with e.g. the applied π-calculus, thelocal state is not encoded by active substitutions within the workflow. The ra-tional for this choice is that the value of local identifiers is to be employed bothwithin the workflow and within the trust negotiation system and that usingactive substitutions would have significantly increased the intricacy of the trustnegotiation part.Certificates and documents are used as a base for access control decisionmaking within an entity. However we distinguish between the documents ingeneral and the certificates as follows: the documents contain information onthe resources and are internally modified or directly sent to the concerned entity,while certificates provide information on the users and are negotiated with otherentities. We define a document to be a list of couples (att, v) where att ∈ AT T theset of attributes (ex. subject, object, value, rank, action...) and v ∈ V AL theassociated set of attribute values. Note that this modeling of documents assumes an abstraction phase in whichthe properties of a document that pertain to access control are defined w.r.t.the document’s content, and then represented as attributes of this document.One could e.g. define how a requester name can be extracted from a form byan XPath expression, and set the requester attribute of the form to the resultof the evaluation of this XPath query on the form. For example, the documentrepresenting a car registration form will be viewed as a set of attributes such as {(issuer, Citizen), (requestId, ID), (decision, V ), (comments, T xt), . . .} A certificate is a more sensitive structure since it is exchanged via some trustnegotiation policy. That is why we choose to model a certificate as a documentthat holds the attributes (e.g. the role of a subject) with four additional param-eters. Namely, every certificate has a certifier cert which represents the entitythat signs it, a recipient recp that specify the intended audience, an issuer issand a subject subj on which the certificate specifies attributes. Note that we donot represent in a certificate which entity sends or receives it, nor which entityit is sent to or received from. As such we define a certificate to be an object ofthe form: (Cert, Recp, Iss, Subj, {(att, v)}att∈AT T )In order to simplify notation, C.cert, C.recp, C.iss and C.subject representrespectively the first, second, third and fourth argument of a certificate. Weassume the existence of two special constants ⊥ and any with the followinginterpretation: • If C.cert = any the certificate is not signed, and if C.recp = any the document part is not encrypted. Otherwise the certificate is respectively signed with the certifier’s signature key, and the set of attributes is en- crypted with the receiver’s public key;
  • 120. 7.3. SYNTAX 123 • For any attribute att ∈ {cert, recp, iss, subj}, we have C.att = ⊥ iff the / attribute is not defined in the document. Example: The certificate Peter says John is Employee and has 5 years ex-perience certified by ca is represented by the 5-uple (ca, any, peter, john, {(role, empl), (exper, 5)}) In the example above we assume that the certificate can be transmittedamong the entities with no restrictions on the recipient. The extra parametersassociated to a certificate are often necessary to prevent attacks on the identityof the certificate subject. Unlike documents, certificates are not supposed to bemodified. Accordingly the modification of the certificate is to be done by theissuer iss of the certificate and certified by some certifying authority mentionedin cert. The specification of the recipient is independent from the trust policy of theentities which determines to whom the certificate can be sent. A certificatemay have both a sending policy and a receiving policy which basically dependon the security infrastructure i.e. with which other entities one entity cancommunicate securely. The sending policy is decided by the entity having thecertificate whereas the receiving policy is defined by the entities receiving acertificate, that are supposed to determine what certificates to expect whenmaking a decision.Workflow. The last feature introduced in our framework has to do with thedynamic aspect of the language. In fact, the access control policy controls thepermission of certain tasks based on a set of preconditions evaluated in thecurrent state of the entity. However these tasks will have an effect on the stateof the entity and therefore on the subsequent access control decisions. In short, the entities have a core layer characterized by the capacity toexecute actions triggered by internal access control rules (and possibly by re-ception of a request from the network). The preconditions for action executionnecessitate certain constraints provided by the workflow, but also by certificateretrieval. The workflow is the orchestrator of the entity, it manages the com-munication of messages and indicate the possible transition in the core of theentity. Finally the trust policy can be viewed as an access control policy on thecertificates within the entity and manages the trust establishment.7.3 SyntaxIn this section we give a formal description of the model. We start by definingthe syntax that shall be used before defining the access control rules and theworkflow.
  • 121. 124 CHAPTER 7. PROPOSITION FOR WS MODELING7.3.1 Values and termsBefore presenting the formal model, we define the syntax for the access controlrules. The values correspond to terms that can be memorized by an entity whilemessages are employed to exchange values between entities.Ground terms. We consider a set C of constants denoted in the Prolog con-vention (names begin with a lowercase letter for constants, and with a uppercaseletter for variables). We let Att ⊆ C be the set of attributes, and Act ⊆ C be aset of action names. We define: • Ground atomic values A := | ⊥ | any | self | c where c ∈ C; • Ground attributes are pairs (a, t) where t is a ground atomic value and a ∈ Att; • Ground documents D are finite sets of ground attributes; • Ground certificates are 5-uple (t1 , t2 , t3 , t4 , D) where t1 , t2 , t3 and t4 are ground atomic values denoting respectively, the certifier, the recipient, the issuer and the subject, and D is a ground document; • Ground values are either ground atomic values, ground documents or ground certificates;The type discipline defined by this grammar ensures that given a finite number nof constants, there is at most an exponential number of possible different grounddocuments, and thus an exponential number of different ground certificates.Variables, substitutions and terms. We assume that we have a denumer-able set V of typed variables denoted using the Prolog convention. The typeof a variable can be one of {atomic, document, certif icate}. A ground substi-tution is a mapping from variables to ground values. A ground substitution iswell-typed whenever it maps variables to ground values of the same type. Thedomain of a substitution is the set of variables on which it is defined. Finally, avalue is either a ground value, a variable, or X.a where X is of type documentor certificate and a is an attribute.Lists and tasks. We structure information within the entities by using listsand sets of values which are denoted respectively v1 · . . . · vn and {v1 , . . . , vn }.If all values in a list or set are ground we say that the list or set is ground. Inorder to represent in the access control policy the invocations of sub-processes,we define tasks that are denoted τ (v1 , . . . , vn ), where τ ∈ Act and the vi arevalues. A term is either a value, a list, a set or a task. A term is ground if it isa ground value, list or task. If the maximal arity in tasks and lists is fixed,there exists at most an exponential (w.r.t. the number of constants) number ofdifferent ground tasks and ground lists, a doubly exponential number of sets,
  • 122. 7.3. SYNTAX 125and thus a doubly exponential number of terms. Given a set C of constants wedenote H(C) the set of ground terms built over these constants. We note thatthis set is at most of doubly exponential size w.r.t. the number of constants.Messages and certificate messages. Messages are employed to exchangeground terms between entities. We distinguish two kinds of messages: • A certificate message CM is a triple cert(C, t1 , t2 ) where C is a ground certificate and t1 , t2 are ground terms denoting the sender and receiver respectively; • A message has the form msg(L, t1 , t2 , τ ) where L is a ground list, and t1 , t2 are atomic values denoting the sender and receiver respectively, and τ ∈ Act;7.3.2 Access control rulesThe entity has two sets of rules, one is responsible for the protection of thecertificate exchange and the other manages the permissions for the tasks thatcan be executed within the entity. Although both are represented by predicatelogic rules, their purpose and semantics is different. We shall first present therules that govern the trust negotiation. We then define the access control rulesthat govern the dynamic evolution of the entities. The rule evaluation semanticswill be presented in Sect.7.4.Trust negotiation.In a distributed environment entities need to exchange information in orderto validate the decision of another entity via the use of certificates containinginformation—which may be sensitive—about the users or resources that act onits behalf in other entities. We model this exchange via a trust negotiationmechanism where each entity can set its own trust policy for the disclosure ofcertificates to the entities. The trust negotiation is triggered by a request thatusually emanates either during an access control evaluation rule or during anegotiation session. These rules have the form: put(C, t) ← bodywhere put(C, t) allows the disclosure of certificate (i.e. a value of type certificate)C to an entity t (a value of type atomic) whenever the conditions in the bodyof the rule are satisfied.Access control policy.When writing a Business Process, one usually differentiates between atomicactions, tasks [117] which are defined by partial orderings on atomic actions,and business roles which are entities to which a set of tasks is assigned. We
  • 123. 126 CHAPTER 7. PROPOSITION FOR WS MODELINGhave chosen instead to consider only the notion of task as a named process thatencompasses the notions of activity, task and role. The access control aspect iswoven into the workflow by checking whenever a task is initiated whether it ispermitted by the access control policy. This access control policy consists of rules that govern the decision makingprior to the execution of actions and consists of a set of rules of the form P ermit(τ (v1 , . . . , vn )) ← bodywhere τ is an action name and v1 , . . . , vn are the parameters of the task whichare values of any type. P ermit(τ (v1 , . . . , vn )) allows the execution of the taskτ when the conditions in the body of the rule are satisfied with the instanceof the parameters v1 , . . . , vn . Note however that since access control rules areonly evaluated when a task is initiated, it is possible that the body of the rule issatisfied with an instance σ of the parameters, but the tasks cannot be executedwith this instance because it is not ready to be executed in the workflow.Evaluation of conditions.The conditions in the body of the rules are defined as follows: body := | T est | body ∧ body | body ∨ body T est := has(t, S) | get(C, t) | t = t | t = t with C a certificate, v an atomic value, S a set and t a value.has(t, S) queries the given set S for the value t. It returns true if t is in the set S and false otherwise;t = t, (t = t) returns true if the relation is satisfied, false otherwise. This is used e.g. to check for an attribute value such as for example C.name = John, for attribute matching C1 .name = C.sender or to check that an attribute is undefined C.value = ⊥.get(C, t) involves negotiating certificates with other entities. get(C, t) initiates a trust negotiation mechanism with the entity t and returns true if the entity t agrees to disclose the certificate C In our running example, a possible trust negotiation policy is:T1: The roles are public and can be sent to anyone (words beginning with capital letters denote variables): put((ca, any, ca, U, {(role, Z)}), E) ← has( (ca, any, ca, U, {(role, Z)}) , orgCert)T2: Alternatively, one could mandate that these certificates are only readable by users trusted by organization org: put( (ca, U, ca, X, {(role, Z)}) , E) ← has( (ca, X, ca, X, {(role, Z)}), orgCert) ∧get( (org, ca, org, U, {(trusted, isT rusted)}) , org)
  • 124. 7.3. SYNTAX 127 Assume C is the certificate (ca, any, peter, john, {(role, empl)}) and C isthe certificate (org, any, org, cro, {(trusted, isT rusted)}). Notice that T 1 willanswer yes to a query C of the entity cro only if C is in the database of ca.On the other hand the rule T 2 requires a trust negotiation between ca and orgto get the certificate C before giving an answer to cro. That is get(C , org)returns true in T2 if there exists in the entity org a rule in which the body issatisfied with an instance of the head put(C , ca). Note also that given a certificate C and attribute name a, if the condi-tion C.a occurs in the body of a rule, an additional condition should be addednamely C.recp = self ∨C.recp = any to ensure that the attributes are readable.Conversely, for rules put(C, E) ← body, we assume that either • there is a condition get(C, t) or has(C, S) in the body, • or that the issuer of the certificate is self , and the certifier is self or any. Let us now consider the access control rule: P ermit(store(U, Doc)) ← has(X, Certif List) ∧(X.recp = self ∨ X.recp = any) ∧ X.subj = U ∧ X.role = emplThis rule returns true if Certif List contains a certificate X (readable by theentity or any)such that the attribute role of this certificate has the value empl.The certificate C satisfies this conditions if U is instantiated with john. Thusthe action store(john, Doc) is permitted if C is in Certif List, and there is notrust negotiation otherwise. Now, if the access control rule is: P ermit(store(U, Doc)) ← (get(X, ca) ∨ has(X, Certif List)) ∧X.subj = U ∧ X.role = empl ∧ X.cert = ca ∧ X.issuer = peterThen a trust negotiation phase would begin if no matching certificate is foundin the instance of Certif List.Discussion.In these rules we suppose that the entities know each other and in particulara given entity knows the entity with which the negotiation is to be performed.The certificates constitute the needed credentials to authenticate a user or apermission on which a decision is based. As such the communication of cer-tificates decides what certificate an entity needs to establish a decision, this isspecified by the get(C, t) in the deciding entity. On the other hand a policythat determines what certificates to send is modeled in the entity possessing thecertificates through put(C, t1 ). We assume that the communication of certifi-cates is done on authentic and confidential channels. Further we assume thatno certificate is kept when the state changes, that is the computation of possiblecertificates is performed after each state change.
  • 125. 128 CHAPTER 7. PROPOSITION FOR WS MODELING7.3.3 WorkflowWhat we have so far is a system of entities that can perform a predeterminedset of tasks. The tasks are protected by the access control policy of an entityand the trust negotiation policy of this and the other entities. We assume thatthe trust negotiation is done outside the scope of local rule evaluation in anentity. As such in the remaining of this discussion we assume that we are givena valid certificate messages sequence α. We define processes in a language whose syntax is borrowed from existingprocess algebra languages. An action is possible in a process if there existsa reduction rule that consumes this action. We say a task τ (v1 , . . . , vn ) isexecutable if it is both permitted by the access control policy and possible inthe workflow. A reception is executable if there exists a matching message thatis waiting to be received. Other possible actions are always executable. Theworkflow gives an order on the tasks performed by various agents within theentity to complete a given procedure in the environment.Atomic actions.We start by defining the atomic actions that will be used to define the workflow.The actions are defined with the following grammar: Action := τ (v1 , . . . , vn ) | νx1 , . . . , xn | snd(v1 · . . . · vn , vs , τ ) | rcv(v1 · . . . · vn , vr , τ ) | add(v, S) | rmv(v, S) | modif y(a, X, v)where v, vs , vr , , v1 , . . . , vn are values, xi are variables that have a value type, τis an action name, X is a document or certificate and S is a set. Let us nowdescribe the different actions. - An action τ (v1 , . . . , vn ) whose execution consists in its replacement by a process P σ provided that there exists a definition τ (x1 , . . . , xn ) = P and σ is the substitution mapping the variables xi to the values vi ; - νx1 , . . . , xn is defined with respect to the local state of the entity (i, ρi , σi , Wi ) (see below) and extends the σi of the entity with new variables x1 , . . . , xn which are mapped to the ⊥ (undefined) value; - snd(v1 · . . . · vn , vr , τ ) sends a message with payload v1 , . . . , vn to an entity vr to access operation τ . Note that τ is the action name for an action to be performed on the entity vr ; - rcv(v1 · . . . · vn , vs , τ ) is the reception in operation τ of a message with payload v1 , . . . , vn from the entity vs ; - add(v, S) adds the value v to a set S in the local state of the entity; - rmv(v, S) removes the value v from the set S;
  • 126. 7.3. SYNTAX 129 - modif y(a, X, v) replaces the value of the attribute a in the certificate or document X by the atomic value v. If v = ⊥ it undefines the attribute. If the attribute a is not defined in X, it creates a new attribute and assigns the value v to the freshly creates attribute.Processes and workflows.The state change is modeled using a transition system. The change is sub-ject to the access control evaluation, the workflow constraints and the messageexchange. Formally we defineTask: A Task definition is the definition of a named processus: T := τ (xi , . . . , xn ) = P where P is a processus and the xi are variables.Processus: Processes are defined by the usual combinations of atomic actions, as given by the following grammar: P := Action | P ; P | P ! | P ||P | P + P where ;, !, || and + stand respectively for the sequence, iteration, parallel composition and non-deterministic choice of processes.Workflow: A workflow of an application is specified by a set of task definitions τ (xi , . . . , xn ) = P and by a process.The operational semantics for the workflow will be presented in Sect. 7.5.7.3.4 Entities and statesEntities. We define an entity by a 4-uple (i, σi , ρi , Wi ) where i is a unique identifier that denotes the entity’s name. σi : param → values is a local substitution that evolves and is updated with state transitions. ρi is a set of access control rules that model the access control policy and the trust negotiation policy of the entity. Wi is a workflow that gives an order for the task execution.Entities and multi-set of entities are denoted respectively E and E, and decora-tions thereof.
  • 127. 130 CHAPTER 7. PROPOSITION FOR WS MODELINGGlobal states. We use multiset rewriting (see [52] for a presentation and forits relation with π-calculus) to specify global states of the system under analysis.A state is a couple of: • A multiset M that represents messages that have been sent and not yet received. This multiset permits us to consider asynchronous communica- tions between entities. • A multiset E of entities that represents the different service instances (with their multiplicity) at the current point of execution.We assume that in an initial state, the multiset M of messages is empty. Wepresent the transition relation on the states in the next two sections. In Sect. 7.4we present the semantics for trust negotiation, on which we rely in Sect. 7.5 todefine one-step transitions.7.3.5 ExampleWe extract from our running example the following workflow: store(X, Y ) = modif y(status, Y, ⊥); add(Y, DocList) W = νU, Doc; recv(Doc, U, store op); store(U, Doc)In the entity (i, ρi , {DocList → ∅}, W ). The first executable action is νU, Docthat creates new variables, and results in the local state: (i, ρi , {DocList → ∅, U → ⊥, Doc → ⊥}, recv(Doc, U, store op); store(U, Doc))The action recv(Doc, U, store op) is now executable. Assuming a matchingmessage msg(doc0 , u, i, store op) is waiting to be received, this action can beexecuted, and will result in the entity state: (i, ρi , {DocList → ∅, U → u0 , Doc → doc0 }, store(U, Doc))This action is then replaced by the definition of store(X, Y ) by substituting Xwith U and Y with Doc. This replacement is permitted if P ermit(store doc(u0 , doc0 ))is derivable from the access control policy, and will result in the entity state: (i, ρi , {DocList → ∅, U → u0 , Doc → doc0 }, modif y(status, Doc, ⊥); add(Doc, DocList)) In Sect. 7.4 and 7.5 we formalize the transition rules on global states, andthereby the operational semantics for processes and entities.
  • 128. 7.4. SEMANTICS FOR ACCESS CONTROL 1317.4 Semantics for access control7.4.1 Application of substitution in an entityWe distinguish between three types of values, namely terms instantiated byconstant values, certificates, documents, sets and lists. We assume that variablesare of one of these types. We define in this substitution the application of asubstitution σ in the context of an entity Ei = (i, ρi , σi , Wi ). Assuming that allsubstitutions are well-typed, we define, when applying a substitution σ in ρi : xσi , x ∈ dom(σi ) and xσi = ⊥ - For a variable x ∈ V [[x]]i = σ xσ, otherwise. - For a constant c ∈ C [[c]]i = c σ - For self [[self ]]i = i the identity name of an entity E. σ [[X.a]]i = v if [[a]]i = att and (att, v) ∈ [[X]]i . σ σ σ - For a certificate or document X: [[X.a]]i = ⊥ if [[a]]i = att and (att, v) ∈ [[X]]i for all v σ σ σ - For a task τ ∈ Act, [[τ (v1 , . . . vn )]]i = τ ([[v1 ]]i . . . [[vn ]]i ) σ σ σ7.4.2 Predicate evaluationWe start by giving meaning to the predicates evaluation in order to define laterrule evaluation for rules of the form h ← body. We use the notation |=i to expressthat the predicate evaluation is local to the rules in entity E of identifier i buttakes into account the global exchange of certificates. As such, let α0 be the setof communicated certificates, and let σ be a ground well-typed substitution. Recall that M represent the multiset of messages sent but not yet receivedand E represent the multiset of entities. The expression S + s represents thefact that there exists an element s in the multiset S. Subsequently, the notationS denotes that the element s was omitted from S. - M, E + (i, ρi , σi , Wi ), α0 , σ |=i - M, E + (i, ρi , σi , Wi ), α0 , σ |=i get(v, t) if ([[t]]i , [[v]]i , i) ∈ α0 . σ σ - M, E+(i, ρi , σi , Wi ), α0 , σ |=i has(v, S) if there exists a set [[S]]i in range(σi ) σ such that [[v]]i ∈ [[S]]i σ σ - M, E + (i, ρi , σi , Wi ), α0 , σ |=i x = y(x = y) if [[x]]i = [[y]]i ([[x]]i = [[y]]i ) σ σ σ σ7.4.3 Rule evaluationTrust negotiation.Trust negotiation is a global mechanism and its result is evaluated in the globalstate. A certificate c can be sent by i to the requester r, if in entity Ei =(i, ρi , σi , Wi ) M, E + (i, ρi , σi , Wi ), α0 |=i put(c, r)
  • 129. 132 CHAPTER 7. PROPOSITION FOR WS MODELINGis true, that is if there exists a rule h ← body in ρi and a ground well-typedsubstitution σ such that: [[h]]i = put(c, r) σ M, E + (i, ρi , σi , Wi ), α0 |=i bodyA trust negotiation for a certificate (c, i, r) is a success, where i is the senderand r the receiver, if the certificate is deducible from the previous sequence ofalready communicated certificates. Namely, given the current global state anda possibly empty initial sequence of certificates α0 , M, E, α0 |= (c, i, r) iff M, E + (i, ρi , σi , Wi ), α0 |=i put(c, r)A trust negotiation for a certificate sequence α is a success if for every certificatemessage in α we can check that the certificate is deducible from the previoussequence of already communicated certificates. Namely, given a global statewith a set of already sent messages α0 : M, E + (i, ρi , σi , Wi ), α0 |= (c, i, r) M, E, α0 |= (c, i, r) · α iff M, E + (i, ρi , σi , Wi ), α0 · (c, i, r) |= αWhen the sequence of certificates is empty, we set that M, E, α |= λ.Access control rules.We now present the access control rules evaluation. We start by the semanticsof the local evaluation, namely given an entity Ei = (i, ρi , σi , Wi ) ∈ E we saythat: M, E + (i, ρi , σi , Wi ) |=i P ermit(τ (v1 , . . . , vn )is true if there exists a ground sequence of certificates α and a rule h ← body ∈ ρisuch that   [[h]]i = P ermit(τ (v1 , . . . , vn )) σ M, E + (i, ρi , σi , Wi ), α |=i body M, E |= α 7.5 Workflow operational semanticsWe present below the reduction rules for atomic actions that are responsiblefor the state evolution of the workflow. We shall first present the notion ofevaluation context, is a context C[−] whose hole is under an iteration, an inputor an output. We shall use this notion to restrict the process substitutionto one given process outside the scope of parallelism. We assume that newvariables can only be created by ν. In what follows we give the semantics forthe transition relations. Recall that the local state of the entity is defined bythe tuple (i, σi , ρ, W ).
  • 130. 7.5. WORKFLOW OPERATIONAL SEMANTICS 133 Variable creation M, E + (i, σi , ρ, C[νxi , . . . , xl .P ]) ↓ {x1 , . . . , xn } ∩ dom(σi ) = ∅ M, E + (i, σi , ρ, C[P ]) ⊥, x ∈ {xi , . . . , xl };with σi = x → xσi , otherwise. Task invokationIf there exists a sequence of certificate messages α such that M, E+(i, σi , ρ, W ), α |=iP ermit([[τ (x1 , . . . xn )]]i ) σ M, E + (i, σi , ρ, C[τ (x1 , . . . , xn ).P ]) ↓ [ (x1 ,...,xn )σ ı] [τ ] M, E + (i, σi , ρ, C[pi (x1 , . . . xn ).P ])where τ (x1 , . . . , xn ) = pi (x1 , . . . xn ) is defined in the workflow and: σi = x →[[x]]i for x ∈ dom(σi ) σ Send action M, E + (i, σi , ρ, C[snd(v1 · . . . · vn , vr , τ ).P ]) ↓ snd(v1 ·...·vn ,vr ,τ )σi M + msg(v1 · . . . · vn , i, vr , τ )σi , E + (i, σi , ρ, C[P ]) Receive action M + msg(t1 · . . . · tn , s, i, τ ), E + (i, σi , ρ, C[rcv(v1 · . . . , ·vn , vs , τ ).P ]) ↓ rcv(t1 ·...·tn ,s,τ ) vi σ = ti , vs σ = s M, E + (i, nextrcv (σi , σ), ρ, C[P ]) xσ, x ∈ {v1 · . . . , ·vn , vs };with nextrecv (σi , σ) = x → xσi , otherwise. Add action M, E + (i, σi , ρ, C[add(v, S).P ]) ↓ add(vσi ,Sσi ) M, E + (i, σi , ρ, C[P ]) {[[v]]i } ∪ [[S]]i , x = S; σ σwith σi = x → xσi , otherwise. Remove action M, E + (i, σi , ρ, C[rmv(v, S).P ]) ↓ rmv(vσi ,Sσi ) M, E + (i, σi , ρ, C[P ]) Sσi {vσi }, x = S;with σi = x → xσi , otherwise.
  • 131. 134 CHAPTER 7. PROPOSITION FOR WS MODELING Modify action M, E + (i, σi , ρ, C[mdf y(a, X, v).P ]) ↓ mdf y(a,Xσi ,vσi ) Xσi .a = ⊥ M, E + (i, σi , ρ, C[P ]) Xσi ∪ {(a, vσi )}, x = X;with σi = x → xσi , otherwise. Modify action M, E + (i, σi , ρ, C[mdf y(a, X, v).P ]), σ ↓ mdf y(a,Xσi ,vσi ) (a, t) ∈ Xσi , t = vσi M, E + (i, σi , ρ, C[P ]) Xσi {(a, t)} ∪ {(a, vσi )}, x = X;with σi = x → xσi , otherwise.7.6 ConclusionWe have defined a logical framework to express the dynamic evolution of anentity by defining a set of access control rules taking into account trust negoti-ation with other entities in the environment on one hand and a workflow thatdescribes the state evolution on the other hand. The workflow is capable ofprocessing the execution of permitted tasks within the entity and the commu-nication of messages with other entities. The communication is asynchronous,however the communication of the messages synchronize the execution of thedifferent workflows by being guards on the execution of tasks. This frameworkcan be seen as a generic model that mimics the work of a business process.Each entity represents the flow of a given service and the business process isrepresented by the global flow. Future work is in the direction of formalizingthe notion of message communication. We also plan to explore the expressivityof this framework by examining the notions of delegation, separation of duties,and other features of access control. Also we find that some complexity analysisare necessary to study the efficiency of the framework.
  • 132. Part IVResults Achieved 135
  • 133. Chapter 8Cryptographic ProtocolsRefutation The work on the refutation of cryptographic protocols in the case of a finite number of messages exchanged by honest partic- ipants is at the core of my research. I consider in this chapter the classical part dealing with the refutation of trace-based security properties.8.1 LocalityOne could argue that all deduction systems for which it was proven that thesatisfiability of a symbolic derivation is decidable have in common that thededuction system is local, i.e. is such that in the case of ground satisfiability itsuffices to consider the ASDs in which only ground term appearing in the HSDneed to be deduced. We first define locality using the notations related to symbolic derivations.Then we present the definition of oracle deduction systems as given in [68]and later re-used in [69] and other papers. We give a short summary of thedecidability proof in [68], with an emphasis on the common points with [69] andother works. Finally we discuss the actual importance of this notion.8.1.1 LocalityThe notion of locality was first defined in the first-order logic context by [118],and later refined for first-order entailment problems by [26, 25]. Before proceed-ing further let us recall this notion as it was originally introduced by [118] inthe language of symbolic derivations.Definition 45. (Locality) A deduction system D is local if for every groundsymbolic derivation Ch with Ch = ∅ there exists (C, ϕ) ∈ Ch with Sub(TrCh ◦ϕ C (C)) ⊆Sub(TrCh ◦ϕ C (Ch )). 137
  • 134. 138 CHAPTER 8. CRYPTOGRAPHIC PROTOCOLS REFUTATION We note in the above definition that since Ch is ground there exists a groundsubstitution σ such that for every C ∈ Ch we have σ = TrCh ◦ϕ C (Ch ). Thedefinition thus implies that there exists a finite set of terms T = Sub(σ) suchthat Ch = ∅ implies that this set contains an ASD in which every state isinstantiated by a term in T . This approach, i.e. locality w.r.t. a finite set ofterms is employed in [34] to provide new decision results for ground satisfiabilityproblems. In parallel to that work and in collaboration with M. Kourjieh [134]I have also considered the notion of locality w.r.t. a well-founded simplificationordering, and proved that that notion implied the notion of locality as definedin [34]. Although our notion of locality is subsumed by the one of Bernat andComon-Lundh we believe it may be of practical interest given that it is oftensimpler to provide a well-founded simplification ordering on ground terms thanto explicitly compute the finite set as in [34].8.1.2 Oracle Deduction SystemsLet us now present an example usage of the notion of locality by giving thedefinition of oracle deduction systems given in [68]. At that time the analysisof cryptographic protocols was performed in the perfect cryptography modeldefined by Dolev and Yao in [106]. However we wanted to extend this modelwith additional deductions for two reasons: • First, and in collaboration with Laurent Vigneron, we had provided earlier a notion of oracle rules [77, 79] that turn the parallel executions of a protocol into additional deduction rules for the intruder. We had a doubly- exponential time complexity of the analysis, but suspected that a singly- exponential algorithm existed; • Second, and in the context of the AVISS project, we had started to work on cryptographic protocols that relied on non-perfect cryptography by exploiting the properties of the exclusive-or or of the modular exponenti- ation.In collaboration with Ralf K¨sters we have searched under which conditions it uis possible to extend the deduction system modelling the attacker defined byDolev and Yao to account for the oracle rules and the imperfect primitives.First let us describe the Dolev-Yao deduction system, and then we present thedefinition we ended up with.Dolev-Yao deduction system. The signature FDY contains 3 symbols ofarity 2, namely , , encs ( , ), and decs ( , ) describing respectively the con-catenation of two messages, the encryption of a message (its first argument) bya symmetric encryption algorithm where the key is the second message and theconverse operation of decryption. It also contains two projection symbols ofarity 1, namely π1 ( ), π2 ( ).
  • 135. 8.1. LOCALITY 139 All these symbols can be employed by any agent, and we have thus thefollowing deduction rules:   Concatenation  Encryption x, y x, y x, y encs (x, y)  p FD =  x  π1 (x) x, y decs (x, y) x π2 (x) The equational theory ED contains the following relations:   Concatenation Encryption ED = π1 ( x, y ) = x decs (encs (x, y), y) = x π2 ( x, y ) = y  pThe deduction system DY = (FD , FD , ED ) describes the classical Dolev-Yaoequational model with pairing and symmetric encryption.Oracle deduction systems. In [68] we have considered the extension ofthe Dolev-Yao deduction system DY with another deduction system Dg = p p p(Fg , Fg , Eg ) with Fg ∩ FDY = ∅. We say that Dg is a guessing deductionsystem if the following condition holds: For every closed DY symbolic derivation C = (V, S, K, In, Out) with σ = TrC ()C a substitution in normal form, and for every ? deduction step i in Ind, with the corresponding equation V(i) = f (V(i1 ), . . . , V(ik )) in S, we say that i is a: • regular composition step if V(i)σ = f (V(i1 )σ, . . . , V(ik )σ) (the equality here is in the empty theory) and f ∈ PD ; • regular decomposition step if f ∈ PD but V(i)σ = f (V(i1 )σ, . . . , V(ik )σ); • guess decomposition step if V (i)σ is a strict subterm of one of the V(ij )σ for 1 ≤ j ≤ k; • guess composition step if every strict subterm of V (i)σ is a subterm of one of the V(ij )σ for 1 ≤ j ≤ k.An index i is a composition (resp. decomposition) step if it is either a regularcomposition (resp. regular decomposition) or guess composition (resp. decom-position step). We finally say that the result of step ij is decomposed at step ?i ij if V(i) = f (i1 , . . . , ik ) is in S and V(i)σ is a strict subterm of V(ij )σ 1 Let be a well-founded simplification ordering on terms.Definition 46. (Oracle deduction systems) Let D be the union of DY with aguessing deduction system Dg . We say that Dg is an oracle deduction system if: 1. D is local; 1 see [68] for the exact definition according to which a, b is not decomposed at step i if ?V(i) = decs (V(j), V(k)) and σ maps V(j) to encs (a, a, b ) and V(k) to a, b .
  • 136. 140 CHAPTER 8. CRYPTOGRAPHIC PROTOCOLS REFUTATION 2. Given t1 , . . . , tn , t it is decidable whether t is deducible in one deduction step from t1 , . . . , tn ; 3. If (C, ϕ) ∈ Ch with C = (V, S, K, In, Out) and σ = TrC◦ϕ Ch (C) then there exists a couple (C , ϕ) ∈ Ch with C = (V , S , K , In , Out ) and σ = TrC ◦ϕ Ch (C ) such that: • There exists a monotonically increasing mapping ψ from Ind to Ind such that V (ψ(i))σ = V(i)σ; • In C the result of a guess composition step is never decomposed by a regular decomposition step; 4. For every non atomic message u, there exists a normalized message (u) with (u) (u)↓ such that: For every ASD C = (V, S, K, In, Out) with (C, ϕ) ∈ Ch such that u is composed at step iu ∈ Ind, let J ⊂ Ind be the set of indices that correspond to oracle deduction step. Then there exists (C , ϕ) with C = (V , S , K , In , Out ) and (C , ψ1 ) with C = (V , S , K , In , Out ) such that: • S = S S{iu }∪J where S{iu }∪J is the set of equations corre- sponding to deduction steps in {iu } ∪ J, In = In ∪ {iu } ∪ J and Ind = Ind, V = V, = , Out = Out; • C ◦ψ1 C ◦ϕ Ch is closed and S is satisfied by TrC ◦ψ1 C ◦ϕ Ch (C ); • TrC ◦ψ1 C ◦ϕ Ch (C ) = TrC◦ϕ Ch (C)δu, (u)Decidability result. Let us now sketch the proof of the decidability of thesatisfiability problem for deduction systems which are the extension of DY byan oracle deduction system. Let Ch be an HSD and assume that Ch = ∅. Ourgoal is to prove that there exists (C, ϕ) ∈ Ch such that σ = TrCh ◦ϕ C (Ch ) isbounded by a polynomial in the size of Ch . To obtain such a bound it sufficesthat every term in Sub(σ) is bound by σ in Sub(Ch ), given that this implies thatthe number of terms in Sub(σ) is bounded (linearly) by the number of terms inSub(Ch ). The bound on σ shall be derived from this bound. The proof proceeds as follows. Assuming that Ch = ∅ we pick (C, ϕ) ∈ Ch anddefine σ = TrC◦ϕ Ch (C◦ϕ Ch ). Assuming that not every term in Sub(σ) is σ-boundin Sub(Ch ) we let u ∈ Sub(σ) be a σ-free term in Sub(Ch ). Our goal is to provethat there exists another couple (C , ψ) ∈ Ch such that TrC ◦ψ Ch (Ch ) = σδu, (u) .Since (u) u we also have σδu, (u) σ. Since the ordering is well-foundedevery sequence of such replacement eventually terminates. The terminationimplies that the resulting trace τ must be such that every subterm t ∈ Sub(τ )must be τ -bound in Sub(Ch ). Thus, let us prove that there exists another couple (C , ψ) ∈ Ch such thatTrC ◦ψ Ch (Ch ) = σδu, (u) . • First some additional conditions are imposed on u to ensure that a variant of Lemma 4.24 is applicable in the considered equational theory. This
  • 137. 8.1. LOCALITY 141 ensures that replacing u with (u) yields a substitution σ that satisfies the unification system of Sh ; • Then we prove that for every σ-free term u in Sub(σ) there exists a com- position step iu in C in which u is deduced; • This permits us to employ the fourth point of the definition of oracle deduction systems to replace every oracle deduction step by a symbolic derivation also satisfied by σ ;Keeping the notations of Definition 46, third point, it suffices to prove that theequations in S are also satisfied by σ . To this end we note that the deductionsremaining in C are regular deductions. Let us treat separately the equationscorresponding to regular composition rules and those corresponding to regulardecomposition rules:Regular composition rules: By definition these equations are satisfied by σ in the empty theory. Assuming wlog that u is only deduced once, this term is σ-free in the set of equations corresponding to regular composition rules. Thus by Lemma 4.24 these equations are also satisfied by σδu, (u) ;Regular decomposition rules: Since wlog we can assume that u is not the result of any decomposition rule, the only problematic case is when the ? equation associated to the regular decomposition step is of the form V(i) = f (. . . , V(iu ), . . .). One easily sees that for the equations in the Dolev-Yao deduction system, if u is not the decomposed term and the equation is satisfied by a substitution σ then it is satisfied by σδu, (u) . Thus it suffices to prove that one can assume that the result of a composi- tion step is never decomposed in a subsequent regular decomposition step. This is ensured by the third point of the definition of oracle deduction sys- tems if u is deduced by an oracle composition step, and a case analysis on the regular composition rules shows that decomposing the result of a composition always result in a stutter, and therefore can be eliminated. Thus if Ch = ∅ there exists an ASD C ∈ Ch such that every subterm ofσ = TrCh ◦ϕ C (Ch ) is bounded by σ in Sub(Ch ). It suffices then to prove: 1. that it suffices to check a finite number of such substitutions; 2. for a guessed substitution σ, decide whether (Ch σ) = ∅. This latter problem is decidable because a) D is local by the first point of the definition of oracle deduction systems, and b) one-step ground deduction is decidable by the second point of the same definition.8.1.3 On the importance of localityAs can be seen from the proof outlined in the above section, the only explicituse of locality is to prove that ground satisfiability problems are decidable. One
  • 138. 142 CHAPTER 8. CRYPTOGRAPHIC PROTOCOLS REFUTATIONcan argue that the second point of the definition of oracle deduction systems isanother locality condition or, more accurately, a saturation condition. However we believe that such an argumentation is weak because a) the sub-term relation employed is not the standard one, and b) the deduction systemhas been altered.Changes in the subterm relation. When excluding the prefix oracle rulesof [68] all other examples of oracle deduction systems rely on a re-definition ofthe subterm relation. The definition of subterms employed in [68, 69] is basedon the factors w.r.t. the equational theory of Dg . In [68] this equational theoryis the one of the bitwise exclusive-or ⊕ with equations: x⊕y = y⊕x x ⊕ (y ⊕ z) = (x ⊕ y) ⊕ z x⊕x = 0 x⊕0 = xwhereas in [69] the equational theory was the union of the one for multiplicativeabelian groups: x×y = y×x x × (y × z) = (x × y) × z x × inv(x) = 1 x×1 = xand a simplified, decidable [130] set of equations modelling the modular expo-nentiation: exp(x, 1) = x exp(exp(x, y), z) = exp(x, y × z)In both cases the terms whose root symbol belongs to the Dolev-Yao signatureare free w.r.t. the considered equational theory.Changes in the deduction system. Given that [68] defines a bitwise exclusive-or operation one would expect its deduction system to contain ⊕ and 0 as publicsymbols, and no other. However using this deduction system would not yielda local deduction system. For example if the attacker must deduce the terma1 ⊕ an after receiving the terms a1 ⊕ a2 , a2 ⊕ a3 , . . . , an−1 ⊕ an he has to com-pute all the intermediate sums, none of which are subterms of either a1 ⊕ annor of any of the ai ⊕ ai+1 for 1 ≤ i ≤ n − 1. The trick employed in [68] consists in computing the transitive closure ofthe deduction system Dg . That is instead of denoted possible deductions with apublic symbol we employ terms, and the equation associated to a step i in which ?a deduction using the term t is performed is V(i) = tθ, where θ is a substitutionmapping the variables of t to {V(1), . . . , V(i − 1)}. The computation of thetransitive closure in practice implies that Dg contains an infinite number ofpublic terms, which in turn implies that the second point of oracle deductionsystems definition is not trivially met.
  • 139. 8.2. COMBINATION OF DECISION PROCEDURES 143Conclusion. The two changes, on the subterm relation and on the deductionsytem, that were performed to obtain decidability results are generic, and canbe defined for every deduction system. In the next section we review how theycan be applied to obtain combination algorithms for the modular resolution ofD-satisfiability problems.8.2 Combination of decision procedures8.2.1 Presentation of the problemAs noted in the preceding section, the main ingredients of the extension of theDolev-Yao deduction system are: 1. the definition of a subterm relation based on the notion of factors; 2. the computation of a transitive closure of the deduction system;Besides these ingredients we needed the decidability of the ground satisfiabilityproblems and a way (the last point of the definition of oracle rules) to reducesatisfiability problems to ground satisfiability ones. A natural question then arises: assuming the Dolev-Yao deduction system DY is extended with a deduction system Dg and that Dg satisfiability problems are decid- able, are (Dg ∪ DY)-satisfiability problems decidable ?Actually one could generalize, and wonder whether the Dolev-Yao deductionsystem plays a special rˆle. This leads to the following problem: o Symmetric combination problem: Assume that D1 and D2 are two deduction systems such that D1 -satisfiability problems and D2 -satisfiability problems are decidable. Are (D1 ∪ D2 ) satisfiability problems decidable ?A second way to generalize is to investigate the conditions under which one canextend an arbitrary (instead of only the Dolev-Yao one) with another deductionsystem: Asymmetric combination problem: Assume that D1 and D2 are two deduction systems such that D1 -satisfiability problems are decidable. Are (D1 ∪ D2 ) satisfiability problems decidable ? I have considered these two problems in collaboration with M. Rusinowitch.We have given a solution to the symmetric combination problem in [70, 76],and a solution to the asymmetric combination problem in [71, 72]. We brieflypresent these results in the rest of this section.
  • 140. 144 CHAPTER 8. CRYPTOGRAPHIC PROTOCOLS REFUTATION8.2.2 Symmetric Combination problemBackground on the combination of equational theoriesBackground. There has been substantial works on the area of the combi-nation of decision procedures for problems related to equational theories. Butbefore describing the ones relevant to our work, let us first introduce some no-tations and definitions. We say that two equational theories are disjoint if theydo not share any function symbol. A theory E is consistent if it has a modelwith more than one symbol or, equivalently, we do not have a =E b for twofree constants a and b. Let E1 and E2 be two disjoint equational theories. Wesay that a term t is a pure E1 -term (resp. E2 -term) if it is built from functionsymbols in the signature of E1 and variables. A term t is alien to E1 if its rootsymbol is a function symbol in E2 or a free constant. By definition of syntac-tic unification it is clear that terms alien to E1 are free (see the definition inSection 4.7.3, p. 71). A result by Tid`n [204] states that the combination of two disjoint consistent eequational theories E1 and E2 is a conservative extension of both E1 and E2 , i.e.for terms s, t built using the functional symbols of the signature of E1 we haves =E1 t if, and only if, s =E1 ∪E2 t. This theorem justifies the purificationprocedure during which a (E1 ∪ E2 )-unification system S is transformed into theunion of two unification systems S1 and S2 in which Si is a Ei -unification system,for i ∈ {1, 2}. This procedure replaces in t each factor s of a term t by a variable ?xs and adds to S an equation xs =E1 ∪E2 s. It is clear that every unifier of S canbe extended into a unifier of S1 ∪ S2 . Conversely, the equations added imposethat all the variables replacing a given term s have to be equal to the instances, which permits one to reconstruct a unifier of S from every unifier of S1 ∪ S2 . Given that E1 ∪ E2 is a conservative extension of each of the Ei one couldexpect that once S is split into S1 ∪ S2 it would suffice to compute unifiersmodulo Ei of Si , for i ∈ {1, 2}, in order to compute unifiers of S. This logicalstep is however not sound for two reasons:symbol clash: it may happen that the same variable x ∈ Var(S1 ) ∩ Var(S2 ) is instantiated differently by the unifiers σi of Si modulo Ei , for i ∈ {1, 2};occur-check: it may happen that it is not possible to reconstruct a global solution from σ1 and σ2 because of a cycle. As a degenerate case consider ? ? the two unification systems {f (x) = y} and {g(y) = x} in the empty ? theory. Each has a solution but the union unification system {f (x) = ? y, g(y) = x} does not have one.Deciding to compute a E1 (resp. E2 ) unifier σ1 (resp. σ2 ) of S1 ∪ S2 would besound but incomplete, as each unifier would be computed assuming that thealien equations have to be true in the empty equational theory. For examplewhen combining the equational theory of the bitwise exclusive-or ⊕ with another ?theory, every equation x ⊕ x = 0 would appear as unsatisfiable (because of aroot symbol clash) in the other equational theory.
  • 141. 8.2. COMBINATION OF DECISION PROCEDURES 145 Combining unification or unifiability decision procedures for the disjointunion of equational theories means finding a way to compute a unifier of S1 ∪ S2modulo E1 ∪ E2 from Ei -unifiers of Si , for i ∈ {1, 2}.Difficulty of the combination of decision procedures. First, and in orderto avoid symbol clashes, [191] introduces two non-deterministic steps: • first one non-deterministically identify the variables that denote terms equal modulo E1 ∪ E2 once the (putative) unifier is applied; • then each variable x is assigned to one of the theory, say E1 . When re- solving S2 modulo E2 this variable will be considered as a free constant.These steps are justified as follows. Assuming the existence of a unifier σ innormal form of S1 ∪ S2 the algorithm choose theory Ei for x if, and only if, theroot symbol of xσ belongs to the functional signature of E1 . Whenever x occursin S1 ∪ S2 as a variable of a E2 -pure term t, we note that xσ is a subterm of tσfree in E2 and in normal form. Also all the factors of t are in normal form. Thus when considering only the unification system S2 we can build fromσ a pure unifier in E2 by applying Lemma 4.22, p. 72 to replace xσ in theterms of S2 σ with a free constant cxσ . The second step consists in applying thisreplacement before computing the unifier corresponding to σ in S2 . Finally one has to ensure that it is possible to reconstruct a unifier σ ofS1 ∪ S2 from unifiers σ1 and σ2 of respectively S1 and S2 that have a disjointdomain (thanks to the assignment of each variable to a theory). Let us explain ? ?the solution on the example S1 = {f (x) = y} and S2 = {g(y) = x}. Thefirst non-deterministic steps assign y to E1 and x to E2 , and finds two unifiersσ1 = {y → f (x)} and σ2 = {x → g(y)}. Thus, in this example: the constant x occurs in the instance of the variable y while the constant y occurs in the instance of the variable x.The differences in the combination methods proposed are differences in thetreatment of this occur-check problem.A solution for finitary equational theories. The first method was pre-sented in the seminal work of Schmidt-Schauß [191] and relies on the existence ofa constant elimination procedure. Such a procedure inputs a sequence of terms(ti )1≤i≤n and a sequence of free constants (cj )1≤j≤m and computes, wheneverit exists, a most general set Σ of substitutions such that for all σ ∈ Σ, for all1 ≤ i ≤ n, and for all 1 ≤ j ≤ m the term ti σ is equal to a term ti in which theconstant cj does not occur. The occur-check problem is avoided by choosingwhich variable occurs as a subterm of which other variable in a solution. Assuming that each equational theory is finitary, one first computes a com-plete set of most general unifiers Σi for Si , for i ∈ {1, 2}. In order to respectthe guessed ordering, a constant x cannot appear in the instance of a variable
  • 142. 146 CHAPTER 8. CRYPTOGRAPHIC PROTOCOLS REFUTATIONy. The constant elimination procedure is employed to eliminate all occurrencesof constants that do not satisfy this requirement from the unifiers in Σi . Theapplication of this procedure yields two sets of unifiers Σ1 and Σ2 . For eachcouple (σ1 , σ2 ) ∈ Σ1 × Σ2 one can reconstruct a unifier of S1 ∪ S2 by inductionon the guessed ordering (see [191] for the complete proof). Thus we have thefollowing theorem.Theorem 8.1. (Schmidt-Schauß, [191]) Let E1 and E2 be two disjoint finitaryequational theories that each has a constant elimination procedure. Then E1 ∪ E2is a finitary equational theory that has a constant elimination procedure.Extension to arbitrary equational theories. In order to employ the con-stant elimination procedure one needs first to compute a finite set of mostgeneral unifiers, which is not possible when the equational theory is infinitaryor nullary. In the same chapter [191], Schmidt-Shauß has provided us with away to handle such equational theories. The principle is simple, and consists inencoding the guessed subterm relation with extra equations in the empty theory.Instead of replacing a variable x assigned to the signature E1 by a constant in ?S2 one adds to S2 an equation x = fx (y1 , . . . , yk ), where the yi are the variablesassigned to E2 that shall be smaller than x in the guessed ordering, and fx is anewly introduced free function symbol. Lemma 4.22, p. 72 is again applicable,and the addition of these equations ensure that the unifiers of the extendedunification systems can be combined.Theorem 8.2. (Schmidt-Schauß, [191]) Let E1 and E2 be two disjoint equationaltheories that both have a decidable general unifiability problem. Then E1 ∪ E2has a decidable general unifiability problem. The presentation of Schmidt-Schauß’ results is heavily influenced by Baaderand Schulz’s article [16] who have greatly simplified the presentation of [191].They have also proposed another way to encode the guessed subterm relation,which consists in guessing a total (instead of partial) ordering on the variablesof the problem. The linear constant restriction consists in restricting the ad-missible unifiers of a unification system to those in which a variable x is notinstantiated by a constant y if x lcr y.Theorem 8.3. (Baader, Schulz, [16]) Let E1 and E2 be two disjoint equationaltheories that both have a decidable unifiability with linear constant restrictionproblem. Then E1 ∪E2 has a decidable unifiability with linear constant restrictionproblem.Combining disjoint deduction systemsGiven that the satisfiability of a connection is defined w.r.t. the satisfiability ofa unification system it seems at first glance that the results on the combinationof decision procedures for unifiability is sufficient to obtain a procedure combin-ing decision procedures for the satisfiability of symbolic derivations. There are
  • 143. 8.2. COMBINATION OF DECISION PROCEDURES 147however differences that need to be taken into account. First, if one abstractsthe deductions of the attacker with contexts—terms in which all function sym-bols are public symbols— a procedure solving the satisfiability problem has tocheck whether there exists contexts such that a unification system is satisfi-able. Since the HSD does not check whether the attacker performs the sameactions at different times, this problem is a special case of second-order linearunification (see [109], p. 1043), which is decidable when the equational theoryis empty ([109] refers to [108], but another available source is [143]). In spite of the fact that the satisfiability of a symbolic derivation is akin to alinear second-order unification problem (as was presented by M. Baudet in histhesis [28]), an algorithm that combines decision procedures for second-orderlinear unification is not sufficient: applying one such algorithm to a (D1 ∪ D2 )-satisfiability would not reduce to D1 - and D2 -satisfiability problems but to D1 -and D2 -second-order linear unification problems. Such a transformation is notoptimal since e.g. in the case of deduction systems for which the equational isconvergent and subterm, the satisfiability and equivalence problems are decid-able [27], but another special case of second-order linear unification is undecid-able [12]. However we have successfully employed the recipes that are at the heart ofthe definition of oracle rules to derive a combination procedure for satisfiability p pproblems. Let D1 = (F1 , F1 , E1 ) and D2 = (F2 , F2 , E2 ) be two disjoint deduc-tion systems, i.e. such that F1 ∩ F2 = ∅. We also let be a simplificationordering on T (F1 ∪ F2 , X ), and assume that there exist a minimum term for which is a constant cmin ∈ Cnew . First we redefine the subterm relation so that the maximal strict subtermsof a term t whose root is a function symbol in Fi are its maximal subterms freein Ei , for i ∈ {1, 2}. Then we construct the transitive closures D1 and D2 ofthe deduction systems D1 and D2 . Without surprise the constructed deductionsystems are local w.r.t. the redefined subterm relation. Assuming that thetrace on the HSD is the substitution σ in normal form, Lemma 4.22 can beemployed to replace σ-free subterms in Sub(Ch ) with the constant cmin ∈ Cnew .By minimality of cmin every sequence of replacements of a free term by cminterminates, and results in a substitution σ such that there exists a (D1 ∪ D2 )-ASD C and a connection function ϕ such that (C, ϕ) ∈ Ch and σ = TrCh ◦ϕ C (Ch ). Since every subterm of σ is bound by σ in Sub(Ch ) we then partially guessa (D1 ∪ D2 )-ASD with less than Sub(Ch ) deduction steps as follows: • For each term t ∈ Sub(Ch ) we guess to which signature the root symbol of (tσ )↓ belongs; • For each deduction step we guess which term t ∈ Sub(Ch ) binds the result of the deduction; • Also for each deduction step we guess which deduction system among D1 and D2 is employed to deduce t; • Finally we guess a connection ϕ between this ASD C and the HSD Ch , and let C = Ch ◦ϕ C.
  • 144. 148 CHAPTER 8. CRYPTOGRAPHIC PROTOCOLS REFUTATIONWe check the soundness of the choices by turning the guessed deduction states(i.e. those that model the deductions of the attacker) of C into both input andoutput states, and by computing two HSDs C1 and C2 which are respectivelyD1 - and D2 -ASDs by deleting in Ci the deduction steps in C that originate fromCh but are not in Di . The difficult part, detailed in [76] consists in proving that the equationsinduced by the choice of the binding term t in the second step are such thatC1 and C2 are still HSDs (modulo the removal of some constants in Cnew ). Theseparation of C into C1 and C2 requires a purification of the unification systemof C , which in term requires either the addition of new function symbols if onewants to employ Theorem 8.2 or the guessing of a linear constant restrictionconstraint if one wants to employ 8.3. We have chosen the latter as it does notrequire one to change the signature. Using the notations of symbolic derivation,we have thus proven in [76] the following theoremTheorem 8.4. (Chevalier, Rusinowitch, [76])?? If the ordered satisfiability pproblem is decidable for two deduction systems D1 = (F1 , F1 , E1 ) and D2 = p(F2 , F2 , E2 ) then the ordered satisfiability problem is decidable for the deductionsystem D1 ∪ D2 . A version for extended deduction systems has also been proved in collabo-ration with D. Lugiez in [65].Theorem 8.5. (Chevalier, Lugiez, Rusinowitch, [65]) If the ordered satisfiabil- pity problem is decidable for two extended deduction systems D1 = (F1 , F1 , E1 ) pand D2 = (F2 , F2 , E2 ) then the ordered satisfiability problem is decidable for theextended deduction system D1 ∪ D2 .Note on the ground case. Let us assume Ch is a ground symbolic derivation.Then, reusing the notations of the above algorithm, for every term t ∈ Sub(Ch )we have tσ = t, and thus the first two steps of guessing can be performeddeterministically. Since every term of C is bound to a ground term so is everyterm in both C1 and C2 . Thus we also have that ground reachability problemsare also modular, a result not written but directly deducible from [70]. A moreprecise analysis performed in [11] actually shows that it is not necessary to guessthe symbolic derivation C : assuming the decidability of ground reachability ineach of the deduction systems, the locality of the union of their transitive closurepermits one to perform a least-fixpoint computation of the accessible subtermsof Ch . This argument leads to the definition of a polynomial time combinationprocedure for the ground reachability problems.Application: composition of cryptographic protocols. A secrecy goalof a cryptographic protocol can be encoded by adding an extra reception tothe HSD representing this protocol in which it is tested whether the messagereceived is the secret. Accordingly, a cryptographic protocol with secrecy goalscan be represented by a finite set of HSDs, one of the secrecy goal being violatedif, and only if, one of these HSDs is satisfiable.
  • 145. 8.2. COMBINATION OF DECISION PROCEDURES 149 Assume that two finite sets of honest symbolic derivations each representingone cryptographic protocol with secrecy goals are defined over disjoint deduction p psystems D1 = (F1 , F1 , E1 ) and D2 = (F2 , F2 , E2 ). A composition with secrecygoal of these two protocols is defined by a set connection between these symbolicderivations in which only one of the secrecy goals is selected. By Theorem ??,one of the composition is satisfiable if, and only if, an HSD in the initial twosets of HSDs is satisfiable. In plain terms, there is a secrecy attack on thecomposition of the two cryptographic protocols if, and only if, there is a secrecyattack on one of these cryptographic protocols. This result was originally provedby Ciobaca and Cortier in [82] in the special case of HSDs in which the statesare totally ordered. We note that the extension to extended deduction systemsby using Theorem 8.5 is straightforward.Note on the linear constant restrictions. Whether for any equational the-ory E the decidability of E-unifiability implies the decidability of E-unifiabilitywith linear constant restriction is still an open problem. However we note thatin our combination theorem we require more than the mere decidability of E-unifiability, and in some cases this extra assumption permits one to encode thelinear constant restrictions into a satisfiability problem. Let D = (F, F p , E) be a deduction system. We say that D is complete if pF = F. Let S be a E-unification system and x1 . . . xn be a linear constantrestriction on the variables and constants of S. We note that S is decidable withthe linear constant restriction if, and only if, the D-HSD CS, constructed asfollows is satisfiable: • First CS, consists in a sequence of length n of input and output states. The ith state in this sequence is either ? – both a knowledge state with associated equation V(i) = xi and an output state if xi is a constant, ? – or an input state with the equation V(i) = xi if xi is a variable; • Then CS, constructs all the terms occurring in S; • Finally we add, in addition to equations stemming from the knowledge ? and deduction steps, equations V(i) = V(j) to model the equations in S.Since the deduction system is complete the attacker can instantiate a variable xiby any ground term in which only the constants among {x1 , . . . , xi−1 } occur. Itis then trivial that CS, is satisfiable if, and only if, S is satisfied by a substitutionsatisfying the linear constant restriction .Theorem 8.6. Let D be a complete deduction system with equational theoryE. Then if D-satisfiability is decidable then E-unifiability with linear constantrestrictions is decidable. As a corollary we obtain the fact that for complete deduction systems onedoes not need to bother with linear constant restriction constraints.
  • 146. 150 CHAPTER 8. CRYPTOGRAPHIC PROTOCOLS REFUTATIONCorollary 8.1. Let D be a complete deduction system. If D-satisfiability prob-lems are decidable then D-satisfiability with linear constant restriction problemsare decidable. In the future I plan to extend Theorem 8.6 to incomplete deduction systems.I believe that such a result would emphasize the relation existing between sym-bolic derivations and subterm ordering constraints.8.2.3 Asymmetric Combination problemIntroductionLet us recall the question we had concerning the extension of a deduction systemthat has a decidable satisfiability problem: Asymmetric combination problem: Assume that D1 and D2 are two deduction systems such that D1 -satisfiability problems are decidable. Are (D1 ∪ D2 )-satisfiability problems decidable ? Of a course a consequence of the preceding section is that, when D2 andD1 are disjoint deduction systems, if the satisfiability problems with linear con-stant restrictions of both systems are decidable then the (D1 ∪ D2 )-satisfiabilityproblems are decidable. This means we shall examine the case in which thesignatures of D1 and D2 are not disjoint, and thus without loss of generality thecase in which: p   D1 = (F1 , F1 , E1 ) p  D1 = (F2 , F2 , E2 )    F1 ⊆ F2  p Ep ⊆ E2  1   F1 ∩ F2 = ∅ Hierarchical theoriesThis section summarizes the joint work with M. Rusinowitch presented in [71,72]. The starting point is the observation—briefly mentionned in Section 8.1.2—that in the Dolev-Yao deduction system, composed terms never needed to bedecomposed. In particular we had a distinction between “being decomposed” and“being employed in a regular decomposition step”. This distinction is justified bythe fact that in the Dolev-Yao equational theory, the replacement of encs (b, c)by any term t in the term t = decs (encs (a, encs (b, c)), encs (b, c)) commutes withthe normalization of t. However we also note that encs (b, c) is not a free term inthe Dolev-Yao equational theory, and thus Lemma 4.22 cannot be employed asis to obtain a pumping lemma authorizing the replacement of a free term witha smaller term. The difficulty in that work consists in finding a criterion such that: • the possibility of replacing a subterm is dependent on its position in a larger term t;
  • 147. 8.2. COMBINATION OF DECISION PROCEDURES 151 • in order to be able to use a variant of Lemma 4.22 we have to define normal forms, and therefore have to provide a criterion which is preserved when computing the o-completion of an equational theory E.Let us look more closely at the symmetric encryption part of the Dolev-Yaoequational theory to obtain more hints of what could or could not work. Besidestwo infinite sets of free constants and of variables we have two binary functionsymbols such that: ∀x, ∀y, decs (encs (x, y), y) = xIt is left to the reader to prove that this equational theory is convergent, andthus is equal to its o-completion. Let us explore the possibilities of defining acriterion that would ensure that a term t can be replaced in a term s. A firstidea consists in looking at the equational theory, and in making the hypothesisthat when a term t is: • in normal form, and • if t = encs (t , t ) for some terms t , t and t does not occur at a position p · 1 in the term s with s|p = decs (t, t )then t can be replaced by any term at the position p in s. This is however notcorrect, as demonstrated by the counter-example: t = encs (t , t ) s = decs (decs (encs (t, a), a), t )This “decomposition from above” phenomena cannot be discarded given that itis the essence of the application of deduction rules on terms. Let us label with 2the positions p such that there may exists a context such that, after a sequenceof ordered rewritings of the term, the replacement of the subterm at positionp does not commute with the application of an ordered rewriting rule. Let usalso label 1 the positions for which this cannot occur. We have: • the “key” positions, i.e. those of the form p · 2 for some p, can safely be labelled with 1: the replacement of all the occurrences of a term t at a key position by the same term u commutes with any ordered rewriting steps; • in a non-key position, the positions 1 · 1 and 1 · 1 · 1 in the term s above show that if the function employed is encs ( , ) or decs ( , ) a replacement of the term may not commute with an ordered rewriting step. We formalize this notion of “bad position” with a notion of mode that aimsat capturing the positions in which the addition of the equations in E2 E1 maylead to additional rewritings of the terms.E2 is a conservative extension of E1 : in order to impose that the equality relation between pure E1 terms is left unchanged by the addition of the equations in E2 E1 we impose that:   all functions symbols in F1 are of mode 1 all functions symbols in F2 are of mode 2 all the equalities in E2 E1 are among terms whose root is of mode 1 
  • 148. 152 CHAPTER 8. CRYPTOGRAPHIC PROTOCOLS REFUTATIONPreservation by o-completion: in order to preserve the type discipline on the ordered completion of the theory: • we extend the mode to variables, which can be of mode 0 or mode 1; • we require that the arguments of function symbols also have a mode. In the following we assume that there exists a mode function m(·, ·) suchthat m(f, i) is defined for every symbol f ∈ F2 of arity n and every integer isuch that 1 ≤ i ≤ n. For all f, i we have m(f, i) ∈ {1, 2} and for all f ∈ F1 andfor all i, m(f, i) = 1. We partition the set X into two denumerable sets X1 ∪ X2 .For all f ∈ F2 ∪ X we define a function that gives the signature Sig(f ) to whicha symbol belongs: sig : F ∪ X ∪ C → {0, 1, 2} i if f ∈ Fi ∪ Xi for i ∈ {1, 2} Sig(f ) = 0 otherwise, i.e. when f is a free constantThe function sig is extended to terms by taking T (t) = T (top(t)) where top(t)is the function symbol at the root of t. A position p · i in a term t is well-moded if T (t|p·i ) = m(top(t|p ), i). In otherwords the position in a term is well-moded if the subterm at that position is ofthe expected type w.r.t. the function symbol immediately above it. If a nonroot position of t is not well-moded we say it is ill-moded in t. Note also that bydefinition every free constant is in a ill-moded position. A term is well-moded ifall its non root positions are well-moded. An equational theory (F, E) is well-moded if for all equations u = v in E the terms u and v are well-moded andT (u) =T (v). One can prove that if an equational theory is well-moded then its completionis also well-moded [72]. We have tailored the notion of mode so that, in a well-mode equational theory E, every ill-moded term in normal form can be replacedby an arbitrary term (Lemma 8 in [72]), thereby regaining a notion of free termin the equational theory. The notion of local extension of the deduction system is more difficult toobtain. On the one hand Hypothesis 1, p. 366 in [72] permits one to obtain thelocality of the deduction system on ground terms. In contrast with the resulton the combination of disjoint deduction systems this result is not sufficient,given that one has to guess the attacker deductions in D2 before resolving theD1 -satisfiability problems. Also we have to be able to solve that E2 -specificequations before solving the pure E1 -unification system. These considerationslead us to the addition of several hypotheses (quoted here from [72]): Hypothesis 1: If E →S2 E, r →S2 E, r, t and r ∈ Sub(E, t)∪Cspe / then there is a set of terms F such that E →∗ 1 F →S2 F, t. S
  • 149. 8.2. COMBINATION OF DECISION PROCEDURES 153 Hypothesis 2: For all terms s ∈ S1 , for all substitutions τ such that (X2 ∩ Var(s))τ is a set of ground terms, and for all ground terms t there is at most one ground substitution σ such that sτ σ =H t, and this substitution can be computed. Hypothesis 3: The equational theory (F, E) is reducible to (F1 , E1 )These hypotheses may not be optimal, but: • first we assume that D2 contains only a finite number of symbols, and thus that a deduction of D2 can be guessed; • second we assume that pattern-matching—(hypothesis 2 in [72]), em- ployed when considering ground satisfiability problems—or unification— (hypothesis 3 in [72]), employed when considering generic satisfiability problems— can be reduced to pattern-matching or unification in E1 . We then obtain the following theorems. Since we allow the computation ofa transitive closure, F p (and decorations thereof) denotes in these theorems aset of terms.Theorem 8.7. (Extension of ground satisfiability problems) If: p • F2 is finite; • D1 -ground satisfiability problems is decidable; • E2 -word problem is decidable; • Hypotheses 1 and 2 are satisfied.Then the D2 -ground satisfiability problem is decidable.Theorem 8.8. (Extension of satisfiability problems) If: p • F2 is finite; • D1 -ordered satisfiability problem is decidable; • Hypotheses 1 and 3 are satisfied.Then the D2 -ordered satisfiability problem is decidable.Extension of the mode to extended deduction systems. Retaining themain ingredients of the reduction from the decidability of D2 -satisfiability prob-lems to the decidability of D1 -satisfiability problem we conjecture that the samereduction can be provided for extended deduction systems if: • An extended deduction of (tσ)↓ from (t1 σ)↓, . . . , (tn σ)↓ for every ground substitution σ in normal form must also satisfy that all the terms t, t1 , . . . , tn are pure F1 - or F2 -terms, and:
  • 150. 154 CHAPTER 8. CRYPTOGRAPHIC PROTOCOLS REFUTATION p – either all the terms are pure F1 -terms, and the rule is in F1 ; p – or t is a pure F2 -term, and the rule is in F2 . • the equational theory satisfies hypothesis 3; • the deduction system satisfies hypothesis 1; p • there is only a finite number of rules in F2 .Then D2 -satisfiability problems can be reduced to D1 -satisfiability problems. We note that this conjecture is actually needed to obtain the decidabilityresult obtained in [57]. Though I believe the proof does not contain any difficultyit can still be counted as a future research direction.8.3 Saturation-based decision procedures8.3.1 A special case of asymmetric combinationLet us consider the case in which F1 = ∅ and thus D1 is empty. Theorem 8.8 inthis case gives a decidability criterion for satisfiability problems. We thus havethe following theorem.Theorem 8.9. (Decidable class of satisfiability problems) Let D = (F, F p , E)be a deduction system such that: • F p is finite; • D is local; • E-unification is finitary.Then the D-satisfiability problem is decidable. However Theorem 8.9 is in most cases of little use given that it actually re-quires the locality w.r.t. a subterm relation such that Lemma 4.22, p. 72 can beapplied on every free subterm of a given term. Thus, in the research directionthat has eventually lead to our interest in saturated sets of clauses in first-orderlogic, I have worked with Mounira Kourjieh on the practical definition of satu-rated deduction systems as well as on subclasses having a decidable satisfiabilityproblem. I present in Section 8.3.2 the original motivation of our analysis of satu-rated deduction systems. Then in Section 8.3.3 I present the decidability andundecidability results obtained for saturated deduction systems.
  • 151. 8.3. SATURATION-BASED DECISION PROCEDURES 1558.3.2 MotivationWhen Mounira Kourjieh began her thesis work under my supervision, therewas a lot of research focusing on the relation between concrete and symbolicmodels of cryptographic protocols. This research focused more precisely on theconditions to impose on the concrete cryptographic primitives that ensure theexistence of a symbolic model so that a protocol valid in the symbolic model isvalid in the concrete model. The techniques developed in this area are howeverof little help when one wants to prove that, under some additional constraints,a cryptographic protocol is flawed. Furthermore, some well-known flaws in existing cryptographic primitiveswere uncovered: • There was a sequence of articles describing meaningful attacks on cryp- tographic protocols based on collision attacks on MD5 described in [211, 142]: computation of forged X.509 certificates [199], of meaningful postscript documents having the same image with MD5 [93],. . . • Also some theoretical works [212, 210] showed some collision computation on the then thought robust SHA-0 and SHA-1 hash functions.A practical problem was thus, given an existing cryptographic protocol thatemploys one of these hash functions, to determine whether these attacks directlylead to secrecy, authentication, or any other high-level flaws. Another similar vulnerability but on digital signature algorithms was knownsince [37]. In a multi-user setting, even assuming the strongest (existentialunforgeability) security on the signature algorithm, it is possible to create a keythat appears to have been employed to create a known message/digital signaturepair. This Duplicate Signature Key Selection attack was employed in [20] toconstruct an unknown key share attack on a cryptographic protocol. This attackonly relies on the fact that every agent creates his own signature keys, insteadof having a trusted library generating and storing them, and therefore affectsmost of the standard signature schemes, including RSA, Rabin, ElGamal, DSAand ECDSA (see [37], Section 4, with a possible, though costly, mitigation forECDSA presented in [127]). We have stated earlier that relating a concrete cryptographic model to asymbolic one is difficult given that in the former the impossibility of a com-putation is assumed while the latter assumes the finite description of all possi-ble computations. This difficulty turns into an advantage when one considersflaws in cryptographic primitives, as they are expressed by the existence, in theconcrete setting, of a tractable function. Even when this function only has anon-negligeable probability of computing the desired result, it can be modeledin a deduction system by an over-approximation that always yields the desiredoutcome. Thus, taking into account the flaws of existing cryptographic primi-tives during the refutation of cryptographic protocols is easy enough: it sufficesto add new public symbols describing the concrete algorithms employed, and torelate the application of these functions to other messages by adding equations
  • 152. 156 CHAPTER 8. CRYPTOGRAPHIC PROTOCOLS REFUTATIONto the equational theory. In the next section we present how in collaborationwith Mounira Kourjieh we have extended deduction systems to take into accountcryptographic primitives’ vulnerabilities in a symbolic model.8.3.3 Results obtainedCollisions. We have considered a slight overapproximation of the known tech-niques employed to compute collisions. Given that the MD5 algorithm computesonline the hash of a message if two messages m and m have the same hash value,then for every message m the messages m · m and m · m will have the samehash value. Accordingly the collision-finding algorithm starts from two arbitrarymessages m1 and m2 , and computes two prefixes p1 and p2 such that p1 ·m1 andp2 ·m2 have the same hash value. An attacker employing this algorithm can thuscompute, given two messages m · m1 and m · m2 , two messages m · p1 · m1 andm · p2 · m2 that have the same hash value. We have chosen, for more flexibility,to allow the two prefixes to differ. I.e., given two messages m1 · m1 and m2 · m2the intruder can compute p1 , p2 such that: h(m1 · p1 · m1 ) = h(m2 · p2 · m2 )We let f1 (resp. f2 ) be the public function symbols modeling the computationof p1 (resp. p2 ) from m1 , m1 , m2 , m2 . The collision is modeled by the equation:∀m1 , m1 , m2 , m2 , h(m1 ·f1 (m1 , m1 , m2 , m2 )·m1 ) = h(m2 ·f2 (m1 , m1 , m2 , m2 )·m2 )This equation depends upon the properties of the concatenation · which is as-sociative and has the neutral element (the empty word):   x · (y · z) = (x · y) · z x· = x ·x = x The operations available to the attacker are modeled by making public h, de-noting the application of a hash function, and the concatenation symbols ·, andby the two extended deductions: x·y → x x·y → yWe then employ the generalization of the hierarchical combination to extendeddeduction systems to reduce the whole satisfiability problem to one in whichthe equation: h(m1 · f1 (m1 , m1 , m2 , m2 ) · m1 ) = h(m2 · f2 (m1 , m1 , m2 , m2 ) · m2 )is removed. Then since f1 , f2 are free symbols w.r.t. the equational theoryof the concatenation we employ the combination result on disjoint deductionsystems to reduce the satisfiability problems of the free f1 and f2 symbols on
  • 153. 8.3. SATURATION-BASED DECISION PROCEDURES 157the one hand, and of the concatenation on the other hand. The decidability ofthe former is trivial. The decidability of the latter is a consequence of the factthat it suffices to guess which free constants occur in the instance of a variable,and thus of the fact that unifiability with linear constant restrictions is decidablefor the associative equational theory [193].Duplicate Signature Key Selection. The subsequent work on the mod-elling of the Duplicate Signature Key Selection (DSKS) property was along thesame line. The computation of a digital signature key pair is modeled by twopublic function symbols v and s (standing respectively for the computation ofthe validation and the signature keys) and with the addition of an equation: valid(x, sign(x)s(y), v (x, sign(x)y)) = trueto the equations modeling that v, s and v , s model validation/signature keypairs: valid(x, sign(x)s(y), v(y)) = true valid(x, sign(x)s (y1 , y2 ), v (y1 , y2 )) = trueAll the function symbols but s, v are public. The decidability of satisfiabilityproblems for this deduction system was presented in [58] and relies on the com-putation of a saturated deduction system, i.e. a deduction system in whichdeductions are modeled by terms instead of symbols, and such that the result ofa composition (i.e. a deduction whose result is not a subterm of the messagesin the input) is never decomposed (we refer to [58] for the exact definitions andproofs). This work has in our view emphasized the importance of the notion ofsaturation, given that finite saturated deduction systems automatically satisfythe first two points of Theorem 8.9 but w.r.t. the standard subterm relation,and the last point is normally a pre-requisite for the saturation.Saturated Deduction Systems. As is the case of ground entailment in first-order logic, saturated deduction systems always have a decidable ground satisfi-ability problem [134]. The natural question is then of whether this result can belifted to satisfiability problems, i.e. to determine whether satisfiability problemsare decidable for saturated deduction systems and, whien this is not the case,give minimal restrictions entailing the decidability of satisfiability problems. It turned out that the answer to the first question is negative: we haveprovided the encoding of the runs of a deterministic Turing machine such thatthe attacker can compute a message m (encoding the halt in an accepting stateof the Turing machine) if, and only if, he can compute an accepting run ofthe Turing machine. Applying this result on the encoding of a universal Turingmachine thus yields the undecidability of the satisfiability problem for saturateddeduction systems. We have nonetheless provided a criterion that ensure decidability which isbased on the structure of the terms in the saturated deduction system. It is innature similar to the definition of S + (Definition 3.17 in [18]):
  • 154. 158 CHAPTER 8. CRYPTOGRAPHIC PROTOCOLS REFUTATIONDefinition 47. (Class S + , [18], p. 1807) A clause set S belongs to S + if for allclauses C in C and all litterals L in C: 1. if t is a functional term occurring in L then Var(t) = Var(C); 2. | Var(L)| ≤ 1 or Var(L) = Var(C). While our criterion lacks the simplicity of the class S + it is tailored to en-sure that every sequence of unification between literals of the clauses in a localderivation eventually terminates. This guarantee is provided by imposing, in-tuitively, that guessing the application of a saturated deduction rule will eitherstrictly decrease the number of variables in the unification system of a sym-bolic derivation representing partially the deductions of the intruder, or willnot instantiate the terms in this unification systems prior to the guess of thededuction. Accordingly we call the saturated deduction systems meeting theserestrictions contracting. We refer the reader to [134] for the exact definition andproofs.8.4 Research DirectionsMy work on the refutation of cryptographic protocols lead me to two differentresearch directions: • first, the importance of saturation leads to the analysis of saturated deduc- tion systems in the more general setting of sets of clauses, instead of just sets of Horn clauses, which would be the natural generalization of deduc- tions. We have already presented some preliminary results in Section 5.2, p. 81; • second, there is a more complex asymmetry issue related to deduction systems. While the saturation of deduction systems enables us to derive decidability results, they are unsatisfactory since these results are conse- quences of the decidability of more complex problems, and thus saturation does not permit one to obtain fine decidability results for the satisfiability problems.In order to make the second point clear, let us consider subterm deductionsystems, i.e. deduction systems such that the equational theory is subtermconvergent. It is known that: • a variant of saturation [134] always terminate on subterm deduction sys- tems, but the resulting deduction system are not contracting; • the decidability of satisfiability for subterm deduction systems relies heav- ily on the fact that initially, all the terms in the knowledge of the intruder are ground; • general constraints, i.e. those for which the initial knowledge is not ground, are undecidable in general for subterm deduction systems.
  • 155. 8.4. RESEARCH DIRECTIONS 159Thus, while saturation may help one in deriving new decidability results forthe satisfiability problem, we believe that more attention should be paid on thestructure of these problems.Example 28. In particular I think the combination result of [70] gives us a moreabstract characterization of satisfiability problems as the natural generalizationof reachability problems for infinite state transition systems. To establish thisassume one is given an infinite-state transition system as follows: • a fixed initial state, modeled by a term t0 ; • a finite set of transitions of the form τ : s → s , such that there exists a transition from a state t to a state t if there exists a ground substitution σ such that sσ = t and s σ = t ; • the set of goal states is the set of all ground instances sf σ of a term sf .The combination result of [70] implies that to modularly decide reachability forsuch transition systems one needs to solve ordered satisfiability problems for thededuction system defined with: • the unary public symbols fτ ; • the (convergent) equational theory fτ (s) = s for every transition τ .A similar remark was also described in [48], where instead of reachability prob-lems the authors consider proofs with holes, i.e. proofs in which parts have beenerased. That remark may be more natural, given that the erasure of some de-ductions is exactly what happens when one tries to modularly prove a theorem.Example 29. Consider a set of clauses S = {C1 , . . . , Cn }. By turning thepredicate symbols into function symbols, introducing a multiset operator +that has the following properties:   x + (y + z) = (x + y) + z x+y = y+x x+0 = x and one unary function symbol neg, one can encode the clauses C1 , . . . , Cn asterms t1 , . . . , tn , the empty clause being encoded with the term 0. Let us add twopublic function symbols f and r of respective arity 1 and 2, with the equations: f (x + x + y) = f (x + y) r(x + y, neg(x) + z) = y+zFinally, consider the equational theory ES constructed as follows, with a newconstant : n ES = ti = i=1
  • 156. 160 CHAPTER 8. CRYPTOGRAPHIC PROTOCOLS REFUTATIONThe completeness and correctness of resolution implies that the set S is unsat-isfiable if, and only, for the following symbolic derivation: ? ? C = ({1, 2}, {1 → x, 2 → y}, {x = , y = 0}, { , 0}, {2}, {1})we have C = ∅. This encoding may seem unnecessary given that we have merely reportedthe difficulty of deciding whether a given set of clauses is unsatisfiable into theequational theory. However having a uniform framework to reason on terms,atoms, clauses and deductions provides in my view a theoretical basis for “de-modulation across argument and literal boundaries,” a research problem posedby [217].8.5 ConclusionI have summarized in this chapter a large part of my research since I starteda Ph.D. In particular I have tried to emphasize the connections between thedifferent problems I have considered, sometimes sacrificing the “unimportant”details that would have helped the reader not familiar with this work. In thisform, however, this summary outlines the extent with which the results obtainedare closely tied to basic or standard results in first-order logic. While reachability or proof finding problems can be analyzed in isolation,it seems more rewarding to obtain composable decidability results. I believethat to obtain this modularity decidability results have to been obtained onthe (ground) satisfiability problems for deduction systems, and not only onreachability problems or proof finding problems. As a consequence I believethat satisfiability problems we have considered hitherto only in the context ofcryptographic protocol refutation should actually be considered as interestingobjects of analysis, in themselves, instead of just by-products of cryptographicprotocol refutation.
  • 157. Chapter 9Web Services Orchestration Choreography I present in this chapter my work on the synthesis of Web Ser- vices that was made in collaboration with Tigran Avanesov, M. Anis Mekki, M. Rusinowitch, and M. Turuani. Instead of presenting a serie of articles, I have taken the summary on these works written in Deliverable D3.1 of the Avantssar project.9.1 Trace-based Synthesis of an OrchestrationThis section is a summary of the work done in collaboration with M. Anis Mekkiand M. Rusinowitch on the synthesis of services.9.1.1 IntroductionAutomatic composition of web services is a challenging task. Many works haveconsidered simplified automata models that abstract away from the structureof the messages exchanged by the services. For the domain of security services(such as digital signing or time stamping), we propose in this section an approachto automated composition of services based on their security policies. Theapproach amounts to collecting the constraints on messages, parameters andcontrol flow from the component services and the goal service requirements. Aconstraint solver checks the feasibility of the composition—possibly adaptingthe message structure while preserving the semantics—and displays the servicecomposition as a message sequence chart (MSC ). From the resulting MSC, weautomatically extract the resulting composed service and translate it back toASLan (using Trace2ASLan, one of the modules of the Avantssar platform).The composed service can then be verified automatically for ensuring that itcannot be subject to active attacks from intruders, using the Avantssar platform.The approach is fully automatic and we show on an Avantssar case study, the 161
  • 158. 162CHAPTER 9. WEB SERVICES ORCHESTRATION CHOREOGRAPHY C l i ent G o al signatureRequest(session(sid),certificate(name,ckey),contract(data)) signaturePolicy(session(sid),policy(footer)) signature(session(sid),SIGNATURE) SIGNATURE = signature(crypt(inv(ckey),apply(sha1,pair(data,footer)))) signatureResponse(session(sid),TIMESTAMP,ASSERTIONS) TIMESTAMP = timestamp(time,PROOF,#2,crypt(inv(#2),PROOF))) PROOF = apply(md5,pair(time,apply(md5,SIGNATURE))) C l i ent G o al Figure 9.1: Time stamping and archiving a digital signatureDigital Contract Signing (DCS)[14], how it succeeds within seconds in derivinga composed service that is currently proposed as a product by the OpenTrustCompany. Furthermore we propose to automatically generate a ready-to-deploy webarchive, corresponding to a prudent implementation of the newly composedweb service.1Introductory exampleFigure 9.1 illustrates a composition problem corresponding to the creation of anew service (described here by Goal ) for appending a time stamp to a digitalsignature performed by a given partner (described here by Client) over somedata (described here by data) and then submitting it together with the signeddata and some other proofs for long time conservation by an archiving thirdparty. More precisely Goal should expect a first message from Client containinga session identifier sid, the Client’s certificate containing his identity and hispublic key ckey and finally the data he wishes to digitally sign. Goal shouldanswer with a message containing the same session identifier and a footer valueto be appended to the data before the client’s signature. This value aims tocapture the fact that the Client acknowledges a certain chart (known by Goal ) 1 Currently we really generate these implementations in terms of ready-to-deploy web ap-plications, invoking real services but there is still some work to do before claiming we generatethem in high compliance with Web Services Standards.
  • 159. 9.1. TRACE-BASED SYNTHESIS OF AN ORCHESTRATION 163before using the service Goal. Indeed this is what Client is expected to send backto Goal. Goal should then append to the received digital signature (describedby SIGNATURE ) a time stamp (described by TIMESTAMP ). The time stampconsists of a time value which is bound to the Client’s signature (through theuse of md5 hash) and signed by a trusted time stamper’s private key #2. Goal should also include a certain number of assertions or proofs about itsresponse message. ASSERTIONS is described below and consists of 4 assertionsor judgements.ASSERTIONS = ASSRT0,ASSRT1,ASSRT2,ASSRT3ASSRT0 = assertion(cOCSPR,#0,crypt(inv(#0),cOCSPR))cOCSPR = ocspr(name,ckey,time)ASSRT1 = assertion(tsOCSPR,#0,crypt(inv(#0),tsOCSPR))tsOCSPR = ocspr(#1,#2,time)ASSRT2 = assertion(arcOCSPR,#0,crypt(inv(#0),arcOCSPR))arcOCSPR = ocspr(#3,#4,time)ASSRT3 = assertion(ARCH,#4,crypt(inv(#4),ARCH))ARCH = archived(session(sid),certificate(name,ckey), contract(data), SIGNATURE,TIMESTAMP,ASSRT0,ASSRT1)#0 in trustedCAKeyspair(#1,#2) in trustedTSspair(#3,#4) in trustedARs For example ASSRT0 is a judgement made about the validity of the Client’scertificate at the time time and signed by a certification authority trusted byClient. This trust relation is modelled by the fact that the public key of thecertification authority is in the set trustedCAKeys representing the public keysof the certification authorities trusted by Client. ASSRT1,ASSRT2 representsimilar judgements made about the certificates of the used time stamper andarchiving service and signed by the same trusted certification authority. On theother hand ASSRT3 models the fact that the data to be signed by Client, itsdigital signature together with a time stamp and all the proofs obtained for thedifferent involved certificates have been successfully archived by an archivingthird party which is in addition trusted by Client for this task: here also thistrust relationship is modelled by the constraint: pair(#3,#4) in trustedARs. Finally the use of dotted communication lines in Figure 9.1 refers to addi-tional constraints on the communication channels used by Client and Goal : inour example this turns to be a transport constraint requiring the use of SSL.We can express this constraint in our model by requiring that the concernedmessages are ciphered by a symmetric key previously shared between both par-ticipants (the key establishment phase is not handled by the composed service). In order to satisfy the requests of Client, Goal relies on a community ofavailable services ranging from time stampers, and archiving third party tocertification authorities. These services are also given by their interface, i.e. the description of theprecise message patterns they accept and they provide in consequence. For
  • 160. 164CHAPTER 9. WEB SERVICES ORCHESTRATION CHOREOGRAPHY A ny Servi ce CA loop CVRequest(mode) alt [mode = OCS P] certificate(name,key) assertion(OCSPR,cakey,crypt(inv(cakey),OCSPR)) OCSPR = ocspr(name,key,time) alt [mode = CRL] currentCRL(crl) A ny Servi ce CA Figure 9.2: Available services: Certification Authorityinstance Figure 9.2 describes a certification authority CA capable of providingtwo sorts of answers when asked about the validity of a certificate: one is OCSP -based (i.e. based on the Online Certificate Status Protocol) and returns a proofcontaining a real-time time-bound for the validity of a given certificate; while thesecond only provides the classical Certificate Revocation List CRL. Intuitivelyby inspecting the composition problem one can think that to satisfy the Clientrequest the second mode should always be employed with CA (provided it isalso trusted by the Client). One can also deduce that some adaptation shouldbe employed over the Client’s messages to obtain the right message patterns(possibly containing assertions) from the community (for example the use ofthe flag OCSP with CA). The solution we propose computes whenever it is possible the sequence ofcalls to the service community possibly interleaved with adaptations over thealready received messages and permitting to satisfy the Client’s requests asspecified in the composition problem. The remainder of this chapter is organised as follows: in Section 9.1.2, wepresent our model for web services and we formally state the composition prob-lem and its solution. In Section 9.1.3, we present our ongoing work on the
  • 161. 9.1. TRACE-BASED SYNTHESIS OF AN ORCHESTRATION 165synthesis of a ready-to-deploy prudent implementation of the newly obtainedcomposed service. In Section 9.1.4, we present our work on translating the for-mal description of the mediator of the obtained composed service to ASLan inorder to permit its validation against regular security properties. We concludein Section 9.1.5.9.1.2 Mediator synthesisA web service is in standard way described in terms of the interface it presentsto the outside world (the possible clients) using the WSDL [187] language. Thisdescription is structured into ports, each proposing a set of available operations.An operation is then defined by the given of its in-bound and out-bound messagepatterns; these patterns are usually described using the XSD [203] language andreflects the XML message structure. Security constraints can then be definedon top of the service interface description using WS-Security [172] annotations.Such annotations can occur at any level in the WSDL binding the levels theyoccur into the security constraints they carry. They range from the service tothe message level and typical examples are an SSL transport requirement forthe whole service or the need to cipher or digitally sign a certain part insidea message pattern (in-bound or out-bound to some operation). We note thatthe use of XSD for the description of message patterns permits the use of theXPATH [215] language to write the queries identifying parts inside these mes-sage patterns which simplifies the writing of message-level security constraints.We put the focus on SOAP-based (in contrast with RESTful-based) web ser-vices. These services rely on the SOAP [87] protocol that encapsulates themessages described in the WSDL specification of the service. We claim thatafter (automated) analysis we can collect from the different specification filesthe descriptions of the different message patterns in-bound and out-bound toall the operations of the service and corresponding to the messages really ex-changed by the service (SOAP encapsulation included). These descriptions arediscussed below.Representation of messages and security constraintsWe aim to represent a significant fragment of XML messages as described by theXSD language using first-order terms defined over a signature given below. Thefragment we address corresponds to XML elements, described by sequentialcomplex types, i.e. elements having an ordered and a fixed-cardinality set ofchildren. We also abstract away the attributes in XML messages. To representXML messages we define the following signature: F = {noden , childn | i ≤ a ∈ N, n ∈ C} ∪ a i a {scrypt, sdcrypt, crypt, dcrypt, sign, verif, inv, invtest, }where the symbol noden represents an XML node named n (ranging over a set of aconstants C) and having a children. For each symbol noden we define the set of a
  • 162. 166CHAPTER 9. WEB SERVICES ORCHESTRATION CHOREOGRAPHYsymbols childn , . . . , childn permitting to extract its children. In order to model 1 a a asecurity constraints holding over exchanged XML messages, we also representthe usual cryptographic primitives through the use of symbols: scrypt/sdcryptfor symmetric encryption and decryption, crypt/dcrypt for asymmetric encryp-tion and decryption, sign/verif for digital signature and its verification, invto denote key inverses and invtest permitting to test whether a pair of terms{t, t } verifies t = inv(t). The constant is the result of a successful test. Wedenote by Fp , the set of public symbols and assume in the remainder of thischapter that Fp = F {inv}. Some of the symbols represent the possible operations on the messages. Theirsemantics is defined with the following equational theory:    sdcrypt(scrypt(x, y), y) = x (Ds )  dcrypt(crypt(x, y), inv(y))   = x (Das ) EXM L verif (x, sign(x, inv(y)), y) = (Sv )  childn (noden (x , . . . , x )) = xi (P a ) i   i a 1 a   a invtest(x, inv(x)) = (Iv )Representation of servicesWe note that the WSDL specification of a web service does not precise any orderof invocation for its operations but only gives their exhaustive list. Moreoverthis specification does not mention how the input parameters are related tothe output parameters for a given operation. The BPEL [171] language allowsreasoning about such properties by permitting first to specify a certain work-flow logic for the service, and second to specify all the manipulations neededto construct the sent messages given the received ones. In this sense BPEL de-scribes business processes which are structured workflows of activities rangingover invocation of web service operations, providing of web services operationsor manipulation of messages. We assume that all the services we consider are also described in termsof their respective BPEL specification and focus only on services described bylinear processes, i.e. sequences of activities. Therefore a service S will be consid-ered as a sequence of in- and out-bound messages denoted respectively RCV (m)and SN D(m) as described by the following grammar: P, Q := services 0 null service RCV (m) · P input message SN D(m) · P output message P Q AC parallel composition Parallel composition of services S1 and S2 is denoted by S1 S2 . It isassociative and commutative, and has a unit element 0, the null process. Weconsider a community to be a parallel composition of all its available services.
  • 163. 9.1. TRACE-BASED SYNTHESIS OF AN ORCHESTRATION 167Transition semantics We introduce transition semantics to define how ser-vices are executed in interaction with their environment and in particular withclients. The state of a service S can be viewed as the list of remaining operationsit has to perform to end properly. For instance the service in state RCV (r) · Sshould wait a message matching r with substitution σ and proceed with S σ.The global configuration is a pair (S, E) with first component the set of servicestates, and second component the set of messages that have been sent so far.The evolution of the global configuration is given by the transition rules: (RCV (r) · S . . . , E ∪ {m}) → (Sσ . . . , E ∪ {m}) if ∃σ, rσ = m (SN D(s) · S . . . , E) → (S . . . , E ∪ {s}) (S, E) → (S, E ∪ {m}) if E m The reception of a message instantiates the variables in the receive pattern.This instantiation is applied on the variables remaining in the process thatdescribes the service. A derivation is a sequence of transitions. We say that aservice T has ended in a derivation if it is reduced to a null process.Web services composition problemComposition Goal To answer a client C request we often need a new serviceT to be obtained as a composition of some of the ones that are available inthe community. We define the composition goal as the ordered list of messagesthat C should receive from T and that T should receive from C. Hence thecomposition goal is also a service that can be specified with the service grammargiven above.Composition mediator We exploit a derivation as follows to generate acomposition compiler. The messages sent by the services are dispatched bythe mediator and they can possibly be adapted before assigning them to theproper recipient. In order to express this adaptation capability of the mediator, adapt adaptwe simply add another transition rule denoted by −→ . The −→ relation isdefined with respect to a deduction relation on messages that expresses whichmanipulations can be performed: adapt (P, E) −→ (P, E ∪ {m}) where E m. The problem we are interested in is to check whether a client C can besatisfied by a composition of services from the community. More formally wecan state it as:Service Composition Problem Input: A community of service S = {S1 , . . . , Sn } A composition goal C (specified by the client requests) Output: True iff there exists a sequence of transitions from initial state (S ∪ {C}, ∅) to a state where C has ended, and each service in S has either ended or is in its initial state.
  • 164. 168CHAPTER 9. WEB SERVICES ORCHESTRATION CHOREOGRAPHY In other word we have to check for the existence of a derivation (applyingthe transition rules) from an initial state (S = (Π1 | · · · |Π2 , ∅), to a state whereall requests from the client have been satisfied (C has ended) and the servicesfrom the community that have been initiated have properly terminated.Solving the composition problemTheorem 9.1. The Service Composition Problem is NP-complete.Sketch of proof: We reduce the Service Composition Problem to showing theexistence of an attack on a protocol built from the services and the client (giventhe EXM L theory). To ensure proper termination of services that are involvedin an interaction with the client, we guess at the beginning whether a service Siwill be employed or not. Let {S1 , . . . , Sm } be the subset of services to be reallyemployed. After this guessing step the composition problem is reduced to thereachability of a configuration (0, E) from a configuration (C S1 . . . Sm , ∅)with {S1 , . . . , Sm } ⊆ {S1 , . . . , Sn } For each service S in {C, S1 , . . . , Sm } we introduce a new constant cS andtransform the service S into a service S = S · SN D(cS ). It is clear that aservice S reduces to the null process if, and only if, S sends cS . Finally we adda monitor service M to the community that checks that all constants are sent.We let M = RCV (cC ) · RCV (cS1 ) . . . RCV (cSm ) · SN D(secret) It is also clear that M sends secret if and only if all the services C, S1 , . . . , Smreduce to the null process. Thus we have transformed the problem of the reach-ability of a configuration (0, E) from a configuration (C S1 . . . Sm , ∅) intothe problem of the reachability of a configuration (P, E ) with secret ∈ E fromthe initial configuration (M C S1 . . . Sm , ∅). This latter problem is aclassic problem for cryptographic protocols and is called the Protocol insecurityproblem. Since the existence of an attack on a protocol is a problem known tobe in NP [190] we can conclude. The protocol insecurity problem corresponding to our composition problemcan then be submitted to any state-of-the-art protocol verification tool capableof checking reachability properties. If the composition problem admits a solutionwe obtain an attack trace describing how the intruder (or the mediator froma composition point of view) succeeded into satisfying the clients requests byapplying its adaptation skills on messages exchanged with some services in thecommunity. For instance Figure 9.3 illustrates the solution for the composition problemstated in the introductory example. The mediator obtains a time stamp from atime stamper (denoted by TS ) trusted by the Client then obtain an assertionfrom the certification authority CA stating the validity of the time stamper’scertificate. He also calls CA to obtain similar assertions about an archiving thirdparty service’s (denoted by ARC ) and the Client’s certificates. Finally he calls
  • 165. 9.1. TRACE-BASED SYNTHESIS OF AN ORCHESTRATION 169the archiving tier service to obtain the last needed assertion before successfullyanswering the last request of the Client. At this level we already decided the feasibility of the composition giventhe Client’s requests and the community of available services. We propose tofurther the study to first, obtain an operational implementation of the new fea-ture provided by the composed service (or mediator) and second to validatethis implementation against regular security properties (and in prescript of allother partner services). We already reached the second objective and enabledit in the Avantssar validation platform: the description of the mediator is auto-matically extracted from the attack trace and then translated to ASLan usingthe Trace2ASLan module. The mediator’s ASLan specification together withthe specifications of the Client and the involved services from the communitycan then be submitted to the Avantssar platform for validation. Details aboutTrace2ASLan are described in Section 9.1.4 while we present in Section 9.1.3our ongoing work on the first objective.9.1.3 Mediator prudent implementationWe present in this section our approach for generating a prudent implementa-tion of the mediator obtained after solving a web service composition problem asexplained in Section 9.1.2. The remainder of this section is organised as follows:first we define a target for web service implementation and one of its importantdesired properties: prudence. Informally speaking this notion requires that theimplementation checks its input messages as thoroughly as possible (for exampleby checking all the correlation possibly existing between received messages orby proceeding to all the possible verifications of digital signatures). Finally wepresent our linear-time procedure to generate a prudent implementation for agiven web service described using the web services model we introduced in Sec-tion 9.1.2 which we apply to generate prudent implementation for compositionmediators.Implementation for web servicesWe first present some extensions to our web services model before introducingthe notion of implementation. Terms are manipulated by applying operationson them. These operations are defined by a subset Fp of the signature Fcalled the set of public symbols. A context C[x1 , . . . , xn ] is a term in which allsymbols are public and such that its nullary symbols are the variables x1 , . . . , xn .C[x1 , . . . , xn ] is also denoted C when there is no ambiguity and n is called itslength.Definition 48. A strand s is a finite sequence of messages each with ! or ?label. Messages with label ! (respectively, ?) are said to be “sent” (respectively,“received”). A strand is positive if and only if all its labels are ?. The length of ! !a strand s = ? m1 , . . . , ? mn is n, and its input is denoted by input(s) and is thestrand (?r1 , . . . , ?rn ) where r1 , . . . , rn is the ordered sub-sequence of messageslabelled by ? in s.
  • 166. 170CHAPTER 9. WEB SERVICES ORCHESTRATION CHOREOGRAPHY ! ! We denote by si (respectively, by si ) the prefix ( ? m1 , . . . , ? mi ) (respectively, ?the labelled message ! mi ). We also define σs as the ground substitution {xi → inputmi }1≤i≤n and σs as the restriction of σs to the set {xi | si =?mi }. Tomodel the initial knowledge IK(s) of the web service, represented by the strands, we prefix s with a reception ?t for every term t in IK(s). We assume in thefollowing that ∈ IK(s) for all strands s.Definition 49. Given a strand s, a context C and a ground term t, we say that input inputC evaluates to t on s if and only if Var(C) ⊆ Supp(σs ) and Cσs =EXM L t. Next we give an operational semantics to the send and receive activitiesdefined by a strand.Definition 50. An unification system S is a finite set of equations denoted by ?(ui = vi )i∈{1,...,n} with terms ui , vi ∈ T (F, X ). It is satisfied by a substitutionσ, and we note σ |= S, if for all i ∈ {1, . . . , n} ui σ =EXM L vi σ.Active frames. Strands are given an operational semantics with active frames—a simple process model in which the computation of messages to send and theverification on the received messages are specified. The notation ?ri (respec-tively, !ei ) refers to a message stored in variable ri (respectively, ei ) which isreceived (respectively, sent). Let us recall the definition of active frames.Definition 31, p. 100. An active frame is a sequence (Ti )1≤i≤k where ?   !ei with ei = Ci [r1 , . . . , ri−1 ] (send) Ti = or  ?ri with Si (r1 , . . . , ri ) (receive)where Ci [r1 , . . . , ri−1 ] denotes a context and Si a unification system over vari-ables rj 1≤ji . A variable ri (respectively, ei ) is called an input variable (re-spectively, an output variable) of the active frame.Definition 32, p. 101. Let ϕ = (Ti )1≤i≤k be an active frame as in Defini-tion 31 and where the input variables are r1 , . . . , rn . Let s be a positive strand!M1 , . . . , !Mn , σϕ,s be the substitution {ri → Mi } and S be the union of theunification systems in ϕ. The evaluation of ϕ on s is denoted ϕ · s and is thestrand (mi )1≤i≤k where: !Ci [m1 , . . . , mi−1 ] If Ti is !ei mi = ?ri σϕ,s If Ti is ?riWe say that ϕ accepts s if Sσϕ,s is satisfiable.Definition 33, p. 101. An active frame ϕ is an implementation of a strand s ifϕ accepts input(s) and ϕ·input(s) =E s. If a strand s admits an implementationwe say this strand is executable.
  • 167. 9.1. TRACE-BASED SYNTHESIS OF AN ORCHESTRATION 171Compilation of web services into prudent implementationsGiven a strand s, a first requirement is that if up to a step in which a messageis sent the messages received are those specified in s, then the sent messagemust also be equal modulo EXM L to the response defined in s. To meet thisrequirement it suffices to compute, for every sent message m, a context Cm thatevaluates to m when applied to the messages received so far.Definition 51. A reachability algorithm Ar computes given a strand s of lengthn and a ground term t a context Ar (s, t) that evaluates to t on s if there existssuch a context (we then say t is reachable from s) and ⊥ otherwise. We denoteby RSTi (s) the set of all subterms of s reachable from si and by RSTinew (s)the set RSTi (s) RSTi−1 (s). We also use the shorthand RST (s) to denoteRSTn (s). Computing an active frame is not enough since one also wants to impose thatreceived messages are checked as thoroughly as possible. Let us first formalisethis by a refinement relation on sequences of messages. We say a strand s refinesa strand s if any observable equality of messages in s can be observed in s usingthe same tests. To put it formally:Definition 35, p. 103. Given a strand s, we denote by Ps the set of all thecontexts pairs {C1 , C2 } such that C1 · s =EXM L C2 · s. We say that s refines astrand s if Ps ⊆ Ps .Example 30. Consider the following strands: s = ? a, b !a? a, b s = ? a, b ? a, c !bSince every equality valid on input(s ) is also valid on input(s) we have that srefines s . We employ the refinement notion to define in which sense an implementationcan check as thoroughly as possible its input.Definition 52. Let s be a strand and ϕ be an implementation of s. We saythat ϕ is prudent if any strand s accepted by ϕ is a refinement of s. fDefinition 53. Given a strand s, a unification system Ps is a finite basis of s input fif for each strand s : σs |= Ps if and only if s is a refinement of s Assume there exists an algorithm Ab (s) that takes a strand s as input, fcomputes a finite basis Ps of s. Together with Ar (s, t) given above, Ab (s) willbe a black-box oracle for our compilation algorithm Ac , described below. ! !Algorithm Ac Let s = ( ? m1 , . . . , ? mn ) be a strand. Compute the activeframe ϕs = (Ti )1≤i≤n with, for 1 ≤ i ≤ n: ? Ti = !xi with xi = Ar (si−1 , mi ) If si =!mi ?xi with Ab (si ) If si =?mi
  • 168. 172CHAPTER 9. WEB SERVICES ORCHESTRATION CHOREOGRAPHYand return the active frame ϕs = (Ti )1≤i≤n . By construction we have thefollowing consequence, that we state with the above notations:Theorem 9.2. Given Algorithms Ar and Ab , and an executable strand s suchthat Ar (si+1 , mi ) never outputs ⊥ whenever si =!mi , then Algorithm Ac com-putes a prudent implementation of s.Solving the compilation problemWe present in the following the theoretical justification of the solution we pro-pose for solving the reachability problem and for computing a finite basis for agiven strand s. In order to compute a prudent implementation of a strand s we need toconsider all the contexts that yield the same term t when applied on s. Inprinciple we have to consider the infinite set of possibilities for t and thus theexplicit computation of this set is impossible. Moreover, when t is fixed thereis still an infinite number of contexts to consider even if we restrict the studyto those in normal form, as explained in Example 31.Example 31. Assume s =?k?scrypt(k, k). We have sdcrypt(x2 , x1 ) · s =EXM Lx1 · s and thus we can build an infinite sequence of contexts in normal form andevaluating to k when applied on s by iteratively replacing the occurrence of thecontext x1 in sdcrypt(x2 , x1 ), by sdcrypt(x2 , x1 ): sdcrypt(x2 , sdcrypt(x2 , . . .)) ·s =EXM L x1 · s The key idea of our solution is to consider only the set of relations of the formt = f (t1 , . . . , tk ) modulo EXM L verified by all the reachable subterms t, t1 , .., tkof a given strand s and where f is a public symbol. We first compute a super-setof these relations by relaxing the condition to consider all the subterms of s. Thissuper-set is computed by applying adequate equations in EXM L involving thesubterms of s. Then we select from this super-set the relations that involve onlythe reachable ones. The latter operation is performed in linear time as follows. Arelation t = f (t1 , . . . , tk ) computed by Alg. 9.1 is used to infer the reachability ofthe term t provided the reachability of all the t1 , . . . , tk . Indeed if C1 , . . . , Ck areextraction contexts for the t1 , . . . , tk then f (C1 , . . . , Ck ) is an extraction contextfor t. The set RSTi (s) is then computed as follows. Assuming that si =?mi westart the computation with the set R = RSTi−1 (s) ∪ {mi }. All terms in thisset are trivially reachable from si since those in STi−1 (s) are reachable fromsi−1 and since mi is reachable with the extraction context xi . Then we visit allthe relations t = f (t1 , . . . , tk ) where {t1 , . . . , tk } ⊆ R. For each such relationthe term t is then reachable from R and can be used iteratively to discover newreachable subterms in RSTi (s) or new extraction contexts for subterms alreadyknown to be reachable. Finally we extract from all the computed extractioncontexts the set of all the pairs of contexts evaluating to the same subterm ton s and prove it is a finite basis of s. Note that this approach provides alsoextraction contexts for the sent messages in s if they are reachable from s whichpermits us to use Theorem 9.2 to derive a prudent implementation of s. In
  • 169. 9.1. TRACE-BASED SYNTHESIS OF AN ORCHESTRATION 173the following the relations t = f (t1 , . . . , tk ) defined above are represented bysequents that are true on a strand s.Definition 54. Given a strand s of length n we define the sequents t1 , . . . , t k f twhere t is in ST (s), t1 , . . . , tk is a possibly empty sequence of elements in ST (s)and f is either a public symbol of arity k or a variable in {x1 , . . . , xn }. Let γdenote the sequent t1 , . . . , tk f t, we call t the right-hand side of γ, f its symboland the sequence t1 , . . . , tk its left-hand side and respectively denote them byrhs(γ), symbol(γ) and lhs(γ). The sequent γ is true if a. either f is a public symbol of arity k and t =EXM L f (t1 , . . . , tk ). input b. or the sequence t1 , . . . , tk is empty and f = xi ∈ Supp(σs ).We denote in the following by S(s) the set of all the true sequents of s and byR(s) the subset of S(s) containing the sequents t1 , . . . , tk f t where t, t1 , . . . , tkare in RST (s). Let s be a strand of length n. For all step i in {1, . . . , n} and for each term tin RSTi (s) we let Ri (s, t) be the set containing xi t if si =?t and all sequentst1 , . . . , tk f t such that: {t1 , . . . , tk } ⊆ RSTi (s) {t1 , . . . , tk } ∩ RSTinew (s) = ∅and let Ri (s) = t∈RST new (s) Ri (s, t). iLet YRST (s) = {yt | t ∈ RST (s)} be a set of variables2 and γ be the se-quent t1 , . . . , tk f t (respectively, xj t) in Ri (s, t), the context of γ denotedby context(γ) is the term f (yt1 , . . . , ytk ) (respectively, xj ). We let Ci (s, t) =context(Ri (s, t)), Ci (s) = context(Ri (s)) and C(s) = context(R(s)).Let R(s) be a total order over R(s) and let for all t in RST (s) γmin (s, t) = min{γ ∈ R(s) | t ∈ rhs(γ) ∪ lhs(γ)}Assume3 in addition that R(s) enjoys the following properties for all t inRST (s):P1: t = rhs(γmin (s, t));P2: γmin (s, t ) R(s) γmin (s, t) for all t in lhs(γmin (s, t)).P3: xi t R(s) xj t if and only if i j 2 We assume in the following that X ∩ YRST (s) = ∅. 3 The existence of such an order is proved in Section 9.1.3.
  • 170. 174CHAPTER 9. WEB SERVICES ORCHESTRATION CHOREOGRAPHYWe let for all t in RST (s), Cmin (s, t) = context(γmin (s, t)) and define for all i in{1, . . . , n} the following unification system over variables {x1 , . . . , xi } ∪ {yt | t ∈RSTi (s)} ? Ui (s) = {Cmin (s, t) = C | C ∈ Ci (s, t) {Cmin (s, t)}} t∈RSTi (s)In the remainder Un (s), when n is the length of s, is also denoted by U(s).Theorem 9.3. Let s be a strand of length n. For all step 1 ≤ i ≤ n lett1 , . . . , tk(i) be the enumeration of elements in RSTinew (s) such that: Cmin (s, t1 ) R(s) . . . R(s) Cmin (s, tk(i) )We define: • τs,i = {yt1 → Cmin (s, t1 )} ◦ . . . ◦ {ytk(i) → Cmin (s, tk(i) )} • τ s,i = τs,1 ◦ . . . ◦ τs,iFor all step i in {1, . . . , n} we have: 1. the context Cmin (s, t)τ s,i evaluates to t on si for all t in RSTi (s); 2. Ui (s)τ s,i is a finite basis of si . The main argument in proof of Theorem 9.3 is the GivanM92 [118] of theEXM L theory. This permits to solve the general reachability problem by consid-ering only its restriction to the subterms of a given strand. In the remainder wepresent algorithms that compute the unification systems {Ui (s)}1≤i≤n and themappings {τ s,i }1≤i≤n given a strand s of length n, which permits to computethe finite bases for {si }1≤i≤n as stated in Theorem 9.3. Moreover our algorithmsprovide for all t in RSTi (s) the contexts Cmin (s, t). Together with {τ s,i }1≤i≤nthese contexts permits to provide extraction contexts from s for all t in RST (s).Therefore if all si+1 labelled with ! in s are reachable from si , we can provide aprudent implementation of s as stated in Theorem 9.2.Concrete algorithmsLet us first introduce the data structures for terms (including the special caseof contexts and thereby unification systems), sequents and strands. Then wewill present the principle of Algorithms 9.1 and 9.2.Arrays and queues. We use FIFO queues and arrays to hold terms andsequents objects. We employ an object-oriented notation. Given an arrayobject A, A.add(t) adds the element t to the array and returns its index,A.nbelements() returns the number of elements in the array A and A[i] re-turns the element stored at index i in A if i ≤ A.nbelements(). Given a FIFOqueue Q, Q.pop() consumes and returns the first element in Q, while Q.push(o)
  • 171. 9.1. TRACE-BASED SYNTHESIS OF AN ORCHESTRATION 175appends o to its end and A.nbelements() returns the number of elements in thequeue Q. We note that all operations described above can be implemented inconstant time. Given a queue or an array O, we let O.size() be the sum of thesizes of all the objects hold by O.Representation of terms. A set of terms S is stored in an array A of termobjects. Each term t ∈ S is represented by a term object with fields:id: integer identifying t. We require that A[i].id = i for all 1 ≤ i ≤ A.nbelements()symbol: element of F representing the head symbol of tdst: array of id ’s of its ordered maximal strict subtermscontext: integer identifying the context Cmin (s, t)sequents: queue holding identifiers of sequents where t appears in the left-hand sideinv: identifier of inv(t) in A if inv(t) is a subterm of s.In Algorithm 9.1 a test of the form t = f (t1 , . . . , tn ) is equivalent to test whethert.symbol = f , and if the test is positive all ti are assigned to t.dst[i]. We definethe size of a term t to be the size of the term object holding t, i.e. the sum ofall the sizes of its fields enumerated above.Representation of contexts and unification systems. Similarly a setof contexts is stored in an array C of context objects where each context isrepresented by a context object, which is the sub-record of the term object ?having only the symbol and dst fields. An equation C = C is then representedby a pair of integers (idC , idC ) where idC , idC are the indexes of the contextobjects representing the contexts C, C in C, and a unification system U isrepresented by a queue holding all the representations of the equations in U .Representation of strands. A strand s = ( ? mi )1≤i≤n is represented by !the couple (A, IO) where A is the representation of ST (s) and IO is an arrayholding the couples (mi .id, ? )1≤i≤n in order. The size of s denoted by |s| is !defined as A.size() + IO.size().Representation of sequents. A sequent γ is represented by a record havingthe following fields:id: integer identifying γrhs: integer identifying the right-hand side of the sequentsymbol: element of Fp and representing the head symbol of the context of γlhs: array of term identifiers (id ) in the left-hand side of γ
  • 172. 176CHAPTER 9. WEB SERVICES ORCHESTRATION CHOREOGRAPHYready: integer representing the number of occurrences of terms in the left-hand side of γ that are not yet reachable and initially set to the arity of the head symbol in contextIn the following, we also use the notation t1 .id, . . . , tn .id f t.id as a shortcutto the structure holding the sequent t1 , . . . , tn f t.Computation of S(s) Given a representation (A, IO) of strand s, our goalis to compute an array S holding a representation of each sequent in S(s) andto update the sequents queue for all elements in A. The update is performedon the global arrays A and S by the register method: method register(id1 , . . . , idn f id) cr ← S.add(id1 , . . . , idn f id) for all k ∈ {1, . . . , n} do A[idk ].sequents.push(cr) end for return cr end method Algorithm 9.1: Computation of S(s) 1: S ←∅ 2: for all t ∈ A do 3: switch t do 4: case t = scrypt(m, k) 5: S.register(m.id, k.id scrypt t.id) 6: S.register(t.id, k.id sdcrypt m.id) 7: case t = crypt(m, k) 8: S.register(m.id, k.id crypt t.id) 9: S.register(t.id, k.inv dcrypt m.id) 10: case t = sign(m, inv(k)) 11: S.register(m.id, inv(k).id sign t.id) 12: S.register(m.id, t.id, k.id verif .id) 13: case t = inv(t) 14: S.register(t.id, t .id invtest .id) 15: case t = noden (t1 , . . . , ta ) a 16: S.register(t1 .id, . . . , ta .id noden t.id) a 17: for all i ∈ {1, . . . , a} do 18: S.register(t.id childn ti .id) i a 19: end for 20: end switch 21: end for 22: return SPrinciple of Algorithm 9.1. Given a strand s in normal form, and for eachterm t ∈ ST (s) we perform a case analysis on its structure to compute the
  • 173. 9.1. TRACE-BASED SYNTHESIS OF AN ORCHESTRATION 177sequents; we then insert these sequents into S using the register method above.Note that each subterm t of s contributes to S(s) by a number of sequents onlydepending of its head symbol, and therefore the value S.nbelements() can becomputed beforehand and is linear in the size of input (A, IO). In fact S doesnot yet contain sequents in S(s) with empty left-hand side. These sequents arefinally added to S by Algorithm 9.2.Complexity of Algorithm 9.1. The outermost loop runs through the sub-terms of s stored in A. Algorithm 9.1 processes each subterm t of s in a numberof constant-time instructions linear w.r.t. the size of t which permits us to stateits time-linearity w.r.t. to the size of s.Computation of the Ui (s). Given the representations (A, IO) of a strands of length n and S of S(s) we compute an array C representing the contextsin C(s) and arrays I, U representing the prudent implementation of s and suchthat for all 1 ≤ i ≤ n: 1. if si =!mi then I[i] is the index of the context object Cmin (s, mi )τ s,i in C4; 2. if si =?mi then U[i] is a queue representing the unification system Ui (s)τ s,i .Algorithm 9.2 relies on the register2 procedure that updates the global array C. method register2(f [id1 , . . . , idn ]) cr ← C.add(f [A[id1 ].context, . . . ,S[idn ].context]) return cr end methodPrinciple of Algorithm 9.2. From the array of sequents S output by Algo-rithm 9.1, Algorithm 9.2 computes iteratively the terms that are reachable instrand s, for each reception step. If a labelled message si =!mi is such that miis reachable in s then an extraction context of mi in s is stored in I. Hencethe computation of I permits us to simulate the call to an oracle Ar by takingAr (si−1 , mi ) = I[i] for si =!mi . Similarly array U stores the extraction contextsof the reachable subterms in s (at each step) and can be employed to build afinite basis for s and its prefixes by taking Ab (si ) = U[i].Correction of Algorithm 9.2. The correction of Algorithm 9.2 is based onthe fact that the order in which it inserts contexts satisfies the properties P1–P3imposed on R(s) . 4 The minimum here is taken with respect to the order Q introduced in Correction ofAlgorithm 9.2.
  • 174. 178CHAPTER 9. WEB SERVICES ORCHESTRATION CHOREOGRAPHY Algorithm 9.2: Computation of the Ui (s)τ s,i 1: S ← Output of Algorithm 9.1 2: C,Q,step ← ∅, ∅, 0 3: for all mi ∈ IO do 4: step++ 5: if mi = (idi ,?) then 6: Q.push(S.add( xi idi )) 7: while Q = ∅ do 8: seq ← Q.pop() 9: t ← S[seq.rhs.id] 10: ind = register2(seq.symbol[seq.lhs]) 11: if t.context = null then 12: t.context ← ind 13: while t.sequents = ∅ do 14: seq’ ← S[t.sequents.pop()] 15: seq’.ready−− 16: if seq’.ready = 0 then 17: Q.push(seq’) 18: end if 19: end while 20: else 21: U[step].push((t.context,ind)) 22: end if 23: end while 24: else if mi = (idi ,!) then 25: I[step] ← A[idi ].context 26: end if 27: end for 28: return I, U, CComplexity of Algorithm 9.2. Given a strand s each sequent γ in S(s) isat most popped once into the queue Q (only when γ.ready = 0). Moreover,each time such a sequent is processed, the algorithm also runs through all theelements in rhs(e).sequents and elements in lhs(e). As previously explained incomplexity of Algorithm 9.1 the first processing is linear-time w.r.t. the size ofthe strand s whereas the second processing is linear w.r.t. the size of the strands. Therefore Algorithm 9.2 runs in linear-time complexity w.r.t. to the DAGsize of its input.ExperimentsThe compilation procedure presented above has been tested on several web ser-vice composition problems. As a preliminary work we succeeded into generatingfrom a composition problem the prudent implementation for its corresponding
  • 175. 9.1. TRACE-BASED SYNTHESIS OF AN ORCHESTRATION 179mediator and for all the involved services from the community. These imple-mentations have been realised in Java and deployed as Java Servlets performingthe communications corresponding to each service and thus enabling the Clientto successfully interact with the mediator. This permitted us to verify in a realsetting our compilation procedure and to obtain a first realisation of the newfeature brought by the composed service. We note that the need for generatingalso the services involved in the composition (they are supposed to be alreadyimplemented and running) is due to the Servlet architecture choice: we some-how bound the messages format and the communication between services to asetting different from web services standards. We currently further this work inorder to generate web services compliant realisations for the mediators: in thissetting the generated mediator communicates directly with the already existingweb services in a standard way.9.1.4 Mediator validationIn this section we show how we obtain an executable specification of the mediatorin terms of the Avantssar Specification Language (ASLan) [13]. ASLan is aformal language for specifying security-sensitive service-oriented architectures,the associated security policies, as well as their trust and security properties.ASLan specifications can be validated (in the Dolev-Yao intruder model) usingback-ends from Avantssar Platform [15]. Hence our translation allows us toverify several security properties of the mediator such as confidentiality andauthentication.Modelling Web Services in ASLanWe translate strands into ASLan roles. An ASLan role is defined by a transitionsystem and an initial state. States are sets of facts, where facts can be thoughtof as first order terms over a given signature. The transition rules are of theform l ⇒ r where l and r are states. There is a transition from a state s toa state s whenever there exists a transition rule l ⇒ r and a substitution σsuch that lσ ⊆ s and s = (s lσ) ∪ rσ. The facts in a state s can encode thereception or the emission of a message (e.g. iknows(scrypt(m, k))). The stateof the web service is encoded with a fact state wrap(x1 , . . . , xn ) where each xiis associated with a reachable subterm of the strand we translate. The languageallows also to guard the transitions by conditions like equality or disequalitybetween first order terms.Generating an ASLan specification for the mediatorThe approach proposed in this section has been implemented in Java. Thedesigned component called Trace2ASLan takes as input a strand representationof web service and outputs in linear time the specification of the correspondingASLan role.
  • 176. 180CHAPTER 9. WEB SERVICES ORCHESTRATION CHOREOGRAPHYHandling Knowledge. A strand of even length s = [?s1 !s2 . . .?sn−1 !sn ] istranslated into a set of rules. We assume the existence of an injective functionname mapping each term in RST (s) to a unique string. We assume that each reception is followed by a response, and compile eachsub-sequence ?s2j−1 !s2j of s into a transition rule. We reuse the notations Siand Ci of Definition 31. The internal state of the agent executing the mediatoris modelled by a term state wrap of arity k, where k is the number of terms inRST (s). At each step i a variable val(i, t) that represents the current value oft ∈ RST (s) in the state is computed as follows: X name(t) if t ∈ RSTk (s) val(k, t) = Y name(t) otherwise We translate each couple ?si−1 !si in the strand with the generic pattern:state wrap(val(i − 2, t1 ),...,val(i − 2, tm ), i − 1).iknows(val(i − 1, si−1 )) ? equal(t, t ) t=t ∈Si−1⇒state wrap(val(i, t1 ),...,val(i, tm ), i + 1).iknows(Ci )Initial knowledge and nonces. We have a special translation for the initialsequence of values received in the strand that correspond to the parametersfor the execution and the nonces. We create an initial state that contains astate wrap term for each instance of a strand. The value of t ∈ RST (s) in thisterm is either ⊥ if t is not a nonce or a parameter, or the ground term actuallyused as a parameter.Example 32. The ASLan specification corresponding to the web service de-scribed by the strand ?scrypt(m, k)?k!m is:section signature: state_wrap: nat * msg * symmetric_key * msg - factsection types: t,Y_T,X_T,m,Y_M,X_M: message k,Y_K,X_K: symmetric_keysection inits: initial_state init := state_wrap(t,k,m,1)section rules: step s1_(Y_T,Y_K,Y_M,X_T) := state_wrap(Y_T,Y_K,Y_M,1). iknows(X_T)
  • 177. 9.2. TRACE-BASED SYNTHESIS OF A CHOREOGRAPHY 181 = state_wrap(X_T,Y_K,Y_M,3) step s3s4(X_T,Y_K,Y_M,X_K,X_M) := state_wrap(X_T,Y_K,Y_M,3). iknows(X_K) equal(X_T,crypt(X_K,X_M)) = state_wrap(X_T,X_K,X_M,5). iknows(X_M)9.1.5 ConclusionRelying on cryptographic protocols analysis methods we succeeded into solvingthe web services composition problem. The solution we propose further theanalysis to generating an operational realisation of the newly obtained com-posed service permitting to use its associated new computation feature. Thisrealisation is prudent in the sense it checks its input messages as thoroughly aspossible and validated against regular security properties using the Avantssarvalidation platform.9.2 Trace-Based synthesis of a choreographyThis section is a summary of the work done in collaboration with Tigran Avanesov,M. Turuani, and M. Rusinowitch on the synthesis of services.9.2.1 Agent cooperationIn this section, we discuss the problem of constructing agent cooperation pro-tocols in the presence of security policies. Whereas service synthesis methodsusually focus on orchestration, i.e. the synthesis of a new service that communi-cates with existing ones to provide new functionalities to the users, we considerthe problem of the synthesis of a choreography, i.e. of a complex multi-partyprotocol between service providers. We consider a set of agents who have to cooperate in order to achieve somegiven goals. We assume that the agents can exchange messages through asyn-chronous communications channels. We need to build a communication scenariosuch that all the agents attain their goals. Such a scenario defines a servicechoreography: each agent performs actions in accordance with behaviour ofother ones in a way that all the participants are satisfied. In contrast to theservice orchestration, we do not mark out any of them as a central entity: thereis neither client nor mediator. Moreover, for each agent we want to define a con-form role such that an agent is able to play it with regard to some restrictionslike agent’s knowledge, security policy and network topology. Note, that we donot fix possible operations for each participant, but give them a carte blanchein using their knowledge. Contrariwise, once choreography is defined, one can
  • 178. 182CHAPTER 9. WEB SERVICES ORCHESTRATION CHOREOGRAPHYextract operations that was used and each agent can deploy a correspondingservice (with fixed operations). Similar cooperation problems have often been addressed in previous work[32, 33, 45, 164, 178] and solved by methods ranging from automata synthesisto AI planning or logic programming. Our objective here is to contribute tothe state of the art by solving some cases, not considered before, where thestructure of messages matters and where the security policy of each agent is anadditional constraint. It is a non trivial task to find a cooperation scheme. Sincesome agents may not trust each other, they may have their own requirementsto communicate, and some intermediates may be required to intervene (e.g. toprovide certificates). We represent the communicating agents abstractly by specifying them solelyby their initial knowledge (what an agent knows in the beginning of the inter-action) and their goals (what he wants to obtain). The agent may create anew knowledge from what he knows at some point: at each point of the execu-tion, the agent’s knowledge is closed under pairing, encryption, decryption (ifhe knows the key), signing, etc. The agent ability to cooperate takes the formof sending and receiving of messages. But some restrictions are to be imposed: • agents may not accept any message, but only those with some pre-defined pattern (this expresses his policy); • agents can only send the messages they can create from their knowledge; • an agent cannot communicate directly with another agent if the two do not share a communication channel.Note that we can parametrise the initial knowledge of the agents, e.g. we cansay that and agent knows something encrypted with a given key but withoutspecifying what exactly is encrypted. In this case the problem would be to findvalues that instantiate an initial knowledge of every agent together with thecommunication that satisfies all the goals9.2.2 Book publishingWe give an instance of the problem (see Figure 9.4): a writer (Agent A1 ) wantsto publish his new book (t). There is an enterprise that, besides others services,has a Publishing (Printing) Service (Agent A4 ). This service accepts to printonly books approved by a Writing Style Authority (Agent A3 ). Anyone outsidethis enterprise is forbidden to access directly the Printing Service. To get accessone has to contact the “Reception” (Agent A2 ) of this enterprise. The Receptioncan communicate with the Printing Service: they share a key and the PrintingService accepts only messages encrypted with that key. In this case, the network topology is as follows: A1 , A2 , A3 are pairwise con-nected (as they represent public entities); A2 and A4 also have a communicationchannel (as they belong to the same enterprise).
  • 179. 9.2. TRACE-BASED SYNTHESIS OF A CHOREOGRAPHY 183 Agent A2 only accepts orders encrypted by his public key. Agents A1 and A3can accept everything (trivial policies are omitted in Figure 9.4). The questionis: how should agents cooperate to print the book (A4 should obtain t)?9.2.3 Formal specification of the problemTerms, deduction system and constraintsTo formalise the problem of agent cooperation, we introduce some notation anddefinitions. Let A be a set of atoms, representing elementary pieces of data: thetext of a book, a public or private key, the name of agent, etc. Let X be the setof variables, representing data (possibly composed) to be found. Let T (F, X )be the set of terms over the set of functional symbols F, the set of variables Xand the set of atoms (considered as functional symbols with arity 0) A. Let tbe a term. We define Var(t) to be the set of all the variables in t. We call ta ground term if Var(t) = ∅. The set of all ground terms is denoted by T (F).Some functional symbols may have algebraic properties (such as commutativity,associativity, etc), and every term t is supposed to have a unique normal formdenoted by (t)↓.Definition 55. A term t is normalised if t = (t)↓. Two terms p and q areequivalent, if (p)↓ = (q)↓. Given a set of terms T we define (T )↓ = {(t)↓ : t ∈ T } We define a substitution σ = {x1 → t1 , . . . , xk → tk } (where xi ∈ X andti ∈ T (F, X )) to be the mapping σ : T (F, X ) → T (F, X ) such that tσ isa term obtained by replacing, for all i, each occurrence of variable xi by thecorresponding term ti . The set of variables {x1 , . . . , xk } is called the domain ofσ and is denoted by Dom(σ). If T ⊆ T (F, X ), then by definition T σ = {tσ : t ∈T }. A substitution σ is ground if for any i ∈ {1, . . . , k}, ti is ground. We will saythat the substitution σ is normalised, if xσ is normalised for all x ∈ Dom(σ).Definition 56. A rule is a tuple of terms written as s1 , . . . , sk → s, wheres1 , . . . , sk , s are terms. A deduction system D is a set of rules. From now to the end of this section, rules are assumed to belong to a fixeddeduction system D.Definition 57. A ground instance of rule d = s1 , . . . , sk → s is a rule l =l1 , . . . , lk → r where l1 , . . . , lk , r are ground terms and there exists a groundsubstitution σ such that li = si σ for all i = 1, . . . , k and r = sσ. We will alsocall a ground instance of a rule a ground rule when there is no ambiguity. Given two sets of ground terms E, F and a rule l → r, we write E →l→r Fiff F = E ∪ {r} and l ⊆ E, where l is a (multi)set of terms. We write E → Fiff there exists rule l → r such that E →l→r F .Definition 58. A derivation D of length n ≥ 0 is a sequence of finite sets ofground terms E0 , E1 , . . . , En such that E0 → E1 → · · · → En , where Ei =Ei−1 ∪ {ti } for all i = {1, . . . , n}. A term t is derivable from a set of terms E
  • 180. 184CHAPTER 9. WEB SERVICES ORCHESTRATION CHOREOGRAPHYiff there exists a derivation D = E0 , . . . , En such that E0 = E and t ∈ En . Aset of terms T is derivable from E iff every t ∈ T is derivable from E. We writeDer(E) to denote the set of terms derivable from E.Definition 59. Let E be a set of terms and t be a term, we define the couple(E, t) denoted E t to be a constraint. A constraint system is a set S = {Ei ti }i=1,...,nwhere n is an integer and Ei ti is a constraint for all i ∈ {1, . . . , n}. We extend the definition of Var(·) to a constraint system S in a natural way.We say that S is normalised if every term occurring in S is normalised. Wewrite (S)↓ to denote a constraint system {(Ei )↓ (ti )↓}i=1,...,n .Definition 60. A ground substitution σ is a model of constraint E t (orσ satisfies this constraint) if (tσ)↓ ∈ Der((Eσ)↓). A ground substitution σis a model of a constraint system S if it satisfies all the constraints of S andDom(σ) = Var(S). Now we can specify formally the agent cooperation problem.Agents cooperation modelWe define an agent community as a pair composed of a set of agents {Ai }i=1,...,mand a network topology T. Each agent A has an initial state, where states aretriplets of the form EA , PA , GA , with • EA is A’s knowledge (a finite set of ground terms he initially knows), • PA is A’s policy (a finite set of terms specifying the authorised patterns of incoming messages), • GA are A’s goals (a finite set of ground terms he wants to obtain).We denote an agent A in state EA , PA , GA as A( EA , PA , GA ). We assume that the internal capabilities of every agent are modelled by adeduction system D, which we suppose to be the same for all agents. We alsosuppose that agent’s policy and agent’s goals are not modifiable, while agent’sknowledge can be changed. The intuition is as follows: The agents form a community and cooperate toachieve theirs goals. Goals are represented by finite sets of ground terms thatagents want to know. Every agent A has his own initial knowledge EA (alsorepresented by finite set of ground terms). An agent can apply arbitrarily manyrules from D to its current knowledge in order to derive new data. An agent will reject any message that is not allowed by his policy. Forexample, if agent Ai has policy PAi = {encs (x, ai )}, where ai represents a publickey of Ai and x is a variable, then he will only accept messages encrypted by hispublic key and nothing else. A trivial policy where an agent accepts everythingis expressed by a variable pattern P = {x}.
  • 181. 9.2. TRACE-BASED SYNTHESIS OF A CHOREOGRAPHY 185 Agent communication is limited by the network topology T. We define T asa set of communication channels, where a communication channel f rom agent Fto agent T is represented by a pair (F, T ). Thus, T = {(Fi , Ti )}i=1,...,k , whereFi , Ti ∈ {A1 , . . . , Am }. If (F, T ) ∈ T then agent F can send messages to agentF . Note, that (F, T ) ∈ T does not imply (F, T ) ∈ T, i.e. there can exist one-waychannels. Agents may send messages to each other on the network defined by T. Afteragent A receives a message (consistent with his policy), his current knowledge isexpanded with this message. The goal of this “game” is that after some roundsof sending-receiving messages, every agent Ai is able to deduce any term of GAifrom his final knowledge (knowledge after executing the “cooperation”). We present a formal semantics by specifying a transition system. A con-figuration of an agent community {Ai }i=1,...,m is a union of all its agents in 0their current state. Thus, initial configuration is {Ai ( EAi , PAi , GAi )}i=1,...,m , 0where EAi , PAi , GAi is an initial state of agent Ai (remark, that we considera case where agents’ policies and agents’ goals are not mutable). We define aunique configuration transition that reflects the intuition described above(agentF can send a message m to agent T if F can derive m from his current knowl-edge and this message matches some pattern from policy of agent T ; messagem becomes a part of agent T ’s knowledge): {T ( ET , PT , GT )} ∪ {A( EA , PA , GA )}A∈{A1 ,...,Am }{T } (F,T ),m −− − − − − − − − − − − − − − − − −→ −−−−−−−−−−−−−−−−−− if F ∈{A1 ,...,Am }{T }∧m∈Der(EF )∧∃p∈PT , ∃σ:pσ=m {T ( ET ∪ {m}, PT , GT )} ∪ {A( EA , PA , GA )}A∈{A1 ,...,Am }{T } The aim is to achieve a configuration {Ai ( EAi , PAi , GAi )}i=1,...,m such that∀i ∈ {1, . . . , m}, ∀g ∈ GAi g ∈ Der(EAi ).9.2.4 Solving the problemGiven a community of agents in their initial states (Ai )i=1,...,m with Ai =Ai ( EAi , PAi , GAi ) for i = 1, . . . , m and a network topology T, we show how tosolve the cooperation problem, assuming a bound on the number of interactions. Let us first define the notion of dataflow. Dataflow is a list of tuples{ (Fi , Ti ), mi }i=1,...,l , where Fi is an agent who sends a message, Ti is an agentto whom the message is sent, and mi is the message sent; we will call Fi andTi the endpoints of step i. Informally, agent F1 sends to agent T1 message m1 ,then agent F2 sends to agent T2 message m2 , etc. Let l be the maximal number of interactions that we allow. If the problemhas a solution within the bound, then given a network topology T, we can guess(as we have a bounded number of cases) the order of endpoints of a dataflow:{(Fi , Ti )}i=1,...,l , where (Fi , Ti ) ∈ T. Then, for every i, we can guess a patternfrom the policy PTi that is used, since a policy is specified as a finite set ofterms. Thus, we have a list { (Fi , Ti ), pi }i=1,...,l , where (Fi , Ti ) ∈ T and pi is apattern from policy PTi .
  • 182. 186CHAPTER 9. WEB SERVICES ORCHESTRATION CHOREOGRAPHY To distinguish values of variables of the same pattern used anew or of differ-ent patterns but using the same name of variable, we introduce a substitutionσi which renames the variables. • Dom(σi ) = Var(pi ) for all i, • Dom(σi )σi ⊆ X , • i = j ⇒ Dom(σi )σi ∩ Dom(σj )σj = ∅. Then we can build a constraint system that models our cooperation problem: S = {EFi ∪ {pj σj }{j:ji,Tj =Fi } pi σi }i=1,...,l ∪ {EAi ∪ {pj σj }{j:Tj =Ai } g}i=1,...,m; g∈GAi l(where Var(S) = i=1 Var(pi σi )).Lemma 9.1. If the cooperation problem has a solution with l 0 interactions,then it has a solution for l + k interactions, for all k ≥ 0.Proof. The idea is to repeat last message exchange k times. Thus, given asolution { (F1 , T1 ), m1 , . . . , (Fl , Tl ), ml }, i.e. a dataflow that leads an initialconfiguration of an agent community to a configuration where all goals aresatisfied, a dataflow: { (F1 , T1 ), m1 , . . . , (Fl , Tl ), ml , (Fl , Tl ), ml , . . . , (Fl , Tl ), ml } kis also a solution, since it leads to the same configuration as the initial dataflow. By Lemma 9.1 it suffices to consider communications of maximal length.Summing up the process of finding the satisfactory communication for the agentcooperation problem, we present Algorithm 9.3 based on the fact that the sat-isfiability of constraint systems within the deduction system D is decidable. We can show a constraint system built by Algorithm 9.3 for the examplepresented above, where terms admit symmetric and asymmetric encryption,signing and pairing and the deduction system used is Dolev-Yao (see § 9.2.5 fordetails). After guessing endpoints ({(A1 , A3 ); (A3 , A1 ); (A1 , A2 ); (A2 , A4 )}) fordataflow and guessing message patterns (there is only one choice for every agentin this example) assuming a bound of four on interactions we have: {t, kA2 } x1 ;      {k , priv(k ), x } x ;   A3     A3 1 2    {t, k , x } enc (x , k );    A2 2 p 3 A2 {kA2 , kA2 A4 , priv(kA4 ), encp (x3 , kA2 )}           encs ( x4 , sign(x4 )priv(kA3 ) , kA2 A4 );      {kA2 , kA3 , kA2 A4 , encs ( x4 , sign(x4 )priv(kA3 ) , kA2 A4 )} t.  
  • 183. 9.2. TRACE-BASED SYNTHESIS OF A CHOREOGRAPHY 187 Algorithm 9.3: Decidability of the cooperation problem Input: {Ai ( EAi , PAi , GAi )}i=1,...,m , T, l ∈ N Output: Dataflow leading to a state where all goals are achieved, if there exists one, otherwise ⊥ Guess the endpoints of data flow and patterns of policy to be used: { (Fi , Ti ), pi }i=1,...,l , where (Fi , Ti ) ∈ T and pi ∈ PTi Build substitution σi , i = 1, . . . , l for renaming variables Build constraint system S: S = {EFi ∪ {pj σj }{j:ji,Tj =Fi } pi σi }i=1,...,l ∪{EAi ∪ {pj σj }{j:Tj =Ai } g}i=1,...,m; g∈GAi if there exist a model σ of S then Return { (Fi , Ti ), (pi σi )σ }i=1,...,l else Return ⊥A solution of this constraint system is the substitution: {x1 → t; x2 → sign(t)priv(kA3 ); x3 → t, sign(t)priv(kA3 ) ; x4 → t} We can easily extend the agent’s policy by adding a pattern of the outputmessages, i.e. the policy would be a pair of sets of terms PA = RA , SA , whereRA is a finite set of terms defining patterns for input messages and SA is afinite set of terms defining patterns for output messages. In other words, if inthe presented model we restricted the form of messages that can be received,then by this extension, we would also restrict the form of messages that can besent by an agent (e.g. an agent can send only messages signed by his privatekey). To get this definition of a policy running for our algorithm, we need onlyto add a guessing phase of output message patterns and perform a unificationbetween a guessed output pattern of an agent who sends a message and a guessedinput pattern of an agent who receives a message.9.2.5 Signature and deduction systemsHere we list two deduction systems (and two corresponding term signatures) forwhich the satisfiability of constraint systems is decidable.
  • 184. 188CHAPTER 9. WEB SERVICES ORCHESTRATION CHOREOGRAPHY Composition rules Decomposition rules t1 , t2 → encs (t1 , t2 ) encs (t1 , t2 ), t2 → t1 t1 , t2 → encp (t1 , t2 ) encp (t1 , t2 ), priv(t2 ) → t1 t1 , t2 → t1 , t2 t1 , t2 → t1 t1 , priv(t2 ) → sign(t1 )priv(t2 ) t1 , t2 → t2 Table 9.1: DY deduction system rulesDolev-YaoWe define a term as follows: term ::= variable | atom | term, term | encs (term, term) | priv(Keys) | encp (term, Keys) | sign(term)priv(Keys)where atom ∈ A, variable ∈ X ; Keys ∈ A ∪ X . Here encs (m, k) correspondsto a message m encrypted with a symmetric key k, priv(k) corresponds to aprivate key to decrypt messages encrypted with public key k or to sign mes-sages, encp (m, k) corresponds to a message m encrypted with a public key k,sign(m)priv(k) corresponds to a digital signature of message m using private keypriv(k) and m1 , m2 corresponds to a pair of messages m1 and m2 . For asym-metric encryption (encp (,)), only atomic keys are allowed. By sign(p)priv(a),we mean a signature of message p with private key priv(a); p is not deduciblefrom the signature. The first deduction system is Dolev-Yao with empty equational theory. Itsrules are shown in Table 9.1.Dolev-Yao extended with an ACI symbolThe second decidable deduction system is Dolev-Yao extended with an associative-commutative-idempotent (ACI) symbol used to model sets. We extend the pre-vious definition of term with an ACI symbol: term ::= variable | atom | term, term | encs (term, term) | · (tlist) | priv(Keys) | encp (term, Keys) | sign(term)priv(Keys) tlist ::= term | term, tlistwhere atom ∈ A, variable ∈ X , Keys ∈ A ∪ X . The rules of this deduction system are given in Table 9.2, where (t)↓ is a nor-mal form of a term modulo ACI. It is defined by a strict total order on T (F, X )and a normalisation function, that works bottom-up by flattening nested · lists(· (a, · (c, d, e) , c) becomes · (a, c, d, e, c)), sorting children of ·-nodes and remov-ing duplicates (· (a, c, d, e, c) becomes · (a, c, d, e)). When the set is reduced to asingleton the ACI symbol is removed (· (a) becomes a). For example, for termt = · ({a, · ({b, a, a, b }) , · ({b, b}) , a }) we have (t)↓ = · ({a, b, a, b , b, a }).
  • 185. 9.3. CONCLUSION 189 Composition rules Decomposition rules t1 , t2 → (encs (t1 , t2 ))↓ encs (t1 , t2 ), (t2 )↓ → (t1 )↓ t1 , t2 → (encp (t1 , t2 ))↓ encp (t1 , t2 ), (priv(t2 ))↓ → (t1 )↓ t1 , t2 → ( t1 , t2 )↓ t1 , t2 → (t1 )↓ t1 , priv(t2 ) → (sign(t1 )priv(t2 ))↓ t1 , t2 → (t2 )↓ t1 , . . . , tm → (· (t1 , . . . , tm ))↓ · (t1 , . . . , tm ) → (ti )↓ for all i Table 9.2: DY+ACI deduction system rulesDecidabilityTheorem 9.4. Satisfiability of a constraint system within DY+ACI is decidableand is in NPTIME.Proof sketch. First we can show that it suffices to consider normalised con-straint systems and normalised models. Then we prove the existence of a con-servative solution of satisfiable constraint system: it can be built using onlyquasi-subterms (some subset of subterms) of the constraint system. This givesus a bound on the size of such a solution, and, therefore, decidability. Due tothe polynomial complexity of normalisation algorithm and also the polynomialcomplexity of a check t ∈ Der(E), where t and E are ground and normalised,we obtain NP as a class of complexity for the initial problem.Theorem 9.5. Satisfiability of a constraint system within DY is decidable andis in NPTIME.Proof. The main idea is to build a solution within DY+ACI deduction system(as DY signature is strictly included into DY+ACI signature, as well as DYdeduction system is strictly included into DY+ACI one), and then replace ACIlists in the solution with nested pairs: · ({t1 , . . . , tn }) is replaced by t1 , . . . , tn .The resulting substitution will still be a model of the initial constraint system.Thus we have the same complexity as for DY+ACI case. Full proofs of these theorems are given in [12].9.3 ConclusionThe work described in this chapter is still under progress. We currently focus onthe automated deployment of synthesized services as Web Services. A prelimi-nary version written by Mohammed Anis Mekki deploys the existing services aswell the newly generated one on a Tomcat server. These services then communi-cate by relying on the Tomcat server for the service to service communications,and implement an instance manager that forwards the messages to the correctinstance of the service. Our choice on communication implies that we are in-dependent from the SOAP security layer, which we believe is a drawback tointer-operability. Future work will concentrate on the deeper integration intothe standard SOAP Web Service Architecture.
  • 186. 190CHAPTER 9. WEB SERVICES ORCHESTRATION CHOREOGRAPHY In order to assess whether the work on the synthesis of choreography can beextended to other equational theories in spite of the negative result on subtermdeduction systems, we currently work on its extension to the bitwise exclusive-or. The future of this research line depends on whether we achieve to prove the(conjectured) decidability of constraint systems in this case.
  • 187. C l i ent G o al CA TS A RC signatureRequest(session(sid),certificate(name,ckey),contract(data)) 9.3. CONCLUSION signaturePolicy(session(sid),policy(footer)) signature(session(sid),SIGNATURE) CVRequest(OCSP) certificate(name,ckey) assertion(cOCSPR,cakey,sign(inv(cakey),cOCSPR)) timeStampRequest(SIGNATURE) timeStampResponse(TIMESTAMP) CVRequest(OCSP) certificate(TS,tskey) assertion(tsOCSPR,cakey,sign(inv(cakey),tsOCSPR)) archiveRequest(session(sid),certificate(name,ckey),contract(data),SIGNATURE,TIMESTAMP,ASSRT0,ASSRT1) archiveResponse(ARCH,assertions(ASSRT3)) CVRequest(OCSP) certificate(ARC,arckey) assertion(arcOCSPR,cakey,sign(inv(cakey),arcOCSPR)) signatureResponse(session(sid),TIMESTAMP,ASSERTIONS) C l i ent G o al CA TS A RCFigure 9.3: Solution for the composition problem in the introductory example 191
  • 188. 192CHAPTER 9. WEB SERVICES ORCHESTRATION CHOREOGRAPHY Figure 9.4: Illustration for agent cooperation example
  • 189. Chapter 10Equivalence ofCryptographic Protocols My first published article on the equivalence of cryptographic protocols was written in collaboration with M. Rusinowitch [75] and consisted in a reformulation of Mathieu Baudet’s proof of decidability of trace equivalence for subterm deduction systems. In this chapter I present a criterion that encompasses saturation deduction systems ?? as well as subterm deduction systems. That work was also presented at the Secret 2010 workshop. The notion introduced is the one of finitary deduction systems. It intuitively corresponds to deduction systems such that there exists a lazy solving algorithm in the spirit of [8]. We prove that the equivalence of symbolic derivations is decidable for finitary deduction systems.10.1 IntroductionContext. Security protocols are designed to provide communication meansbetween several parties in a way that ensures that some information is protected.Well-known stories about flaw discoveries [147] have revealed that protocols maybe subject to unexpected and undesirable behaviours under malevolent attackersactions. Formal analysis of protocols is therefore mandatory for gaining the levelof confidence required in critical applications. Formal methods and related toolshave proved to be successful to some extent for this task. But they are limitedin expressiveness since in most cases authors were focused on the resolutionof reachability problems, and as a consequence very few effective proceduresconsider the more general case of equivalence properties.Motivations. Observational equivalence is a crucial notion for specifying se-curity properties such as anonymity or secrecy of a ballot in vote protocols [96]. 193
  • 190. 194 CHAPTER 10. EQUIVALENCE OF CRYPTOGRAPHIC PROTOCOLSFor instance observational equivalence can justify that there is no action foran attacker that makes distinguishable two protocol executions with differentidentities or vote values. To be of effective use the notion of observational equivalence should be con-sidered on processes modeling cryptographic protocols. We consider in thischapter a setting in which the actions of the are represented by one HSD andthose of a unique intruder by one ASD (see Chapter 6 for more details). Sym-bolic derivations can be seen as standing between symbolic traces [27] and thesimple cryptographic processes of [89]. The only decidability result on the equivalence of symbolic traces (calledS-equivalence) we are aware of is for the class of subterm deduction systemsand was given by M. Baudet [27, 28]. We have recently given another proof ofthis result [73] on which this chapter elaborates. A more efficient procedure ispresented in [54] when one considers only the Dolev-Yao deduction system. Inspite of the relevance of this problem for the analysis of e.g. voting protocols, weare not aware of any extension of Baudet’s decidability results to other classesof deduction systems.Applications. The equivalence notion we consider in this chapter has twostraightforward applications, one related to the symbolic validation of crypto-graphic properties and one related to the search for on-line guessing attacks. An on-line attack is one in which the attacker interacts with honest agents toachieve his goals which usually are the acquisition of a previously unknown pieceof data, or the impersonation of a honest agent. In these cases the achievabilityof a goal can be reduced to a reachability problem. However one may considergoals for which this reduction does not hold. For example, the dictionnaryattacks introduced by Schneier [192] consist in guessing a piece of data (usuallya password) and interacting with the honest agents with this piece of data.Depending on the resulting communication the attacker knows whether theguess was correct. It is often the case that such attacks can be detected bythe honest agents involved. For example, sending a wrong password will bedetected by an authentication system that, after a small number of failure, mayinvalidate the account and ask for a new password. To take into account thispossible response by honest agents, Ding and Horster [105] have introduced theconcept of undetectable on-line guessing attacks. They consider that a protocolis vulnerable to this kind of attacks whenever (i) the honest agents cannotdistinguish between a session with the right piece of data with one involving awrong guess whereas (ii) the intruder can distinguish the two executions. Wemodel the first point by stating that the tests performed by the honest agentssucceed in both cases, and the second point by saying that the two executionsare not equivalent. Recent works initiated by Abadi and Rogaway in 2000 [7] have shown thatcomputational proofs of indistinguishability ensuring the security of a protocolcan be derived, under some natural hypothesis on cryptographic primitives, fromsymbolic proofs. This has opened the path to the automation of computational
  • 191. 10.2. FINITARY DEDUCTION SYSTEMS 195proofs. It was shown by [86] that in presence of an active attacker observationalequivalence of the symbolic processes can be transfered to the computationallevel.Related works. Many works have been dedicated to proving correctnessproperties of cryptographic protocols using equivalences on process calculi. Inparticular framed bisimilarity has been introduced by Abadi and Gordon [6]for this purpose, for the spi-calculus. Another approach that circumvents thecontext quantification problem is presented in [42] where labelled transitionsystems are constrained by the knowledge the environment has of names andkeys. This approach allows for more direct proofs of equivalence. To the best of our knowledge, the first tool capable of verifying equivalence-based secrecy is the resolution-based algorithm of ProVerif [39] that has beenextended for handling equivalences of processes that differ only in the choice ofsome terms in the context of the applied π-calculus [40]. This allows to add someequational theories for modelling properties of the underlying cryptographicprimitives. The more recent YAPA tool [29] also permits one to evaluate theindistinguishability of two constraint systems that are essentially equivalent tosymbolic derivations, but it still lacks an associated decision procedure. Few decidability results are available. In the article [125] H¨ttel proves udecidability for a fragment of the spi-calculus without recursion for framedbisimilarity. In [89] the authors show how to apply the result by Baudet onS-equivalence to derive a decision procedure for observational equivalence forsubterm convergent theories for simple processes. Since [89] relies on the proofof Baudet’s result, that is long and difficult [28], we believe that a direct self-contained approach as the one presented below might be valuable too.Organization of this chapter. We reuse in this chapter the notions and no-tations for terms, equational theories, deduction systems, and symbolic deriva-tions introduced in earlier chapters. We assume that the equational theoryconsidered is consistent, i.e. has a model with more than one element1 . Themain result of the chapter is proved in Section 10.3, namely that equivalence ofsymbolic derivations is decidable for finitary deduction systems.10.2 Finitary Deduction SystemsAn equational theory E is finitary whenever every E-unification system hasa finite set of more general unifiers. We define in this subsection an analogfor deduction systems w.r.t. symbolic derivations rather than just equationaltheories w.r.t. unification systems. In order to guide the reader we introduce theconcepts we define by relating them to the analoguous concept for equationaltheories. 1 Note that in an inconsistent equational theory all terms are equal, all unification systemsare satisfied by any substitution, and two symbolic derivations are equivalent if, and only if,they have the same structure on their input and output states.
  • 192. 196 CHAPTER 10. EQUIVALENCE OF CRYPTOGRAPHIC PROTOCOLS10.2.1 Aware and stutter-free ASDsObserving an HSD is limited to the search of the (sequences of) messages thisHSD accepts and to the analysis of the responses of the HSD. Our procedurefollows this dichotomy by splitting each ASDs which is a solution of an HSDinto a stutter-free ASD that builds the acceptable messages and a testing ASDthat observes the responses.Definition 61. (Stutter-free ASD) Let CI = (VI , SI , KI , InI , OutI ) ∈ Ch bean ASD. We say that CI is stutter-free if: • There exists a most general unifier θ of SI in the empty theory; • Given i, j two non-reuse states, i = j implies VI (i)θ =E VI (j)θ; • Remove? For every deduction state i there does not exist j i such that V(j)σ = V(i)σ, where σ = TrCI ◦Ch (CI ). The conditions in the definition are given so that every instance of a messagereceived by the ASD will be accepted by the intruder (see Prop. 10.1). A notiondual to the one of stutter-free derivation is the one of testing ASD.Definition 62. (Testing ASDs) An ASD is testing iff K is empty.Definition 63. (Aware ASD) Remove? Let Ch be a HSD and assume that(CI , ϕ) ∈ Ch and that σ = TrCh ◦CI (CI ) is a ground substitution in normal form.We say that CI = (VI , SI , KI , InI , OutI ) is aware iff for all i, j ∈ IndI theequality VI (i)σ = VI (j)σ implies either: • VI (i) = VI (j), i.e. one of the states is a re-use of the other; ? • VI (i) = VI (j) is an equation in SI . Intuitively aware ASDs in Ch correspond to a full remembering by the in-truder of the equalities that occur in the connection with Ch .Example 33. Remove? Consider a HSD that has one input state and onededuction state in Out which builds a pair of copies of its input. An ASD thatsends a constant a ∈ nonces(), inputs the result of the HSD, and builds a pairof a is stutter-free. However it will not be aware as the building of a pair of awill create in the connection with the HSD a message equal to the received one.Proposition 10.1. Let CI = (VI , SI , KI , InI , OutI ) ∈ Ch be a stutter-freeASD. Then for any ground substitution σ of domain InI the unification systemSI σ is satisfiable in the empty theory.Proof. We remind that a unification system S is in solved form in the empty the-ory if and only if there exists an ordering u on variables such that S contains, ?for each variable x, at most one equation x = t and if for every y ∈ Var(t) wehave y u x. First let us notice that since CI is stutter-free, SI does not contain
  • 193. 10.2. FINITARY DEDUCTION SYSTEMS 197 ?any equation VI (i) = VI (j) with VI (i) = VI (j) for the second condition wouldotherwise be impossible to satisfy for any unifier of SI . Assume there exists twoequations in S VI (i) = f (VI (i1 ), . . . , VI (in )) and VI (i) = g(VI (j1 ), . . . , VI (jm )).Since S has a mgu θ in the empty theory we must have f = g, and consequentlyn = m. By definition of θ we thus have VI (ik )θ = VI (jk )θ for 1 ≤ k ≤ n.Thus by the second point of the definition of stutter free derivations we musthave VI (ik ) = VI (jk ) for 1 ≤ k ≤ n, and thus the equations are identical. Ac-cordingly we can assume that for every deduction state i there is exactly one ?equation VI (i) = f (VI (i1 ), . . . , VI (in )) in SI . ? Thus SI contains exactly one equation VI (i) = t if i is not an input orthe re-use of an input state, and none otherwise. In the former case we canassume that for a mgu θ of S we have V(i)θ = V(i). Given the condition on the ?deduction equations, SI is in solved form, adding to SI equations VI (i) = ti ,for i ∈ InI and ti a ground term thus leads to a unification system also in solvedform.10.2.2 Sets of solutionsOutline. We prove in this section that ASDs have the property that, whenreplacing a constant in Cnew by the result of a sequence of compositions (thisoperation is called opening) we obtain another ASD which can be connected toall the HSDs the original ASD could be connected to (Lemma 10.1). We thendefine The opening operation Thus given any set S of ASDs and a HSD Ch one can test whether S ⊂ Ch bytesting whether the minimal ASDs in S are also in Ch . to be the ones which, by sfthis opening operation, generates all ASDs in Ch it is then trivial to check the sf sfinclusion Ch ⊆ Ch : it suffices to check whether min (Ch ) ⊆ Ch (Lemma 10.2).Opening of symbolic derivations. If C = (V, S, K, In, Out) and C ⊆Cnew ∩ K is a set such such that C ∩ Sub(K C) = ∅, we open C on C, anddenote the operation openC (C), when for each c ∈ C: ? • If i ∈ Ind is the first knowledge state with V(i) = c ∈ S, we remove this equation from S and add i to the input states; • we replace all occurrences of c in C by V(i).We note that the set K obtained from K after the replacement is still a set ofground terms since C ∩ Sub(K C) = ∅, and thus the result of the operation isstill a symbolic derivation. Also, C is an ASD, then so is openC (C).Lemma 10.1. Let CI ∈ Ch with CI = (VI , SI , KI , InI , OutI ), let C ⊆ KI and sflet Cc ∈ Ch for some HSD Ch . If a connection Cc ◦ Ch ◦ openC (CI ) is closedthen it is satisfiable.
  • 194. 198 CHAPTER 10. EQUIVALENCE OF CRYPTOGRAPHIC PROTOCOLSProof. By Proposition 10.1 the substitution TrCc ◦Ch ◦open{c} (CI ) (Cc ) satisfies Sc .Since CI is an ASD we have C ∩ Sub(K C) = ∅, and thus C ∩ Sub(Sh ) = ∅. Let ?us denote SI the unification system SI in which the equations x = c with c ∈ Care removed. For any substitution σ and any constant c ∈ C, Lemma 4.23 andσ |= Sh ◦ SI imply σδc,t |= Sh ◦ SI . Let σ = TrCc ◦Ch ◦openC (CI ) (CI ). For each memory state i ∈ IndI that con-tains a constant c ∈ C we let tc = VI (i)σ . We define δ as the replacement ofeach constant c ∈ C by the term tc . By induction on the indexes of the connection Cc ◦ Ch ◦ openC (CI ) we have: TrCc ◦Ch ◦openC (CI ) (Cc ◦ Ch ◦ openC (CI )) = TrCh ◦CI (Ch ◦ CI )δThus every equation in Sh ∪ SI (minus the removed memory equations) is satis-fied by the composition with Cc . Since every equation in its unification systemis satisfied the connection Cc ◦ Ch ◦ openC (CI ) is satisfiable.Ordering on symbolic derivations. Given two symbolic derivations CI =(VI , SI , KI , InI , OutI ) and CI = (VI , SI , KI , InI , OutI ), we say that CI ≤ CIif: • there exists C ⊆ KI , a stutter-free symbolic derivation CC and a connec- tion ϕ such that CC ◦ϕ openC (CI ) = CI modulo a renaming of variables; • or there exists a set of memory states I ⊆ IndI such that CI is equal to CI = (VI , SI , KI , InI , OutI ) where: – VI is the restriction of VI to the domain IndI I ? – and SI = SI {VI (i) = ci }i∈I .We also introduce an equivalence notion that we call renamming of nonces anddenote CI ≡ CI whenever there exists C ⊆ KI , a stutter-free symbolic derivationCC with only memory statesand a connection ϕ such that CC ◦ϕ openC (CI ) = Chmodulo a renaming of variables. Given a set S of ASDs we denote min (S) theset of ASDs in S that are minimal in S modulo renamming of nonces. Since CI is a symbolic derivation, we note that the memory states of CI thatare removed are never re-used nor employed in any deduction. We also notethat C ≤ C implies that either: • C has strictly less deduction states than C , and less states; • C has strictly less states than C’; • or C and C are equivalent modulo a renamming of nonces.Modulo this renamming it is thus clear that the relation is a well-foundedordering relation.Lemma 10.2. Let S be a set of ASDs and Ch be a HSD. If min (S) ⊆ Chthen S ⊆ Ch .
  • 195. 10.3. DECIDABILITY OF SYMBOLIC EQUIVALENCE FOR FINITARY DEDUCTION SYSTEMS199Proof. Assume min (S) ⊆ Ch and let CI be in S. By definition of the orderingthere exists a derivation CI ∈ min (S) and a stutter-free derivation Cc such thatCc ◦ CI = CI . By hypothesis we have CI ∈ Ch . By Lemma 10.1 this impliesthat CI is also in Ch .Complete sets of solutions. The ordering plays the same role w.r.t. thesolutions of a HSD as the instantiation ordering on substitutions w.r.t. thesolutions of an unification system. In particular the traditional notion of mostgeneral unifier is translated into a notion of minimal solution.Definition 64. (Complete set of solutions) A set Σ of ASDs is a complete setof solutions of an HSD Ch whenever: • Σ ⊆ Ch ; sf • for every ASD CI ∈ Ch there exists an ASD Cm ∈ Σ and a stutter free ASD Cc such that Cm ≤ CI ◦ Cc . We have departed from our line of translating terms from the unificationframework to the symbolic derivation framework by introducing a symbolicderivation Cc . It permits us to consider cases in which the computation of acomplete set of unifiers introduces unnecessary deduction steps in individualASDs. A common example of such addition is the normalisation of messages t, t , i.e. the automatic deduction of the two messages t and t even when theyare not useful to the attacker.10.2.3 Finitary deduction systemsWe have already noted that a NP decision procedure for the satisfiability ofHSDs for the Dolev-Yao deduction system is known since [190]. While thisprocedure is based on the guessing of an attack of minimal size, other proce-dures have been proposed [8, 161] that instead cover all possible stutter-freederivations [66], i.e. compute a complete set of solutions. We define deductionsystems for which such a procedure exists to be finitary.Definition 65. (Finitary Deduction Systems) Let I be a deduction system. Ifthere exists a procedure that computes for every I-HSD Ch a finite complete setof solutions we say that I is a finitary deduction system.10.3 Decidability of Symbolic Equivalence for Finitary Deduction SystemsThis section is devoted to the proof of the main theorem of this paper.Theorem 10.1. Symbolic equivalence is decidable for finitary deduction sys-tems.
  • 196. 200 CHAPTER 10. EQUIVALENCE OF CRYPTOGRAPHIC PROTOCOLS We first prove that every ASD can be written as the connection between astutter-free ASD and a testing ASD in which no new term is deduced (Lemma 10.3).This implies the reduction of the inclusion problem to the one of checkingwhether, for any stutter-free ASD in Ch , the connections of this ASD withCh and Ch result in closed symbolic derivations C1 and C2 such that C1 ⊆ C2(Lemma 10.4). Given a stutter-free ASD in Ch this latter test is simple since itsuffices to consider the connection with ASD that have at most one deduction(Prop. 10.2, ??).Lemma 10.3. Let Ch be a HSD. Then for every aware CI in Ch there existstwo ASDs C = (V , S , K , In , Out ) and Ct = (Vt , St , Kt , Int , Outt ) such that: sf • C is aware and in Ch and Ct is testing; • {Vt (i)TrCt ◦C ◦Ch (Ct )}i∈Indt ⊆ {V (i)TrC ◦Ch (C )}i∈Ind ; • For every HSD Ch , C ◦ Ct ∈ Ch iff CI ∈ Ch .Proof. Let σ = TrCh ◦Ct (CI ). We define ψ : IndI → IndI an application suchthat for all deduction states i ∈ IndI , ψ(i) = min{j i | V(j)σ = V(i)σ} if thisset is not empty and ψ(i) = i in all other cases. Let θ : VI (i) → VI (ψ(i)). Letus construct C and Ct :Internal states: Ind = ψ(IndI ), Indt = IndI ;Variables: Vt = VI and V = VI |Ind ;Unification systems: Let S0 be the set of equations that are deductions in CI for some state i ∈ Ind . Then we define S = S0 θ and St = SI S0 ;Knowledge: K = KI and Kt = ∅;Input states: Any state in Ind ⊆ IndI which is not a deduction state in Ct is an input state of Ct . Input states of C are the same as the ones in CI ;Output states: Outt = ∅ and Out = OutI ∪ Ind .We define the connection φ to be the identity mapping from Int to Out . Thisconstruction deletes redundant deductions of a term in C and records thesedeductions by adding the deduction equations in Ct . The properties are directconsequences of the construction.Lemma 10.4. Let Ch and Ch be two HSDs. We have Ch ⊆ Ch if, and only if: sf • Ch ⊆ Ch ; sf • and for each aware ASD CI ∈ Ch and for all testing ASD Ct ∈ (CI ◦ Ch ) we have Ct ∈ (CI ◦ Ch ) .
  • 197. 10.3. DECIDABILITY OF SYMBOLIC EQUIVALENCE FOR FINITARY DEDUCTION SYSTEMS201Proof. Let us first prove the direct implication. Let us assume that Ch ⊆ Ch . sfBy definition we then have Ch ⊆ Ch . By contradiction let us assume that there sfexists C ∈ Ch such that C1 = C ◦ Ch and C2 = C ◦ Ch are such that there exists a ∗ ∗testing ASD Ct in C1 ⊆ C2 . By construction C ◦ Ct is an ASD in Ch Ch . Let us prove the converse direction by contra-positive reasoning. Assumew.l.o.g. that Ch Ch = ∅ and thus contains an ASD CI , and let C , Ct the ASDsobtained by applying Lemma 10.3 on CI w.r.t. Ch . Since CI ◦ Ch = (Ch ◦ C ) ◦ Ctis not satisfiable, then either Ch ◦ C is not satisfiable, or it is satisfiable, but sf(Ch ◦ C ) ◦ Ct is not. In the first case we have by definition of C that Ch ⊆ Ch . sfIn the second case we have found an ASD C in Ch such that C ◦ Ch and C ◦ Chare satisfiable closed derivations and (C ◦ Ch ) ⊆ (C ◦ Ch ) .Lemma 10.5. Assume CI ∈ Ch and Ct ∈ (CI ◦ Ch ) . Then CI ∈ (Ct ◦ Ch )sf . sfProof. We let CI , Ch , and Ct be as in the statement of the lemma, and denotethem as follows:   CI = (VI , SI , KI , InI , OutI ) Ch = (Vh , Sh , Kh , Inh , Outh ) Ct = (Vt , St , Kt , Int , Outt ) Since CI ∈ Ch there exists a one-to-one2 mapping ϕ : InI ∪ Inh → OutI ∪ sfOuth such that Ch = CI ◦ϕ Ch is closed and satisfiable. Let us denote Ch =(Vh , Sh , Kh , Inh , Outh ). Also by hypothesis there exists a one-to-one mapping ψ : Inh ∪Int → Outh ∪Outt such that Ct ◦ψ Ch is closed and satisfiable. Since Ch is closed the functionψ is actually a mapping from Int to Outh ∪ Outt . Let D be the subset of the ¯domain of ψ of indices i such that ψ(i) ∈ OutI , and D be its complement inthe domain of ψ. Let us define from ψ and D two functions: ψ = ψ|D ¯ ϕ = ψ|D ∪ ϕLet Ch = Ch ◦ψ Ct . Since by construction CI ◦ϕ (Ch ◦ψ Ct ) = Ct ◦ψ (Ch ◦ϕ CI )and Ct ∈ (Ch ◦ϕ CI ) the connection between CI and Ch is also closed and sfsatisfiable, and thus CI ∈ (Ch ) . Since CI ∈ Ch the first two points of thedefinition of stutter free derivations are satisfied by CI . Given that: ϕIn ∪In = ϕInh ∪InI h Iit is easy to see that: TrCI ◦ϕ (Ch ◦ψ Ct ) (CI ) = TrCI ◦ϕ Ch (CI )As a consequence the hypothesis CI ∈ Ch implies CI ∈ (Ch )sf . sf 2 Since the connection is closed the mapping is total.
  • 198. 202 CHAPTER 10. EQUIVALENCE OF CRYPTOGRAPHIC PROTOCOLS sf Let us assume that we are given two HSDs Ch and Ch such that Ch ⊆ Ch . sfOur goal is to show that Ch ⊆ Ch . Given an ASD CI ∈ Ch we define χ(CI ) = {Ct testing ASD | Ct ◦ CI ∈ Ch Ch }Intuitively this is the set of testing ASDs that permit one to distinguish Ch fromCh . By Lemma 10.4, Ch ⊆ Ch if, and only if, there exists an ASD CI such thatχ(CI ) = ∅. sfProposition 10.2. Ch ⊆ Ch if, and only if, there exists CI ∈ Ch such thatχ(CI ) contains an ASD Ct with at most one deduction and one equality test.Proof. The converse direction is trivial. First let us note that if C ∈ Ch Ch then, adding test equations to C whichare satisfied by TrC ◦Ch (C ) yields another symbolic derivation in C ∈ Ch Ch .Thus and wlog we let C ∈ Ch Ch be an aware ASD. According to Lemma 10.3C can be split into one stutter-free derivation CI = (VI , SI , KI , InI , OutI )and one test derivation Ct = (Vt , St , Kt , Int , Outt ). We also define a partition d t d tSt ∪ St of St such that St contains only deduction equations and St contains d donly test equations. Let Ct = (Vt , St , Kt , Int , Outt ). Let us define the followingsubstitutions: σI = TrCI ◦Ch (CI ) σI = TrCI ◦Ch (CI ) σt = TrCt ◦CI ◦Ch (Ct ) σt = TrCt ◦CI ◦Ch (Ct )where the ASD Ct is constructed from Ct as follows. We note that, if Vt (i) =Vt (j) for two distinct states i, j which are not reuse states, we can introducea new variable x, change Vt (j) to x, and introduce in St a new test equation ?Vt (i) = x. In other words we can assume wlog that Vt is injective on states dwhich are not reuse states. This permits one to ensure that the subset St ofequations which are not test equations is satisfiable in any closed connection d dwith another symbolic derivation. We define σt = TrCt ◦CI ◦Ch (Ct ). d By the second point of Lemma 10.3 there exists a mapping ψ : Indt → IndIsuch that for every i ∈ Indt we have Vt (i)σt = VI (ψ(i))σI . Wlog we assumethat ψ is defined as an extension of the connection between CI and Ct , therebyensuring that for input states i of Ct we also have Vt (i)σt = VI (ψ(i))σI .Claim 6. Wlog we can assume that for any deduction state i ∈ Indt we haveVt (i)σt = VI (ψ(i))σI . Proof of the claim. Let i ∈ Indt be a deduction state such that Vt (i)σt = VI (ψ(i))σI . Adding a reuse state if necessary, we can change i into an input state that is connected to ψ(t) (or a state which is a reuse of ψ(i)). This construction does not change σt nor σt and thus the fact that Ct ◦ CI ◦ Ch or Ct ◦ CI ◦ Ch is satisfiable. When repeatedly applying it, we obtain a symbolic derivation Ct that satisfies the claim. ♦ We now split the analysis in two cases depending on whether the set It ⊆Indt of indices i such that Vt (i)σt = VI (ψ(i))σI is empty or not. If it is
  • 199. 10.3. DECIDABILITY OF SYMBOLIC EQUIVALENCE FOR FINITARY DEDUCTION SYSTEMS203empty, the claim implies that we can assume there is no deduction states in tCt , and thus that St = St . Since Ct ◦ CI ◦ Ch is satisfiable but not Ct ◦ CI ◦ Ch ?there exists two input states i, j and one equation Vt (i) = Vt (j) in St whichis satisfied by σt but not by σt . Thus χ(CI ) contains one symbolic derivation ?(V : i ∈ {1, 2} → xi , {x1 = x2 }, ∅, {1, 2}, ∅) where 1 is connected to ψ(i) and 2is connected to ψ(j). On the other hand, if It is not empty, let i0 be minimal in this set, and let ?Vt (i0 ) = f (Vt (i1 ), . . . , Vt (in )) be the equation corresponding to this deduction dstate in St . Given the claim we can assume that it is the first deduction state,and thus that all preceding states are input states. Thus there exists an orderingon the set Ind0 = {t, 0, . . . , n} such that the following symbolic derivation is inχ(CI ) and satisfies the proposition: ? ? (V : i ∈ Ind0 → xi , {x0 = f (x1 , . . . , xn ) , x0 = xt }, {t, 1, . . . , n}, ∅)Proposition 10.3. Given two HSDs Ch and Ch we have Ch ⊆ Ch if, and onlyif, there exists a symbolic testing derivation Ct with at most one deduction stateand one equality and a connection ϕ such that (Ch ◦ϕ Ct )sf ⊆ (Ch ◦ϕ Ct ) .Proof. Let us first prove the contrapositive of the direct direction. Let CI be anASD in (Ch ◦ϕ Ct )sf (Ch ◦ϕ Ct ) , and ψ be a connection such that: CI ◦ψ (Ch ◦ϕ Ct ) is closed and satisfiable CI ◦ψ (Ch ◦ϕ Ct ) is closed and not satisfiableFrom ϕ and ψ we easily define two connections ϕ and ψ such that CI ◦ϕ Ctis an ASD CI such that CI ◦ψ Ch is closed and satisfiable whereas CI ◦ψ Ch isclosed but not satisfiable. Hence: (Ch ◦ϕ Ct )sf (Ch ◦ϕ Ct ) = ∅implies Ch ⊆ Ch . Let us now prove the contrapositive of the converse implication and assume sfCh ⊆ Ch . By Proposition 10.2 there exists a symbolic derivation CI ∈ Ch , atesting ASD Ct and a connection ψ such that:   Ct ◦ψ CI ∈ Ch Ct ◦ψ CI ∈ Ch / Ct contains at most one deduction and one equality test By Lemma 10.5 this implies that there exists a connection ϕ such that CI ∈(Ch ◦ϕ Ct )sf . Given the construction it is clear that CI ∈ (Ch ◦ϕ Ct ) . / We are now equipped for proving the main result of this chapter.
  • 200. 204 CHAPTER 10. EQUIVALENCE OF CRYPTOGRAPHIC PROTOCOLSTheorem 10.2. (Inclusion of Ch into Ch ) Let D be a finitary deduction system.The inclusion Ch ⊆ Ch is decidable for any two honest D-symbolic derivationsCh , Ch .Proof. By Prop. 10.3 the inclusion does not hold if, and only if, there exists anASD Ct of bounded length and a connection function ϕ such that: ∆ = (Ch ◦ϕ Ct )sf (Ch ◦ϕ Ct ) = ∅Let Cτ be an ASD in ∆. By definition of finitary deduction systems one cancompute from Ch ◦ϕ Ct a finite set Σ of ASDs such that there exists Cσ ∈ Σ andCc stutter free such that CI ≤ CI ◦ Cc . By definition of the ordering there existsa stutter free derivation Cθ and a set of constants C such that: openC (Cσ ) ◦ Cθ = Cτ ◦ CcBy hypothesis there exists a connection function ψ such that Cτ ◦ψ (Ch ◦ϕ Ct ) isclosed and satisfiable whereas Cτ ◦ψ (Ch ◦ϕ Ct ) is closed but not satisfiable. ByLemma 10.1 (employed with C = ∅) Cc ◦ (Cτ ◦ψ (Ch ◦ϕ Ct )) is satisfiable whereas,since Cτ ◦ψ (Ch ◦ϕ Ct ) is closed, Cc ◦ (Cτ ◦ψ (Ch ◦ϕ Ct )) is not. By Lemma 10.1 ifCσ ∈ Ch then so is Cc ◦ (Cτ ◦ψ (Ch ◦ϕ Ct )). Since Cσ ∈ Σ implies Cσ ∈ (Ch ◦ϕ Ct )we thus have Cσ ∈ (Ch ◦ϕ Ct ) (Ch ◦ϕ Ct ) . In conclusion, if Ch ⊆ Ch one can guess (in bounded time) a symbolic deriva-tion Ct and compute a finite Σ of symbolic derivations that contains one whichis not in (Ch ◦ Ct ) . Conversely it is clear if one such derivation is found then Ch ⊆ Ch . As a trivial consequence we obtain the announced theorem.Theorem 10.1, p. 199. Symbolic equivalence is decidable for finitary deductionsystems.10.4 Research directionsI believe this criterion is still too syntactic to be applicable to a wide class ofdeduction systems. Further work is needed to make it a true generic criterionfor the reduction of equivalence to satisfiability.
  • 201. Part VEpilogue 205
  • 202. Chapter 11Research project • to work on the potential applications to safety analysis; • to explore further the relation between reachability anal- ysis and first-order automated reasoning techniques; • to obtain a comprehensive framework for service compo- sition that also takes into account trust negotiation, and as a consequence to relate more formally the models for protocols and Web Services presented in this document; • to extend the modularity results obtained to address the modular verification of aspect-based programs. The third point is a straightforward continuation of the research I have presented in this document. I accordingly focus this chapter on the remaining points.11.1 From security to safetyIt has been advocated in [145] that security should not be an additional layeraround the protected system, but instead every system should be built with itssecurity in mind. A striking example is the case of malwares: it is futile to tryto detect the malware the users install, whether knowingly or not, on a system.Sooner or later, a user will try to install one malware, and sooner or later, oneof the installed malware will not be detected in time. Accordingly, the problemis not to detect or define what a malware is, but to ensure that no user-installedsoftware can alter in any way the proper functioning of the operating system. This paper has launched a serie of works, both academic and industrial.First, an operating system with security in mind was devised [?]. Then, andin order to access a larger public, mandatory access control was implementedwithin the linux kernel to provide anyone interested with a Security EnhancedLinux, i.e. a free operating system that could be really secured. 207
  • 203. 208 CHAPTER 11. RESEARCH PROJECT In parallel, the concepts or spatial and temporal segregation, initially formal-ized by John Rushby in [?] where reintroduced in modern computing environ-ments through virtualization. One can run each piece of software in a virtualizedoperating system, i.e. an operating system standard in every aspect but on thefact that it runs not on the machine’s hardware, but on an abstraction of it. Ahost operating system orchestrate the different application, and ensures whenpossible the time segregation between the guest OS. The advantage of this ar-chitecture is that a flaw in one application is contained in the virtual OS inwhich it is run. The security provided by such systems is not optimal given that the hostoperating system can be almost any off-the-shelf one, and thus is itself proneto suffer from a large number of security issues. A decisive step towards secureoperating systems was the proposal of the Multiple Independent Levels of Se-curity (MILS) architecture. There, the virtualization part is kept, but the hostoperating system is merely a scheduler whose primary role is to ensure that noinformation passes from one application to another. The first OS to be certifiedat common criteria EAL-71 abides by this architecture. An important point isthat it was the security evaluation was aimed to prove safety objectives. Thoughone can argue that the modularity achieved by this system is proper to aircraftsystems regulation2 , I have chosen to view this as an indicator of a long termtrend in safety analysis, in which the safety objectives to be validated will bethe same as the standard security objectives. These development raise questions on the research in security: If industrials know enough to produce high-quality and certified operating systems, what is left to researchers ?Though one could argue that researchers can focus on securizing the casual usersoperating systems instead of highly critical ones. However good ideas tend tospread3 , e.g. Google’s Chrome browser also implements some spatial segregationunder the name of sandboxing, and it seems more promising to assume that thekernel is secure, and to focus on the problems left by this assumption: • First the communications of the machine with its environment also have to be secured, and thus the protocols securing these communications also have to be validated; • Second, the above description was over-simplified and has omitted the communications between the applications running in the guest operat- ing systems. These cannot be disregarded as even though they violate the spatial separation principle, they are often mandatory for the proper functioning of the system. Accordingly, in addition to being a scheduler, 1 The target was the implementation of the ARINC 653 1-2 scheduler and the segregationrecommended in the RTCA DO-178B at level A 2 in particular the reusability of off-the-shelf components introduced by the RTCA DO-297 3 Who would have bet, 10 years ago, that 74% of the computers (a.k.a. smartphones) soldin september 2010 were either running linux or FreeBSD (actually a variant of. . . ) ?
  • 204. 11.2. REACHABILITY ANALYSIS AND AUTOMATED DEDUCTION 209 the host OS also has to ensure that all these communications adhere to the policy defined.In such systems, the problem left is the one of evaluating the access control poli-cies to ensure that the rules implemented satisfy the high-level security needs.Research direction. My work on the access control policy of Web Services,which are themselves independent communicating applications with an accesscontrol policy can be seen as a first step with a low entry cost towards the moregeneral security analysis of access control policies in highly critical systems.However the move towards these industrial system necessitates first some proof-of-concept of our approach, and hence at least at first a focus of my researchon the implementation of our modeling of Web Services by entities, and of toolsthat can validate the properties of sets of entities. Only once enough experiencewill have been gained on this topic will it be possible to address the problem ofvalidating the safety of critical sytems.11.2 Reachability analysis and automated de- ductionMy work on the refutation of cryptographic protocols started 10 years ago in avery simple setting: a fixed set of Horn clauses modelling the Dolev-Yao intruderwas given, and I had to find a decision procedure for this set of clauses. Since,a lot of progress has been accomplished, and one now considers classes of setsof Horn clauses modulo an equational theory. Since automated deduction is the area of computer science concerned withfinding decision procedures for classes of theories, it is natural to try to extendthe techniques we have developed to this more general setting. The preliminarystep, presented in Chapter 5, lacks a proof-of-concept for the advantages (orlack thereof) of the saturation method employed. Thus, an implementation totest its potential is needed. Also, in order to achieve the same level of efficiencyas we did in cryptographic protocol refutation, we also need a translation of theconcept of solved form. Implementing our saturation procedure and devising a more efficient rep-resentation of potential solutions are areas of automated reasoning in which Iintend to work in the coming years.11.3 Validation of aspect-oriented programsProgramming with aspects consists in first building a skeleton of an applicationthat contains its basic functionalities. Then one add aspects to enrich thisapplication. For instance, a Web Service interface is an aspect added to a Javaclass by Axis2. Then access control and security policy are aspects that can beadded to the service description to make it more precise.
  • 205. 210 CHAPTER 11. RESEARCH PROJECT A natural question for aspect-oriented programs is whether they can bevalidated modularly. In addition to the combination results I have obtained,there has been a lot of work on the combination of rewriting system since theseminal termination counter-example presented by Toyama [205]. Given thatin e.g. the Avantssar project we have given a rewriting-based semantics tosome aspect-based programms, namely Web Services, I believe it will be veryinteresting to relate the modularity techniques developped for rewriting logicsto the usual ways an aspect is woven into an existing program. The benefit ofthis approach is clear, as it would suffice to validate programms incrementallyas aspects are added to enrich it.
  • 206. Bibliography[1] 14th IEEE Computer Security Foundations Workshop (CSFW-14 2001), 11-13 June 2001, Cape Breton, Nova Scotia, Canada. IEEE Computer Society, 2001.[2] Proceedings of the 22nd IEEE Computer Security Foundations Sympo- sium, CSF 2009, Port Jefferson, New York, USA, July 8-10, 2009. IEEE Computer Society, 2009.[3] Robinson J. A. A machine-oriented logic based on the resolution principle. J. Assoc. Comput. Mach., 12:23–41, 1965.[4] Mart´ Abadi and V´ronique Cortier. Deciding knowledge in security pro- ın e tocols under equational theories. In Josep D´ Juhani Karhum¨ki, Arto ıaz, a Lepist¨, and Donald Sannella, editors, ICALP, volume 3142 of Lecture o Notes in Computer Science, pages 46–58. Springer, 2004.[5] Mart´ Abadi and C´dric Fournet. Mobile values, new names, and secure ın e communication. In Proceedings of the Principle of Programming Lan- guages Conference, pages 104–115, 2001.[6] Mart´ Abadi and Andrew D. Gordon. A calculus for cryptographic pro- ın tocols: The spi calculus. In ACM Conference on Computer and Commu- nications Security, pages 36–47, 1997.[7] Martin Abadi and Phillip Rogaway. Reconciling two views of cryptog- raphy (the computational soundness of formal encryption). J. Cryptol., 20(3):395–395, 2007.[8] Roberto M. Amadio and Denis Lugiez. On the reachability problem in cryptographic protocols. In Catuscia Palamidessi, editor, CONCUR, vol- ume 1877 of Lecture Notes in Computer Science, pages 380–394. Springer, 2000.[9] Anne Anderson. Web services profile of xacml (ws-xacml) version 1.0. Available at http://www.oasis-open.org/committees/download.php/ 24951/xacml-3.0-profile-webservices-spec-v1-wd-10-en.pdf, 2007. 211
  • 207. 212 BIBLIOGRAPHY [10] S. Andova, C.J.F. Cremers, K. Gjøsteen, S. Mauw, S.F. Mjølsnes, and S. Radomirovi´. A framework for compositional verification of security c protocols. Information and Computation, 206:425–459, February 2008. [11] Mathilde Arnaud, V´ronique Cortier, and St´phanie Delaune. Combining e e algorithms for deciding knowledge in security protocols. In Boris Konev and Frank Wolter, editors, FroCos, volume 4720 of Lecture Notes in Com- puter Science, pages 103–117. Springer, 2007. [12] Tigran Avanesov, Yannick Chevalier, Michael Rusinowitch, and Mathieu Turuani. Satisfiability of General Intruder Constraints with and without a Set Constructor. Research Report RR-7276, INRIA, 05 2010. http: //hal.inria.fr/inria-00480632/en/. [13] AVANTSSAR. Deliverable 2.1: Requirements for modelling and ASLan v.1. Available at http://www.avantssar.eu, 2008. [14] AVANTSSAR. Deliverable 5.1: Problem cases and their trust and security requirements. Available at http://www.avantssar.eu, 2008. [15] AVANTSSAR. Deliverable 4.1: AVANTSSAR Validation Platform v.1. Available at http://www.avantssar.eu, 2009. [16] Franz Baader and Klaus U. Schulz. Unification in the union of disjoint equational theories: Combining decision procedures. J. Symb. Comput., 21(2):211–243, 1996. [17] Leo Bachmair and Harald Ganzinger. Non-clausal resolution and superpo- sition with selection and redundancy criteria. In Andrei Voronkov, editor, LPAR, volume 624 of Lecture Notes in Computer Science, pages 273–284. Springer, 1992. [18] Leo Bachmair and Harald Ganzinger. Resolution theorem proving. In Robinson and Voronkov [188], pages 19–99. [19] Michael Backes, Markus D¨rmuth, Dennis Hofheinz, and Ralf K¨sters. u u Conditional reactive simulatability. Int. J. Inf. Sec., 7(2):155–169, 2008. [20] J. Baek, K. Kim, and T. Matsumoto. On the significance of unknown key-share attacks: How to cope with them? In Proc. of Symposium on Cryptography and Information Security (SCIS 2000), 2000. [21] Philippe Balbiani, Yannick Chevalier, and Marwa El Houri. A logical ap- proach to dynamic role-based access control. In Danail Dochev, Marco Pistore, and Paolo Traverso, editors, Artificial Intelligence: Methodology, Systems, and Applications, 13th International Conference, AIMSA 2008, Varna, Bulgaria, September 4-6, 2008. Proceedings, volume 5253 of Lec- ture Notes in Computer Science, pages 194–208. Springer, 2008.
  • 208. BIBLIOGRAPHY 213[22] Philippe Balbiani, Yannick Chevalier, and Marwa El Houri. A logi- cal framework for reasoning about policies with trust negotiations and workflows in a distributed environment. In Anas Abou El Kalam, Yves Deswarte, and Mahmoud Mostafa, editors, CRiSIS 2009, Post-Proceedings of the Fourth International Conference on Risks and Security of Internet and Systems, Toulouse, France, October 19-22, 2009, pages 3–11. IEEE, 2009.[23] Gergei Bana, Koji Hasebe, and Mitsuhiro Okada. Computational seman- tics for basic protocol logic - a stochastic approach. In Iliano Cervesato, editor, ASIAN, volume 4846 of Lecture Notes in Computer Science, pages 86–94. Springer, 2007.[24] Gilles Barthe, Marion Daubignard, Bruce Kapron, Yassine Lakhnech, and Vincent Laporte. On the equality of probabilistic terms. In Proceedings of the 17th LPAR conference, page (to appear). Voronkov editions, 2009.[25] David Basin and Harald Ganzinger. Automated complexity analysis based on ordered resolution. J. ACM, 48(1):70–109, 2001.[26] David A. Basin and Harald Ganzinger. Complexity analysis based on ordered resolution. In LICS, pages 456–465, 1996.[27] Mathieu Baudet. Deciding security of protocols against off-line guess- ing attacks. In Vijay Atluri, Catherine Meadows, and Ari Juels, editors, ACM Conference on Computer and Communications Security, pages 16– 25. ACM, 2005.[28] Mathieu Baudet. S´curit´ des protocoles cryptographiques : aspects logi- e e ques et calculatoires. Th`se de doctorat, Laboratoire Sp´cification et V´- e e e rification, ENS Cachan, France, January 2007.[29] Mathieu Baudet, V´ronique Cortier, and St´phanie Delaune. Yapa: A e e generic tool for computing intruder knowledge. In Ralf Treinen, editor, Rewriting Techniques and Applications, 20th International Conference, RTA 2009, Bras´ ılia, Brazil, June 29 - July 1, 2009, Proceedings, volume 5595 of Lecture Notes in Computer Science, pages 148–163. Springer, 2009.[30] Moritz Y. Becker, C´dric Fournet, and Andrew D. Gordon. SecPAL: e Design and semantics of a decentralized authorization language. Technical Report MSR-TR-2006-120, Microsoft Research, September 2006.[31] Mihir Bellare and Phillip Rogaway. Optimal asymmetric encryption. In EUROCRYPT, pages 92–111, 1994.[32] D. Berardi, D. Calvanese, G. De Giacomo, R. Hull, and M. Mecella. Auto- matic Composition of Transition-based semantic Web Services with Mes- saging. In Proc. 31st Int. Conf. Very Large Data Bases, VLDB 2005, pages 613–624, 2005.
  • 209. 214 BIBLIOGRAPHY [33] D. Berardi, D. Calvanese, G. De Giacomo, M. Lenzerini, and M. Mecella. Automatic Composition of e-Services that export their Behavior. In Proc. 1st Int. Conf. on Service Oriented Computing, ICSOC 2003, volume 2910, 2003. [34] Vincent Bernat and Hubert Comon-Lundh. Normal proofs in intruder theories. In Okada and Satoh [174], pages 151–166. [35] Elisa Bertino, Jason Crampton, and Federica Paci. Access control and authorization constraints for ws-bpel. In ICWS, pages 275–284. IEEE Computer Society, 2006. [36] Pierre Bieber. A logic of communication in hostile environments. In Proceedings of the Computer Security Foundations Workshop, pages 14– 22, 1990. [37] Simon Blake-Wilson and Alfred Menezes. Unknown key-share attacks on the station-to-station (sts) protocol. In Hideki Imai and Yuliang Zheng, editors, Public Key Cryptography, volume 1560 of Lecture Notes in Com- puter Science, pages 154–170. Springer, 1999. [38] Bruno Blanchet. An efficient cryptographic protocol verifier based on prolog rules. In CSFW [1], pages 82–96. [39] Bruno Blanchet. Automatic proof of strong secrecy for security protocols. In IEEE Symposium on Security and Privacy, pages 86–. IEEE Computer Society, 2004. [40] Bruno Blanchet, Mart´ Abadi, and C´dric Fournet. Automated veri- ın e fication of selected equivalences for security protocols. In LICS, pages 331–340. IEEE Computer Society, 2005. [41] Bruno Blanchet and Andreas Podelski. Verification of cryptographic pro- tocols: Tagging enforces termination. In Andrew D. Gordon, editor, FoS- SaCS, volume 2620 of Lecture Notes in Computer Science, pages 136–152. Springer, 2003. [42] Michele Boreale, Rocco De Nicola, and Rosario Pugliese. Proof techniques for cryptographic processes. In LICS, pages 157–166, 1999. [43] Francois Bronsard and Uday S. Reddy. Conditional rewriting in focus. In M. Okada, editor, Proceedings of the Second International Workshop on Conditional and Typed Rewriting Systems, volume 516 of Lecture Notes in Computer Science. Springer-Verlag, 1991. [44] T. Brown. A Structured Design Method for Specialized Proof Procedures. Phd, California Institute of Technology, 1974. [45] Tevfik Bultan, Xiang Fu, Richard Hull, and Jianwen Su. Conversation specification: a new approach to design and analysis of e-service compo- sition. In WWW, pages 403–410, 2003.
  • 210. BIBLIOGRAPHY 215[46] Alan Bundy, editor. Automated Deduction - CADE-12, 12th Interna- tional Conference on Automated Deduction, Nancy, France, June 26 - July 1, 1994, Proceedings, volume 814 of Lecture Notes in Computer Sci- ence. Springer, 1994.[47] Sergiu Bursuc and Hubert Comon-Lundh. Protocol security and alge- braic properties: decision results for a bounded number of sessions. In Ralf Treinen, editor, Proceedings of the 20th International Conference on Rewriting Techniques and Applications (RTA’09), volume 5595 of Lec- ture Notes in Computer Science, pages 133–147, Bras´ ılia, Brazil, 2009. Springer.[48] Sergiu Bursuc, Hubert Comon-Lundh, and St´phanie Delaune. Deducibil- e ity constraints. presentation at the 2010 Secret Workshop, 2010.[49] Carlos Caleiro, Luca Vigan`, and David A. Basin. On the semantics of o alicebob specifications of security protocols. Theor. Comput. Sci., 367(1- 2):88–122, 2006.[50] Ran Canetti. Universally composable security: A new paradigm for cryp- tographic protocols. In Proceedings of the 42nd Foundations Of Computer Science conference, pages 136–145, 2001.[51] Ulf Carlsen. Generating formal cryptographic protocol specifications. Se- curity and Privacy, IEEE Symposium on, 0:137, 1994.[52] Iliano Cervesato. The logical meeting point of multiset rewrit- ing and process algebra. Technical report, University of Stan- ford, 2004. Unpublished manuscript. Available electronically from http://theory.stanford.edu/?iliano/forthcoming.html.[53] Chin-Liang Chang and Richard Char-Tung Lee. Symbolic Logic and Me- chanical Theorem Proving. Academic Press, 1973.[54] Vincent Cheval, Hubert Comon-Lundh, and St´phanie Delaune. A deci- e sion procedure for proving observational equivalence. In Michele Boreale and Steve Kremer, editors, Preliminary Proceedings of the 7th Interna- tional Workshop on Security Issues in Coordination Models, Languages and Systems (SecCo’09), Bologna, Italy, October 2009. accepted to IJ- CAR 2010.[55] Yannick Chevalier. R´solution de Probl`mes d’Accessibilit´ pour la Com- e e e pilation et la V´rification de Protocoles Cryptographiques. PhD thesis, e Universit´ Henri Poincar´ Nancy I, LORIA, december 2003. e e[56] Yannick Chevalier. A simple constraint solving procedure for protocols with exclusive or. In Workshop on Unification (in conjunction with IJCAR 2004), 2004.
  • 211. 216 BIBLIOGRAPHY [57] Yannick Chevalier and Mounira Kourjieh. A symbolic intruder model for hash-collision attacks. In Okada and Satoh [174], pages 13–27. [58] Yannick Chevalier and Mounira Kourjieh. Key substitution in the sym- bolic analysis of cryptographic protocols. In Vikraman Arvind and Sanjiva Prasad, editors, FSTTCS 2007: Foundations of Software Technology and Theoretical Computer Science, 27th International Conference, New Delhi, India, December 12-14, 2007, Proceedings, volume 4855 of Lecture Notes in Computer Science, pages 121–132. Springer, 2007. [59] Yannick Chevalier and Mounira Kourjieh. On the decidability of (ground) reachability problems for cryptographic protocols (extended version). CoRR, abs/0906.1199, 2009. [60] Yannick Chevalier, Ralf K¨sters, Micha¨l Rusinowitch, and Mathieu Tu- u e ruani. Deciding the security of protocols with commuting public key en- cryption. Electr. Notes Theor. Comput. Sci., 125(1):55–66, 2005. [61] Yannick Chevalier, Ralf K¨sters, Micha¨l Rusinowitch, and Mathieu Tu- u e ruani. An np decision procedure for protocol insecurity with xor. Theor. Comput. Sci., 338(1-3):247–274, 2005. [62] Yannick Chevalier, Ralf K¨sters, Micha¨l Rusinowitch, and Mathieu Tu- u e ruani. Complexity results for security protocols with diffie-hellman expo- nentiation and commuting public key encryption. ACM Trans. Comput. Log., 9(4), 2008. [63] Yannick Chevalier, Ralf K¨sters, Micha¨l Rusinowitch, Mathieu Turu- u e ani, and Laurent Vigneron. Extending the dolev-yao intruder for analyz- ing an unbounded number of sessions. In Matthias Baaz and Johann A. Makowsky, editors, CSL, volume 2803 of Lecture Notes in Computer Sci- ence, pages 128–141. Springer, 2003. [64] Yannick Chevalier, Luca Compagna, Jorge Cuellar, Paul Hankes Drielsma, Jacopo Mantovani, Sebastian M¨dersheim, and Laurent Vigneron. A o High-Level Protocol Specification Language for Industrial Security- Sensitive Protocols. September 2004. Presented at the SAPS’04 Work- shop, co-located with ASE 2004. [65] Yannick Chevalier, Denis Lugiez, and Micha¨l Rusinowitch. Towards an e automatic analysis of web service security. In Boris Konev and Frank Wolter, editors, Frontiers of Combining Systems, 6th International Sym- posium, FroCoS 2007, Liverpool, UK, September 10-12, 2007, Proceed- ings, volume 4720 of Lecture Notes in Computer Science, pages 133–147. Springer, 2007. [66] Yannick Chevalier, Denis Lugiez, and Micha¨l Rusinowitch. Verifying e cryptographic protocols with subterms constraints. In Nachum Dershowitz and Andrei Voronkov, editors, LPAR, volume 4790 of Lecture Notes in Computer Science, pages 181–195. Springer, 2007.
  • 212. BIBLIOGRAPHY 217[67] Yannick Chevalier and Micha¨l Rusinowitch. Combining Intruder The- e ories. In Lu´ Caires, Giuseppe F. Italiano, Lu´ Monteiro, Catuscia ıs ıs Palamidessi, and Moti Yung, editors, Automata, Languages and Program- ming, 32nd International Colloquium, ICALP 2005, Lisbon, Portugal, July 11-15, 2005, Proceedings, volume 3580 of Lecture Notes in Computer Science, pages 639–651. Springer, 2005.[68] Yannick Chevalier, Ralf K¨sters, Micha¨l Rusinowitch, and Mathieu Tu- u e ruani. An NP Decision Procedure for Protocol Insecurity with XOR. In 18th IEEE Symposium on Logic in Computer Science (LICS 2003), 22-25 June 2003, Ottawa, Canada, Proceedings, pages 261–270. IEEE Computer Society, 2003.[69] Yannick Chevalier, Ralf K¨sters, Micha¨l Rusinowitch, and Mathieu Tu- u e ruani. Deciding the Security of Protocols with Diffie-Hellman Exponenti- ation and Products in Exponents. In Paritosh K. Pandya and Jaikumar Radhakrishnan, editors, FST TCS 2003: Foundations of Software Tech- nology and Theoretical Computer Science, 23rd Conference, Mumbai, In- dia, December 15-17, 2003, Proceedings, volume 2914 of Lecture Notes in Computer Science, pages 124–135. Springer, 2003.[70] Yannick Chevalier and Micha¨l Rusinowitch. Combining intruder theories. e In Lu´ Caires, Giuseppe F. Italiano, Lu´ Monteiro, Catuscia Palamidessi, ıs ıs and Moti Yung, editors, ICALP, volume 3580 of Lecture Notes in Com- puter Science, pages 639–651. Springer, 2005.[71] Yannick Chevalier and Micha¨l Rusinowitch. Hierarchical combination of e intruder theories. In Pfenning [176], pages 108–122.[72] Yannick Chevalier and Micha¨l Rusinowitch. Hierarchical combination of e intruder theories. Information and Computation, 206:352–377, 2008.[73] Yannick Chevalier and Micha¨l Rusinowitch. Decidability of equivalence of e symbolic derivations. Submitted to the Journal of Automated Reasoning, 2009.[74] Yannick Chevalier and Micha¨l Rusinowitch. Compiling and securing e cryptographic protocols. Inf. Process. Lett., 110(3):116–122, 2010.[75] Yannick Chevalier and Micha¨l Rusinowitch. Decidability of the equiva- e lence of symbolic derivations. Journal of Automated Reasoning., page (to appear), August 2010.[76] Yannick Chevalier and Micha¨l Rusinowitch. Symbolic protocol analysis e in the union of disjoint intruder theories: Combining decision procedures. Theor. Comput. Sci., 411(10):1261–1282, 2010.[77] Yannick Chevalier and Laurent Vigneron. A tool for lazy verification of security protocols. In ASE, pages 373–376. IEEE Computer Society, 2001.
  • 213. 218 BIBLIOGRAPHY [78] Yannick Chevalier and Laurent Vigneron. Towards efficient automated verification of security protocols. In In Proceedings of the Verification Workshop (VERIFY’01) (in connection with IJCAR’01), Universit¡E0¿ degli studi di Siena, TR DII 08/01, pages 19–33, 2001. [79] Yannick Chevalier and Laurent Vigneron. Automated unbounded verifi- cation of security protocols. In Ed Brinksma and Kim Guldstrand Larsen, editors, CAV, volume 2404 of Lecture Notes in Computer Science, pages 324–337. Springer, 2002. [80] Najah Chridi, Mathieu Turuani, and Micha¨l Rusinowitch. Decidable e analysis for a class of cryptographic group protocols with unbounded lists. In CSF [2], pages 277–289. [81] Erik Christensen, Francisco Curbera, Greg Meredith, and Sanjiva Weer- awarana. Web services description language (wsdl) 1.1. Available at http://www.w3.org/TR/wsdl11/, 2001. [82] Stefan Ciobˆca and V´ronique Cortier. Protocol composition for arbitrary a e primitives. In Proceedings of the 23rd IEEE Computer Security Founda- tions Symposium, CSF 2010, Edinburgh, United Kingdom, July 17-19, 2010, pages 322–336. IEEE Computer Society, 2010. [83] Michael R. Clarkson and Fred B. Schneider. Hyperproperties. In Datta [92], pages 51–65. [84] Hubert Comon-Lundh and V´ronique Cortier. New decidability results for e fragments of first-order logic and application to cryptographic protocols. In Robert Nieuwenhuis, editor, RTA, volume 2706 of Lecture Notes in Computer Science, pages 148–164. Springer, 2003. [85] Hubert Comon-Lundh and V´ronique Cortier. Security properties: Two e agents are sufficient. In Pierpaolo Degano, editor, ESOP, volume 2618 of Lecture Notes in Computer Science, pages 99–113. Springer, 2003. [86] Hubert Comon-Lundh and V´ronique Cortier. Computational soundness e of observational equivalence. In ACM Conference on Computer and Com- munications Security, pages 109–118, 2008. [87] The World Wide Web Consortium. Simple Object Access Protocol 1.2. http://www.w3.org/TR/soap12-part1, Apr 2007. [88] V´ronique Cortier, J´r´mie Delaitre, and St´phanie Delaune. Safely com- e ee e posing security protocols. In Vikraman Arvind and Sanjiva Prasad, edi- tors, FSTTCS, volume 4855 of Lecture Notes in Computer Science, pages 352–363. Springer, 2007. [89] V´ronique Cortier and St´phanie Delaune. A method for proving obser- e e vational equivalence. In Proceedings of the 22nd IEEE Computer Security Foundations Symposium (CSF’09), pages 266–276. IEEE Computer Soci- ety Press, 2009.
  • 214. BIBLIOGRAPHY 219 [90] V´ronique Cortier, Micha¨l Rusinowitch, and Eugen Zalinescu. A resolu- e e tion strategy for verifying cryptographic protocols with cbc encryption and blind signatures. In Pedro Barahona and Amy P. Felty, editors, PPDP, pages 12–22. ACM, 2005. [91] C.J.F. Cremers. Feasibility of multi-protocol attacks. In Proc. of The First International Conference on Availability, Reliability and Security (ARES), pages 287–294, Vienna, Austria, April 2006. IEEE Computer Society. [92] Anupam Datta, editor. Proceedings of the 21st IEEE Computer Secu- rity Foundations Symposium, CSF 2008, Pittsburgh, Pennsylvania, 23-25 June 2008. IEEE Computer Society, 2008. [93] Magnus Daum and Stefan Lucks. Hash collisions (the poisoned message attack). http://th.informatik.uni-mannheim.de/people/ lucks/HashCollisions/, 2005. [94] Hans de Nivelle. Chapter 3: Logic Preliminaries. University of Delft, 1996. [95] Hans de Nivelle. Chapter 4: How to Obtain Resolution Calculi, Section 5, Refinements. University of Delft, 1996. [96] St´phanie Delaune, Steve Kremer, and Mark Ryan. Verifying privacy-type e properties of electronic voting protocols. Journal of Computer Security, 17(4):435–487, 2009. [97] St´phanie Delaune, Steve Kremer, and Graham Steel. Formal analysis of e PKCS#11. In Proceedings of the 21st IEEE Computer Security Founda- tions Symposium (CSF’08), pages 331–344, Pittsburgh, PA, USA, June 2008. IEEE Computer Society Press. [98] Grit Denker and Jon Millen. Capsl and cil language design - a common authentication protocol specification language and its intermediate lan- guage, 1999. [99] Grit Denker and Jonathan K. Millen. Modeling group communication protocols using multiset term rewriting. Electr. Notes Theor. Comput. Sci., Proceedings of the 2002 Workshop on Rewriting Logic and its Ap- plications, 71, 2002.[100] Nachum Dershowitz and Jean-Pierre Jouannaud. Rewrite systems. In Handbook of Theoretical Computer Science, Volume B: Formal Models and Sematics (B), pages 243–320. Elsevier and MIT Press, 1990.[101] Nachum Dershowitz and Ralf Treinen. Rta list of open problems, problem 37. http://rtaloop.mancoosi.univ-paris-diderot.fr/problems/ summary.html, 1998.
  • 215. 220 BIBLIOGRAPHY[102] T. Dierks and C. Allen. The tls protocol version 1.0. Technical Report RFC 2246, Internet Engineering Task Force (IETF), January 1999.[103] T. Dierks and E. Rescorla. The transport layer security (tls) protocol version 1.1. Technical Report RFC 4346, Internet Engineering Task Force (IETF), April 2006.[104] Whitfield Diffie and Martin E. Hellman. Multiuser cryptographic tech- niques. In AFIPS National Computer Conference, volume 45 of AFIPS Conference Proceedings, pages 109–112. AFIPS Press, 1976.[105] Yun Ding and Patrick Horster. Undetectable on-line password guessing attacks. Operating Systems Review, 29(4):77–86, 1995.[106] D. Dolev and A. Yao. On the Security of Public-Key Protocols. IEEE Transactions on Information Theory, 2(29), 1983.[107] Daniel J. Dougherty, Kathi Fisler, and Shriram Krishnamurthi. Specifying and reasoning about dynamic access-control policies. In of Lecture Notes in Computer Science, pages 632–646. Springer, 2006.[108] Gilles Dowek. A unification algorithm for second order linear terms. un- published manuscript, 1993.[109] Gilles Dowek. Higher-order unification and matching. In Robinson and Voronkov [188], pages 1009–1062.[110] Marwa El Houri. A formal model to express dynamic policies for access control and trust negotiation in a distributed environment. Th`se de doc- e torat, Universit´ Paul Sabatier, Toulouse, France, mai 2010. e[111] F. Javier Thayer F´brega, Jonathan C. Herzog, and Joshua D. Guttman. a Strand spaces: Proving security protocols correct. Journal of Computer Security, 7:191–230, 1999.[112] Christian G. Ferm¨ller, Alexander Leitsch, Ullrich Hustadt, and Tanel u Tammet. Resolution decision procedures. In Robinson and Voronkov [188], pages 1791–1849.[113] David Ferraiolo and Richard Kuhn. Role-based access control. In In 15th NIST-NCSC National Computer Security Conference, pages 554– 563, 1992.[114] R. Fielding, J. Gettys, J. Mogul, H. Frystyk, L. Masinter, and T. Berners- Lee. Hypertext transfer protocol – http/1.1. Technical Report RFC 2616, Internet Engineering Task Force (IETF), June 1999.[115] Zvi Galil, Stuart Haber, and Moti Yung. Symmetric public-key encryp- tion. In Hugh C. Williams, editor, CRYPTO, volume 218 of Lecture Notes in Computer Science, pages 128–137. Springer, 1985.
  • 216. BIBLIOGRAPHY 221[116] Taher El Gamal. A public key cryptosystem and a signature scheme based on discrete logarithms. In CRYPTO, pages 10–18, 1984.[117] Dimitrios Georgakopoulos, Mark F. Hornick, and Amit P. Sheth. An overview of workflow management: From process modeling to workflow automation infrastructure. Distributed and Parallel Databases, 3(2):119– 153, 1995.[118] Robert Givan and David A. McAllester. New results on local inference relations. In KR, pages 403–412, 1992.[119] Shafi Goldwasser and Silvio Micali. Probabilistic encryption and how to play mental poker keeping secret all partial information. In STOC, pages 365–377. ACM, 1982.[120] W3C XML Protocol Working Group. Soap version 1.2, part1: Messaging framework, April 2007.[121] Yuri Gurevich and Itay Neeman. Dkal: Distributed-knowledge authoriza- tion language. In CSF ’08: Proceedings of the 2008 21st IEEE Computer Security Foundations Symposium, pages 149–162, Washington, DC, USA, 2008. IEEE Computer Society.[122] Sebastian Hinz, Karsten Schmidt 0004, and Christian Stahl. Transforming bpel to petri nets. In Wil M. P. van der Aalst, Boualem Benatallah, Fabio Casati, and Francisco Curbera, editors, Business Process Management, volume 3649, pages 220–235, 2005.[123] Jieh Hsiang and Micha¨l Rusinowitch. On word problems in equational e theories. In Thomas Ottmann, editor, ICALP, volume 267 of Lecture Notes in Computer Science, pages 54–71. Springer, 1987.[124] G´rard Huet. Constrained Resolution: A Complete Method for Higher e Order Logic. PhD thesis, Case Western Reserve University, 1972.[125] Hans H¨ttel. Deciding framed bisimilarity. Presented at the INFINITY’02 u workshop, June 2002.[126] Florent Jacquemard, Micha¨l Rusinowitch, and Laurent Vigneron. Com- e piling and verifying security protocols. In Michel Parigot and Andrei Voronkov, editors, LPAR, volume 1955 of Lecture Notes in Computer Sci- ence, pages 131–160. Springer, 2000.[127] Don Johnson, Alfred Menezes, and Scott Vanstone. The elliptic curve digital signature algorithm (ecdsa). International Journal of Information Security, 1:36–63, 2001. 10.1007/s102070100002.[128] Diane Jordan and John Evdemon et al. Web services business process execution language version 2.0. Available at http://docs.oasis-open. org/wsbpel/2.0/OS/wsbpel-v2.0-OS.html, 2007.
  • 217. 222 BIBLIOGRAPHY[129] Anas Abou El Kalam, Salem Benferhat, Alexandre Mi`ge, Rania El Baida, e Fr´d´ric Cuppens, Claire Saurel, Philippe Balbiani, Yves Deswarte, and e e Gilles Trouessin. Organization based access contro. In POLICY, pages 120–. IEEE Computer Society, 2003.[130] Deepak Kapur, Paliath Narendran, and Linda Wang. An e-unification algorithm for analyzing protocols that use modular exponentiation. In Robert Nieuwenhuis, editor, Rewriting Techniques and Applications, 14th International Conference, RTA 2003, Valencia, Spain, June 9-11, 2003, Proceedings, volume 2706 of Lecture Notes in Computer Science, pages 165–179. Springer, 2003.[131] Nickolas Kavantzas, David Burdett, Gregory Ritzinger, Tony Fletcher, Yves Lafon, and Charlton Barreto. Web Services Choreography De- scription Language Version 1.0. Available at http://www.w3.org/TR/ ws-cdl-10/, 2005.[132] John Kelsey, Bruce Schneier, and David Wagner. Protocol interactions and the chosen protocol attack. In Proceedings of the 5th Interna- tional Workshop on Security Protocols, pages 91–104, London, UK, 1998. Springer-Verlag.[133] Hristo Koshutanski and Fabio Massacci. An access control framework for business processes for web services. In Sushil Jajodia and Michiharu Kudo, editors, XML Security, pages 15–24. ACM, 2003.[134] Mounira Kourjieh. Logical Analysis and Verification of Cryptographic Pro- tocols. Th`se de doctorat, Universit´ Paul Sabatier, Toulouse, France, e e d´cembre 2009. e[135] Robert Kowalski and Patrick J. Hayes. Semantic trees in automated the- orem proving. Machine Intelligence, 4, 1969.[136] Steve Kremer, Antoine Mercier, and Ralf Treinen. Reducing equational theories for the decision of static equivalence. In Anupam Datta, editor, Proceedings of the 13th Asian Computing Science Conference (ASIAN’09), volume 5913 of Lecture Notes in Computer Science, pages 94–108, Seoul, Korea, December 2009. Springer.[137] Ralf K¨sters and Tomasz Truderung. Using proverif to analyze protocols u with diffie-hellman exponentiation. In CSF [2], pages 157–171.[138] Ralf K¨sters and Max Tuengerthal. Joint state theorems for public-key u encryption and digital signature functionalities with local computation. In Datta [92], pages 270–284.[139] Ralf K¨sters and Max Tuengerthal. Computational soundness for key u exchange protocols with symmetric encryption. In Ehab Al-Shaer, Somesh Jha, and Angelos D. Keromytis, editors, ACM Conference on Computer and Communications Security, pages 91–100. ACM, 2009.
  • 218. BIBLIOGRAPHY 223[140] Ralf K¨sters and Thomas Wilke. Transducer-based analysis of crypto- u graphic protocols. Inf. Comput., 205(12):1741–1776, 2007.[141] D.S. Lankford. Canonical inference. Technical Report Report ATP-32, University of Texas at Austin, 1975.[142] Arjen K. Lenstra and Benne de Weger. On the possibility of construct- ing meaningful hash collisions for public keys. In Colin Boyd and Juan Manuel Gonz´lez Nieto, editors, ACISP, volume 3574 of Lecture Notes in a Computer Science, pages 267–279. Springer, 2005.[143] Jordi Levy. Linear second-order unification. In Harald Ganzinger, editor, RTA, volume 1103 of Lecture Notes in Computer Science, pages 332–346. Springer, 1996.[144] Zhiyao Liang and Rakesh M. Verma. Correcting and improving the np proof for cryptographic protocol insecurity. In Atul Prakash and Indranil Gupta, editors, ICISS, volume 5905 of Lecture Notes in Computer Science, pages 101–116. Springer, 2009.[145] Peter A. Loscocco, Stephen D. Smalley, Patrick A. Muckelbauer, Ruth C. Taylor, S. Jeff Turner, and John F. Farrell. The inevitability of failure: The flawed assumption of security in modern computing environments. In In Proceedings of the 21st National Information Systems Security Confer- ence, pages 303–314, 1998.[146] Donald W. Loveland. Automated theorem proving : a logical basis. Num- ber 6 in Fundamental studies in computer science. North-Holland Pub. Co., Elsevier, 1978.[147] Gavin Lowe. Breaking and fixing the needham-schroeder public-key pro- tocol using fdr. In Tiziana Margaria and Bernhard Steffen, editors, TACAS, volume 1055 of Lecture Notes in Computer Science, pages 147– 166. Springer, 1996.[148] Gavin Lowe. Casper: A compiler for the analysis of security protocols. Journal of Computer Security, 6(1-2):53–84, 1998.[149] Roberto Lucchi and Manuel Mazzara. A pi-calculus based semantics for ws-bpel. J. Log. Algebr. Program., 70(1):96–118, 2007.[150] Christopher Lynch. Personnal communication. Toulouse, december 2009, 2009.[151] Pierre Marchand. Cours de logique de dea. unpublished manuscript, 1986.[152] Alberto Martelli and Ugo Montanari. Theorem proving with structure sharing and efficient unification. In IJCAI, page 543, 1977.[153] S.J. Maslov. An inverse method for establishing deducibility in the clas- sical predicate calculus. Dokl. Akad. Nau. SSSR, 159:1420–1424, 1964.
  • 219. 224 BIBLIOGRAPHY[154] S.J. Maslov. An inverse method for establishing deducibility for logical calculi. Trudy Mat. Inst. Steklov, 98:26–87, 1968.[155] Jay A. McCarthy and Shriram Krishnamurthi. Cryptographic protocol explication and end-point projection. In Sushil Jajodia and Javier L´pez, o editors, Computer Security - ESORICS 2008, 13th European Symposium on Research in Computer Security, M´laga, Spain, October 6-8, 2008. a Proceedings, volume 5283 of Lecture Notes in Computer Science, pages 533–547. Springer, 2008.[156] Jay A. McCarthy, Shriram Krishnamurthi, Joshua D. Guttman, and John D. Ramsdell. Compiling cryptographic protocols for deployment on the web. In Carey L. Williamson, Mary Ellen Zurko, Peter F. Patel- Schneider, and Prashant J. Shenoy, editors, Proceedings of the 16th Inter- national Conference on World Wide Web, WWW 2007, Banff, Alberta, Canada, pages 687–696. ACM, 2007.[157] Antoine Mercier. Contributions ` l’analyse automatique des protocoles a cryptographiques en pr´sence de propri´t´s alg´briques : protocoles de e ee e groupe, ´quivalence statique. Th`se de doctorat, Laboratoire Sp´cification e e e et V´rification, ENS Cachan, France, December 2009. e[158] Ralph C. Merkle. Secure communications over insecure channels. Com- mun. ACM, 21(4):294–299, 1978.[159] Middleware and Related Services PTF. Common object request broker architecture (corba/iiop) v 3.1. Technical report, Object Modeling Group, January 2008. Available at http://www.omg.org/spec/CORBA/3.1/.[160] Jonathan K. Millen. A necessarily parallel attack. In In Workshop on Formal Methods and Security Protocols, 1999.[161] Jonathan K. Millen and Vitaly Shmatikov. Constraint solving for bounded-process cryptographic protocol analysis. In ACM Conference on Computer and Communications Security, pages 166–175, 2001.[162] Sebastian M¨dersheim. Algebraic properties in alice and bob notation. o In Proceedings of the The Forth International Conference on Availability, Reliability and Security, ARES 2009, March 16-19, 2009, Fukuoka, Japan, pages 433–440. IEEE Computer Society, 2009.[163] Sebastian M¨dersheim and Luca Vigan`. Secure pseudonymous channels. o o In Michael Backes and Peng Ning, editors, ESORICS, volume 5789 of Lecture Notes in Computer Science, pages 337–354. Springer, 2009.[164] S. Narayanan and S. McIlraith. Simulation, verification and automated composition of web services. In Proceedings of the Eleventh International World Wide Web Conference (WWW-11), pages 77–88, Honolulu, Hawaii, USA, May 7-11 2002.
  • 220. BIBLIOGRAPHY 225[165] NBS. Federal information processing standard (fips) for the data encryp- tion standard. Technical Report FIPS-46, National Bureau of Standards (NBS), May 1975.[166] Roger M. Needham and Michael D. Schroeder. Using encryption for au- thentication in large networks of computers. Commun. ACM, 21(12):993– 999, 1978.[167] Robert Nieuwenhuis and Fernando Orejas. Clausal rewriting. In St´phane e Kaplan and Mitsuhiro Okada, editors, CTRS, volume 516 of Lecture Notes in Computer Science, pages 246–258. Springer, 1990.[168] Robert Nieuwenhuis and Albert Rubio. Ac-superposition with constraints: No ac-unifiers needed. In Bundy [46], pages 545–559.[169] NIST. Federal information processing standard (fips) for the data encryp- tion standard. Technical Report FIPS-46.3, National Institute of Stan- dards and Technology (NIST), October 1999.[170] NIST. Federal information processing standard (fips) for the advanced encryption standard. Technical Report FIPS-197, National Institute of Standards and Technology (NIST), November 2001.[171] Oasis Consortium. Web Services Business Process Execution Language Version 2.0. http://www.oasis-open.org/committees/documents. php?wg_abbrev=wsbpel, 23 January, 2006.[172] Oasis Technical Comittee on Secure Exchange. Ws-securitypolicy 1.2. http://doc.oasis-open.org/ws-sx/ws-securitypolicy/200702/ ws-securitypolicy-1.2-spec-cd-02.pdf, 2007.[173] OASIS XACML TC. Xacml 2.0 core: extensible access con- trol markup. Available at http://docs.oasis-open.org/xacml/2.0/ access_control-xacml-2.0-core-spec-os.pdf, 2005.[174] Mitsu Okada and Ichiro Satoh, editors. Advances in Computer Science - ASIAN 2006. Secure Software and Related Issues, 11th Asian Computing Science Conference, Tokyo, Japan, December 6-8, 2006, Revised Selected Papers, volume 4435 of Lecture Notes in Computer Science. Springer, 2008.[175] Federica Paci, Elisa Bertino, and Jason Crampton. An access-control framework for ws-bpel. Int. J. Web Service Res., 5(3):20–43, 2008.[176] Frank Pfenning, editor. Term Rewriting and Applications, 17th Inter- national Conference, RTA 2006, Seattle, WA, USA, August 12-14, 2006, Proceedings, volume 4098 of Lecture Notes in Computer Science. Springer, 2006.
  • 221. 226 BIBLIOGRAPHY[177] Birgit Pfitzmann, Matthias Schunter, and Michael Waidner. Crypto- graphic security of reactive systems. Electr. Notes Theor. Comput. Sci., 32, 2000.[178] M. Pistore, A. Marconi, P. Bertoli, and P. Traverso. Automated compo- sition of Web Services by Planning at the knowledge Level. In Proc. Int. Joint Conf. on Artificiel Intelligence, IJCAI 2005, pages 1252–1259, 2005.[179] PKCS Editor. Pkcs #1 v1.5: Rsa cryptography standard. Technical Report PKCS #1, RSA Laboratories, 1993.[180] PKCS Editor. Pkcs #1 v2.1: Rsa cryptography standard. Technical Re- port PKCS #1, RSA Laboratories, 2002. OAEP description in Section 7.1.[181] Gordon D. Plotkin. Building-in equational theories. Machine Intelligence, 7:73–90, 1972. also available at http://homepages.inf.ed.ac.uk/gdp/ publications/building_in_equational_theories.pdf.[182] J. M. Pollard. A monte carlo method for factorization. Nordisk Tidskrift for Informationsbehandlung (BIT), 15:331–334, 1975.[183] W. V. Quine. A proof procedure for quantification theory. Journal of Symbolic Logic, 20:141–149, June 1955.[184] Charles Rackoff and Daniel R. Simon. Non-interactive zero-knowledge proof of knowledge and chosen ciphertext attack. In Joan Feigenbaum, editor, CRYPTO, volume 576 of Lecture Notes in Computer Science, pages 433–444. Springer, 1991.[185] Ramaswamy Ramanujam and S. P. Suresh. Tagging makes secrecy decid- able with unbounded nonces as well. In Paritosh K. Pandya and Jaikumar Radhakrishnan, editors, FSTTCS, volume 2914 of Lecture Notes in Com- puter Science, pages 363–374. Springer, 2003.[186] Ronald L. Rivest, Adi Shamir, and Leonard M. Adleman. A method for obtaining digital signatures and public-key cryptosystems. Commun. ACM, 21(2):120–126, 1978.[187] Roberto Chinnici and Jean-Jacques Moreau and Arthur Ryman and San- jiva Weerawarana. Web Services Description Language (WSDL) 2.0. http://www.w3.org/TR/wsdl20/, June 2007.[188] John Alan Robinson and Andrei Voronkov, editors. Handbook of Auto- mated Reasoning (in 2 volumes). Elsevier and MIT Press, 2001.[189] Michael Rusinowitch. D´monstration automatique: e techniques de r´´criture. InterEditions, 1989. ee[190] Micha¨l Rusinowitch and Mathieu Turuani. Protocol insecurity with finite e number of sessions is NP-complete. In CSFW [1], pages 174–.
  • 222. BIBLIOGRAPHY 227[191] Manfred Schmidt-Schauß. Unification in a combination of arbitrary dis- joint equational theories. In Claude Kirchner, editor, Unification, pages 217–265. Academic Press, 1986.[192] Bruce Schneier. Applied cryptography. Addison-Wesley, 1996.[193] Klaus U. Schulz. Makanin’s algorithm for word equations - two improve- ments and a generalization. In Klaus U. Schulz, editor, IWWERT, volume 572 of Lecture Notes in Computer Science, pages 85–150. Springer, 1990.[194] Helmut Seidl and Kumar Neeraj Verma. Flat and one-variable clauses: Complexity of verifying cryptographic protocols with single blind copying. In Franz Baader and Andrei Voronkov, editors, LPAR, volume 3452 of Lecture Notes in Computer Science, pages 79–94. Springer, 2004.[195] Helmut Seidl and Kumar Neeraj Verma. Flat and one-variable clauses: Complexity of verifying cryptographic protocols with single blind copying. ACM Trans. Comput. Log., 9(4), 2008.[196] Helmut Seidl and Kumar Neeraj Verma. Flat and one-variable clauses for single blind copying protocols: The xor case. In Ralf Treinen, editor, RTA, volume 5595 of Lecture Notes in Computer Science, pages 118–132. Springer, 2009.[197] Victor Shoup, editor. Advances in Cryptology - CRYPTO 2005: 25th Annual International Cryptology Conference, Santa Barbara, California, USA, August 14-18, 2005, Proceedings, volume 3621 of Lecture Notes in Computer Science. Springer, 2005.[198] Thoralf Skolem. Logisch-kombinatorische untersuchungen uber die ¨ erf¨llbarkeit oder beweisbarkeit mathematischer s¨tze nebst einem the- u a oreme uber dichte mengen. Skrifter utgit av Videnskapsselskapet i Kris- ¨ tiani, I. Matematisk-naturvidenskabelig klasse, 4:1–36, 1920.[199] Marc Stevens, Arjen K. Lenstra, and Benne de Weger. Chosen-prefix collisions for md5 and colliding x.509 certificates for different identities. In Moni Naor, editor, EUROCRYPT, volume 4515 of Lecture Notes in Computer Science, pages 1–22. Springer, 2007.[200] Scott D. Stoller. A reduction for automated verification of authentica- tion protocols. Technical Report 520, Computer Science Dept., Indiana University, December 1998.[201] Scott D. Stoller. A reduction for automated analysis of authentication pro- tocols. In Workshop on Formal Methods and Security Protocols, July 1999. Also appeared as Indiana University, Computer Science Dept., Technical Report 520, Dec. 1998.
  • 223. 228 BIBLIOGRAPHY[202] The Avantssar Project. Problem cases and their trust and security re- quirements. Deliverable D5.1, Automated VAlidatioN of Trust and Se- curity of Service-oriented ARchitectures (AVANTSSAR), http://www. avantssar.eu/, 2008.[203] The World Wide Web Consortium. XML Schema Definition (XSD). http: //www.w3.org/XML/Schema, March 2005.[204] Erik Tid´n. Unification in combinations of collapse-free theories with e disjoint sets of function symbols. In J¨rg H. Siekmann, editor, 8th Inter- o national Conference on Automated Deduction, Oxford, England, July 27 - August 1, 1986, Proceedings, volume 230 of Lecture Notes in Computer Science, pages 431–449. Springer, 1986.[205] Yoshihito Toyama. Counterexamples to termination for the direct sum of term rewriting systems. Inf. Process. Lett., 25(3):141–143, 1987.[206] Tomasz Truderung. Regular protocols and attacks with regular knowledge. In Robert Nieuwenhuis, editor, CADE, volume 3632 of Lecture Notes in Computer Science, pages 377–391. Springer, 2005.[207] Max Tuengerthal, Ralf K¨sters, and Mathieu Turuani. Implement- u ing a unification algorithm for protocol analysis with xor. CoRR, abs/cs/0610014, 2006.[208] Mathieu Turuani. The cl-atse protocol analyser. In Pfenning [176], pages 277–286.[209] Laurent Vigneron. Associative-commutative deduction with constraints. In Bundy [46], pages 530–544.[210] Xiaoyun Wang, Yiqun Lisa Yin, and Hongbo Yu. Finding collisions in the full sha-1. In Shoup [197], pages 17–36.[211] Xiaoyun Wang and Hongbo Yu. How to break md5 and other hash func- tions. In Ronald Cramer, editor, EUROCRYPT, volume 3494 of Lecture Notes in Computer Science, pages 19–35. Springer, 2005.[212] Xiaoyun Wang, Hongbo Yu, and Yiqun Lisa Yin. Efficient collision search attacks on sha-0. In Shoup [197], pages 1–16.[213] Stephen A. White and Derek Miers. BPMN Modeling and Reference Guide. Future Strategies Inc, 2008.[214] Wikipedia. The enigma machine. Available at http://en.wikipedia. org/wiki/Enigma_machine, 2010.[215] World Wide Web Consortium. XML Path Language (XPath) 2.0. http: //www.w3.org/TR/xpath20/, 23 January, 2007.
  • 224. BIBLIOGRAPHY 229[216] L. Wos and G. Robinson. Paramodulation and set of support. In Sympo- sium of the INRIA Symposium on Automatic Demonstration, volume 125 of Lecture Notes in Computer Science, pages 276–310. Springer, 1970.[217] Larry Wos. Automated reasoning: 33 BASIC research problems. Prentice- Hall, Inc., Upper Saddle River, NJ, USA, 1988.

×