ILIC Dejan - MSc: Secure Business Computation by using Garbled Circuits in a Web Environment


Published on

This thesis introduces a web based system for secure evaluation of economic function, named Secure Business Computation (SBC), in the manner suggested by Yao 1982

Published in: Technology, Business
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

ILIC Dejan - MSc: Secure Business Computation by using Garbled Circuits in a Web Environment

  1. 1. University of Trieste Italy Master of Science in Computer Engineering Master Thesis Secure Business Computation by using Garbled Circuits in a Web Environment ´ by ILIC Dejan UNITS Supervisor SAP Supervisor DR. BARTOLI Alberto ¨ M.Sc. SCHROPFER Axel Piazzale Europa 1 Vincenz-Prießnitz-Straße 1 34127 Trieste, Italy 76131 Karlsruhe, Germany March 5, 2010
  2. 2. Acknowledgments It is an honor for me to give my biggest thank to my professor and supervisor Alberto Bartoli. He has made available his support in a number of ways. His professional, positive, informed and encouraging nature has been an inspiration throughout. From the first day, at University of Trieste, to the very end of my studdings. I am indebted to my many of my colleagues for supporting me at SAP Research in Karslruhe. To all who helped and encouraged me during my time within their department. Firstly, I am heartily thankful to my second supervisor Axel Schr¨pfer, o whose encouragement, supervision and support from the preliminary to the conclud- ing level enabled me to hit even harder for my future career development. It is an honor for me to acknowledge the project leader Florian Kerschbaum for his willing support, advice and hard work in many aspects of the project. I am also grateful to the many students who participated in the Secure SCM project, in particular Daniel Funke, Vishaal Kumar and Piter Kohl. Their tolerance, team work, good humour and insight added much to this work experience. Hopefully, all my colleagues felt as rewarded as I did for their time and effort. It is a pleasure to thank those who made this thesis possible. Firstly, I owe my deepest gratitude to my parents and my brother who gave me the moral support I’ve required from the very first day. I am particularly indebted to my girlfriend and my friends, who has turned a blind eye to the dereliction of duty that occurred over the past few years. Also, many thanks goes to my friends and colleagues at the University of Trieste that were supporting and encouraging me to pursue this degree. Lastly, I offer my regards and blessings to all of those who supported me in any respect during the studies. Sincerely yours, ´ ILIC Dejan
  3. 3. Abstract English. Collaboration among partners in a supply chain has been proven bene- ficial. Thus, the full potential of a supply chain is not achievable by locally optimal strategies, but rather requires cooperation of all participating parties. Yet part- ners are reluctant to share private data in fear of exploitation by suppliers and competitors. Secure Computation (SC) enables the effective decision making upon comprehensive partners data, while assuring the secrecy of private data. The SC is an interesting topic in modern cryptography. Rapid growth in process- ing and communication speeds made a web based two-party SC realistic paradigm. This thesis introduces a web based system for secure evaluation of economic func- tion, named Secure Business Computation (SBC), in the manner suggested by Yao in [29]. A function is described with high level Secure Business Computation Lan- guage (SBCL), which will be used to generate a one-pass Binary Circuit Description Language (BCDL) object. Indispensable libraries for cryptographic tools, like Obliv- ious Transfer and Yao’s secure function evaluation protocol, were developed. These libraries are used by a pair of Supplier/Buyer web based SBC applications. This work is particularly focused on a Joint Economic Lot Size (JELS) function. Italiano. La collaborazione tra i partner pu migliorare il funzionamento di una sup- ply chain rispetto ad uno scenario in cui ogni componente della supply chain adotta una strategia ottimale a livello locale. Realizzare questa collaborazione nella prat- ica molto complicato in quanto i componenti sono riluttanti a condividere i propri dati con altre organizzazioni, siano esse fornitori o potenziali concorrenti. La Secure Computation (SC) uno strumento potente in questo contesto in quanto permette di prendere decisioni su dati globali garantendo la segretezza dei dati stessi. La velocit di calcolo e di comunicazione delle tecnologie moderne rende il paradigma SC praticamente utilizzabile. La tesi descrive un sistema web basato sul paradigma di SC per realizzare forme di Secure Business Computation in accordo al protocollo proposto da Yao in [29]. La funzione da calcolare viene descritta con linguaggio ad alto livello Secure Business Computation Language e da questa descrizione viene poi derivato un oggetto Binary Circuit Description Language che realizza la fun- zione per mezzo di un circuito combinatorio. Sono state sviluppate le librerie per i componenti crittografiche essenziali, quali la componente di Oblivious Transfer e la componente che realizza il protocollo di secure evaluation proposto da Yao. Queste librerie sono utilizzate in una applicazione web per Supplier e Buyer in paradigma SC. Il lavoro ` focalizzato in particolare sulla funzione Joint Economic Lot Size. e
  4. 4. Contents 1 Introduction 3 2 State of the Art 5 3 Boolean Circuit Construction 8 3.1 Formula Tree Object . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 3.1.1 In-Memory Tree Structure . . . . . . . . . . . . . . . . . . . . 10 3.1.2 SBCL Parser . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 3.2 Arithmetic Operation Blocks . . . . . . . . . . . . . . . . . . . . . . 15 3.2.1 Addition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 3.2.2 Subtraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 3.2.3 Multiplication . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 3.2.4 Division . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 3.3 Formula Tree To BCDL Compiler . . . . . . . . . . . . . . . . . . . . 19 3.3.1 Boolean Circuit Description Language . . . . . . . . . . . . . 19 3.3.2 Formula Circuit Composition . . . . . . . . . . . . . . . . . . 21 3.3.3 Circuit Correctness . . . . . . . . . . . . . . . . . . . . . . . . 26 4 Web Based Secure Business Computation 29 4.1 Browser To Browser Message Exchange . . . . . . . . . . . . . . . . 30 4.1.1 Exchange Session Manager . . . . . . . . . . . . . . . . . . . 32 4.1.2 Browser Message Box . . . . . . . . . . . . . . . . . . . . . . 37 4.2 Cryptographic Library . . . . . . . . . . . . . . . . . . . . . . . . . . 39 4.2.1 Secure Hash Algorithm SHA-1 . . . . . . . . . . . . . . . . . 41 4.2.2 Oblivious Transfer . . . . . . . . . . . . . . . . . . . . . . . . 42 4.2.3 Garbled Circuits . . . . . . . . . . . . . . . . . . . . . . . . . 46 4.3 Web Yao Application Structure . . . . . . . . . . . . . . . . . . . . . 51 4.4 Graphic User Interface Design . . . . . . . . . . . . . . . . . . . . . . 56 5 Evaluation 58 5.1 Joint Economic Lot Size . . . . . . . . . . . . . . . . . . . . . . . . . 58 1
  5. 5. 5.2 Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60 5.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64 6 Conclusion and Future Work 68 A Cryptographically Secure Random Number 70 2
  6. 6. Chapter 1 Introduction This thesis work is a part of the IT research at SAP AG Research CEC (Campus- based Engineering Center) situated in Karlsruhe, Germany. The corresponding project is focused on a Security&Trust1 research field, particularly onto improvement of the Supply Chain Management (SCM) by using secure multi-party computation. Thus, the project is denominated ”Secure SCM” project. The key title parts are ”Secure Business Computation”, ”Garbled Circuits” and ”Web Environment”. The Secure Business Computation (SBC) was born by adapt- ing Secure Function Evaluation (SFE) technique, while the evaluated function is a simple economic (or business) function constructed from basic arithmetical opera- tions. The technique described by Yao in [7], named Garbled Circuits, is used to develop secure business computation paradigm. At the end, the entire system is accommodated to run on internet, or in other words ”Web Environment”. The successful work result is an internet based service that performs a secure business computation between a supplier and a buyer, in a web browser. A specified high level language has been developed dedicated to a custom business function description named Secure Business Computation Language (SBCL). This language then will be parsed to an in-memory formula object which will be compiled to a one- pass boolean circuit described with Binary Circuit Description Language (BCDL). The final result is a fully operational web based product which performs a secure business computation behind a friendly Graphical User Interface (GUI). As such this product is ready to be released to the market. 1 Security&Trust is a research program of SAP Research. This program approach emphasizes on the users and their individual protection needs, including confidence in the system security and awareness of its security status. Itrelates to security properties including authenticity, authorization, integrity, confidentiality, privacy, anonymity, pseudonymity and non-repudiation. 3
  7. 7. Business data exchange between a business partners is always critical for each partner, thus they keep their sensitive information private. But the partners would like to achieve collaborative business improvements e.g. to identify a globally optimal production plan (Funke [8]). A good example for this is an economic function named Joint Lot Economic Size (JELS), used in the collaborative supply chain which has to be computed by both partners. This function can be evaluated securely with secure function evaluation technique. In this case, this function is described as an encrypted (garbled) boolean circuit. Therefore, a SBC is a secure function evaluation technique, but represented trough the garbled boolean circuit. With this method each partner can compute an economic function output without revealing his input to his partner. Final application is a secure computation service meant for business partners that are willing to execute the computation in a web environment. The entire work is divided in two parts. In the first part, a SBCL language has been developed. Than a SBCL parser was developed to generate an in-memory formula object. A developed compiler then compiles in-memory formula to a BCDL object. The parser and compiler are written in Java programing language. The second part is development of a web based (Yao protocol) SBC application between two browsers. These browsers communicate over an artificial channel through a web server message exchange subsystem developed with Java Servlet and deployed on the Apache Tomcat server. The server application logic uses Java Server Pages (JSP), the server-side Java technology. While the client application work flow is developed with JavaScript scripting language, together with the Asynchronous JavaScript And XML (AJAX) technology used for client/server communication. Even for the Yao evaluation, the cryptographic libraries were developed in JavaScript. The AJAX technology was also used to build a rich GUI for client application. The State-of-the-Art chapter compares this thesis work with similar available practical privacy-preserving web based information systems on market. The chapter 3 describes how a secure business computation is composed (from the four arithmetic blocks) using dedicated languages for its construction. The chapter 4 explains how the application itself is constructed. It describes how messages between two browsers are exchanged. Indispensable cryptographic libraries for Yao and OT protocol evaluation will be explained. The evaluation chapter describes the final web based application system. The system runs a secure computation of the Joint Lot Economic Size function. In the chapter 6 an overall research conclusion is described and desired future work on a web based privacy-preserving secure business computation systems. 4
  8. 8. Chapter 2 State of the Art In cryptography, secure multi-party computation (SMC) is a problem that was ini- tially suggested by Andrew C. Yao in a 1982 paper [28]. In that publication, Yao raised the issue of two millionaires, Alice and Bob, willing to find out who is richer without revealing the precise amount of their wealth. A SMC solution has been provided, satisfying Alice’s and Bob’s curiosity while respecting the constraints. By looking at this example, there is no doubt that without a SMC those parties would be forced to reveal amount of their wealth to a third (trusted) party. This example show the type of the problems that generalized a multi-party com- putation (as in [11]), or SMC protocols. In an SMC, there is a number N of partici- pants p1 , p2 , . . . , pN , where each participant has a private data d1 , d2 , . . . , dN . These participants want to compute the value of a public function with their private in- puts, e.g. F (d1 , d2 , . . . , dN ). By definition, a SMC protocol computation is privacy preserving, i.e. nothing else is revealed to the other players than what is inferable by his private input and the outcome of the function. These functions should be se- curely computed, or evaluated, and this process is called Secure Function Evaluation (SFE). In order to construct SFE there has been accepted Yao’s encrypted (garbled) bi- nary circuit approach [29]. In this approach one of the participants, Bob, constructs the garbled circuit by assigning two random (garbled) bit strings to all the wires which are represented by 0 and 1 respectively and sends it to the other participant, Alice. Before Alice can evaluate the circuit she needs to learn her corresponding garbled input string. To do so, they use a cryptographic technique called 1-out-of-2 2 oblivious transfer (OT1 ) for Alice to receive her input in garbled form. With this technique Alice reveals only one message mA , while two messages m0 and m1 were offered by Bob, but still Bob cannot reveal value A and therefore he doesn’t know 5
  9. 9. which message Alice has choose. Alice then evaluates the garbled circuit and sends Bobs output back. They both translate the garbled output strings for their output wires into output bits, or computation result. As has been shown in practice, various real-life problems would require SMC solution. Such problems are distributed voting, private bidding and auctions, private information retrieval, etc. But some problems can rely only on the sub-problem of SMC. That’s because of its close relation to many cryptographic tasks is referred to as secure two-party computation (2PC). As has been proven by Yao in [28], a SFE can be achieved with help of the garbled circuits. Since now, the Yao’s garbled circuit was implemented only as non web based solution. Therefore, the main goal of the thesis is to contrive a web based two-party application which runs SFE with Yao protocol. Till now this concept was mostly theoretical, but modern cryptography coupled with rapid growth in processing and communication speeds made secure two-party computation a realistic paradigm. First system that runs Yao protocol is presented in a Fairplay project [7]. It comprises two applications that are activated by the two players, who want to engage in two-party SFE. This system presents the first evaluation of an overall SFE in real settings. The two-party FairPlay system includes a high-level language SFDL for specifying a distributed protocol, a compiler that compiles the high-level definition into a low-level sequence of primitive operations (boolean circuit) named SHDL, and cryptographic protocols for securely executing this sequence of operations. An important notice is that the two-party FairPlay system is not a web based solution. Nowadays an essential business request is to have a system capable to be run in a web environment. Maybe FairPlay it can evaluate simple SFDL programs, but business partners will require to be transparent and independent while they still can evaluate simple custom economic functions.1 This was also mentioned in the FairPlay project paper [7], a secure two-party computation system can be extended in many ways. One important improvement was developing a new applications driven by advances in the communication infrastructure (such as the ubiquity of the Internet or the emergence of web services). As has been described in chapter 1, this work shows a cutting edge web based SFE application in a security information systems. The final system is able to demonstrate the concept of SFE in a web environment, with protocol proposed by Yao. This systam will protect partners and their confidentiality by contrive them to learn the SFE result without revealing their secret inputs. It will be shown in chapter 4, that this system provides a service to business partners for a custom 1 An simple economic function is expected that can be built with four basic arithmetical operators. 6
  10. 10. Figure 2.1: The final web SBC application look economic function computation. The key task is to demonstrate evaluation of the JELS function defined with BCDL object, that is generated with help of a high-level language SBCL used to define an economic function. This example makes clear that the final circuit can be an arbitrary economic function. Therefore, the thesis research goes to cutting edge in the security research field and delivers state of the are system. The resulting application provides an oppor- tunity for future engineering of practical privacy-preserving web based information systems. It pushes research in supply chain optimizations even further, making pos- sible and secure computation for all participating parties. At the very beginning, there has been mentioned that thesis result is a pair of internet applications ready to be release to the market. Indeed, as has been shown, two browsers are capable to evaluate a secure two-party computation over the internet. The figure 2.1 shows entering web page for the final SBC application, and more pictures are shown in the section 4.4. 7
  11. 11. Chapter 3 Boolean Circuit Construction A boolean circuit is a mathematical model of computation used in studying compu- tational complexity theory. This circuits are also the main object of study in circuit complexity [27]. In this study, a boolean circuit with n input bits is a directed acyclic graph in which every node (usually called gates in this context) is either an input node of in-degree 0 labeled by one of the n input bits, an AND gate, an OR or a NOT gate. One of these gates is designated as the output gate. Such a circuit naturally computes a function of its n inputs. The size of a circuit is directly related to the number of gates it contains. Several important complexity measures can be defined on boolean circuits, in- cluding circuit depth1 , circuit size, and number of alternations. In a circuit family the size complexity is considered for this thesis work, e.g. of a family to be the function of n that gives the size of the circuit that decides inputs of length n. A boolean circuit is defined in terms of the gates it contain. As mentioned, a basic circuit might contain binary AND and OR gates and unary NOT gates. This basic circuits gates can be used to derive a binary gate eXclusive OR, or XOR. The usage of the XOR gate can help to decrease circuit size, or total gate number. Each gate corresponds to some Boolean function, meaning that it is some mathematical function which takes k bits as input and which outputs a single bit. One important feature of the one-pass(or combinatorial) boolean circuits is that they are oblivious. They perform the same operation sequence independently of the input (i.e. compute the values of the gates one after the other). A combinatorial circuit is a circuit whose output is uniquely defined by its inputs. They do not have memory, previous inputs do not affect their outputs. This structure differs to a sequential circuit structure where the circuit gates can be reused, obviating the 1 Depth of a circuit, denotes the maximum distance from an input to an output. 8
  12. 12. circuit memoryless property. Therefore, the combinatorial circuit structure makes this circuits oblivious as required by Yao in [29]. This obliviousness is the key reason why one-pass boolean circuits were used as the computation model for the secure function evaluation protocols (rather than, e.g. a Turing machine). The concept of the obliviousness is important since the BCDL (from section 3.3.1) object will need to respect this rule. As said, this circuits are capable to compute also a general function where n inputs are assigned to k players. In this scenario a boolean circuit contains n in- puts dedicated to k = 2 players. Bob and Alice are players with their inputs a and b respectively. Together they wish to compute some general function f (a, b) using a properly constructed boolean circuit. This is a general problem of a secure multi-party computation (SMC). In this work circuit evaluation is a protocol that accomplishes three things: 1. Alice can enter her input a without Bob’s being able to learn it. 2. Bob can enter his input b without Alice’s being able to learn it. 3. Both Bob and Alice can agree on an arbitrary boolean circuit used to calculate the output. This way both parties are sure on the output correctness and that neither party has tempered it. The theory regarding Yao protocol evaluation will be explained detailed in section 4.2.3. As Yao proposed in [29], the SMC circuit should differ from circuits constructed out of the real hardware compilers. He specified that a boolean circuit should be a purely combinatorial circuit, with no sequential logic. Compilers into real hardware are mostly designed to use (and re-use) circuit components i.e. usage of registers. For instance, look at a compound statement command like sum = sum + a[i], where i = {1 . . . 32}. Real hardware compilers would produce a circuit with a single (sum) register and a single addition circuit, where in each of the 32 clock cycles, one value a[i] is added to the sum register. Fallowing Yao’s proposal, a compiler should produce a circuit that has 32 copies of the addition circuit. Looking at an arbitrary economic function it is obvious that a repetition of an arithmetic operation will be common. But still, the circuit construction blocks, described in section 3.2, can’t have any sequential logic and they can’t be reused. From this it can be concluded that each arithmetic operation inside the formula will require one arithmetic block. In this work an economic formula is considered as a really simple mathematical 9
  13. 13. formula, that involves only the four basic arithmetic operations. Of course that even an economic can include interest rates, percentage etc., but all these should be then represented through the four arithmetic operations. This definition is valid for entire document. 3.1 Formula Tree Object An arbitrary economic formula will be represented as an in-memory object. Because of the restriction on sequential logic, mentioned inside the introduction of chapter 3, a chosen approach is to represent formula as a binary tree data structure. A binary tree is a tree data structure in which each node has at most two children. With this structure a parent has to be an arithmetic operation and the child would be an operand. Also, it will be shown that formula described inside the binary tree also satisfies Yao’s restriction on sequential logic. A typical request in business would be a computation of a custom formula be- tween n business partners. In other words, request is to construct the circuit out of an economic formula defined by the business partners in a high-level programming language. For that reason, as the part of this work, a high-level programming lan- guage named Secure Business Computation Language (SBCL) has been developed. This have leaded the work to development of a SBCL parser, which will parse the SBCL description into an in-memory formula object. 3.1.1 In-Memory Tree Structure Before describing a structure of the in-memory formula tree, there are some impor- tant definitions for a rooted binary tree: • A directed edge refers to the link from the parent to the child (the arrows in the picture of the tree). • The root node of a tree is the node with no parents. There is at most one root node in a rooted tree. • A binary tree is a tree data structure in which each node has at most two children. • A leaf node has no children. • The depth of a node n is the length of the path from the root to the leaf. The set of all nodes at a given depth is sometimes called a level of the tree. The root node is at depth zero. 10
  14. 14. Figure 3.1: A formula tree example representing formula (3 ∗ d)/c − (a + b). • Siblings are nodes that share the same parent node. In a rooted binary tree each node have at most two siblings, left and right. An in-memory tree object is an economic formula represented in data structure of the rooted binary tree. To represent a formula inside this structure a root node has to be an arithmetic operation. The children nodes are also arithmetic operation, but a leaf can be either an input variable or a constant. To understand this better, an arbitrary economic function can be seen in figure 3.1. In the figure an root node is shown that perform subtraction of the children nodes. The left child is an node performing division of the children nodes, where the left node is another arithmetic operation and the right one (leaf) is an input variable, etc. Thus, a node (or a child) can perform an arithmetic operation dedicated to economic calculations such as addition, subtraction, multiplication and division. Operation values can derive from another node as a result of arithmetic operation result on lower binary tree levels. In this case first node is known as the root node and the child nodes are called left and right. This is important because two of the four basic arithmetic operations are operand position dependent. In the representation of subtraction the minuend is the left child, while the subtrahend is the right one. In the representation of division the dividend is the left child, while the divisor is the right child. Looking first at the structure of a binary tree starting from the root node there has to be provided a recursive function. This function should run through the 11
  15. 15. Figure 3.2: A formula tree classes from tree.arithmetic package entire tree, on all nodes and leafs. This shows necessity for an abstract class that is supposed to be extended by arithmetic operation, input variable and constant. As shown on figure 3.2 each tree node extends the abstract class Operand. Inside the tree a leaf input variable and a constant is an instance of FormulaVariable or FormulaConstant class, respectively. For an arithmetic operation a class named Operation is meant to be extended further to a four classes intended for four arithmetic operations. The Operation class extends the Operand class with two attributes. This attributes are mentioned even before as left child and right child, but in this case the term child is substituted with the term operand. Now, the recursion trough a rooted binary tree can be executed with a help of predefined abstract functions inside the Operand class. There are two main func- tions: getValue(inputs) returns a formula’s calculated output value for the specific input values. This function is called recursively for all operands from the tree root to the leafs. For instance, if binary tree root computes a function f0 (a, b) = a + b, then a and b will be returned values from the both Operand nodes. Therefore if a is an another arithmetic operation where a ≡ f1 (c, d) = c − d than operand’s function getValue(inputs) will take recursion to 1st level of binary tree etc. getSize() returns a calculated bit output size from the tree. This function is also called recursively for all operands from the tree root to the leafs. For example, 12
  16. 16. relating to previous example if binary tree root computes a function f0 (a, b) = a + b, where a and b are node operands, than getSize() function will return i+1 i≥j g(i, j) = j+1 otherwise. where i and j are bit output size returned from the getSize() function invoked on the a and the b operands respectively. Of course, if function getValue(inputs) is called than a result will be com- puted respect to the arithmetical operations of the nodes. Thus, if getSize() is invoked, returned result depends also on operands and its characteristic to binary representation. Therefore, if i and j are size of operands, then: Addition will return bit size of max(i, j) + 1 Subtraction will return i size, or minuend bit size2 Multiplication will return i + j integer Division will return dividend (i) bit size Later in text it will be shown that an in-memory formula tree object is only a transition object between SBCL and BCDL3 . That is important, because in the final application design an business partner should just provide the formula, written in SBCL, and BCDL object is generated out of it. 3.1.2 SBCL Parser The secure function evaluation protocol requires that an evaluated function is given as a Boolean circuit, preferably described in with BCDL. Business partners, however, will desire a more convenient high-level form for a given economic function. In the context of secure protocols, this is even more important than the strong usual reasons for writing in high level programming languages. The starting point of any attempt of security is a clear, formal, and easily understandable definition of the requirements. Such clarity of definition is nearly impossible for humans, using low-level formalisms such as Boolean circuits. The clear high-level domain specific languages are required, such as SBCL. To fulfill one of the users request there has been provided a high-level definition language called Secure Business Computation Language (SBCL) used for a clear 2 Subtraction returns minuend’s bit size because of arithmetic block limits to only positive integer numbers. 3 For a detailed description refer to the section 3.3. 13
  17. 17. computation overview. The SBCL is language-oriented and its tailored to describe economic formulas. Expressions combine the standard notations as constants, vari- ables, operators and optionally, parenthesis. The allowed operators include arith- metic addition, subtraction, multiplication and division. It is also capable to handle input data of a different input bit size. Once such a specification is given, a parser generates an intermediate level specification of the computation in the form of an in- memory formula tree object. The main reason to parse a SBCL object to in-memory binary tree is for actual BCDL construction. The SBCL language serve only to ease the development effort for the construction of a BCDL objects. Let’s demonstrate an economic type of formula written in SBCL. This is a simple two-player example with multiple variables and arithmetic operations: default-bits:12 a:Alice b:Bob c:Bob d:Alice x:Alice,Bob x = d/c - (a+b) representing the variables input size, variable names and owners, together with a function to compute. To explain the shown code, starting from top of the bottom, a few important expressions have to be explained: default-bits:12 is default input size only for all input variables. This is not true for the output bit size, because is supposed to be calculated respect to the given formula; a:Alice is an input variable named a who’s owner is Alice. The other input variables are a, b, c and d where Bob owns b and c , while Alice owns a and d; x:Alice,Bob is an output variable. It can be seen that the variable is owned by both players, Bob and Alice; x = d/c - (a+b) is a custom formula suggested by business partners. Here can be also seen that x is an output variable. The input and output variables, together with the owners have to be defined for every formula. If th+ e variables and the owners are not defined than the parser 14
  18. 18. can’t create an in-memory tree object meant to be compiled to BCDL. In order to fulfill requests for future development, variable can have multiple owners. This rule is valid for input and output variables. Therefore, if a business partners provide all necessary data, the parser will thus accept an economic function written in a high-level programming language and parse it into a in-memory object that representing the same function. In our case the compiler compiles an SBCL program into an rooted binary tree object. It is also important to notice that the overall circuit size directly effect on eval- uation performance. For example, due to their cost, multiplication and division should be used with great caution. The multiplication would increase circuit size really fast, therefore it is recommended to be used as least operation. The division operator should be avoided if possible, or at least used with great restrictions. This restrictions are just a hint to decrease circuit size, because final circuit have to be purely combinatorial, in order to maintain obliviousness. For further details on the arithmetic blocks characteristics refer to the table 3.1 Regarding to previously mentioned performance, the very first step in circuit optimization is actually recomposition of the given economic formula. Making an automatic optimizer for economic formula will make a life of the business partners easier. This is one of the future development points. Another focus in future de- velopment would be developing a SBCL-to-BCDL compiler. The a SBCL-to-BCDL compiler is a novel endeavor in itself, because unlike common hardware compilers, our compiler may use no registers, no loops, and moreover, may use every gate only once. Still, the development of a direct SBCL-to-BCDL compiler, at this point of time, is in a phase of development. 3.2 Arithmetic Operation Blocks Now will be shown the construction of the arithmetic building blocks in the above circuits given the one-pass constraint by Yaos protocol. Building blocks are com- binations of previously mentioned logic gates. The necessary building blocks for an economic function are addition, subtraction, multiplication and division. Note that there exist circuits with better asymptotic complexity than circuits here pre- sented, but many of them cannot be used, since they either do not adhere to the one-pass constraint (Oberman and Flynn, 1997 [18]) or their complexity hides very high constants in the ”big O” notation (e.g. Karatsuba and Ofman, 1962 [12]). The goal is to optimize the number of gates for a realistic domain D, e.g. 32 bit. The domain D needs to be chosen, such that it includes all possible inputs, outputs and 15
  19. 19. Figure 3.3: Half adder circuit diagram intermediate values of j(XA , XB ) and therefore no over- or underflow occurs during the computation. It is emphasized that the size of D is independent of any security parameter chosen to protect the privacy of data sets. Thus circuits with the same (or even lower) asymptotic complexity may noticeably differ in the absolute num- ber of gates. The absolute number of gates is relevant because computation and communication costs increase linearly with the number of gates. Table 3.1 depicts for the three arithmetic building blocks a function on the num- ber of gates required for inputs of length l bit, g(l), and the asymptotic complexity of g(l). 3.2.1 Addition The addition building block takes as input two positive numbers, a and b, both having l bits. We denote a = al−1 . . . a0 and b = bl−1 . . . b0 . The output is a number o = a+b, having l +1 bits, o = ol . . . o0 . The circuit is composed from one half-adder which is constructed from two gates and l − 1 full-adders which is constructed from five gates each. A half adder is a logical circuit that performs an addition operation on two one-bit binary numbers. The one half-adder takes as input bits a0 and b0 and has as output the sum bit o0 and carry bit c0 . A full adder is a logical circuit that performs an addition operation on three one-bit binary numbers. For 0 < i ≤ l a full-adder takes as inputs bits ai , bi and carry bit ci−1 and has as output sum bit oi and carry bit ci . It can be combined with other full adders or work on its own. The final bit ol is cl−1 . The total number of gates then is 2 + (l − 1) ∗ 5 = 5l − 3 which is proved to be the theoretical optimal lower bound for carrying out addition of two l-bit numbers (Redkin, 1981 [22]). 3.2.2 Subtraction The subtraction building block takes as input two numbers, a and b, both having l bits. We denote a = al−1 . . . a0 and b = bl−1 . . . b0 . The output is a number o = a+b, having l bits, o = ol−1 . . . o0 . The circuit is composed from one complement or 16
  20. 20. Figure 3.4: Full adder circuit diagram negation ¬x (NOT) circuit and two addition circuits already explained in subsection 3.2.1. The complement circuit takes as input bits bl−1 . . . b0 and the output c = cl−1 . . . c0 will have same length of input size. Second part of the circuit takes as input bits cl−1 . . . c0 as one of input variables of first addition circuit together with other constant input variable equal to one, giving output d = dl . . . d0 . Third and the last part, addition circuit from 3.2.1, takes a = al−1 . . . a0 and d = dl−1 . . . d0 , where dl is discarded, and giving out the output o = ol . . . o0 , where, again the most significant bit ol is discarded. The total number of gates then is l + 2 ∗ (2 + (l − 1) ∗ 5) = 11l − 6. 3.2.3 Multiplication The multiplication building block described in (Wegener, 1996) takes as input two positive numbers, a and b, both having l bits. We denote a = al−1 . . . a0 and b = bl−1 . . . b0 . The output is a number o = a ∗ b, having 2l bits, o = o2l−1 . . . o0 . First, l intermediate products si for 0 ≤ i < l will be computed, each having l bits. Let sij denote bit j of intermediate product si , then si,j is computed from aj ∧ bi for 0 ≤ i, j < l , i.e. requires l2 gates. We then calculate by employing the addition building block l − 1 intermediate sums tk for 0 ≤ k ≤ l − 1 of two l-bit strings having as output l + 1 bit. Let tk ,m denote bit m of intermediate sum tk . tk is computed by adding the two l bit strings sk+1 and tk−1,l . . . tk−1,1 . t0 is computed by adding the two l-bit strings s0,l−1 . . . s0,1 and s1 . The final output o is then the concatenation of bits tl−2,l . . . tl−2,1 tk,0 s0,0 . The total number of gates is l2 + (l − 1) ∗ (5l − 3) = 6l2 − 8l + 3. 3.2.4 Division The division building block takes as input two positive numbers, a and b, both having l bits. We denote a = al−1 . . . a0 and b = bl−1 . . . b0 . The output is a number o = a/b, having l bits, o = ol−1 . . . o0 . In the phase of planning there were proposed 17
  21. 21. Addition Subtraction Multiplication g(l) 5l − 3 11l − 6 6l2 − 8l + 3 O(g(l)) O(l) O(l) O(l2 ) g(32) 157 346 5891 Table 3.1: Gate number and asymptotic complexity, respect to the l input size, for each arithmetic building block Figure 3.5: Number of gates per building block, respect to input size two solutions. The first circuit performed as proposed by Oberman and Flynn (1997) in [18] by first calculating the reciprocal of b, b−1 , with the iterative Newton-Raphson approximation, and then performing a multiplication to obtain o = a ∗ b−1 . This circuit had a total gate number equal to (log2 l − 1) ∗ (2(6l2 − 8l + 3) + 5l − 3) + 6l2 − 8l + 3 = log2 l(12l2 − 11l + 3) − 6l2 + 3l. For an l = 32 the circuit size is equal to 53647 gates. A second proposed building block for division by Kerschbaum and Schr¨pfer has o reduced the size of the first proposal, for a l = 32 input size, by factor three4 . At the moment the circuit is still if phase of development, therefore a reader is referred to the [19] for a detailed explanation. 4 Detailed explanation of the circuit will be published in WP2 deliverable for Secure SCM project. 18
  22. 22. 3.3 Formula Tree To BCDL Compiler The in-memory object formula represents an arithmetic function chosen by busi- ness partners. But still, before the evaluation this formula has to be represented as a boolean circuit object. Therefore the language has been developed and used for describing this boolean circuit. This language is denominated Boolean Circuit Description Language (BCDL). It is expected that an in-memory formula can be compiled into BCDL object. This step is required to provide a necessary input object to the web-based application (described in chapter 4), i.e. a BCDL object. 3.3.1 Boolean Circuit Description Language Required data for evaluation of a boolean circuit are the circuit gates, their truth tables and circuit output wires. While this circuit data were enough for a mathe- matical formula, there were required some additional informations for an economic formula. To calculate an economic formula input and output variables have to be owned by at least one of the business partners. There had been decided a specific boolean circuit descriptive language develop- ment, after the structural analysis of a boolean circuit. This language shall describe a one-pass boolean circuit together with all necessary data required to represent an economic function e.g an input owner. A BCDL object contains all necessary infor- mations about a boolean circuit gates, truth tables, input/output wires and their owners. As such a BCDL object can be divided into following descriptive parts: Inputs. There are n inputs, e.g. for each function variable vl where l ∈ {1, . . . , n}. As said before, each input (or variable) has its own size sl described with SBCL. This input size sl is the size of an array representing an input vl , where each element in the array is a circuit wire wu . Therefore, inputs in BCDL are represented as a double array i.e. the inputs and their sizes. Gates. The circuit connections are represented with gates gs where s ∈ {1, . . . , m}, and the circuit wires wu where u ∈ {1, . . . , k}. Then all gates are represented as an array of length m and k is the total number of unique wires in the circuit. g Each element gs inside this array is a gate composed of multiple inputs wi s g and one output wos , where i = o and i, o ∈ u. The gate gs is represented as an array of circuit wires wu , where first wires represents input wires (e.g. two g g g inputs w0s and w1s ) and the last wire (w2s ) is an output wire. In this thesis work, each gate is a two input gate5 . 5 The security of the Yao protocol is still not prove for more than two inputs per gate. 19
  23. 23. Truth tables. Each gate inside the circuit has its own truth table tgs . This table is represented as an array, where a gate gs is paired to a truth table tgs . The output value tgs from a gate gs depends on the binary representation of the h input h of the fixed bit size r, where h ∈ {0 . . . 2r − 1}. I.e. if the input size is r = 2 and the binary input is h = 1, then its binary representation would be h = 012 . In this case, for an AND gate a result would be tgs = 0. Thus, there 01 is an obvious link between gate input size and truth table size, while output values tgs have to be inserted respect to required gate. h Outputs. The outputs format of a BCDL object is exactly the same to its inputs format. Each output (or result) has specific output size depends on the formula described with SBCL. The only difference in respect to inputs is that the g output is an array of the output wires for specific gates wos . Owners. A circuit described in a BCDL object also requires that all circuit vari- ables(input and output) have their owners, as previously shown with an ex- ample in section 3.1.2, where all the variables have their owners. Looking at descriptive parts of the gates and the truth tables, there can be concluded that a gate can be notated as Gs (gs , tgs ). A compiler has been developed to generate the BCDL object from in-memory formula. This BCDL object will contain all the mentioned information for the circuit that computes an arbitrary business function from the formula tree object. For example, the BCDL object exported in JSON6 format looks something like following: var circuit = { "input":[[{"wires":[0,...,31],"name":"2*d*fa"}... "tt":[[0,1,1,0],[0,0,0,1],[0,1,1,0],... "output":[[{"wires":[24009,24010,24011,24012,24013,... "gates":[[0,32,128],[0,32,129],[1,33,130],[130,129,131],... } Once the circuit is created it is required to check the circuit’s correctness. In this case the correctness check supports only integer numbers, because operations are limited to the unsigned integer numbers to maintain technical simplicity. Thus, the in-memory formula tree to BCDL compiler includes many novel tricks for reducing the number of resulting gates in the circuit. 6 JSON is a light weight markup language suitable for data representation in the JavaScript scripting language. 20
  24. 24. Capability to compile a formula tree together with a constant values will affect evaluation performance of the Yao protocol, especially in a web environment. That’s because each additional operation will increase the final circuit size. There has been decided to precalculate all public informations, e.g. sharing the same input value from both sides, necessary for secure function evaluation. This precalculation is done before the evaluation of a circuit, providing a minimized circuit as much as possible ready to directly improve the protocol evaluation performance. In this software version the ”formula-to-circuit-compiler” cannot export constant value into the circuit. However compiling formula tree that contain constant values will throw an exception informing user about unsupported functionality. 3.3.2 Formula Circuit Composition To compose a binary circuit there has to be a logic behind when the circuit size start to increase. A good solution to represent an economic formula trough a boolean circuit is to provide basic arithmetical operations as prepared building blocks. Each of these building blocks is stored inside a Java class with some internal logic for a generic construction of the block, independent on the size of the inputs. The preparation of this basic arithmetic blocks is done with help of a base Gate class which is extended by each derived gate class inside the circuit. There have been prepared a classes for a basic logical operation gates and a gates that represents an artificial boolean value inside the circuit. Beside the basic logical operations there is also a derived logical operation class. All this classes extend the base Gate class whose illustrated attributes in figure 3.6 are the following: win is an array of Wire instances wi where i ∈ {1 . . . n} and n is the size of input bits for specific gate; tt array is a truth table of Integer elements tg inside, where g ∈ {0, . . . , 2n − 1}. Truth table output is represented with Integer and it is paired to a g input where g is inserted as a binary value; wout is a Wire element holding the information for gate’s output wire wo where o = i. This makes it clear that each Gate element can have only one output wire. With the help of this three attributes the basic logic operations can be represented, such as AND, OR and NOT. The same structure is appropriate to represent derived logic operations, e.g. XOR operation. Said that, there can be even constructed more complex gates of different input sizes, but with only one output. 21
  25. 25. Figure 3.6: A circuit basic gates classes from binary.circuit.gate package For the basic and derived logic operations there have been provided classes illus- trated on the right side of the 3.6 figure. The classes GateAnd, GateOr, GateNot and GateXor extends base Gate class, and their constructors require the input size argu- ment. A class constructor will generate automatically the truth table for a specific binary operation gate. The GateZero class is also extended from class Gate, but with some slight differ- ence. Looking at the GateZero class on UML diagram in the figure 3.6 the difference between this class and the other gate classes can be noticed. The static function getZeroGate() returns the unique GateZero object for the overall circuit. That is because the output of gate gz is always artificial zero, independently of the binary input value. Later in text it is explained why this gate will be used multiple times inside a BCDL object. In this phase of development each circuit arithmetic building blocks, used to compose a BCDL object, requires the equal operands size. This is a crossing point in between the compilation of SBCL to the BCDL object. The SBCL offers con- struction of an in-memory formula with a specific input size for each input variable. Thus, if x and y are the size of the inputs, where x = y, then the GateZero output wire have to be used for a |x − y| most significant bits on the min(x, y) input. The result of the process is x, y = max(x, y) and all provided arithmetic building blocks can be used for the construction of a BCDL object. An interesting optimization point is focused to this problem.7 There has been also provided a GateOne class extended from Gate. After previous 7 Discussed as a future development in chapter 6. 22
  26. 26. Figure 3.7: A circuit’s InputVariable and OutputVariable classes from binary.circuit.variable package extended from abstract class Variable example, it is obvious that this gate will return artificial value tgo = 1 independently h on the binary input value h. All wires in the circuit can be divided in two groups, the variable wires and wires in the middle. The variable wires can be separated into input(in-degree 0) and output variables. That’s why some wires have to be memorized inside the Variable class. On the right side of the figure 3.7 the base Variable class is shown with the name and the wires attribute. The attribute name is used only if the variable is of importance for a final BCDL object, e.g. an information about the circuit output variable used as an input variable to another circuit may have no importance. There is also a method getSize() which returns the number of Wire elements inside the wires array. As expected, a method as isNamed() ensuring the importance of the variable for a final BCDL object, i.e. if a variable is named than it should be a part of final BCDL object. On the left side of the figure the classes are presented for the input and the output variables nominated as InputVariable and OutputVariable respectively. These classes extend Variable class and an abstract method addOwner(owner). In the first version of the BCDL specifications these methods differ on the owners number. At first version an InputVariable can have only one owner, while an OutputVariable can have multiple owners. While a name attribute is not obligate, the wires is an obligate attribute indepen- dently on type of the variable i.e. input or output. It is obvious that a BCDL object can be evaluated even without the owners, but in the secure two-party computation protocol proposed by Yao [28] all the variables must have the owners. As it has been shown, it is necessary to construct an economic formula circuit out of the arithmetic building operation blocks. In the figure 3.8 all classes for a four arithmetic blocks builders are shown. A base, or a parent, class per each building block class is an instance of the Operator class. The Operator class is 23
  27. 27. an abstract class with a protected abstract function createCircuit(vina, vinb, vout) extended to the all arithmetic blocks. Each arithmetic block will override the abstract function of Operator class from a binary.arithmetic package. This abstract method is protected Circuit createCircuit(Variable vina, Variable vinb, OutputVariable vout) used to create a specific arithmetic circuit block, providing input parameters vina, vinb, vout. All these parameters are arrays of wires for variable inputs (vina, vinb) and array of wires intended to the output of the circuit (vout). Mentioned function require two Variable arguments for left and right operands, and an out- put OutputVariable argument. While this function is protected and dedicated for internal circuit construction logic, there has been ensured an another function for public calls. In figure 3.8 it is shown a public call function getCircuit in two forms. Both functions require two Variable arguments for left and right operands, while only one of them expect voutName argument used to nominate OutputVariable of returned Circuit class. Here, the nomination of the output variable is important only if this output is required in a final BCDL object. Inside the class Circuit all the informations are stored to build a BCDL object. In the Circuit class attributes are inputs, gates and outputs, stored as arrays of Variable, Gate and OutputVariable elements, respectively. These attributes are presented in the figure 3.8, together with their types. All these elements inside the arrays are generated together with a Circuit class returned from getCircuit(vina, vinb, vout) function. It is obvious when looking at the figure 3.7 that an inputs attribute should be an array of InputVariable elements. This fact shows that an input into the Circuit instance can be also an output from another Circuit element. Thus, referring again to the figure 3.8 and the Circuit class, a function prepareCircuit() shall eliminate OutputVariable elements from whom are unnamed from outputs and all from inputs. Afterward a BCDL object is prepared for future evaluation. In a final BCDL object every wire wi has its unique identification number. Uniqueness of each wire is required for correct evaluation of the circuit explained later in text. Therefore, composition of overall circuit may require multiple arith- metic blocks or, in other words, added circuits to the main circuit. While circuit becomes more complex its getting more difficult to manage a uniqueness wire identi- fication number. Although, keeping uniqueness is even harder if static BCDL objects as GateZero can be reused for each building block of the final circuit. 24
  28. 28. Figure 3.8: Circuit arithmetic blocks classes from binary.arithmetic package Figure 3.9: A WireManager class for managing of all the Wire objects in a circuit. There has been developed WireManager class in order to facilitate manage- ment for a wire identification uniqueness in a final BCDL object. Therefore, in a WireManager class a function getWire() shall be called for every Wire instance. As expected, unique WireManager instance stores every produced Wire instance inside the wires attribute. Idea behind this is to use one static global wire identifi- cation attribute gwid inside the Wire class. Once a final BCDL object is constructed a function setWiresId() needs to be invoked from unique WireManager instance. This function will invoke the function setWid() for all instances produced from WireManager. Afterwards each Wire instance is identified in respect to the position inside the wires array. Looking at the figure 3.8, particularly on class Circuit, it is clear that a BCDL object can be also generated from an in-memory formula tree. Composing the circuit out of the in-memory formula object requires a compiler that will go through entire rooted tree. Therefore an method inside the Operand class, from tree.arithmetic package8 , named getSize() will help in BCDL object construction. From the fol- lowing figure 3.10 can be understood how the circuit is constructed from the tree 8 Refer to figure 3.2 for detailed explanation of the tree.arithmetic package. 25
  29. 29. Figure 3.10: Flow diagram for generation of the BCDL from in-memory formula rooted tree, starting from root Operation element. root Operation. The algorithm for circuit construction will start from binary tree root, which is an Operation instance. Every Operation instance contains a left and a right Operand instance, as shown in figure 3.2. As it has been shown on circuit generation flow diagram a first checked Operand is a left one. If this instance is an oper- ation (e.g. MulOperation) than the algorithm would invoke, for the second time, generateCircuit(formula) for that particular Operation. Else, if the Operand is a FormulaVariable then an InputVariable instance from the binary.circuit.variable package. This function is recursively called trough entire tree until it reach a tree leafs or FormulaVariable instance for both operands. A function returns a specific operation building block which is added to the main circuit of the instance which called the method. Afterward the algorithm goes for the right Operand instance and repeat the same process. As a very final step the algorithm will process in a same manner with building block of current instance. For the arithmetic block will use as 3.3.3 Circuit Correctness As has been shown, four of the arithmetic building blocks have been provided for circuit construction. Still in a future development of an arithmetic block there is always a place open for an improvement. Therefore, a system to prove correctness of the circuit had to be developed. Further in the text is shown an improvement of the Division block and testing for the following circuit. Some building blocks, particularly the Division block, have evolved in planning 26
  30. 30. and development process of a JELS circuit9 . I.e. in the development process of this thesis work a two versions of the Division building blocks were proposed. The division building block takes as input two positive numbers, a and b , both having l bits. The output is a number o = a/b, having l bits. A first circuit performed as proposed by Oberman and Flynn (1997) in [18] by first calculating the reciprocal of b, b−1 , with the iterative Newton-Raphson approximation, and then performing a multiplication to obtain o = a ∗ b−1 . This circuit had a total gate number equal to (log2 l − 1) ∗ (2(6l2 − 8l + 3) + 5l − 3) + 6l2 − 8l + 3 = log2 l(12l2 − 11l + 3) − 6l2 + 3l. For an l = 32 the circuit size is equal to 53647 gates. A second proposed building block for division by Kerschbaum and Schr¨pfer has o reduced the size of the first proposal, for a l = 32 input size, by factor three. At the moment the circuit is still if phase of development, therefore a reader is referred to the [19] for a detailed explanation. In this section the correctness is focused on an overall BCDL object and not just for the specific block. Thus, if a requested is to test specific building block than an input formula can be just a single arithmetic operator. Formula tree object can be used to compute formula output over predefined function, inside abstract Operand class public abstract Integer getValue(HashMap<String, Integer> inputs); where function call in classes Constant and Variable, that extends Operand class will return number values, while Operation class call same function for left and right Operand objects. In case of that one of operands is another Operation object, than call becomes recursive trough entire tree. Custom input variable objects return their values go over provided parameter HashMap<String, Integer> inputs containing all possible values inside the tree. Parameter inputs contains String for key values and Integer represents value used for this input variable. In case of circuit correctness there been provided a fallowing static function inside Evaluator class public static int evaluate(Circuit c, HashMap<String, Integer> inputs); requiring Circuit object to evaluate and HashMap for input variables values. 9 Abbreviation of the JELS stands for Joint Economic Lot Size explained later in chapter 5. 27
  31. 31. At the end, its enough to create random Integer values of proper input size for each input variable inside HashMap table and to provide this instance to the previously mentioned functions, comparing returned result. This is good approach because sometimes you can’t test all possible inputs because of overall circuit input size. 28
  32. 32. Chapter 4 Web Based Secure Business Computation The secure business computation is a new concept in a business world, but still there is a missing system to provide SFE for an economic function. In FairPlay [7] project has been shown that a SFE can be done using a Yao protocol, but this system isn’t suitable for an economic function. Beside that, the methods for doing business evolved rapidly and these methods tend run business on internet, what would require to make an web oriented SBC application. Developing a web based application for the secure evaluation, in the manner suggested by Yao [28], is a real challenge. This chapter gives an introduction to libraries developed for web based secure business computation application. Then follows a presentation of the subsystem developed for browser to browser message exchange. It will explain the cryptographic libraries for secure computation, which were required by the original solution of Yao [28]. Once message exchange system and the libraries were described, an overall description for web application (work flow) libraries and its structure are presented. The figures give detailed formal descriptions of the libraries and protocols while the text tries to explain their details supported by some examples. All the libraries and message exchange subsystem were developed as a part of this work. At the very end, the main target is to use this functionality on a real business case. Therefore, the task was not only to create a system itself, but also an appli- cation demo to be demonstrated in public. The final web application is presented as a pair of client focused sub-applications which run in an internet browser. For that reason, the SBC progress is displayed to the business partners in a user friendly GUI. This application requires from business partners to insert their private inputs 29
  33. 33. and to run the evaluation, while at the end they should verify result, confidentiality1 and integrity2 provided by this system. 4.1 Browser To Browser Message Exchange In everyday business each partner requires to be transparent and independent re- spect to the other partners. To make them independent, an independent working environment has to be developed for each partner. In todays business an inter- net browser became an interesting basis for the business applications. An internet browser can interpret, in this case, a JavaScript code required to run a business application. In order to evaluate a n-party protocol each party is obligate to exchange mes- sages with other parties until the protocol is evaluated. Since this work focus on the Yao’s protocol to evaluate the JELS economic function, where n = 2, requirement is a secure two-party computation system. This specification gave an option to develop a message exchange subsystem to support only n = 2 parties. It is expected that a ”theoretical” system as this one has to be less complex and therefore a scenario with n = 2 had been accepted to maintain the simplicity of developed code. Following the analysis of a general service in a web environment, took this sub- system to a multiple message exchange session provider. This subsystem will provide a selection to the user between options to host or to join this subsystem. Hence, another exchange session can be created on a same web server without any influence to the other exchange sessions. It is well known that an internet browser is designed to open a TCP/IP chan- nel with a web server in order to exchange application data. This data exchange allows a client/server application to run. But in order to run the SBC in a web environment there have to be at least two clients processing the data among their internet browsers. This is why there has to be developed a subsystem dedicated to help in a message exchange between the browsers trough a web server. Usually, among client and server data is exchanged with favor of HyperText Transfer Pro- tocol (HTTP). The HTTP protocol will satisfy a subsystem requirements, plus it will make profit out of an Asynchronous JavaScript And XML (AJAX) technology. With the AJAX technology a GET/POST HTTP request/response is executed in an application background, while user can have a complete control over the application. The HTTP GET method is used to download data and to use basic functions for 1 Confidentiality ensure privacy, it prevents unauthorized represent of the secret data. 2 Integrity ensures correctness, the original data cannot be changed or substituted. 30
  34. 34. data manipulation on the web server. While the HTTP POST method its therefore exclusively used to upload data onto the web server. This technology opens a door to a completely new business logic on a web page and gives to a user more friendly Graphical User Interface (GUI). While almost every browser on the todays market provide an AJAX object in its local library, a creation of an AJAX object is different in some browsers. This differentiation comes because of the difference in browsers class name for the AJAX object. Therefore, a JavaScript function has to be provided to discover the AJAX class name inside the most popular web browsers. Afterward this JavaScript object is going to be used for all HTTP requests. Inside a specially developed library app/appAjaxReq.js there is a function called getXMLHttpReq() which tests and returns a valid AJAX object for the specific browser. For example, there is a conflict between IE7+, Firefox, Chrome, Opera, Safari where object is called with XMLHttpRequest() function, and in IE5, IE6 there the object has to be called with a ActiveXObject("Microsoft.XMLHTTP") command. After the creation of a XML HTTP Request (XHR) instance, a HTTP request can be made directly from JavaScript. A first step is to call open(type,url,async) used to initialize the object. In this function a three arguments are requested: Request Type: Request method type, only GET or POST URL: Web server Unique Resource Location (URL) string Async: Boolean value indicates which if a request should be dispatched in asyn- chronous mode The async argument is really important in this case. If the async value is true, request will be executed in the asynchronous mode and rest of the JavaScript code will continue to run. In this case, it is necessary to have an event manager to discover the origin of the event. If the async value is false, the request will be executed in the synchronous mode and the application will freeze until a web server responds. In the final application version the synchronous method has been used, instead of asynchronous, for every AJAX call. This has been done in order to skip necessity to have an event manager to discover the origin of the event used to prevent confound of the web server responses. Knowing that an AJAX object can execute only the GET and the POST request definitely helped to maintaining the simplicity. For each HTTP parameter used in a client exchange session a dedicated AJAX library will contain a function dedicated to 31
  35. 35. one of two possible requests. Following JavaScript functions, in app/appAjaxReq.js library, ware provided for these operations • ajaxDoGet(url, param, destObject) request with the HTTP GET method • ajaxDoPost(url, data, destObject) request with the HTTP POST method used to simplify GET or POST method call of an open(type,url,async) function predefined inside the browser’s AJAX library. The arguments url and destObject stays for URL of a web server that provides some service and an object where function stores a return value, respectively. In case of GET request, there is a param argument where all required parameters of a HTTP GET request should be placed. While POST request has data argument that is an instance of a byte array to upload on a web server. In an ideal application case the AJAX technology combined with a Comet tech- nology. The Comet technology offers a server push3 methodology, instead of a pull4 methodology used in this work. In scenario with the Comet technology an overall result of TCP/IP packages efficiency will increase in a SBC application. As it has been said two browsers can exchange data only if there is a web server in between. There has developed a exchange session manager class ManagerME in- tend for management of a multiple parallel exchange sessions. Below this exchange mechanism is one browser’s (or client’s) instance of a dedicated message box class Browser.MessageBox. A client has to administer his message box in order to keep it up to date. These concepts are described in details in the following chapters. 4.1.1 Exchange Session Manager A browser exchange sessions manager on a web server shall provide a robust ex- change service, to keep securely stored data directed to a client. It has to support multiple independent exchange sessions, together with user dedicated functions to host a session and to join an existent session. While secure storage is an important issue there is also a feature of a great importance, the latency. All this issues are ”unimportant” in a test environment (as in this work), but in a real web server there are hundreds or thousands of the exchange sessions this can become an enormous concern. 3 Push technology, or server push, describes a style of Internet-based communication where the request for a given transaction is initiated by the publisher or central server. 4 Pull technology or client pull is a style of network communication where the initial request for data originates from the client, and then is responded to by the server. 32
  36. 36. Figure 4.1: Ideal session activation flow in two-party message exchange system The message exchange system has to be deployed on a web server approachable to all SBC players. This system has help running a multiple parallel and independent SBC sessions. Running in parallel would mean that a multiple n-party SBC can be evaluated at the same time. This independence give guaranties that each client has one and only one exchange session on a web server, refereed to his current exchange session ID information stored in browser. If client wants to host/join another session the old exchange session ID will be overwritten. Any client can host/join a message exchange session in any point of time, even in the middle of a SBC evaluation. Once client host/join a new session, his internet browser stores an information regarding a currently hosted/joined exchange session. This information will be stored until client chose to host/join an another session or to delete the information. The reader is referred to the figure 4.1 for the host/join exchange session step. If an exchange session is created (hosted), the host client becomes automatically a part of this session. His internet browser should store his current exchange session ID as a browser cookie. The created exchange session shall stay an inactive session until all the players join. That’s why the host must invite (e.g. email, SMS, letter etc.) a guest to join the exchange session, as represented with gray arrow in figure 4.1. Afterward, once all the required clients join the session it becomes an active session (figure 4.1). Once session is active no additional players can join the session. If some other player tries to join the session an exception will be thrown to inform the player about the exchange session status. 33
  37. 37. There is also an advantage if subsystem is designed to provide only n = 2 ex- change sessions. In this case, a web server can automatically recognize designation player in an exchange session ID where player is trying to send a message. Otherwise it would require additional message destination parameters in a HTTP request. In n > 2 case the system requires further development. This system should contain an information about the destination inside a HTTP request, while n = 2 parties system transmits messages directly to the opposite party. Dynamic server responses were inevitable, since all message exchange requests goes over HTTP request methods (GET, POST). Therefore, Java Servlet technology with its HTTPServletRequest and HTTPServletResponse classes entirely fulfilled elaboration requirements. An extended Servlet class ServletME runs on a web server and listens for incoming • doGet(HttpServletRequest request, HttpServletResponse response) • doPost(HttpServletRequest request, HttpServletResponse response) requests. A doGet function call responds on HTTP GET method requests, while doPost responds on POST method requests. A personal message exchange ser- vice(read, delete,...) for clients is administered trough GET method, while all the POST method are exclusively for message sending, uploading data to a message box. The decision on which method to use is what purpose is directly related to HTTP protocol specification. E.g. Microsoft Internet Explorer has a maximum uniform resource location (URL) length of 2,083 characters. It also has a maximum path length of 2,048 characters. This limit applies to both POST request and GET request URLs5 . I.e. if GET method is used there is a limit to a maximum of 2,048 characters, minus the number of characters in the actual path. This is more than enough to control a web server via parameters. However, the POST method is not limited by the size of the URL for submitting name/value pairs. These pairs are transferred in the header and not in the URL. The ”unlimited” parameters in header of the POST method were not enough to transfer entire circuit, since the application expects only one (huge) BCDL object. Instead it is decided to transfer BCDL object in a data section of the HTTP POST method. All HTTP GET request should contain one and only one parameter, because only first parameter is going be processed by ServletME and all the others will 5 RFC 2616, ”Hypertext Transfer Protocol – HTTP/1.1,” does not specify any requirement for URL length. 34
  38. 38. be discarded6 . Single parameter rule is not valid only with one specific applica- tion request, this request is the host/join HTTP GET request which requires two parameters. This specific GET request will contain message exchange session ID string which will be used to host/join message exchange session. The difference is introduced because need for some additional information e.g. the file contains the BCDL object, which will be evaluated in the final web application. Once client hosts an exchange session on ServletME, his browser receives an unique JSESSION cookie from a HTTPSession class. The browser will use this cookie for identification purpose in future requests. Actually, Servlet uses HTTPSession in- stance to store some additional parameters for specific client. Similar to the Servlet’s browser recognition, a message exchange system is going to use JSESSION value to discover his message exchange session. The JSESSION value is stored in bwId vari- able in a BrowserME class. Since every client has to provide an unique exchange session ID parameter to host/join the session, the Servlet will store this parameter as esId attribute in the browser’s HTTPSession instance. After this step, the esId value is linked to JSESSION (which is equal to bwId) value. The bwId value will be used to restore the message exchange session instance through the esId value. Working environment is managed though ManagerME class. This class is de- signed to simplify creation of an exchange session. This is made with help of a MessageExchange class where esId and bwId values are stored, in SessionME and BrowserME instances respectively. Basically, the ManagerME keeps multiple MessageExchange instances stored in an array. Manager has openSession(esId, bwId) function dedicated to creation of a MessageExchange instance. This func- tion is used to host/join an exchange session with provided parameters. Obvi- ously, a player invokes this function to register his bwId for specific esId inside the ManagerME. Once player tries to host/join a session the manager will check if there is already an existent exchange session with equivalent the esId value. If there is no equivalent esId, the player will be considered as a host and a function hostSession(esId, bwId) will be invoked. If equivalent esId already exists, the player will be considered as a guest and a hfunction joinSession(esId, bwId) will be invoked. Usage of the ManagerME makes everything automatized, in fact an user should use available management requests designed as the HTTP request parameters. Most of these HTTP management commands should be used without a parameter value. 6 One parameter rule is present for subproject simplicity. 35
  39. 39. For an example, the fallowing HTTP request sends a valueless getMsg parameter as http://10.55.145.XXX:8080/WebYao/ServletME?getMsg This command will call deployed ServletME Servlet in order to read a message from the browser’s message box. On HTTP request reception, a Servlet restores the exchange session (esId) attribute stored in HTTPSession trough bwId. As said, with assistance of JSESSION cookie the attribute values can be restored from Servlet. This means that client will have to provide esId value only once, when he host/join the session. Following HTTP parameters are dedicated to management of an exchange session on a web server: • GET: esId - host or join to the specified exchange session • GET: esActive - valueless parameter used to check if exchange session is active The parameters are transfered trough HTTP GET method where only esId param- eter is expected to have a value of his exchange session. Both responses is going to have string value ”true” or ”false”. The esId request returns true if player has successfully hosted/joined the session and esActive returns true if all players joined the session. A false value can sometimes be expressed trough exception message e.g. if player tries to join multiple times an exchange session or if player tries to join an active session. However, only a response with data content ”true” is a success- ful/positive operation. Previously mentioned parameters are meant to invoke the following functions, with ”es” prefix, from app/appAjaxReq.js library : • esOpen(url, esId, destObject) host/join exchange session • esActive(url, destObject) check if exchange session is active where url is the URL of the ServletME instance and detsObject is an object used to store a response value. The esOpen function requires an exchange session esId argument, client is willing to host/join. All the other commands are related to message box operations and those will be explained in details in the following section 4.1.2. 36
  40. 40. 4.1.2 Browser Message Box Starting at ManagerME class that contains a vector of a MessageExchange classes where you can find a SessionME and array of BrowserME classes particularly linked to an exchange session, all the way to MessageBox class. As has been said, every browser has its own message box on a web server. This box is represented as a Browser.MessageBox class. An owner of this message box will have benefit to receive messages and to manage it through the HTTP requests. Since each client’s browser instance can be involved in only one message exchange session, it indicates that a browser instance can have only one active message box. In fact, any client can open multiple sessions and therefore multiple message boxes, but only if the browsers don’t share memory space. If client tries to host/join multiple exchange sessions with the same browser a JSESSION value remain the same and it will refer only to the last hosted/joined session. Important notice is that client can’t go backward and reuse an existing message box if he has hosted/joined a different session. The message box class has to be simple as possible, therefore it can receive one and only one message. This message cannot be overwritten if there is another incoming message, instead the mail box owner have to read the message first and than to execute command that will delete the message. The message box has to be managed and messages should be sent by client, hence there has been provided a few management commands. The managing commands are following: • Received message 1. READ: Read a message from the box. If there is no message in the box return ”false” value. 2. DELETE: Delete a message from the box. If there had been a message in the box return ”true” value, otherwise return ”false”. • Intended message 1. REQUEST TO SEND: Ask is a receiving box capable to receive a message. Response is the boolean ”true” or ”false” value 2. SEND: Send a message. A response is ”true” value if the message is sent successfully , or ”false” otherwise. where all the commands are obvious except the ”request to send” command. If client pi wants to send a message to pj , where i = j, firstly he should send a ”request to 37
  41. 41. send” to a web server. The pi can check if there is already a message in pj message box. If there is the message, pi can move on with sending a ”small” size packets to a web server, until pj is available to receive the message. Without this command a pi would be forced to send ”large” packets, and this can be highly inefficient if the pj message box is unable to receive those packets. There has been predefined a limit l of messages inside the message box. The limit has to be enough to exchange all the necessary data between two browsers, while subsystem complexity is reduced to minimum. This limit is set to limx→∞ l = 1 where x is the number of messages on wire directed to one of the message boxes. Of course, each client is responsible to maintain his message box in order to allow new messages to be received. Therefore, each client has to know the subsystem’s predefined maintenance commands used for read, delete, send etc. Some protocols can require more than one message to be sent (or to be received) at the same time, which would make this exchange aspect ineffective. The mentioned inefficiency can be noticed e.g. in an OT protocol part inside the Yao’s protocol when a multiple binary inputs has to be exchanged in order to evaluate the circuit. Since this system relies on Yao’s protocol, exchanged messages can be adopted to perform this subsystem with maximum efficiency, e.g. parallelization of OT7 step. An advantage in the accepted n = 2 exchange scenario is that each client pi can send a message to the opposite party pj without any informations on message destination. This feature is allowed by background logic of the subsystem where i, j ∈ {0, 1} and i = j are used to discover who is a pj party. Therefore, pi will send a message to the ServletME and the message will be stored directly in a pj ’s message box. All HTTP requests from clients will be sent with help of the AJAX technology. These requests are predefined on a ServletME used for message box management over few basic predefined functions to read, send, delete message. Requests to the web server, followed by responses to the browser, should contain following HTTP parameters: • GET, getMsg - get message from personal message box • GET, delMsg - delete message from personal message box • GET, reqSend - request to send message to other client • POST, without a parameter - send message to other client 7 Oblivious Transfer(OT) is a cryptography technique described later in chapter 4.2.2 38
  42. 42. Figure 4.2: Ideal message exchange flow in an active session of two-party message exchange system In order to exchange messages the clients will have to send one of these HTTP requests to a web server over a previously opened exchange session (figure 4.2). This session with another browser has to be also active before sending a message. As it has been shown, the clients also need to have some additional parameters (commands) used to execute a few basic controls over their mail boxes, e.g. delete a message before receiving another one. There are two possible responses for each HTTP upload (POST) request method to a server, ”true” or ”false”. If there is no message inside the box, the message will be stored and server will return value ”true” as in figure 4.2. Response will be ”false” if there is an undeleted message in the box, thus sent message will be discarded. For that reason, one of the GET parameter requests reqSend is intended to spare unnecessary network traffic i.e. sending a messages that cannot be stored. All these methods have been designed to provide message exchange in between players browser for the SBC with help of following cryptographic libraries. 4.2 Cryptographic Library A generic type of secure computation (SC) was suggested by Yao [28] in his seminal work where he introduced the notion of a secure function evaluation (SFE). In that work Yao raised the issue of two millionaires Alice and Bob, eager to determine whom of them is richer, but reluctant to reveal their actual wealth. Assuming Bob’s and Alice‘s amount of a wealth is i and j respectively, where i, j ∈ [0, N ] and N ∈ N. 39
  43. 43. In order to determine who is richer they should evaluate a following protocol 1 i≥j f (i, j) = 0 otherwise. where f = 1 claims that Bob is richer. In secure multi-party computation (SMC) the millionaires problem represents only a special case of a broad issue. In SMC [11], set of n players Pi , where i ∈ {1, . . . , n} wish to jointly compute an arbitrary function of their private inputs y1 , . . . , yn = f (x1 , . . . , xn ) with each party Pi private input xi . There is no Pi that is willing to reveal his private input xi to Pj , where i = j. This rule is also valid for the outputs where each party Pi receives only their output yi without learning or inferring anything else, except what can be inferred from input xi and yi [10].8 In general definition the SMC allows to the players P1 , . . . , Pn computation of a public function f (x1 , . . . , xn ) = (y1 , . . . , yn ). The computation is privacy preserving, i.e. nothing else is revealed to a player than what is inferable by his private input and the outcome of the function. This work, more generally, focuses on the following two-party SFE problem. Alice has an input x = x1 , . . . , xs and Bob has an input y = y1 , . . . , yr where s and r are their input size respectively. They both are willing to learn f (x, y) for an agreed public function f , but without revealing any information on their inputs. In order to evaluate the SFE with help of a Yao protocol, there have been developed indispensables JavaScript libraries for the final web application. As a part of this work, the following libraries were developed: • crypto/digest/sha1.js Secure Hash Algorithm SHA-1 JavaScript library • crypto/ot Oblivious Transfer JavaScript package • crypto/yao Yao’s garbled circuits JavaScript package where crypto/ot and crypto/yao are composed out of multiple objects. All these libraries are sorted by dependence, but each library is going to be explained in details in following sections. 8 In a two party setting the function y = f (x1 , x2 ) = x1 + x2 , where y is output for each Pi . The party Pi cannot compute function y without learning the party’s Pj input xj . But it is obvious the each player can reveal the other players input if commits a simple subtraction. In order to keep both inputs private an introduced secret sharing scheme by Shamir[25] and Blakley[2] needs to be applied. 40
  44. 44. bit size Output Internal state Block Max message Word Rounds 160 160 512 264 − 1 32 80 Table 4.1: Characteristics of Secure Hash Algorithm SHA-1 4.2.1 Secure Hash Algorithm SHA-1 Theory. A hash function is a deterministic procedure that takes an arbitrary block of data and returns a fixed-size bit string, the hash value, such that change to the data will ideally change the hash value. A message digest is a secure one-way hash function that take arbitrary-size data and outputs a fixed-length hash value. The SHA is called secure because it is designed to be computationally infeasible to recover a message which produce the same message digest. In this work a library has been developed that produces a 160 bits long message digest. This is the Secure Hash Algorithm SHA-1, derived from his predecessor SHA- 0. SHA-1 is a symmetric cryptographic function that is actually used in the Yao protocol for actual circuit garbling described in section 4.2.3. It has been chosen because for convenience of a block cipher usage9 , therefore it supports a variable input length. Some of its main characteristics are presented in table 4.110 . Practice. This function is the most frequently used function in the entire appli- cation i.e. this function will be called multiple times for every wire in the circuit. Because of that, the key objective to achieve is to obtain a hash function which performs as fast as possible, but written as a JavaScript library. Consequently, exclusively for this application, a JavaScript library sha1.js in crypto/digest package has been developed. Whole (mostly binary) background processing is hidden behind a dedicated function which requires a message as an argument. This function is defined as function sha1(message) and returns the message hash value in a BigInteger11 object. This is a key function of the library, while all the other functions are used as a behind logic used by the key 9 In cryptography, a block cipher is a symmetric key cipher operating on fixed-length groups of bits, called blocks. 10 Internal state means the internal hash sum after each compression of a data block. 11 The jsbn is a pure JavaScript implementation of arbitrary-precision integer arithmetic. Basic library (jsbn.js) contains BigInteger implementation which is just enough for RSA encryption and not much more. The rest of the library (jsbn2.js) include most public BigInteger methods. 41
  45. 45. function. It is important to notice that the message argument is also a BigInteger object. This was useful since all the random values used in circuit were already represented as a BigInteger object. Its performance was the most important point since overall time increases linearly with time necessary to compute hash function. This time cannot be neglected when a circuit size increase to, and exceed, 60 000 gates. In performance comparison with the other, well known, open source SHA-1 libraries (available on Internet) this library had the best performance. 4.2.2 Oblivious Transfer Theory. An important primitive in SMC is Oblivious Transfer (OT) introduced by Michael O. Rabin in 1981.12 The OT is a widely used primitive in many secure computation protocols. For example, in Yao’s garbled circuits (section 4.2.3), in the Fairplay system by Malkhi [7] as well as in the multi-party version of Fairplay, FairplayMP [1]. Both Fairplay systems enhance the Yao‘s original construction.13 In this case the most basic set-up is used. In general description, Bob has two bits(messages) m0 and m1 , while Alice has a bit of choice b. Evaluating the protocol will reveal mb to Alice, but nothing is learned about m¬b and Bob didn’t learn the Alice’s secret input b. Used oblivious transfer scheme is based on the RSA cryptosystem, as can be concluded from the following paragraph. Consider a party P1 eager to send k of its n inputs to party P2 while resting assured that P2 cannot learn anything about the other n − k values. Furthermore, P2 is unwilling to reveal to P1 which k of the n inputs has been choose to receive. Even at [23] is presented the first 1 − 2 oblivious transfer scheme14 , which can be generalized to 1 − n and k − n OT schemes, [16] and [9] respectively. Plenty of research has been conducted on efficient and secure OT protocols. Naor and Pinkas [17] significantly improve their earlier 1 − n approach, but they were surpassed by the OT scheme of Tzeng [26], who also devises enhanced k − n OTs [5]. The Chu and Tzeng [5] OTn protocol of will be presented. Required protocol’s k system parameters (g, h, Gq) where g and h are generators of the Gq group of prime order q, subgroup of Z∗ . q 1. P2 choose k inputs he want to retrieve, denoted σ1 . . . σk , and constructs a 12 Rabin’s form was improved later by Shimon Even, Oded Goldreich and Abraham Lempel[3], in order to build protocols for secure multiparty computation. It is generalized to 1-out-of-n OT 13 Yao original construction is described in [29]. 14 One out of two oblivious transfer scheme, also denoted OT21 42