Specifying and Implementing SNOW3G with Cryptol

1,665 views

Published on

Published in: Technology, Education
  • Be the first to comment

Specifying and Implementing SNOW3G with Cryptol

  1. 1. Specifying and implementing SNOW3G with Cryptol Pedro Pereira and Ulisses Costa Universidade do Minho {pedro.mdp,ulissesaraujocosta}@gmail.com Abstract. This paper presents a non-traditional approach to the design of symmetric-key cryptographic algorithms. SNOW 3G is the chosen al- gorithm and a single tool, Cryptol, is used during the process. Cryptol also provides a push-button verification framework for equivalence and safety checking of both specification and implementation. Keywords: Stream cipher, SNOW 3G, Cryptol, formal methods1 Introduction We claim that non-hardware people can get good results by working in Cryptol and would like to confirm or deny that claim. Galois, Inc. The ever-growing complexity of cryptographic algorithms is requiring fun-damental changes to the traditional way they are tested and implemented inhardware. The whole specify-implement process has become increasingly time-consuming and different tools/languages are required for different stages of theprocess. And since functional specifications are written in software for clarityand are generally not optimized or intended for synthesis, the hardware imple-mentation must be proven logically equivalent. This functional validation is donein software and also adds to the effort and time. Cryptography can be seen as the mathematical manipulation of sequences ofdata. This is reflected in the design of Cryptol, a domain-specific language (DSL)which consists of arithmetic operations and manipulation of sequences. As aDSL, it allows cryptographers to design and implement cryptographic algorithmsusing familiar concepts and constructs. With the toolset, it’s possible to providea high degree of assurance in the correctness of their design and at the sametime, produce high performance implementations in C and VHDL. Cryptol wasdeveloped in Haskell during the past decade by Galois, Inc [15]. The Cryptol interpreter is the toolset’s frontend and interacts with an Inter-mediate Representation 1 (IR) explicitly annotated with types which allows fortype-directed evaluation/translation in backends. In this project all interpretermodes were used:1 Generated by the interpreter after parsing and type-checking.
  2. 2. 2 Pedro Pereira and Ulisses Costa – bit mode which performs interpretation on the IR and supports the entire set of Cryptol. This is where Cryptol programs are run; – the symbolic mode performs symbolic interpretation on the IR and supports equivalence checking; – in C mode programs are translated to the C language; – sbv compiles programs into an IR called symbolic bit-vector (SBV) and can also translate them to C. This mode also supports safety checking; – in spir mode, the IR is compiled to Signal Processing Intermediate Represen- tation (SPIR). Also provides rough profiling information of the final circuit and supports equivalence checking; – the fpga mode compiles to SPIR, translates to VHDL and uses external tools to synthesize the VHDL to an architecture dependent netlist.All necessary details about these modes will be presented in following sections. The paper is structured as follows. Section 2 presents a brief overview of theSNOW 3G cipher. Section 3 explains a few design choices in the specification ofthe cipher in Cryptol. Section 4 details the refinement of the specification. Section5 addresses usage of the verification framework. Section 6 exposes related work.Section 7 concludes the paper.2 SNOW 3G cipherSNOW 3G is a word-based synchronous stream cipher developed by ThomasJohansson and Patrik Ekdahl at Lund University in 2001. It was chosen as thestream cipher for the 3GPP encryption algorithms UEA2 and UIA2 [12]. SNOW [7], the cipher’s first version, was originally submitted to NESSIE[22]. The NESSIE research project was funded from 2000-2003 to identify se-cure cryptographic primitives. During SNOW’s evaluation some weaknesses werefound [4, 16] and, as a result, it was not included in the NESSIE suite of algo-rithms. The authors then developed version 2.0 [8] of the cipher which solves theweaknesses and has improved performance. When submitted to the ETSI [11]Security Algorithms Group of Experts (SAGE) evaluation, the design was fur-ther modified to increase its resistance against algebraic attacks and the resultbecame SNOW 3G [14]. SNOW 3G generates a keystream of 32-bit words which mask the plaintext.The cipher requires a 128-bit key and initialization vector. Its structure is essen-tially a combination of a sixteen stage Linear Feedback Shift Register (LFSR)and a Finite State Machine (FSM) composed of three registers R1, R2 and R3as represented by Figure 1. First, a key initialization phase consisting of thirty-two clock cycles is per-formed, altering the LFSR’s and FSM’s state. The cipher then enters the keystreamgeneration phase in which the first clocked output is discarded. With every sub-sequent clock tick, a 32-bit word from the keystream is produced. The bitwise xor operation is denoted by ⊕ and addition modulo 232 denotedby . The M U Lα operation is represented by α and its inverse, DIV α is rep-resented by α−1 . These are bit-mapping functions.
  3. 3. Specifying and implementing SNOW3G with Cryptol 3 Fig. 1. SNOW 3G The LFSR feeds input into the FSM and its state at time t is denoted by(st , ..., st+15 ). Each of these stages can be divided in four 8-bit words: st = (st,0 st,1 st,2 st,3 )Therefore, the FSM’s input V is defined as: V = ((s0,1 s0,2 s0,3 0x00) ⊕ α(s0,0 ) ⊕ s2 ) ⊕ (0x00 s11,0 s11,1 s11,2 ) ⊕ α−1 (s11,3 )While its output F is defined as: F = (s15 R1) ⊕ R2The output zt , ie. the keystream, is: zt = F ⊕ stThe register R1 is updated as: R1 = (s5 ⊕ R3) R2And R2 is updated from a S-box transformation S1(R1) and so is R3 by S2(R2).S1 is based on Rijndael’s round function and S-box SR [5] while S2 is based onthe SQ S-box. SQ is constructed using Dickson polynomial [6]. For further details,SNOW 3G’s complete specification can be found in [13].
  4. 4. 4 Pedro Pereira and Ulisses Costa3 Specifying SNOW 3G in CryptolCryptol has a Hindley-Milner style polymorphic type system extended with sizepolymorphism and arithmetic predicates. This design precisely captures con-straints that naturally arise in cryptographic specifications. For instance, con-sider the following description from [13]: SNOW 3G (...) generates a sequence of 32-bit words under the control of a 128-bit key and a 128-bit initialization variable.Hence, our keystream generation function has the following type:GenKS : ([4][32] , [4][32]) -> [ inf ][32]Note how it rigorously corresponds to the textual description, as it is staticallyensured that both key and initialization vector are 128-bit long (each of them isrepresented by four 32-bit words) and the keystream is a 32-bit word sequenceof unbounded size (inf). We’re allowed to declare finite or infinite sequences ofdata in Cryptol thanks to lazy evaluation. Due to the simple and mathematical nature of SNOW 3G’s components theyare trivially written in Cryptol. To illustrate, let’s consider the specification forMULx which maps 16 to 8 bits: (v 1) ⊕ c if v’s most significant bit == 1 M U Lx(v, c) = v 1 otherwiseAnd its equivalent in Cryptol:MULx : ([8] , [8]) -> [8];MULx (v , c ) = if ( v ! 0 ) then ( v << 1) ^ c else v << 1;Cryptol indexes words in little-endian by default, thus the ! operator to retrievev’s most significant bit. Since the other components’ definitions are practicallyidentical to their specifications we’re omitting them from this section but theycan be viewed in appendix A. During initialization mode, the cipher executes two clocking operations: onefor the LFSR and the other for the FSM. These were written as recursive streamswhich successfully capture the cyclic essence of the operations:Init : ([4][32] , [4][32]) -> ([16][32] , [3][32]) ;Init (k , iv ) = ( ClockLFSR_IM@32 , ClockFSM@32 ) where {ClockLFSR_IM : [ inf ][16][32];ClockLFSR_IM = [ ( Init_LFSR (k , iv ) ) ] # [| ( drop (1 , LFSR ) # [ ( V ( LFSR@0 , LFSR@2 , LFSR@11 ) ^ F ( R@0 , R@1 , LFSR@15 ) ) ] ) || LFSR <- ClockLFSR_IM
  5. 5. Specifying and implementing SNOW3G with Cryptol 5 || R <- ClockFSM |];ClockFSM : [ inf ][3][32];ClockFSM = [ [ 0 0 0 ] ] # [| [ ( ( R@1 + ( R@2 ^ LFSR@5 ) ) & 0 xFFFFFFFF ) ( S1 ( R@0 ) ) ( S2 ( R@1 ) ) ] || LFSR <- ClockLFSR_IM || R <- ClockFSM |];};Each element from both streams represents a particular iteration and we’re onlyinterested in the 32nd ones. Although the streams’ size isn’t finitely restricted,by requesting the first 32 elements of each stream, lazy evaluation guaranteesthat recursion ends after this iteration. The keystream generation mode was written in a similar way and its defini-tion can also be viewed in appendix A.4 Refining into an implementationUnfortunately, there are some restrictions that must be applied to the code inorder for the compiler to successfully translate from Cryptol to VHDL: – No divisions by powers other than 2; – No polymorphic definitions; – No recursive functions; – No high-order functions (partially);The first restriction is a hardware limitation and not required by the Cryp-tol compiler itself. Regarding the second one, the compiler can’t generate aninfinite number of definitions and therefore a specific size must be attributed toall functions. The third restriction forbids definitions of recursive functions butwe can, however, define recursive sequences and every recursive function can berewritten with a recursive sequence. Finally, although we can’t have functions re-turning other functions, they can be passed as parameters to others. But for thisparticular algorithm, first-order functions are sufficient. Therefore, the presentedspecification needs to be rewritten according only to the third restriction. The only recursive function defined is MULxPOW which can be trivially rewrit-ten as:MULxPOW : ([8] , [32] , [8]) -> [8];MULxPOW (v , i , c ) = res @ i where res = [ v ] # [| MULx (e , c ) || e <- res |];On a different subject, Init will be laid over time2 because its more liberaldefinition (as espected in a specification) deals with infinite streams even though2 An infinite stream of output also requires infinite hardware, instead, circuitry is reused forcing data to be processed over time.
  6. 6. 6 Pedro Pereira and Ulisses Costaonly the first 32 iterations are actually required. One could argue the advantageof restricting the streams’ size to 33 elements since it seems useless to keep theminfinite. However, this is ignored as explained later in this section. There are two ways of representing a sequential circuit in Cryptol: the un-clocked step model and the clocked stream model. An accurate performance anal-ysis requires data to be processed over time because of the useful clocking con-straints. The only way to explicitly force processing over time is by convertingthe top-level function into the stream model which essentially implies receivingand/or producing data. Our GenKS already outputs infinite data so no changesare required. The interpreter provides rough timing analysis and size estimates when trans-lating to VHDL in spir mode if spir_profile=detailed is set. It’s best tokeep refining an implementation while using this mode because translating toSPIR takes significantly less time than synthesis, yet still provides enough infor-mation to help produce an efficient implementation. In order to generate an efficient circuit, some optimizations are required.Optimizations rely on space-time tradeoffs. A possible optimization would betrying to reduce some of the computational effort via conversion of mappingfunctions to static lookup tables, trading more space for less time. Static lookuptables are also automatically translated into BlockRAMs (fast component ofFPGA circuits) by the Cryptol compiler. For instance, MULxPOW is a mappingfunction, it receives three 8-bit parameters which means that 2562 tables with 256elements each, were required to maintain equivalence. Realistically, it wouldn’tbe an optimization; the resulting circuit would be fast but the large amount ofspace traded does not compensate. However, in all MULxPOW calls, the third parameter is always the same and thesecond one only assumes eight different values. We can then shorten the previousrange to just eight tables of 256 8-bit elements each, which only requires about16KB of memory making it a much more desired optimization. But, we cando better if MULa and DIVa are converted instead. The space required for thisoptimization is also 16KB (two tables of 256 32-bit elements) and would implyeven less computational logic. Here is a detailed report of the resulting implementation in SPIR:snow3g - impl > : set spir spir_profile = detailedsnow3g - impl > GenKS...=== Summary of Path Timing Estimates ===Overall clock period : 8.38 ns (119.3 MHz )Input pin to flip - flop : 1.94 ns (514.7 MHz )Flip - flop to flip - flop : 7.72 ns (129.6 MHz )Flip - flop to output pin : 8.38 ns (119.3 MHz )Input pin to output pin : No paths=== Summary of Size Estimates ===Estimated total size : about 6848 LUTs , 2776 Flipflops
  7. 7. Specifying and implementing SNOW3G with Cryptol 7=== Circuit Timing ===circuit latency : 37 cycles (36 cycles plus propagation delay )circuit rate : one element per cycleoutput length : unboundedtotal time : unboundedAlthough the report still doesn’t look promising, these numbers are rough esti-mates and a few options used in a later phase will influence it. There are essentially three ways of controling data flow: by paralleling, se-quencing or pipelining. Each approach implies a different space-time tradeoffand translates into different VHDL code. Cryptol provides three pragmas to au-tomate these tradeoffs: par, seq and reg respectively. The par pragma causescircuitry to be replicated, whereas the seq pragma causes circuitry to be reusedover multiple clock cycles. By default, the compiler replicates circuitry as muchas possible in exchange for performance. The user overrides this behavior us-ing seq, par is only useful for switching back to the default behavior withinan instance of seq. The reg pragma imposes pipelining. In a pipelined design,one separates a function into several smaller computational units, each of whichis a stage in the pipeline that consumes output from the previous stage andproduces output for the next one. Each stage is a relatively small circuit withsome propagation delay. The clock rate is limited by the stage in the pipelinewith the highest propagation delay, whereas the un-pipelined implementationwould be limited by the sum of the propagation delays of all stages. So, ratherthan perform one large computation on one input during a very long clock cycle,an n-stage pipeline performs n parallel computations on n partial results, eachcorresponding to a different input to the pipeline. Our circuit’s remaining computational logic resides in Init and GenKS. Thesefunctions deal with infinite streams so they’re going to be translated as sequentialcircuits. Their throughput however, could probably be improved by pipeliningsome of their components. In fact, using reg on a section did result in a greaterclock rate which influenced throughput. A detailed report concerning this resulting implementation can be found insection 4.2.4.1 C code generationBoth C and sbv backends generate C code, although for different subsets andwith different goals. C mode can deal with almost the entire Cryptol language,while only monomorphic, first-order, symbolically terminating and finite func-tions can be translated in sbv mode. This is because sbv was designed forformal-verification using SMT solvers and the C backend was mainly designedfor integration with external C projects. The other difference between C and sbvmodes is that the code generated by sbv does not do memory allocation/deal-location at run-time, as opposed to the C one. The simplicity of the SBV representation is what allows Cryptol to generatereally fast C code. But, translation in sbv mode fails:
  8. 8. 8 Pedro Pereira and Ulisses CostaLoading " snow3g_impl . cry " .. Checking types .. Processing .. Done !snow3g_impl > : set sbvsnow3g_impl > : translate GenKSPANIC : SBV2C : Not yet implemented : BVRor over unsupported sizes s198 :[8] -0 x1 :[3]The reason being that the SBV to C compiler was done as a proof-of-conceptand currently only processes specific constructs.C code generation in the C backend depends on the following libraries: – Cryptol.h Contains all the necessary prototypes, macros and a few stan- dard C includes; – CryAlloc.o Implements a custom memory allocator/deallocator for Cryp- tol run-time; – CryPrim.o Implements C-equivalents of Cryptol’s built-in functions; – CryStream.o C library for representing/manipulating infinite streams;Compiling (:compile) in this mode produces fast code, although it’s not as fastas the hand-written C implementation found in [13]. Also, the generated code isa bit more cryptic as demonstrated by the C definition of MULxPOW:uint8MULxPOW ( uint8 v_MULxPOW , uint32 i_MULxPOW , uint8 c_MULxPOW ){ uint32 local7 = 0 x0 ; uint8 local8 = 0 x0 ; uint8 MULxPOW_res = 0 x0 ; uint32 * mrk449 = getAllocMark () ; MULxPOW_res = v_MULxPOW ; for ( local7 = 0 x0 ; local7 < i_MULxPOW ; local7 += 0 x1 ) { local8 = MULx0 ( MULxPOW_res , c_MULxPOW ) ; MULxPOW_res = local8 ; } freeUntil ( mrk449 ) ; return MULxPOW_res ;}4.2 VHDL code generationModes spir and fpga provide VHDL generation via :translate. This processdepends on the following libraries: – RTLib.vhdl Run-time library which is linked with the generated VHDL code; – RTLib Xilinx.vhdl Defines the Xilinx specific parts of the VHDL run-time library;But ultimately, synthesis, simulation and exact performance reports requireexternal tools. For synthesis, Cryptol currently supports xst from Xilinx and
  9. 9. Specifying and implementing SNOW3G with Cryptol 9Synplicity’s synplify-pro. Regarding simulation, GHDL, ModelSim and Xilinx’sown simulator are among those supported. After installing any of these, Cryp-tol should be ready to interact with them out-of-the-box. We used the following:fpga_synthesis=xst and vhdl_simulation=ise. In this mode, a more exact profiling report of the proposed implementationmay be generated:snow3g - impl - pipe > : set fpga fpga_board = spartan3e fpga_part = xc3s1600e -5 fg484 fpga_netlist = vhdl fpga_blockram = behavioural fpga_optlevel =6 + vsnow3g - impl - pipe > GenKS...Timing Summary :----------------Speed Grade : -5Minimum period : 6.214 ns ( Maximum Frequency : 160.930 MHz )Minimum input arrival time before clock : 2.892 nsMaximum output required time after clock : 11.497 nsMaximum combinational path delay : No path foundDevice Utilization ( size summary ) :-----------------------------------Selected Device : 3 s1600efg484 -5Number of Slices : 1212 out of 14752 8%Number of Slice Flip Flops : 1810 out of 29504 6%Number of 4 input LUTs : 2192 out of 29504 7%The following interpreter options were used: fpga_board=spartan3e in orderto specify which FPGA board should the circuit be placed into (Cryptol cur-rently supports two: spartan3e or avnet_v4mb), fpga_part=xc3s1600e-5fg484for the specific FPGA part, fpga_netlist=vhdl for VHDL netlist generation,fpga_blockram=behavioural to take advantage of one cycle latency Block-RAMs, fpga_optlevel=6 for maximum code optimization and +v for displayingthe reports of the various stages during implementation. Regarding space, the proposed implementation is quite compact. But in orderto assess if it’s fast, comparisons need to be made with other implementationsand this requires the throughput value which can be calculated as: clock rate ∗ output width throughput = output rateThe circuit’s clock rate is 160 MHz and it produces a 32-bit word per cycle as seenpreviously in spir’s timing report. Therefore, the proposed implementation’sthroughput is equal to 5120 Mbps. Comparing it with other implementations:
  10. 10. 10 Pedro Pereira and Ulisses Costa Implementation Frequency (MHz) Throughput (Mbps) Proposed SNOW 3G 160 5120 SNOW 3G [19] 249 7968 SNOW 3G [9] 100 2500 SNOW 2.0 [18] 141 4512 SNOW 1.0 [2] 66.5 2128 Table 1. Experimental results5 Verification frameworkSince specifications are geared towards clearer and rigorous understanding ofbehavior while implementations must be optimized and designed for synthesis,even when written in the same language, they’re bound to become very differ-ent. Therefore it’s imperative to check whether a implementation is functionallyequivalent to its specification. And since we’re talking about assurance it wouldbe desirable to assess if an implementation can be safely executed ie. won’tproduce run-time errors. Cryptol’s verification framework has been designed to check these equivalenceand safety problems. The :eq command checks whether two functions f and gagree on all inputs. If f and g are not equivalent, Cryptol identifies a value xsuch that f x = g x. This is accomplished by generating their formal modelsand feeding them for comparison to a Boolean Satisfiability 3 (SAT) solver. Cryp-tol currently supports two SAT solvers: JAIG [24] and ABC [1], the latter beingthe fastest with our code and therefore being used instead of the default one. A formal model is either a symbolic bit-vector (SBV) or and-inverter graph(AIG), the former is generated in sbv mode and the latter can be generatedin symbolic, spir or fpga modes. This means that currently, with SBV it’sonly possible to do Cryptol ⇔ Cryptol equivalence checking while AIG basedequivalence checking may be done across different backends. The SBV is a much simpler language designed for formal verification withSatisfiability Modulo Theories 4 (SMT) solvers. It’s completely monomorphised,there are no jumps as all function calls are unrolled and it only consists ofbit-vector data and arithmetic/logical operations. Further details about SBV inCryptol can be consulted in [10]. An And-Inverter Graph (AIG) is a directed, acyclic graph representing aboolean logic circuit composed only of inverters and two-input AND gates. Op-tional inverters are modeled as labels on the edges and AND gates correspond tograph nodes. AIGs can represent arbitrary boolean functions and allow for effi-cient manipulations with such functions [20]. Also, a recent emergence of much3 Decision problem for determining if the variables of a given boolean formula can be assigned in such a way as to make the formula evaluate to True.4 Decision problem for determining whether logical formulas are satisfiable with re- spect to combinations of background theories expressed in classical first-order logic with equality.
  11. 11. Specifying and implementing SNOW3G with Cryptol 11more efficient SATs when coupled with AIGs as the circuit representation, leadto remarkable speedups in solving a wide variety of boolean problems [21]. On the other hand, :safe checks for possible run-time exceptions such asdivisions by zero or out-of-bounds indexes and if so, outputs the values whichresult in these exceptions. Guaranteeing the safe execution of a Cryptol programimplies that its subsequent translations to C will be safe as well. However, for the full Cryptol language, both the equivalence and safety check-ing problems are undecidable. They do become solvable if a restricted subset ofthe language is adopted. Therefore, Cryptol’s verification framework only sup-ports functions that are: – Monomorphic; – Finite; – Symbolically terminating; – First-order;The first restriction comes from the fact that the framework’s underlying logicis fixed-size bit vectors. Functions must also be finite because the system lacksinduction capabilities. The third restriction is required because the symbolictermination problem is undecidable in general, therefore stream recursions mustbe used. And because the only available data types from the underlying logicare fixed-size bit vectors, everything is expanded away thus it’s impossible torepresent a high-order function. But even with this restricted subset, the equivalence checking problem re-mains NP-complete. While most practical instances should be solved in a feasi-ble amount of time, one cannot expect a fast analysis for every instance. Someinstances can be solved much faster though, if human guidance is introduced.Cryptol’s equivalence checker can translate problems into Isabelle/HOL notationvia the :isabelle command, reducing the equivalence question to a theorem tobe proved in high-order logic [23]. The proposed implementation is already monomorphic, symbolically termi-nating and first-order but the finite restriction applies. GenKS is the only infinitefunction and so, the size of its output is fixed at, for instance, ten 32-bit wordswith the inclusion of take functions. The following shows how to check the equivalence and safety of GenKS forthe first 10 words of output:Loading " snow3g_spec . cry " .. Checking types .. Processing .. Done !snow3g_spec > : set symbolic abcsnow3g_spec > : fm ( ( x , y ) -> take (10 , GenKS (x , y ) ) ) " genks_spec . aig "snow3g_spec > : load ./ snow3g_impl . cryLoading " snow3g_impl . cry " .. Checking types .. Processing .. Done !snow3g_impl > : set symbolic abcsnow3g_impl > : eq ( ( x , y ) -> take (10 , GenKS (x , y ) ) ) " genks_spec . aig "Truesnow3g_impl > : set sbv
  12. 12. 12 Pedro Pereira and Ulisses Costasnow3g_impl > : safe ( ( x , y ) -> take (10 , GenKS (x , y ) ) )" ( ( x , y ) -> take (10 , GenKS (x , y ) ) ) " is safe ; no safety violations exist .Equivalence checking may be used with yet another purpose. The Cryptol com-piler is a verifying one [17] so when translating from Cryptol to VHDL for instance,it’s necessary to prove the functional equivalence between the two:snow3g_impl > : set spirsnow3g_impl > : eq ( ( x , y ) -> take (10 , GenKS (x , y ) ) ) " genks_spec . aig "TrueThere’s also a :sat command which can be used to find satisfying assignmentsfor bit-valued functions. :sat can be used to check interesting properties, forinstance, given the following finite definitions of cipher and decipher operations:encrypt , decrypt : ([10][32] , [4][32] , [4][32]) -> [10][32];encrypt ( pt , key , iv ) = [| p ^ k || p <- pt || k <- GenKS ( key , iv ) |];decrypt ( ct , key , iv ) = [| c ^ k || c <- ct || k <- GenKS ( key , iv ) |];Can encrypt produce the result 0?: sat ( ( pt , key , iv ) -> encrypt ( pt , key , iv ) == zero )Are there any different plaintext values p1 and p2, such that they will map tothe same ciphertext for the same key?: sat ( ( p1 , p2 , key , iv ) -> ( p1 != p2 ) & ( encrypt ( p1 , key , iv ) == encrypt ( p2 , key , iv ) ) )In each of the two above situations, two formal models are generated. One forthe bit-valued function (property) being checked and another for a function fdefined as:f : ([10][32] ,[4][32] ,[4][32]) -> Bit ;f x = False ;The SAT solver then takes checks these two models for equivalence and the firstcounter-example found is returned as the satisfying solution. It’s also worth mentioning that the equivalence checking problem can beposed as a satisfiability problem and vice versa. In general, the following twoqueries semantically encode the same problem:: eq f g: sat ( x -> f x != g x )Cryptol also supports a flexible way of checking certain properties of an algorithmwith theorem proving. In Cryptol, theorems are simple bit-valued functions re-turning either True or False. This theorem-function correspondence provides
  13. 13. Specifying and implementing SNOW3G with Cryptol 13consistency and avoids an extra language to express properties. The :provecommand generates two formal models, one for the theorem and the other for afunction f defined as:f : ([10][32] ,[4][32] ,[4][32]) -> Bit ;f x = True ;The two models are then checked for equivalence. The following theorem repre-sents the cipher’s correctness (ie. decryption undoes encryption):correct : ([10][32] , [4][32] , [4][32]) -> Bit ;theorem correct : { pt key iv }. decrypt ( encrypt ( pt , key , iv ) , key , iv ) == pt ;Which can be proved as demonstrated by the following:snow3g - impl > : set symbolic abcsnow3g - impl > : prove correctQ.E.D.Evidently, only for the first 10 words. Although an algorithm’s total correctnesscan’t be proven with this restricted set of the Cryptol language, the verificationsystem helps to gain confidence in the algorithm’s behavior. For further detailsregarding the framework, [10] should be consulted.6 Related workThe use of tools such as Frama-C which implement automatic proving of algo-rithms in C is possible and has already been done successfully [3] for crypto-graphic algorithms such as RC4. However, the provers are guided with specialannotations which represent properties such as Hoare style pre/post-conditionsand invariants. But some properties may be impossible to prove and we haveto perform additional proofs with the help of an interactive proof assistant suchas Coq. Another problem with this approach is having to deal with unrelateddetails inherent of a low-level and architecture dependent language like C, suchas (de)allocations, pointer manipulation and valid array accesses for instance. Other tools like CryptoVerif, provide a generic mechanism for specifyingthe security assumptions on cryptographic primitives. CryptoVerif is based onobservational equivalence which induces rewriting rules applicable in contextsthat satisfy some properties. The generated security proofs are sequences ofgames [25] and the desired properties are proven if each individual proof remainsvalid for a polynomial number of sessions (security parameter) in the presenceof an active adversary. This method requires the correct transcription of C codeor the exact security properties described in the CryptoVerif language.7 ConclusionsWe wrote a specification for SNOW 3G. We then optimized it and also gener-ated a hardware implementation. This was done with a single tool, Cryptol. The
  14. 14. 14 Pedro Pereira and Ulisses Costagenerated VHDL implementation is both compact and fast. We have successfullyconfirmed Galois’ claim, non-hardware people such as us, can get good resultsby working in Cryptol. A user’s perspective on the Cryptol language and toolsetwas also presented. During the writing of this paper, Cryptol version 1.8.4 was used and since it’sconstantly being developed, newer versions might be different regarding some ofthe aspects discussed.8 AcknowledgmentsWe want to thank and express our profound respect to our tutor Prof. ManuelAlcino Cunha for his reliable guidance. We also want to thank Mr. Levent Erk¨k ofrom Galois for his invaluable help.References 1. ABC: A System for Sequential Synthesis and Verification. http://www.eecs. berkeley.edu/~alanmi/abc/. Berkeley Logic Synthesis and Verification Group. 2. K. Alexander, R. Karri, I. Minkin, K. Wu, P. Mishra, and X. Li. Towards 10-100 Gbps Cryptographic Architectures. In International Symposium On Computer and Information Sciences, Orlando, pages 25–30, 2002. 3. J. B. Almeida, M. Barbosa, J. S. Pinto, and B. Vieira. Deductive Verification of Cryptographic Software. Technical Report DI-CCTC-09-03, Universidade do Minho, 2009. 4. D. Coppersmith, S. Halevi, and C. Jutla. Cryptanalysis of Stream Ciphers With Linear Masking. In Proc. of CRYPTO’02, pages 515–532. Springer-Verlag, 2002. 5. J. Daemen and V. Rijmen. The Design of Rijndael. Springer-Verlag, 2002. 6. L. E. Dickson. The analytic representation of substitutions on a power of a prime number of letters with a discussion of the linear group. Annals of Mathematics, 11:65–120, 161–183, 1897. 7. P. Ekdahl. LFSR Based Stream Ciphers Analysis and Design. PhD thesis, Depart- ment of Information Technology, Lund University, Sweden, 2003. 8. P. Ekdahl and T. Johansson. A New Version of the Stream Cipher SNOW. In SAC ’02: Revised Papers from the 9th Annual International Workshop on Selected Areas in Cryptography, pages 47–61. Springer-Verlag, 2003. 9. Elliptic Semicondutor Inc. CLP-41 SNOW 3G Cipher Core, available at http://www.ellipticsemi.com/products-clp-41.php.10. L. Erk¨k and J. Matthews. Pragmatic Equivalence and Safety Checking in Cryptol. o In Programming Languages meets Program Verification, PLPV’09, Georgia, USA, pages 73–81. ACM Press, 2009.11. European Telecomunications Standards Industry. http://www.etsi.org.12. ETSI/SAGE. Specification of the 3GPP Confidentiality and Integrity Algorithms UEA2 and UIA2. Document 1: UEA2 and UIA2 Specification, version: 1.1. http: //www.gsmworld.com/documents/etsi_sage_06_09_06.pdf, 2006.13. ETSI/SAGE. Specification of the 3GPP Confidentiality and Integrity Algorithms UEA2 and UIA2. Document 2: SNOW 3G Specification, version: 1.1. http://www. gsmworld.com/documents/snow_3g_spec.pdf, 2006.
  15. 15. Specifying and implementing SNOW3G with Cryptol 1514. ETSI/SAGE. Specification of the 3GPP Confidentiality and Integrity Algorithms UEA2 and UIA2. Document 5: Design and Evaluation report, version: 1.0. http: //www.gsmworld.com/documents/uea2_design_evaluation.pdf, 2006.15. Galois, Inc. http://www.galois.com.16. P. Hawkes and G. G. Rose. Guess-and-Determine Attacks on SNOW. In SAC ’02: Revised Papers from the 9th Annual International Workshop on Selected Areas in Cryptography, pages 37–46. Springer-Verlag, 2003.17. T. Hoare. Towards the Verifying Compiler. In Formal Methods at the Crossroads: From Panacea to Foundational Support, volume 2757 of LNCS, pages 151–160. Springer, 2003.18. P. Kitsos. Hardware Implementations for the ISO/IEC 18033-4:2005 Standard for Stream Ciphers. International Journal of Signal Processing (IJSP), Number 1, 3:66–73, 2006.19. P. Kitsos, G. Selimis, and O. Koufopavlou. High Performance ASIC Implemen- tation of the SNOW 3G Stream Cipher. In IFIP/IEEE VLSI-SoC 2008 - Inter- national Conference on Very Large Scale Integration (VLSI SOC), Rhodes Island, Greece, October, pages 13–15, 2008.20. A. Kuehlmann, V. Paruthi, F. Krohm, and M. K. Ganai. Robust Boolean Reason- ing for Equivalence Checking and Functional Property Verification. IEEE Trans. CAD, 21:1377–1394, 2002.21. A. Mishchenko, S. Chatterjee, and R. Brayton. Improvements to technology map- ping for LUT-based FPGAs. In FPGA ’06: Proceedings of the 2006 ACM/SIGDA 14th international symposium on Field programmable gate arrays, California, USA, pages 41–49. ACM, 2006.22. New European Schemes for Signature, Integrity, and Encryption. https://www. cosic.esat.kuleuven.be/nessie/.23. T. Nipkow, L. C. Paulson, and M. Wenzel. Isabelle/HOL — A Proof Assistant for Higher-Order Logic, volume 2283 of LNCS. Springer, 2002.24. T. Nordin. The JAIG equivalence checker, 2005.25. D. Nowak. A Framework for Game-Based Security Proofs. In Information and Communications Security, 9th International Conference, ICICS 2007, Zhengzhou, China, Proceedings, volume 4861 of LNCS, pages 319–333. Springer, 2007.A SNOW 3G Reference Specification// SNOW 3 G Specification// - - - - - - - - - - - - - - - - - - - - -//// Pedro Pereira & Ulisses Costa/////////////////////////////////// Components/////////////MULx : ([8] , [8]) -> [8];MULx (v , c ) = if ( v ! 0) then ( v << 1) ^ c else ( v << 1) ;
  16. 16. 16 Pedro Pereira and Ulisses CostaMULxPOW : ([8] , [32] , [8]) -> [8];MULxPOW (v , i , c ) = if ( i == 0 ) then v else MULx ( MULxPOW (v , ( i - 1) , c ) , c ) ;MULa : [8] -> [32];MULa ( c ) = join [ ( MULxPOW (c , 239 , 0 xA9 ) ) ( MULxPOW (c , 48 , 0 xA9 ) ) ( MULxPOW (c , 245 , 0 xA9 ) ) ( MULxPOW (c , 23 , 0 xA9 ) ) ];DIVa : [8] -> [32];DIVa ( c ) = join [ ( MULxPOW (c , 64 , 0 xA9 ) ) ( MULxPOW (c , 6 , 0 xA9 ) ) ( MULxPOW (c , 39 , 0 xA9 ) ) ( MULxPOW (c , 16 , 0 xA9 ) ) ];// Rijndael S - boxSr : [8] -> [8];Sr ( x ) = sb@x where sb = [ 0 X63 0 X7C 0 X77 0 X7B 0 XF2 0 X6B 0 X6F 0 XC5 0 X30 0 X01 0 X67 0 X2B 0 XFE 0 XD7 0 XAB 0 X76 0 XCA 0 X82 0 XC9 0 X7D 0 XFA 0 X59 0 X47 0 XF0 0 XAD 0 XD4 0 XA2 0 XAF 0 X9C 0 XA4 0 X72 0 XC0 0 XB7 0 XFD 0 X93 0 X26 0 X36 0 X3F 0 XF7 0 XCC 0 X34 0 XA5 0 XE5 0 XF1 0 X71 0 XD8 0 X31 0 X15 0 X04 0 XC7 0 X23 0 XC3 0 X18 0 X96 0 X05 0 X9A 0 X07 0 X12 0 X80 0 XE2 0 XEB 0 X27 0 XB2 0 X75 0 X09 0 X83 0 X2C 0 X1A 0 X1B 0 X6E 0 X5A 0 XA0 0 X52 0 X3B 0 XD6 0 XB3 0 X29 0 XE3 0 X2F 0 X84 0 X53 0 XD1 0 X00 0 XED 0 X20 0 XFC 0 XB1 0 X5B 0 X6A 0 XCB 0 XBE 0 X39 0 X4A 0 X4C 0 X58 0 XCF 0 XD0 0 XEF 0 XAA 0 XFB 0 X43 0 X4D 0 X33 0 X85 0 X45 0 XF9 0 X02 0 X7F 0 X50 0 X3C 0 X9F 0 XA8 0 X51 0 XA3 0 X40 0 X8F 0 X92 0 X9D 0 X38 0 XF5 0 XBC 0 XB6 0 XDA 0 X21 0 X10 0 XFF 0 XF3 0 XD2 0 XCD 0 X0C 0 X13 0 XEC 0 X5F 0 X97 0 X44 0 X17 0 XC4 0 XA7 0 X7E 0 X3D 0 X64 0 X5D 0 X19 0 X73 0 X60 0 X81 0 X4F 0 XDC 0 X22 0 X2A 0 X90 0 X88 0 X46 0 XEE 0 XB8 0 X14 0 XDE 0 X5E 0 X0B 0 XDB 0 XE0 0 X32 0 X3A 0 X0A 0 X49 0 X06 0 X24 0 X5C 0 XC2 0 XD3 0 XAC 0 X62 0 X91 0 X95 0 XE4 0 X79 0 XE7 0 XC8 0 X37 0 X6D 0 X8D 0 XD5 0 X4E 0 XA9 0 X6C 0 X56 0 XF4 0 XEA 0 X65 0 X7A 0 XAE 0 X08 0 XBA 0 X78 0 X25 0 X2E 0 X1C 0 XA6 0 XB4 0 XC6 0 XE8 0 XDD 0 X74 0 X1F 0 X4B 0 XBD 0 X8B 0 X8A 0 X70 0 X3E 0 XB5 0 X66 0 X48 0 X03 0 XF6 0 X0E 0 X61 0 X35 0 X57 0 XB9 0 X86 0 XC1 0 X1D 0 X9E 0 XE1 0 XF8 0 X98 0 X11 0 X69 0 XD9 0 X8E 0 X94 0 X9B 0 X1E 0 X87 0 XE9 0 XCE 0 X55 0 X28 0 XDF 0 X8C 0 XA1 0 X89 0 X0D 0 XBF 0 XE6 0 X42 0 X68 0 X41 0 X99 0 X2D 0 X0F 0 XB0 0 X54 0 XBB 0 X16 ];
  17. 17. Specifying and implementing SNOW3G with Cryptol 17Sq : [8] -> [8];Sq ( x ) = sb@x where sb = [ 0 X25 0 X24 0 X73 0 X67 0 XD7 0 XAE 0 X5C 0 X30 0 XA4 0 XEE 0 X6E 0 XCB 0 X7D 0 XB5 0 X82 0 XDB 0 XE4 0 X8E 0 X48 0 X49 0 X4F 0 X5D 0 X6A 0 X78 0 X70 0 X88 0 XE8 0 X5F 0 X5E 0 X84 0 X65 0 XE2 0 XD8 0 XE9 0 XCC 0 XED 0 X40 0 X2F 0 X11 0 X28 0 X57 0 XD2 0 XAC 0 XE3 0 X4A 0 X15 0 X1B 0 XB9 0 XB2 0 X80 0 X85 0 XA6 0 X2E 0 X02 0 X47 0 X29 0 X07 0 X4B 0 X0E 0 XC1 0 X51 0 XAA 0 X89 0 XD4 0 XCA 0 X01 0 X46 0 XB3 0 XEF 0 XDD 0 X44 0 X7B 0 XC2 0 X7F 0 XBE 0 XC3 0 X9F 0 X20 0 X4C 0 X64 0 X83 0 XA2 0 X68 0 X42 0 X13 0 XB4 0 X41 0 XCD 0 XBA 0 XC6 0 XBB 0 X6D 0 X4D 0 X71 0 X21 0 XF4 0 X8D 0 XB0 0 XE5 0 X93 0 XFE 0 X8F 0 XE6 0 XCF 0 X43 0 X45 0 X31 0 X22 0 X37 0 X36 0 X96 0 XFA 0 XBC 0 X0F 0 X08 0 X52 0 X1D 0 X55 0 X1A 0 XC5 0 X4E 0 X23 0 X69 0 X7A 0 X92 0 XFF 0 X5B 0 X5A 0 XEB 0 X9A 0 X1C 0 XA9 0 XD1 0 X7E 0 X0D 0 XFC 0 X50 0 X8A 0 XB6 0 X62 0 XF5 0 X0A 0 XF8 0 XDC 0 X03 0 X3C 0 X0C 0 X39 0 XF1 0 XB8 0 XF3 0 X3D 0 XF2 0 XD5 0 X97 0 X66 0 X81 0 X32 0 XA0 0 X00 0 X06 0 XCE 0 XF6 0 XEA 0 XB7 0 X17 0 XF7 0 X8C 0 X79 0 XD6 0 XA7 0 XBF 0 X8B 0 X3F 0 X1F 0 X53 0 X63 0 X75 0 X35 0 X2C 0 X60 0 XFD 0 X27 0 XD3 0 X94 0 XA5 0 X7C 0 XA1 0 X05 0 X58 0 X2D 0 XBD 0 XD9 0 XC7 0 XAF 0 X6B 0 X54 0 X0B 0 XE0 0 X38 0 X04 0 XC8 0 X9D 0 XE7 0 X14 0 XB1 0 X87 0 X9C 0 XDF 0 X6F 0 XF9 0 XDA 0 X2A 0 XC4 0 X59 0 X16 0 X74 0 X91 0 XAB 0 X26 0 X61 0 X76 0 X34 0 X2B 0 XAD 0 X99 0 XFB 0 X72 0 XEC 0 X33 0 X12 0 XDE 0 X98 0 X3B 0 XC0 0 X9B 0 X3E 0 X18 0 X10 0 X3A 0 X56 0 XE1 0 X77 0 XC9 0 X1E 0 X9E 0 X95 0 XA3 0 X90 0 X19 0 XA8 0 X6C 0 X09 0 XD0 0 XF0 0 X86 ];S1 : [32] -> [32];S1 ( w ) = join [ ( Sr ( w0 ) ^ Sr ( w1 ) ^ MULx ( Sr ( w2 ) , 0 x1B ) ^ Sr ( w2 ) ^ MULx ( Sr ( w3 ) , 0 x1B ) ) ( Sr ( w0 ) ^ MULx ( Sr ( w1 ) , 0 x1B ) ^ Sr ( w1 ) ^ MULx ( Sr ( w2 ) , 0 x1B ) ^ Sr ( w3 ) ) ( MULx ( Sr ( w0 ) , 0 x1B ) ^ Sr ( w0 ) ^ MULx ( Sr ( w1 ) , 0 x1B ) ^ Sr ( w2 ) ^ Sr ( w3 ) ) ( MULx ( Sr ( w0 ) , 0 x1B ) ^ Sr ( w1 ) ^ Sr ( w2 ) ^ MULx ( Sr ( w3 ) , 0 x1B ) ^ Sr ( w3 ) ) ] where [ w3 w2 w1 w0 ] = split w ;S2 : [32] -> [32];S2 ( w ) = join [ ( Sq ( w0 ) ^ Sq ( w1 ) ^ MULx ( Sq ( w2 ) , 0 x69 ) ^ Sq ( w2 ) ^ MULx ( Sq ( w3 ) , 0 x69 ) ) ( Sq ( w0 ) ^ MULx ( Sq ( w1 ) , 0 x69 ) ^ Sq ( w1 ) ^ MULx ( Sq ( w2 ) , 0 x69 ) ^ Sq ( w3 ) ) ( MULx ( Sq ( w0 ) , 0 x69 ) ^ Sq ( w0 ) ^
  18. 18. 18 Pedro Pereira and Ulisses Costa MULx ( Sq ( w1 ) , 0 x69 ) ^ Sq ( w2 ) ^ Sq ( w3 ) ) ( MULx ( Sq ( w0 ) , 0 x69 ) ^ Sq ( w1 ) ^ Sq ( w2 ) ^ MULx ( Sq ( w3 ) , 0 x69 ) ^ Sq ( w3 ) ) ] where [ w3 w2 w1 w0 ] = split w ;// Clocking Operations//////////////////////Init : ([4][32] , [4][32]) -> ([16][32] , [3][32]) ;Init (k , iv ) = ( ClockLFSR_IM@32 , ClockFSM@32 ) where {ClockLFSR_IM : [ inf ][16][32];ClockLFSR_IM = [ ( Init_LFSR (k , iv ) ) ] # [| ( drop (1 , LFSR ) # [( V ( LFSR@0 , LFSR@2 , LFSR@11 ) ^ F ( R@0 , R@1 , LFSR@15 ) ) ]) || LFSR <- ClockLFSR_IM || R <- ClockFSM |];ClockFSM : [ inf ][3][32];ClockFSM = [ [ 0 0 0 ] ] # [| [ ( ( R@1 + ( R@2 ^ LFSR@5 ) ) & 0 xFFFFFFFF ) ( S1 ( R@0 ) ) ( S2 ( R@1 ) ) ] || LFSR <- ClockLFSR_IM || R <- ClockFSM |];};GenKS : ([4][32] , [4][32]) -> [ inf ][32];GenKS (k , iv ) = tail zt where {( lfsr , fsm ) = Init (k , iv ) ;ClockLFSR_KSM : [ inf ][16][32];ClockLFSR_KSM = [ lfsr ] # [| ( drop (1 , LFSR ) # [ ( V ( LFSR@0 , LFSR@2 , LFSR@11 ) ) ] ) || LFSR <- ClockLFSR_KSM |];zt : [ inf ][32];zt = [| F ( R@0 , R@1 , LFSR@15 ) ^ LFSR@0 || LFSR <- ClockLFSR_KSM || R <- ClockFSM |];ClockFSM : [ inf ][3][32];
  19. 19. Specifying and implementing SNOW3G with Cryptol 19ClockFSM = [ fsm ] # [| [ ( ( R@1 + ( R@2 ^ LFSR@5 ) ) & 0 xFFFFFFFF ) ( S1 ( R@0 ) ) ( S2 ( R@1 ) ) ] || LFSR <- ClockLFSR_KSM || R <- ClockFSM |];};// Auxiliary////////////Init_LFSR : ([4][32] , [4][32]) -> [16][32];Init_LFSR (k , iv ) = [ ( k@0 ^ 0 xFFFFFFFF ) ( k@1 ^ 0 xFFFFFFFF ) ( k@2 ^ 0 xFFFFFFFF ) ( k@3 ^ 0 xFFFFFFFF ) ( k@0 ) ( k@1 ) ( k@2 ) ( k@3 ) ( k@0 ^ 0 xFFFFFFFF ) ( k@1 ^ 0 xFFFFFFFF ^ iv@3 ) ( k@2 ^ 0 xFFFFFFFF ^ iv@2 ) ( k@3 ^ 0 xFFFFFFFF ) ( k@0 ^ iv@1 ) ( k@1 ) ( k@2 ) ( k@3 ^ iv@0 ) ];F : ([32] , [32] , [32]) -> [32];F ( R0 , R1 , LFSR_15 ) = (( LFSR_15 + R0 ) & 0 xFFFFFFFF ) ^ R1 ;V : ([32] , [32] , [32]) -> [32];V ( LFSR_0 , LFSR_2 , LFSR_11 ) = join ( reverse ( drop (1 , s0 ) # [0 x00 ]) ) ^ MULa ( s0 @ 0) ^ LFSR_2 ^ join ( reverse ([0 x00 ] # take (3 , s11 ) ) ) ^ DIVa ( s11 @ 3) where { s0 = reverse ( split LFSR_0 ) :[4][8]; s11 = reverse ( split LFSR_11 ) :[4][8]; };

×