Published on

IJERA (International journal of Engineering Research and Applications) is International online, ... peer reviewed journal. For more detail or submit your article, please visit www.ijera.com

  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide


  1. 1. Puneet Paruthi, Tanvi Kumar, Himanshu Singh / International Journal of Engineering Research and Applications (IJERA) ISSN: 2248-9622 www.ijera.com Vol. 2, Issue 5, September- October 2012, pp.1761-1766 Simulation of IEEE 754 Standard Double Precision Multiplier using Booth Techniques 1 Puneet Paruthi, 2Tanvi Kumar, 3Himanshu Singh 1,2 Dept. of ECE, BMSCE, Mukhsar, 3Asstt. Professor,ISTK,Kalawad.Abstract Multiplication is an important are generated in parallel, the multiplier’sfundamental function in arithmetic operations. It performance is almost doubled compared to acan be performed with the help of different conventional floating point multiplier.multipliers using different techniques. The A. Floating Point Arithmeticobjective of good multiplier is to provide a The IEEE Standard for Binary Floating-physically compact high speed and low power Point Arithmetic (IEEE 754) is the most widely usedconsumption. To save significant power standard for floating-point computation, and isconsumption of multiplier design, it is a good followed by many CPU and FPU implementations.direction to reduce number of operations thereby The standard defines formats for representingreducing a dynamic power which is a major part floating-point number (including ±zero andof total power dissipation. The main objective of denormals) and special values (infinities and NaNs)this Dissertation is to design “Simulation of IEEE together with a set of floating-point operations that754 standard double precision multiplier” using operate on these values. It also specifies fourVHDL. rounding modes and five exceptions. IEEE 754 specifies four formats for representing floating-point values: single-precision (32-bit), double-precisionKeywords - IEEE floating point arithmetic; (64-bit), single-extended precision (≥ 43-bit, notRounding; Floating point multiplier commonly used) and double-extended precision (≥ 79-bit, usually implemented with 80 bits). ManyI. INTRODUCTION languages specify that IEEE formats and arithmetic Every computer has a floating point be implemented, although sometimes it is optional.processor or a dedicated accelerator that fulfils the For example, the C programming language, whichrequirements of precision using detailed floating pre-dated IEEE 754, now allows but does not requirepoint arithmetic. The main applications of floating IEEE arithmetic (the C float typically is used forpoints today are in the field of medical imaging, IEEE single-precision and double uses IEEE double-biometrics, motion capture and audio applications. precision).Since multiplication dominates the execution time of B. Double Precision Floating Point Numbersmost DSP algorithms, so there is a need of high Thus, a total of 64 bits is needed forspeed multiplier with more accuracy. Reducing the double-precision number representation. To achievetime delay and power consumption are very essential n 1requirements for many applications. Floating Point a bias equal to 2 − 1 is added to the actualNumbers: The term floating point is derived from exponent in order to obtain the stored exponent. Thisthe fact that there is no fixed number of digits before equal 1023 for an 11-bit exponent of the doubleand after the decimal point, that is, the decimal point precision format. The addition of bias allows the usecan float. There are also representations in which the of an exponent in the range from −1023 to +1024,number of digits before and after the decimal point corresponding to a range of 0.2047 for doubleis set, called fixed-point representations. precision number. The double precision format offers a range from 2−1023 to 2+1023, which is Floating Point Numbers are numbers that equivalent to 10−308 to 10+308.can contain a fractional part. E.g. following numbers Sign: 1-bit wide and used to denote the sign of theare the floating point numbers: 3.0, -111.5, ½, 3E-5 number i.e. 0 indicate positive number and 1etc. IEEE Standard for Binary Floating Point represent negative number.Arithmetic. The Institute of Electrical and Exponent: 11-bit wide and signed exponent inElectronics Engineers (IEEE) sponsored a standard excess- 1023 representation. Mantissa: 52-bit wideformat for 32-bit and larger floating point numbers, and fractional componentknown as IEEE 754 standard. 1 11 52 This paper presents a new floating-point Sign Exponent Mantissamultiplier which can perform a double-precisionfloating-point multiplication or simultaneous single 64precision floating-point multiplications. Since in Fig.1 Double Precision Floating Point Formatsingle precision floating-point multiplication results 1761 | P a g e
  2. 2. Puneet Paruthi, Tanvi Kumar, Himanshu Singh / International Journal of Engineering Research and Applications (IJERA) ISSN: 2248-9622 www.ijera.com Vol. 2, Issue 5, September- October 2012, pp.1761-1766C. Floating-Point Multiplication operation of the sign fields of both of the operands. Multiplication of two floating point Cho, J. Hong et al. and N. Besli et al.[5][6]normalized numbers is performed by multiplying the multiplied double precision operands (53-bit fractionfractional components, adding the exponents, and an fields),in which a total of 53 53-bit partial productsexclusive or operation of the sign fields of both of are generated . To speed up this process, the twothe operands The most complicated part is obvious solutions are to generate fewer partialperforming the integer-like multiplication on the products and to sum them faster. Sumit Vaidya etfraction fields . Essentially the multiplication is done al.[1] compared the different multipliers on the basisin two steps, partial product generation and partial of power, speed, delay and area to get the efficientproduct addition. For double precision operands (53- multiplier. It can be concluded that array Multiplierbit fraction fields), a total of 53 53-bit partial requires more power consumption and givesproducts are generated. optimum number of components required, but delay The general form of the representation of floating for this multiplier is larger than Wallace Treepoint is: Multiplier. Hasan Krad et al.[4] presented a (-1) S * M * 2E performance analysis of two different multipliers forWhere unsigned data, one uses a carry-look-ahead adderS represents the sign bit, M represents the mantissa and the second one uses a ripple adder. In this authorand E represents the exponent. said that the multiplier with a carry-look-aheadGiven two FP numbers n1 and n2, the product of adder has shown a better performance over theboth, denoted as n, can be expressed as: multiplier with a ripple adder in terms of gate delays.n = n1 × n2 In other words, the multiplier with the carrylook-= (−1) S1 · p1 · 2E1 × (−1) S2 · p2 · 2E2 ahead adder has approximately twice the speed of= (−1) S1+S2 · (p1 · p2) · 2E1+E2 the multiplier with the ripple adder, under the worstIn order to perform floating-point multiplication, a case. Soojin Kim et al.[2] described the pipelinesimple algorithm is realized: architecture of high-speed modified Booth  Add the exponents and subtract 1023. multipliers. The proposed multiplier circuits are  Multiply the mantissas and determine the based on the modified Booth algorithm and the sign of the result. pipeline technique which are the most widely used to  Normalize the resulting value, if necessary. accelerate the multiplication speed. In order to implement the optimally pipelined multipliers, manyD. Model Sim Overview kinds of experiments have been conducted. P. ModelSim is a verification and simulation Assady [3] presented a new high-speed algorithm.tool for VHDL, Verilog, SystemVerilog, SystemC, As multipler has to do three important steps, whichand mixed-language designs. ModelSim VHDL include partial product generation, partial productimplements the VHDL language as defined by IEEE reduction, and final addition step. In partial productStandards 1076-1987, 1076-1993, and 1076-2002. generation step, a new Booth algorithm has beenModelSim also supports the 1164-1993 Standard presented. In partial product reduction step, a newMultivalue Logic System for VHDL Interoperability, tree structure has been designed and in final additionand the 1076.2-1996 Standard VHDL Mathematical step, a new hybrid adder using 4-bit blocks has beenPackages standards. Any design developed with proposed.ModelSim will be compatible with any other VHDLsystem that is compliant with the 1076 specifications III. METHODOLOGY A. Methods for multiplicationII. LITERATURE SURVEY There are number of techniques that can be A few research work have been conducted used to perform multiplication. In general, theto explain the concept of Floating Point Numbers. D. choice is based upon factors such as latency,Goldberg [11] explained the concept of Floating throughput, area, and design complexity.Point Numbers used to describe very small to very a) Array Multiplier b) Booth Multiplierlarge numbers with a varying level of precision.They are comprised of three fields, a sign, a fraction Booths multiplication algorithm is aand an exponent field. B. Parhami [8] proposed multiplication algorithm that multiplies two signedIEEE-754 standard defining several floating point binary numbers in twos complement notation. Thenumber formats and the size of the fields that algorithm was invented by Andrew Donald Booth..comprise them. This Standard defines severalrounding schemes, which include round to zero, 1) Booth Multiplierround to infinity, round to negative infinity, and Conventional array multipliers, like theround to nearest. Michael L. Overton [7] performed Braun multiplier and Baugh Woolley multiplierthe multiplication of two floating point normalized achieve comparatively good performance but theynumbers by multiplying the fractional components, require large area of silicon, unlike the add-shiftadding the exponents, and an Exclusive OR algorithms, which require less hardware and exhibit 1762 | P a g e
  3. 3. Puneet Paruthi, Tanvi Kumar, Himanshu Singh / International Journal of Engineering Research and Applications (IJERA) ISSN: 2248-9622 www.ijera.com Vol. 2, Issue 5, September- October 2012, pp.1761-1766poorer performance. The Booth multiplier makes use 6) Drop the least significant (rightmost) bitof Booth encoding algorithm in order to reduce the from P. This is the product of m and r.number of partial products by considering two bitsof the multiplier at a time, thereby achieving a speedadvantage over other multiplier architectures. Thisalgorithm is valid for both signed and unsignednumbers. It accepts the number in 2s complementform.2) Array Multiplier Array multiplier is an efficient layout of acombinational multiplier. Multiplication of twobinary number can be obtained with one micro-operation by using a combinational circuit that formsthe product bit all at once thus making it a fast wayof multiplying two numbers since only delay is thetime for the signals to propagate through the gatesthat forms the multiplication array. In arraymultiplier, consider two binary numbers A and B, ofm and n bits. There are mn summands that areproduced in parallel by a set of mn AND gates. n x nmultiplier requires n (n-2) full adders, n half-addersand n2 AND gates. Also, in array multiplier worst Figure 2: Block Diagram of the Multipliercase delay would be (2n+1) td. C. Double Precision Booth MultiplicationB. Booth Multiplication(Proposed work) Lets suppose a multiplication of 2 floating- Booths algorithm involves repeatedly point numbers A and B, where A=-18.0 and B=9.5adding one of two predetermined values A and S to a Binary representation of the operands:product P, then performing a rightward arithmetic A = -10010.0shift on P. Let m and r be the multiplicand and B = +1001.1multiplier, respectively; and let x and y represent thenumber of bits in m and r. Normalized representation of the operands: Determine the values of A and S, and the A = -1.001x24initial value of P. All of these numbers should have B = +1.0011x23a length equal to (x + y + 1). Fill the most significant IEEE representation of the operands:(leftmost) bits with the value of m. Fill the A =remaining (y + 1) bits with zeros. 110000000011001000000000000000000000000000 1) Fill the most significant bits with the value 0000000000000000000000 of (−m) in twos complement notation. Fill B = the remaining (y + 1) bits with zeros. 010000000010001100000000000000000000000000 2) P: Fill the most significant x bits with zeros. 0000000000000000000000 To the right of this, append the value of r. Fill the least significant (rightmost) bit with Multiplication of the mantissas: a zero. •we must extract the mantissas, adding an1 as most 3) Determine the two least significant significant bit, for normalization (rightmost) bits of P. A= a. If they are 01, find the value of 100100000000000000000000000000000000000000 P + A. Ignore any overflow. 00000000000 b. If they are 10, find the value of B= P + S. Ignore any overflow. 100110000000000000000000000000000000000000 c. If they are 00, do nothing. Use P 00000000000 directly in the next step. the 106-bit result of the multiplication is: d. If they are 11, do nothing. Use P 0x558000000000 directly in the next step. only the most significant bits are useful: 4) Arithmetically shift the value obtained in after normalization (elimination of the most the 2nd step by a single place to the right. significant 1), we get the 52-bit mantissa of the Let P now equal this new value. result. This normalization can lead to a correction of 5) Repeat steps 2 and 3 until they have been the results exponent done y times. In our case, we get: 1763 | P a g e
  4. 4. Puneet Paruthi, Tanvi Kumar, Himanshu Singh / International Journal of Engineering Research and Applications (IJERA) ISSN: 2248-9622 www.ijera.com Vol. 2, Issue 5, September- October 2012, pp.1761-176601 IV. SYNTHESIS RESULTS010101100000000000000000000000000000000000 This design has been implemented,0000000000 simulated on ModelSim and synthesized for VHDL.000000000000000000000000000000000000000000 The HDL code uses VHDL 2001 constructs that000000000000 provide certain benefits over the VHDL 95 standard Addition of the exponents: in terms of scalability and code reusability.exponent of the result is equal to the sum of the Simulation based verification is one of the methodsoperands exponents. A 1 for functional verification of a design. In thiscan be added if needed by the normalization of the method, test inputs are provided using standard testmantissas multiplication (this is not the case in our benches. The test bench forms the top module thatexample) instantiates other modules. Simulation based As the exponent fields (Ea and Eb) are verification ensures that the design is functionallybiased, the bias must be removed in order to do the correct when tested with a given set of inputs.addition. And then, we must to add again the bias, to Though it is not fully complete, by picking a randomget the value to be entered into the exponent field of set of inputs as well as corner cases, simulationthe result (Er): based verification can still yield reasonably good results.Er = (Ea-1023) + (Eb-1023) + 1023= Ea + Eb – 1023In our example, we have:Ea 10000000011Eb 10000000010-1023 10000000001Er 10000000110what is actually 7, the exponent of the result Table 1. The comparison of MultipliersCalculation of the sign of the result: The following snapshots are taken fromThe sign of the result (Sr) is given by the exclusive- ModelSim after the timing simulation of the floatingor of the operands signs point multiplier core.(Sa and Sb): Consider the inputs to the floating point multiplierSr = Sa ⊕ Sb are:In our example, we get: 1) A = -18.0 = -1.001x24Sr = 1 ⊕ 0 = 1 A = 1 10000011i.e. a negative sign 001000000000000000000000000000000000000000Composition of the result: the setting of the 3 0000000000000 = 0x40F00000intermediate results (sign, exponent and B = 9.5 = 1.0011x23mantissa) gives us the final result of our B = 0 10000010multiplication: 0011000000000000000000000000000000000000001 10000000110 0000000000000 = 0x41780000010101100000000000000000000000000000000000 AxB = -171.0 = -1.0101011 x270000000000 AxB = 1 10000110AxB = -18.0x9.5 = -1.0101011 x 21030-1023 = - 01010110000000000000000000000000000000000010101011.0 = -171.010 00000000000 = 0x42E88000Figure 3: Schematic Diagram of Double PrecisionMultiplier Figure 4: Simulation of Double Precision Multiplier 1764 | P a g e
  5. 5. Puneet Paruthi, Tanvi Kumar, Himanshu Singh / International Journal of Engineering Research and Applications (IJERA) ISSN: 2248-9622 www.ijera.com Vol. 2, Issue 5, September- October 2012, pp.1761-17662) A = 134.0625 = 1.00001100001x27 Academy of Science, Engineering andA = Technology 61, 2010.010000000110000011000010000000000000000000 [3] P. Assady, “A New Multiplication0000000000000000000000 Algorithm Using High-SpeedB = -2.25 = -1.001x21 Counters”,European Journal of ScientificB = Research ISSN 1450-216X, Vol.26110000000000001000000000000000000000000000 No.3 ,pp.362-368, 2009.0000000000000000000000 [4] Hasan Krad and Aws Yousif Al-Taie,AxB = -301.640625 = -1.00101101101001x28 “Performance Analysis of a 32-BitAxB = 1 10000000111 Multiplier with a Carry-Look-Ahead Adder001011011010010000000000000000000000000000 and a 32-bit Multiplier with a Ripple Adder0000000 using VHDL”, Journal of Computer Science 4 , ISSN 1549-3636,pp. 305-308, 2008. [5] Cho, J. Hong, and G Choi, “54x54-bit Radix-4 Multiplier based on Modified Booth Algorithm,” 13th ACM Symp.VLSI, pp 233-236, Apr. 2003. [6] N. Besli, R. G. DeshMukh, “A 54*54-bit Multiplier with a new Redundant Booth’s Encoding,” IEEE Conf. Electrical and Computer Engineering, vol. 2, pp 597-602, 12-15 May 2002. [7] Michael L. Overton, “Numerical Computing with IEEE Floating Point Arithmetic,” Published by Society for Industrial and Applied Mathematics, 2001.Figure 5: Simulation of Double Precision Multiplier [8] B. Parhami, “Computer Arithmetic: Algorithms and Hardware Designs”,V. CONCLUSION Oxford University Press, 2000. In this study a floating point multiplier [9] N. Itoh, Y. Naemura, H. Makino, Y.design which is capable of executing a double Nakase, “A Compact 54*54-bit Withprecision floating-point multiplication using booth Improved Wallace-Tree Structure,” Dig.algorithm. One of the important aspects of the Technical Papers of Symp. VLSI Circuits,presented design method is that it can be applicable pp 15-16, Jun. 1999.to all kinds of floating-point multipliers. The [10] R. K. Yu, G. B. Zyner, “167MHz Radix-4presented design is compared with a ordinary Floating-Point Multiplier,” IEEE Symp. onfloating point array multiplier via synthesis. The Computer Arithmetic, pp 149-154, Jul.synthesis results showed that proposed design is 1995.more efficient than conventional multiplier and [11] D. Goldberg, “What every computercritical path increment is only one or two gate delay. scientist should know about floating-pointSince modern floating-point multiplier designs have arithmetic” ,ACM Computing Surveys vol.significantly larger area than the standard floating- 23-1 , pp. 5-48 ,1991.point multiplier, the percentage of the extra [12] A. Goldovsky and et al., “Design andhardware will be less for those units. The methods Implementation of 16 by 16 Low-Powerpresented in this study will be used on the design of Two’s Complement Multiplier. In Designfloating-point multiplier-adder circuits. Also, the and Implementation of Adder/Subtractorfuture work will enhance the proposed designs to and Multiplication Units for Floating-Pointsupport all IEEE754 rounding modes. Arithmetic IEEE International Symposium on Circuits and Systems, 5, pp 345–348,REFERENCES 2000. [1] Sumit Vaidya and Deepak Dandekar, [13] P. Seidel, L. McFearin, and D. Matula. “ Delay-Power Performance comparison of “Binary Multiplication Radix-32 and multipliers in VLSI circuit Radix-256”, In 15th Symp. On Computer design”,International Journal of Computer Arithmetic, pp 23–32, 2001. Networks & Communications (IJCNC), [14] U. Kulisch, “Advanced Arithmetic for the Vol.2, No.4, pg 47-55, July 2010. Digital Computers”, Springer- Verlag, [2] Soojin Kim and Kyeongsoon Cho, “Design Vienna, 2002. of High-speed Modified Booth Multipliers [15] M. Schulte, E. Swartzlander, “Hardware Operating at GHz Ranges”, World Design and ArithmeticAlgorithms for a 1765 | P a g e
  6. 6. Puneet Paruthi, Tanvi Kumar, Himanshu Singh / International Journal of Engineering Research and Applications (IJERA) ISSN: 2248-9622 www.ijera.com Vol. 2, Issue 5, September- October 2012, pp.1761-1766 Variable-Precision, Interval Arithmetic [28] K. Yano and et al. A 3.8-ns CMOS 16 _ 16- Coprocessor”, Proc. IEEE 12th Symposium b Multiplier Using Complementary Pass- on Computer Arithmetic (ARITH-12), pp Transistor Logic. Journal of Solid-State 222-230, 1995. Circuits, 25:388–395, 1990.[16] R. V. K. Pillai, D. A1 - IShalili and A. J. [29] I. Khater, A. Bellaouar, and M. Elmasry, A1 - Khalili, “A Low Power Approach to “Circuit Techniques for CMOS Low-Power Floating Point Adder Design”, in High-Performance Multipliers”, IEEE Proceedings of the I997 International Journal of Solid-State Circuits, 31:1535– Conference on Computer Design, pp. 178- 1546, 1996. 185. [30] G.Bewick, “Fast Multiplication:[17] Kari Kalliojarvi and Yrjo Neovo, Algorithms and Implementation”, PhD. “Distribution of Roundoff Noise in Binary Thesis, Stanford University, 1992 Floating - Point Addition”, Proceedings of ISCAS92, pp. 1796-1799.[18] Wakerly, John F., “Digital Design – Principles and Practices”, Tata McGraw Hill Series.[19] Flores, Ivan, The Logic of Computer Arithmetic, Prentice Hall, Inc. Englewood Cliff (N.J.).[20] G. Todorov, BASIC Design, Implementation and Analysis of a Scalable High-radix Montgomery Multiplier,” Master Thesis, Oregon State University, USA, December 2000.[21] L. A. Tawalbeh, “Radix-4 ASIC Design of a Scalable Montgomery Modular Multiplier using Encoding Techniques,” M.S. Thesis, Oregon State University, USA, October 2002.[22] W. Gallagher and E. Swartzlander, “High Radix Booth Multipliers Using Reduced Area Adder Trees”, In Twenty- Eighth Asilomar Conference on Signals, Systemsand Computers, 1, PP 545–549, 1994.[23] B. Cherkauer and E. Friedman, “A Hybrid Radix-4/Radix- 8 Low Power, High Speed Multiplier Architecture for Wide Bit Widths”, In IEEE Intl. Symp. onCircuits and Systems, 4, pp 53–56, 1996.[24] Israel Koren, Computer Arithmetic Algorithms, Prentice Hall, Inc. Englewood Cliff (N.J.) 1993.[25] Nabeel Shirazi, A1 Walters, and Peter Athanas. “Quantitative Analysis of Floating Point Arithmetic on FPGA Based Custom Computing Machines”. In IEEE Symposium on FPGAs for Custom Computing Machines, pages 155-162, April 1995.[26] Y.Wang, Y. Jiang, and E. Sha, “On Area- Efficient Low Power Array Multipliers”, In The 8th IEEE International Conference on Electronics, Circuits and Systems, pp 1429– 1432, 2001.[27] Computer Architecture: A Quantitative Approach, D.A. Patterson and J.L. Hennessy, Morgan Kaufman, San Mateo, C.A. 1996, Appendix A. 1766 | P a g e