services on...... embedded(ARM9,ARM11,LINUX,DEVICE DRIVERS,RTOS) VLSI-FPGA DIP/DSP PLC AND SCADA JAVA AND DOTNET iPHONE ANDROID If ur intrested in these project please feel free to contact us@09640648777,Mallikarjun.V
1" In! I Conf. on Recent Advances in Information Technology I RAIT-2012 I Une 0 0 Diagram for2 0 ! X ! 0 Digits 1 2 3 Une 0 0 0 Diagram °or 3 0 0 � 0 X >K Digits 1 2 3 0 0 X: ! 0 0 4 5 Une 0 0 0 0 ! :X 0 Diagram for4 0 0 0 0 0 >K Digits 1 2 3 0 X 0 v P, P, P, P , Fig. 1: Array Multiplier using CSA Hardware Architecture P, P, P, � >K 0 0 0 0 4 5 6 0 0 0 III. URDHAVA MULTIPLIER ! 0 0 0 7 Urdhava Tiryakbhyam  (Vertically and Fig. 2: Line Diagram for Urdhava Multiplication of 2, 3 and 4 digitsCrosswise), is one of Sixteen Vedic Sutras and deals withthe multiplication of numbers. The sutra is illustrated in From the Example 2, it is observed that all the partialExample 2 and the hardware architecture is depicted in products are generated in parallel. So the speed of theFig.3. In this example two decimal numbers (31 x 35) are multiplier is higher compared to array multiplier.multiplied. Line diagram for the multiplication of two, threeand four digit numbers is shown in Fig. 2 using Urdhava The above discussions can now be extended toMethod. The digits on the two ends of the line are multiplication of binary number system with the preliminarymultiplied and the result is added with the previous carry. knowledge that the multiplication of two bits aa and ba isWhen three or more lines are present, all the results are just an AND operation and can be implemented usingadded to the previous carry. The least significant digit of the simple AND gate. To illustrate this multiplication scheme innumber thus obtained acts as one of the result digit and the binary number system, consider the multiplication of tworest act as the carry for the next step. Initially the carry is binary numbers a3a2alaa and b3b2bl ba. As the result of thistaken to be zero. multiplication would be more than 4 bits, the product is expressed as r7r6rSr4r3r2rlra. Least significant bit ra is obtained by multiplying the least significant bits of theExample 2: 31x 35 = 1085 multiplicand and the multiplier as shown in the Fig.2. The digits on both sides of the line are multiplied and added with3 1 1 3 1 3 the carry from the previous step. This generates one of the I X I bits of the result (rn) and a carry (Cn). This carry is added in3 5x 5x 3 5 3x the next step and thus the process goes on. If more than one line are there in one step, all the results are added to the 5 15+3=18 9+ 1 =10 previous carry. In each step, least significant bit acts as the � � result bit and the other entire bits act as carry. For example, Carry to next stage if in some intermediate step, we get 110, then 0 will act as result bit and 11 as the carry (referred to as Cn in this text). ItAnswer: 31 x 35 = 1085 should be clearly noted that Cn may be a multi-bit number. Thus the following expressions (1) to (7) are derived: a b ra aaba = ... (1) clrl alba + aobl = ... (2) c d x C2r2 Cl + a2ba + alb I + aob2 = ... (3) C3r3 C2 + a3ba + a2bl + alb2 + aab3 = ... (4) (ad cb) C4r4 C3 + a3bl + a2b2 + alb3 ... ( 5) ac bd = csrs C4 + a3b2 + a2b3 = ... (6) C6r6 Cs + a3b3 = ... (7)In case, if there is a carry in (ad+bc) term, that is added toac.
1" In! I Conf. on Recent Advances in Information Technology I RAIT-2012 Iwith c6r6rSr4r3r2r1rO being the final product. Partial products sum by one bit. If the squares of the numbers are stored in aare calculated in parallel and hence the delay involved is ROM, the result can be instantaneously calculated.just the time it takes for the signal to propagate through the However, in case of Odd difference, the process is differentgates. as the average is a floating point number. In order to handle floating point arithmetic, Ekadikena Purvena - the Vedic Sutra which is used to find the square of numbers end with 5 is applied. Example 5 illustrates this. In this case, instead of squaring the average and deviation, [Average x (Average + 1)] - [Deviation x (Deviation+I)] is used. However, instead of performing the multiplications, the same ROM is used and using equation (10) the result of multiplication is obtained. 2 , r, , , , r, , " n(n+l) = (n +n) ... (10) teMplJl) 2 Here n is obtained from the ROM and is added with the address which is equal to n(n+l). The sample ROM Fig.3 Urdhava Multiplier Hardware Architecture contents are given in Table 1. TABLE 1: ROM CONTENTS The main advantage of the Vedic Multiplicationalgorithm ( Urdhava Tiryakbhyam Sutra) stems from the fact Address Memory Content (Square)that it can be easily implemented in FPGA due to its 1 1simplicity and regularity . The digital hardware 2 4realization of a 4-bit multiplier using this Sutra is shown inFig. 3. This hardware design is very similar to that of the 3 9array multiplier where an array of adders is required to 4 16arrive at the final product. Here in Urdhava, all the partial .. . ...products are calculated in parallel and the delay associatedis mainly the time taken by the carry to propagate through Thus, division and multiplication operations arethe adders. effectively converted to subtraction and addition operations using Vedic Maths. Square of both Average and Deviation IV. PRO POSED METHOD is read out simultaneously by using a two port memory to reduce memory access time. The proposed method is based on ROM approachhowever both the inputs for the multiplier can be variables. Example 3: 16x 12 192 =In this proposed method a ROM is used for storing the 1) Find the difference between (16-12) = 4 -7 Even Number 2squares of numbers as compared to KCM where the 2) For Even Difference, Product = [Average] - [Deviation] 2multiples are stored. i. Average = [(a+b)/2] = [(16+12)12] = [28/2] = 14Method: To find (a x b), first we have to find whether the ii. Smallest(a,b) = smallest(l6,12) =12difference between a and b is odd or even. Based on the iii. Deviation = Average - Smallest (a,b) = 14 -12 =2difference, the product is calculated using (8) and (9). 2 2 3) Product = 14 - 2 = 196 - 4 = 192 i. In case of Even Difference Example 4: 15x 12 1 80 2 2 =Result of Multiplication= [Average] - [Deviation] I) Find the difference between (15-12)=3 -7 Odd Number ... (8) 2) For Odd Number Difference find the Average and Deviation. 11. In case of Odd Difference i.Average = [(a+b)/2] = [(12+15)/2] = 13.5Result of Multiplication = [Average x (Average + 1)] - ii.Deviation = [Average - smallest(a, b)] =[Deviation x (Deviation+I)] ... (9) [12.5 - smallest(l3,12)] = [13.5 - 12] = 1.5Where, Average = [(a+b)/2] and Deviation = [Average - 3)Product = (l3xI4) - (lx2) = 182 - 2 =18 0smallest(a,b)] 2 Example 5: 25 =625 Example 3 (Even difference) and Example 4 (Odd 1) To find the square of 25, first find the square of 5 whichdifference) depict the multiplication process. Thus the two is 25 and put 2 in the tens place and 5 in the ones place ofvariable multiplication is performed by averaging, squaring the answer respectively.and subtraction. To find the average[(a+b)/2], which 2) To find the number in the hundreds place, multiply 2 byinvolves division by 2 is performed by right shifting the its immediate next number, 3, which is equal to (2x3) = 6
1" In! I Conf. on Recent Advances in Information Technology I RAIT-2012 I 23) Answer 25 = 625 FigA depicts the RTL view of the proposed multiplierfor 4x4 as a sample case, implemented on a Cyclone IIdevice. SxS multiplier is implemented using ROM Criterion Array Urdhava Proposedapproach, by storing the squares of the numbers in the Multiplier Multiplier Multipliermemory starting from 0 0 0 0 to 1111 1111. The 00 00 Areamemory requirement for an SxS multiplication will be SKB. Total Combinational 738 694 291 FunctionsBut in the case of 16xl6 multiplier the memory requirement 16 Dedicated Logicwill be huge, 2 x 32 2MB. So, in order to reduce the 96 96 48 = Registersmemory requirements for higher order bit multiplication, Total Memory 0 0 16(l6x16, 32x32, etc.) lower order (SxS) multiplier can be Bits( Kb)instantiated. By this process the constraint of larger Transitions 3173 3084 5341 Speed(Aftermemory requirements can be overcome. 77.65 80.23 119.76 Pipelining)(MHz) Power Static Power(mW) I 46.20 46.19 46.17 Dynamic Power(mW) I 4.41 3.61 9.57 I/O Power(mW) 17.37 17.34 17.41 180,----- 160 H._----- 140H._--__I------ 120H._--__I------ • Total Combinational 100 H.--__I---II--- Functions 80H._--__I--�__-- • Dedicated logic Registers 6oH.--__I---II--- • Total Memory Bits(Kb) 40 20 Array Urdhava Proposed Multiplier Multiplier Multiplier Fig. 4: RTL View of Proposed Multiplier (4x4) Fig. Sa: Area Comparison for (8x8) V. EX PERIMENTAL RESULTS l � r----- From the Table 2 and Table 3, it is inferred that the 140 +---proposed multiplier is best suited for higher order bit 13S+---multiplication (i.e., more than SxS). Since in FPGA there is 130+-----sufficient amount of on chip memory, which can be used tostore the squares of the numbers, the proposed multiplier 1:15+-----will consume only fewer logic elements for its 12l>+---�-implementation. CritMOil Itrfay Mlilt,plfer Urdb... Pf<lf!OSod Multiplier lMunt�l.ier TABLE 2 RESULTS FOR 8x8 MULTIPLIER Criterion Array Urdhava Proposed Fig. 5b: Speed Comparison for (8x8) Multiplier Multiplier MultiplierAreaTotal Combinational 163 149 114FunctionsDedicated Logic 48 48 16RegistersTotal Memory 0 0 8Bits( Kb)Transitions 1557 1501 1782Speed(After 137.46 142.67 129.52Pipelining)(MHz)Power
1" In! I Conf. on Recent Advances in Information Technology I RAIT-2012 I 50 From the above Fig.S and Fig.6 it is clear that the 4S 40 proposed multiplier outperforms both the Array Multiplier 3S 30 as well as the Urdhava Multiplier in performance. 2S 20 . Arroy MuItIpI1... 1, VI. CONCLUSIONS 10 • UrdhIv. Muttlpllor 5 • p"",a.ed Mul1ipler o Thus the proposed multiplier provides higher performance for higher order bit multiplication. In the proposed multiplier for higher order bit multiplication i.e. for 16x16 and more, the multiplier is realized by instantiating the lower order bit multipliers like 8x8. This is Fig. 5c: Power Comparison for (SxS) mainly due to memory constraints. Effective memory implementation and deployment of memory compression 800 ,------ algorithms can yield even better results. 700 +-i_---_�---- 600 +-i_---_f--- REFERENCES 500 +-i_---_f--- • Total Combinational Functions 400 +-i_---_f--- [I] Swami Bharati Krshna Tirthaji, Vedic Mathematics. Delhi: Motilal • Dedicated Logic Registers 300 +--1.---.f----=.-- Banarsidass Publishers, 1965. 200 t-i_---_r---_.t---  K.K.Parhi "VLSI Digital Signal Processiong Systems -Design and • Total Memory Bits(Kb} Implementation" John Wiley & Sons,1999. 100  Harpreet Singh Dhillon and Abhijit Mitra "A Digital Multiplier Architecture using Urdhava Tiryakbhyam Sutra oj Vedic Array Urdhav,3 Proposed Mathematics" IEEE Conference Proceedings,200S. Multiplier Multiplier Multiplier  Asmita Haveliya "A Novel Design ./i)r High Speed Multiplier .fi)r Digital Signal Processing Applications (Ancient Indian Vedic mathematics approach)" International Journal of Technology And Fig. 6a: Area Comparison for (l6x16) Engineering System(IJTES):Jan - March 2011- Vo12 .Nol  Raminder Preet Pal Singh, Parveen Kumar, Balwinder Singh 140 ,----- "Perfimnance Analysis of32-Bit Array Multiplier with a Carry Save 120 +----- Adder and with a Carry-Look-Ahead Adder" International Journal of Recent Trends in Engineering, Vol 2, No. 6, November 2009 100 +-----  Parth Mehta, Dhanashri Gawali "Conventional versus Vedic 80 +------ mathematical method Jor Hardware implementation oj a multiplier" 2009 International Conference on Advances in Computing, Control, 60 +------ Series! • and Telecommunication Technologies 40 +------  Prabir Saha, Arindam Banerjee, Partha Bhattacharyya, Anup 20 +------ Dandapat ""High Speed ASIC Design of Complex Multiplier Using Vedic Mathematics" Proceeding of the 2011 IEEE Students Criterion Array Multiplier Urdhava Proposed Technology Symposium 14-16 January, 20II, liT Kharagpur u t i pl i er M l Multiplier [S] H. D. Tiwari, G. Gankhuyag, C. M. Kim, and Y. B. Cho, "Multiplier design based on ancient Indian Vedic Mathematics," in Proceedings IEEE International SoC Design Cotiference, Busan, Nov. 24-25, Fig. 6b: Speed Comparison for (l6x16) 200S,pp.65-6S  H. Thapliyal, M. B. Srinivas and H. R. Arabnia , "Design And 50 Analysis oj a VLSI Based High PerJormance Low Power Parallel 45 quare Architecture", in Proc. Int. Conf. Algo. Math. Compo Sc., Las Vegas, June 2005, pp. 72-76. 40  P. D. Chidgupkar and M. T. Karad, "The Implementation oj Vedic 35 Algorithms in Digital Signal Processing", Global J. oj /cngg. /cdu., 30 vol. 8, no.2, pp. 153-158, 2004. • Series! 25  H. Thapliyal and M. B. Srinivas, "High Speed Efficient N x N Bit • Series2 20 Parallel Hierarchical Overlay Multiplier Architecture Based on • Series3 15 Ancient Indian Vedic Mathematics", EnJormatika Trans., vol. 2, pp. 10 225-22S, Dec. 2004.  Wakerly, J.F. "Digital Design-Principles and Practices", 2006, 4th Edition. Pearson Prentice Hall.  J.Bhasker, "Verilog HDL Primer" BS P Publishers, 2003.  Himanshu Thapliyal, S. Kotiyal and M.B. Srinivas, "Design and Analysis (Jf a Novel Parallel Square and Cube Architecture Based Fig. 6c: Power Comparison for (16xI6) on Ancient Indian Vedic Mathematics", Proceedings on 48th II/c/c/c International Midwest Symposium on Circuits and Systems (MWSCAS 2005),
1" In! I Conf. on Recent Advances in Information Technology I RAIT-2012 I Himanshu Thapliyal and Hamid R. Arabania, "A Time- Area Power Efficient Multiplier and Square Architecture Based on Ancient Indian Vedic Mathematics" , proceedings on VLSI04, Las Vegas, U. S. A, June 2004[16J M. Ramalatha, K. Deena Dayalan, P. Dharani, S. Deborah Priya, "High Speed Jcnergy ElJzcient AL U Design using Vedic Multiplication Techniques", ACTEA 2009, Zouk Mosbeh, Lebanon Pau1.B.C., Fujita.F.S., Okajima.M., "ROM Based Logic (RBL) Design: A Low-Power 16 Bit Multiplier, IEEE Jouna! of Solid State Circuits, Volume 44, Issue II, Pg. 2935-2942, Nov 2009.