SlideShare a Scribd company logo
FPGA BASED IMPLEMENTATION OF DELAY
OPTIMISED DOUBLE PRECISION IEEE
FLOATING-POINT ADDER
Dissertation
Submitted in partial fulfillment of the requirement for the
degree of
Bachelor of Engineering
In
Electrical Engineering
SESSION 2009-2013
	
	
	
	
	 	
	
Submitted	by	
Somsubhra	Ghosh	
BE	(Final	year)	
Under	the	Supervision	of	
Somnath	Ghosh,	Scientist,	ADRIN,	
Dept	of	Space/ISRO,	Govt.	of	INDIA	
DEPT.	OF	ELECTRICAL	ENGINEERING	
JADAVPUR	UNIVERSITY	
KOLKATA-700032	
INDIA	
June,	2012	
Roll	No-	000910801089	
Reg.	No-	107973	of	2009-10
JADAVPUR	UNIVERSITY	|	FPGA	BASED	IMPLEMENTATION	OF	DOUBLE	PRECISION	IEEE	FP-ADDER		 1	
	
	
	
	
	
	
	
	
	
	
	
	
	
	
	
	
	
	
	
	
JADAVPUR UNIVERSITY
JADAVPUR	UNIVERSITY	|	FPGA	BASED	IMPLEMENTATION	OF	DOUBLE	PRECISION	IEEE	FP-ADDER		 2
JADAVPUR	UNIVERSITY	|	FPGA	BASED	IMPLEMENTATION	OF	DOUBLE	PRECISION	IEEE	FP-ADDER		 3	
	
CONTENTS
SECTION PAGE NO
Preface 4
Acknowledgements 5
Abstract 6
Section-I Introduction and summary 7
Section-II Notation 10
Section-III NaΓ―ve FP-Adder Algorithm 11
Section-IV Optimization techniques 13
4.1 Separation of FP-Adder in parallel paths 13
4.2 Unification of significant result ranges 14
4.3 Reduction of IEEE rounding modes 15
4.4 Sign magnitude computation of a difference 16
4.5 Compound addition 17
4.6 Approximate counting of leading zeros 18
4.7 Precomputation of post normalization shift 20
Section- V Overview of present algorithm 21
Section-VI Specification and detailed description 26
Section VII Delay analysis reporting 32
Section-VIII Implementation 33
Simulation screen shots 42
Section-VIII Future Scope	 	 	 	 	 	 48
Bibliography 49
JADAVPUR	UNIVERSITY	|	FPGA	BASED	IMPLEMENTATION	OF	DOUBLE	PRECISION	IEEE	FP-ADDER		 4	
	
Preface β€”
This dissertation is a part of Bachelor of Electrical Engineering curriculum required in
partial fulfillment for the degree of B.E. Being a student of Electrical Engineering I
opted to take up a topic related to my preferred specialization. A prerequisite about
optimal Floating Point addition is essential and important for ensuring the faster and
optimal management of resources demanded by the complex circuit models
employing complex arithmetic operations. Addition and subtraction are the basis of the
arithmetic operations. So I was fascinated by the topic greatly and started my work.
Thereafter I studied the basic algorithmic steps and started to build the basic building
blocks using VHDL. Afterwards I integrated the blocks as per the requirement and
used the debugging tool for help. At the end circuit has been simulated and verified
using standard test benches.
JADAVPUR	UNIVERSITY	|	FPGA	BASED	IMPLEMENTATION	OF	DOUBLE	PRECISION	IEEE	FP-ADDER		 5	
	
ACKNOWLEDGEMENT
I sincerely acknowledge to my mentor Mr. Somnath Ghosh, Scientist, ADRIN. To D.
Mallikharjuna. Rao, Section Head, ES&DS, Advanced Data Processing And
Research Institute (ADRIN) for providing laboratory facility and other facilities. I am
grateful to them for introducing me with exposure in research and development area.
I am thankful to Director ADRIN and Dr. Venkatraman (GD), for allowing to carry on
my project work here.
My mentor Mr. Somnath Ghosh has been a great source of encouragement throughout
the project work. His appropriate guidance, critical appraisal, scholarly approach has
helped me to reach my target.
It is a great pleasure to express my deep and heartiest gratitude to Sri Shashank
Adimulyam, Scientist, ADRIN, for his guidance and helpful discussion. His valuable
suggestions, ideas, and encouragement given in the course of this work were
unforgettable.
This is a reference implementation of β€œDELAY OPTIMIZED IMPLEMENTATION OF
IEEE FLOATING POINT ADDITION” appeared on IEEE trans. On computers vol. 53,
NO.2, February 2004.
Somsubhra Ghosh
JADAVPUR	UNIVERSITY	|	FPGA	BASED	IMPLEMENTATION	OF	DOUBLE	PRECISION	IEEE	FP-ADDER		 6	
	
Abstract β€”
Hereby is presented an implementation of an IEEE double precision floating-point
adder (FP-adder) design mentioned in the IEEE publication β€œDELAY OPTIMIZED
IMPLEMENTATION OF IEEE FLOATING POINT ADDITION” authored by P.M. Seidel
and G. Even. The adder accepts normalized numbers, supports IEEE rounding mode,
and outputs the correctly normalized rounded sum/difference in the format required by
the IEEE Standard. The FP-adder design achieves a low latency by combining various
optimization techniques such as: A nonstandard separation into two paths, a simple
rounding algorithm, unification of rounding cases for addition and subtraction, sign-
magnitude computation of a difference based on one’s complement subtraction,
compound adders. A technology-independent analysis and optimization of this
implementation based on the Logical Effort hardware model is done and optimal gate
sizes and optimal buffer insertion has been determined. The estimated delay of this
optimized design at 30.6 FO4 delays for double precision operands (15.3 FO4 delays
per stage between latches). It has been concluded that this algorithm has shorter
latency (-13 percent) and cycle time (-22 percent) compared to the next fastest
algorithm.
	
Index Terms β€” Floating-point addition, IEEE rounding, delay optimization, dual path
algorithm, logical effort, optimized gate sizing, buffer insertion.
JADAVPUR	UNIVERSITY	|	FPGA	BASED	IMPLEMENTATION	OF	DOUBLE	PRECISION	IEEE	FP-ADDER		 7	
	
SECTION I. INTRODUCTION AND SUMMERY	
FLOATING-POINT addition and subtraction are the most frequent floating-point
operations. Both operations use a floating-point adder (FP-adder). Therefore, a lot of
effort has been spent on reducing the latency of FP-adders (see [2], [9], [20], [22], [23],
[25], [27], [28], and the references that appear there). Many patents deal with FP-
adder design (ref. [6], [10], [11], [14], [15], [19], [21], [31], [32], [35]). In this dissertation
an FP-adder design is implemented that accepts normalized double precision
significands, supports IEEE rounding modes, and outputs the normalized
sum/difference that is rounded according to the IEEE FP standard 754 [13]. The
latency of this design is analyzed using the Logical Effort Model [33]. This model allows
for technology-independent delay analysis of CMOS circuits. The model enables
rigorous delay analysis that takes into account fanouts, drivers, and gate-sizing.
Following Horowitz [12], the delay of an inverter is used, the fanout of which equals 4,
as a technology-independent unit of delay. An inverter with fanout 4 is denoted by
FO4. The analysis using the Logical Effort Model shows that the delay of this FP-adder
design is 30:6 FO4 delays. This design is partitioned into two pipeline stages, the delay
of which is bounded by 15:3 FO4 delays. Extensions of the algorithm that deal with
denormal inputs and outputs are discussed in [16], [27]. It is shown there that the delay
overhead for supporting denormal numbers can be reduced to 1-2 logic levels (i.e.,
XOR delays). Several optimization techniques are employed in this algorithm. A
detailed examination of these techniques combined, enables implementation of an
overall fast FP-adder design. In particular, effective reduction of latency by parallel
paths requires balancing the delay of the paths. Such a balance is achieved by a gate-
level consideration of the design.
The optimization techniques, that has been used are included in the following -
1. A two path design with a nonstandard separation criterion. Instead of separation based
on the magnitude of the exponent difference [10], A separation criterion is defined that
also considers whether the operation is effective subtraction and the value of the
significand difference. This separation criterion maintains the advantages of the
standard two-path designs, namely, alignment shift and normalization shift take place
only in one of the paths and the full exponent difference is computed only in one path.
In addition, this separation technique requires rounding to take place only in one path.
JADAVPUR	UNIVERSITY	|	FPGA	BASED	IMPLEMENTATION	OF	DOUBLE	PRECISION	IEEE	FP-ADDER		 8	
	
2. Reduction of IEEE rounding to three modes [25] and use of injection based rounding
[8].
3. A simpler design is obtained by using unconditional preshifts for effective subtractions
to reduce to 2 the number of binades that the significands’ sum and difference may
belong to.
4. The sign-magnitude representation of the difference of the exponents and the
significands is derived from one’s complement representation of the difference.
5. A parallel-prefix adder is used to compute the sum and the incremented sum of the
significands [34].
6. Recordings are used to estimate the number of leading zeros in the nonredundant
representation of a number represented as a borrow-save number [20].
7. Postnormalization is advanced and takes place before the rounding decision is ready.
Form an overview of FP-adder algorithms from technical papers and patents, the
optimization techniques that are used in each of these designs are summarized. The
algorithms from two particular implementations are also analyzed from literature in
some more detail [11], [21]. To allow for a β€œfair” comparison, the functionality of these
designs are adopted to match the functionality of the present design. This design uses
simpler rounding circuitry and is more amenable to partitioning into two pipeline stages
of equal latency. The latency of [21] is also analyzed using the Logical Effort Model.
The analysis for this algorithm gives the following results:
a) The total delay is 35:2 FO4 delays (i.e., 13 percent slower than this design) and the
delay of the slowest pipeline stage is 19:7 FO4 delays (i.e., 22 percent slower).
b) This paper focuses on double precision FP-adder implementations. Many FP-adders
support multiple precisions (e.g., x86 architectures support single, double, and
extended double precision). In [27], it is shown that by aligning the rounding position
(i.e., 23 positions to the right of the binary point in single precision and 52 positions to
the right of the binary point in double precision) of the significands before they are
input to the design and post aligning the outcome of the FP-adder, it is possible to use
the FP-adder presented in the mentioned paper for multiple precisions. Hence, the
FP-addition algorithm presented in this paper can be used to support multiple
precisions.
SECTION II. NOTATION	
Values and their representation:
JADAVPUR	UNIVERSITY	|	FPGA	BASED	IMPLEMENTATION	OF	DOUBLE	PRECISION	IEEE	FP-ADDER		 9	
	
Binary strings are denoted by upper case letters (e.g., S, E, F). The value represented
by a binary string is represented in italics (e.g., s; e; f) IEEE FP-numbers. Here
normalized IEEE FPnumbers are considered. In double precision, IEEE FP-numbers
are represented by three fields (S, E [10:0], F [0:52]) with sign bit S {0 : 1}, exponent
string E[10:0]	∈ {0,1}11
and significand string F[0:52]	∈ {0,1}53
. The values of exponent
and significand are defined by:
𝑒 =	 𝐸[𝑖] βˆ™ 2+,-
+.- βˆ’ 1023, 		𝑓 =	 𝐹[𝑖] βˆ™ 25+,-
+.- .
Since only normalized FPnumbers are considered, it can be safely concluded that f	∈
[1,2). The value represented by a FP-number (S, E[10:0], F[0:52]) is:
𝑓𝑝9:; 𝑆, 𝐸, 𝐹 = (βˆ’1)>
βˆ™ 2?
βˆ™ 𝑓.
Factorings: Given an IEEE FP-number (S,E,F) it is referred to the triple (s,e,f) as the
factoring of the FP-number. Note that s = S since S is a single bit. The advantage of
using factorings is the ability to ignore representation details and focus on values.
Inputs: The inputs of a FP-addition/subtraction are:
Operands denoted by (SA,EA[10:0],FA[0:52]) and (SB,EB[10:0],FB[0:52] ).
Operation denoted by SOP ∈ {0,1} to indicate addition/subtraction.
IEEE rounding mode.
Output: The output is a FP-number (S,E[10:0],F[0:52] ). The value represented by the
output equals the IEEE rounded value of
π‘“π‘π‘ π‘’π‘š = 	𝑓𝑝9:;	 𝑆𝐴, 𝐸𝐴 10: 0 , 𝐹𝐴 0: 52 +		(βˆ’1)>JK
βˆ™ 𝑓𝑝9:; 𝑆𝐡, 𝐸𝐡 10: 0 , 𝐹𝐡 0: 52 .
JADAVPUR	UNIVERSITY	|	FPGA	BASED	IMPLEMENTATION	OF	DOUBLE	PRECISION	IEEE	FP-ADDER		 10	
	
SECTION III. NAIVE FP-ADDER ALGORITHM
In this section, the β€œvanilla” FP-addition algorithm has been discussed. To simplify
notation, representation has been neglected and only the values of the inputs, outputs,
and intermediate results are taken into account. Throughout the paper, the notation
defined for the naive algorithm has been used. Let π‘ π‘Ž, π‘’π‘Ž, π‘“π‘Ž 	π‘Žπ‘›π‘‘	(𝑠𝑏, 𝑒𝑏, 𝑓𝑏) denote
the factorings of the operands with a sign-bit, an exponent, and a significand, and let
SOP indicate whether the operation is an addition or a subtraction. The requested
computation is the IEEE FP representation of the rounded sum:
π‘Ÿπ‘›π‘‘ π‘ π‘’π‘š = π‘Ÿπ‘›π‘‘ βˆ’1 >:
	βˆ™	2?:	
βˆ™ 		π‘“π‘Ž +	 βˆ’1 >JKR>S
βˆ™ 2?S
	βˆ™ 𝑓𝑏 .	
Let 𝑆. 𝐸𝐹𝐹 = π‘ π‘Ž 𝑠𝑏 π‘ π‘œπ‘ .
The case that S.EFF = 0 is called effective addition and the case that S.EFF = 1 is
called effective subtraction. The exponent difference is defined as 𝛿 = π‘’π‘Ž βˆ’ 𝑒𝑏	. The
β€œlarge” operand, (sl,el,fl), and the β€œsmall” operand, (ss,es,fs), are defined as follows:
			 𝑠𝑙, 𝑒𝑙, 𝑓𝑙 =
π‘ π‘Ž, π‘’π‘Ž, π‘“π‘Ž 													𝑖𝑓	𝛿 β‰₯ 0
		(𝑆𝑂𝑃	 𝑠𝑏, 𝑒𝑏 , 𝑓𝑏)					π‘œπ‘‘β„Žπ‘’π‘Ÿπ‘€π‘–π‘ π‘’
			
𝑠𝑠, 𝑒𝑠, 𝑓𝑠 =
		(𝑆𝑂𝑃	 	𝑠𝑏, 𝑒𝑏, 𝑓𝑏) 				𝑖𝑓	𝛿 β‰₯ 0
							(	π‘ π‘Ž, π‘’π‘Ž, 𝑓𝑏)											π‘œπ‘‘β„Žπ‘’π‘Ÿπ‘€π‘–π‘ π‘’
	
The sum can be written as: π‘ π‘’π‘š = (βˆ’1)>;
βˆ™	2?;
(𝑓𝑙 +	 βˆ’1 ]^__ 𝑓𝑠	 β‹…	25a
).
To simplify the description of the datapaths, the computation of the result’s
significand has been given greater attention, which is assumed to be normalized
(i.e., in the range ∈ [1,2) after rounding). The significand sum is defined by
π‘“π‘ π‘’π‘š =	(βˆ’1)].bcc
	(𝑓𝑠	 βˆ™	25a
).
The significand sum is computed, normalized, and rounded as follows :
1. Exponent subtraction: 𝛿 = π‘’π‘Ž βˆ’ 𝑒𝑏
2. Operand swapping: compute 𝑠𝑙, 𝑒𝑙, 𝑓𝑙, and 𝑓𝑠,
3. Limitation of the alignment shift amount: 𝛿;+d = min 𝛼, π‘Žπ‘π‘  𝛿 , where 𝛼 is a
constant greater than or equal to 55,
4. Alignment shift of 𝑓𝑠: π‘“π‘ π‘Ž = 𝑓𝑠	 β‹…	25a_;+d
	,
JADAVPUR	UNIVERSITY	|	FPGA	BASED	IMPLEMENTATION	OF	DOUBLE	PRECISION	IEEE	FP-ADDER		 11	
	
5. Significand negation: π‘“π‘ π‘Žπ‘› =	(βˆ’1)].bcc
βˆ™ π‘“π‘ π‘Ž	;
6. Significand addition: π‘“π‘ π‘Žπ‘› = 𝑓𝑙 + π‘“π‘ π‘Žπ‘›	;
7. Conversion: π‘Žπ‘π‘ j>kd = π‘Žπ‘π‘ (π‘“π‘ π‘’π‘š) (the sign S of the result is determined by 𝑆 =
𝑠𝑙	 (π‘“π‘ π‘’π‘š	 < 0),
8. Normalization shift: 𝑛j>kd = π‘›π‘œπ‘Ÿπ‘š	(π‘Žπ‘π‘ _π‘“π‘ π‘’π‘š), and
9. Rounding and Postnormalization of 𝑛_π‘“π‘ π‘’π‘š.
The naive FP-adder implements the nine steps from above sequentially, where the
delay of Steps 4, 6, 7, 8, 9 is logarithmic in the significand’s length. Therefore, this is
a slow FP-adder implementation.
JADAVPUR	UNIVERSITY	|	FPGA	BASED	IMPLEMENTATION	OF	DOUBLE	PRECISION	IEEE	FP-ADDER		 12	
	
SECTION IV. OPTIMIZATION TECHNIQUES
Optimization techniques that were employed in this FP-adder design are discussed in
this section
	
4.1 Separation of FP-Adder into Two Parallel Paths
The FP-adder pipeline is separated into two parallel paths that work under different
assumptions. The partitioning into two parallel paths enables one to optimize each
path separately by simplifying and skipping some steps of the naive addition algorithm
(Section 3). Such a dual path approach for FP-addition was first described by
Farmwald [9]. Since Farmwald’s dual path FP-addition algorithm, the common criterion
for partitioning the computation into two paths has been the exponent difference. The
exponent difference criterion is defined as follows: The near path is defined for small
exponent differences (i.e., βˆ’1,0, +1), and the far path is defined for the remaining
cases. Here a different partitioning criterion has been used for partitioning the
algorithm into two paths: namely the N-path for the computation of all effective
subtractions with small significand sums π‘“π‘ π‘’π‘š	 ∈ (βˆ’1,1) and small exponent
differences 𝛿 ≀ 1, and the R-path for all the remaining cases. The path selection
signal IS_R as follows :
𝐼𝑆_ 𝑅 ⟺	 𝑆. 𝐸𝐹𝐹	𝑂𝑅	 𝛿 β‰₯ 2	𝑂𝑅	π‘“π‘ π‘’π‘š ∈ [1,2). (1)
The outcome of the R-path is selected for the final result if 𝐼𝑆_ 𝑅 = 1, otherwise the
outcome of the N-path is selected. This partitioning has the following advantages:
a) In the R-path, the normalization shift is limited to a shift by one position (in Section 4.2,
it is shown how the normalization shift may be restricted to one direction).
b) Moreover, the addition or subtraction of the significands in the R-path always results
with a positive significand and, therefore, the conversion step can be skipped.
In the N-path, the alignment shift is limited to a shift by one position to the right. Under
the assumptions of the N-path, the exponent difference is in the range{βˆ’1,0,1}.
Therefore, a 2-bit subtraction suffices for extracting the exponent difference.
Moreover, in the N-path, the significand difference can be exactly represented with 53
bits, hence, no rounding is required. Note that the N-path applies only to effective
JADAVPUR	UNIVERSITY	|	FPGA	BASED	IMPLEMENTATION	OF	DOUBLE	PRECISION	IEEE	FP-ADDER		 13	
	
subtractions in which the significand difference π‘“π‘ π‘’π‘š is less than 1. Thus, in the N-
path it is assumed that π‘“π‘ π‘’π‘š ∈ (βˆ’1,1).
The advantages of this partitioning criterion compared to the exponent difference
criterion stem from the following two observations:
a) A conventional implementation of a far path can be used to implement also the R-path
b) The N-path is simpler than the near path since no rounding is required and the N-path
applies only to effective subtractions. Hence, the N-path is simpler and faster than the
near path presented in [28].
4.2 Unification of Significand Result Ranges
In the R-path, the range of the resulting significand is different in effective addition and
effective subtraction. Using the notation of Section 3, in effective addition, 𝑓𝑙 ∈ [1,2)
and 	π‘“π‘ π‘Žπ‘› ∈ [0,2). Therefore, π‘“π‘ π‘’π‘š ∈ [1,4). It follows from the definition of the path
selection condition that, in effective subtractions π‘“π‘ π‘’π‘š ∈ (
,
r
, 2) in the R-path. The
ranges of π‘“π‘ π‘’π‘š are unified in these two cases to [1, 4) by multiplying the significands
by 2 in the case of effective subtraction (i.e., preshifting by one position to the left).
The unification of the range of the significand sum in effective subtraction and effective
addition simplifies the rounding circuitry. To simplify the notation and the
implementation of the path selection condition, the operands are also preshifted for
effective subtractions in the N-path.
It has to be noted that, in this way, the preshift is computed in the N-path
unconditionally, because in the N-path all operations are effective subtractions.
In the following, a few examples of values that include the conditional preshift are given
(it may be noted that an additional β€œp” is included in the names of the preshifted
versions) -
𝑓𝑙𝑝 =
2 βˆ™ 𝑓𝑙							𝑖𝑓	𝑆. 𝐸𝐹𝐹
				𝑓𝑙						π‘œπ‘‘β„Žπ‘’π‘Ÿπ‘€π‘–π‘ π‘’
π‘“π‘ π‘π‘Žπ‘› =
2 βˆ™ π‘“π‘ π‘Žπ‘›							𝑖𝑓	𝑆. 𝐸𝐹𝐹
				π‘“π‘ π‘Žπ‘›						π‘œπ‘‘β„Žπ‘’π‘Ÿπ‘€π‘–π‘ π‘’
π‘“π‘π‘ π‘’π‘š =
2 βˆ™ π‘“π‘ π‘’π‘š							𝑖𝑓	𝑆. 𝐸𝐹𝐹
				π‘“π‘’π‘š						π‘œπ‘‘β„Žπ‘’π‘Ÿπ‘€π‘–π‘ π‘’
JADAVPUR	UNIVERSITY	|	FPGA	BASED	IMPLEMENTATION	OF	DOUBLE	PRECISION	IEEE	FP-ADDER		 14	
	
Based on the significand sum π‘“π‘π‘ π‘’π‘š, which includes the conditional preshift, the
path selection condition (1) can be rewritten as
𝐼𝑆_ 𝑅 ⟺	 𝑆. 𝐸𝐹𝐹	𝑂𝑅	 𝛿 β‰₯ 2	𝑂𝑅	π‘“π‘π‘ π‘’π‘š ∈ [2,4)
4.3 Reduction of IEEE Rounding Modes
The IEEE-754-1985 Standard defines four rounding modes - Round toward 0, round
toward ∞, round toward βˆ’βˆž, and round to nearest (even) [13]. Following Quach et al.
[25], the four IEEE rounding modes are reduced to three rounding modes: Round-to-
zero RZ, round-to-infinity RI, and round-to-nearest-up RNU. The discrepancy between
round-to-nearest-even and RNU is fixed by pulling down the LSB of the fraction (see
[8] for more details). In the rounding implementation in the R-path, the three rounding
modes RZ, RNU, and RI are further reduced to truncation using injection based
rounding [8]. The reduction is based on adding an injection that depends only on the
rounding mode. Let X = X0,X1,X2……XK denote the binary representation of a
significand with the value π‘₯ =	 𝑋 ∈ [1,2) for which 𝐾 β‰₯ 53 (double precision rounding
is trivial for k < 53), then the injection is defined by:
inj =
0																																				if	RZ	
25z{
																											if	RNU
25z{
βˆ’	25~
																		if	RI
For double precision and π‘šπ‘œπ‘‘π‘’	 ∈ {𝑅𝑍, π‘…π‘π‘ˆ, 𝑅𝐼} , the effect of adding 𝑖𝑛𝑗 is
summarized by:
𝑋 ∈ 1,2 ⟹ π‘Ÿπ‘›π‘‘dJ…? 𝑋 = 	π‘Ÿπ‘›π‘‘β€ β€‘( 𝑋 = 𝑖𝑛𝑗).
(3)
4.4 Sign-Magnitude Computation of a Difference
In this technique, the sign-magnitude computation of a difference is computed using
one’s complement representation [22]. This technique is applied in two situations:
1. Exponent difference: The sign-magnitude representation of the exponent difference is
used for two purposes:
a) The sign determines which operand is selected as the ”large” operand and
b) The magnitude determines the amount of the alignment shift.
JADAVPUR	UNIVERSITY	|	FPGA	BASED	IMPLEMENTATION	OF	DOUBLE	PRECISION	IEEE	FP-ADDER		 15	
	
2. Significand difference: In case the exponent difference is zero and an effective
subtraction takes place, the significand difference might be negative. The sign of the
significand difference is used to update the sign of the result and the magnitude is
normalized to become the result’s significand.
Let A and B denote binary strings and let 𝐴 denote the value represented by A (i.e.
𝐴 =	 𝐴+ βˆ™ 2+
+ ). The technique is based on the following observation:
π‘Žπ‘π‘ ( 𝐴 βˆ’ 𝐡 ) =
𝐴 + 𝐡 + 1, 𝑖𝑓 𝐴 βˆ’ 𝐡 > 0
𝐴 + 𝐡 																	𝑖𝑓	 𝐴 βˆ’ 𝐡 ≀ 0
The actual computation proceeds as follows: The binary string D is computed such
that 𝐷 = 𝐴 + 𝐡 . D is referred to as the one’s complement lazy difference of A and
B. Two cases are considered to explain the scenario:
If the difference is positive, then 𝐷 is off by an ULP and is needed to be incremented.
However, to save delay, the increment is avoided as follows:
a) In the case of the exponent difference that determines the alignment shift amount,
the significands are preshifted by one position to compensate for the error.
b) In the case of the significand difference, the missing ULP is provided by computing
the incremented sum of 𝐴 	π‘Žπ‘›π‘‘	 𝐡 using a compound adder.
On the otherhand, if the exponent difference is negative, then the bits of D are
negated to obtain an exact representation of the magnitude of the difference.
4.5 Compound Addition
The technique of computing in parallel the sum of the significands as well as the
incremented sum is well known. The rounding decision controls which of the sums is
selected for the final result, thus enabling the computation of the sum and the rounding
decision in parallel technique. The technique suggested by Tyagi [34] for implementing
a compound adder is followed here. This technique is based on a parallel prefix adder
in which the carry-generate and carry-propagate strings, denoted by Gen_C and
Prop_C, are computed [3]. Let Gen_C[i] equal the carry bit that is fed to position i. The
bits of the sum S of the addends A and B are obtained as usual by:
𝑆 𝑖 = 𝑋𝑂𝑅(𝐴 𝑖 , 𝐡 𝑖 , 𝐺𝑒𝑛_𝑐[𝑖])
The bits of the incremented sum 𝑆𝐼 are obtained by:
𝑆𝐼 𝑖 = 𝑋𝑂𝑅(𝐴 𝑖 , 𝐡 𝑖 , 𝑂𝑅(𝐺𝑒𝑛Œ + , π‘ƒπ‘Ÿπ‘œπ‘_𝑐[𝑖]))
JADAVPUR	UNIVERSITY	|	FPGA	BASED	IMPLEMENTATION	OF	DOUBLE	PRECISION	IEEE	FP-ADDER		 16	
	
Applications - There are two instances of a compound adder in this FP-addition
algorithm. One instance appears in the second pipeline stage of the R-path where the
delay analysis relies on the assumption that the MSB of the sum is valid one logic level
prior to the slowest sum bit.
The second instance of a compound adder appears in the N-path. In this case, the
problem that the compound adder does not β€œfit” in the first pipeline stage according to
the delay analysis, has been properly addressed. This critical path is broken by
partitioning the compound adder between the first and second pipeline stages as
following:
A parallel prefix adder placed in the first pipeline stage computes the carry-generate
and carry-propagate signals as well as the bitwise XOR of the addends. From these
three binary strings, the sum and incremented sum are computed within two logic
levels as described above. However, these two logic levels must belong to different
pipeline stages. Therefore, first, the three binary strings 𝑆 𝑖 , 𝑃 𝑖 =
𝐴 𝑖 	𝑋𝑂𝑅	𝐡 𝑖 	, π‘Žπ‘›π‘‘	𝐺𝑃_𝐢[𝑖] = 𝑂𝑅(𝐺𝑒𝑛_𝐢[𝑖], π‘ƒπ‘Ÿπ‘œπ‘_𝐢[𝑖]) , which are passed to the
second pipeline stage. In this way, the computation of the sum is already completed
in the first pipeline stage and only an XOR-line is required in the second pipeline stage
to compute also the incremented sum.
4.6 Approximate Counting of Leading Zeros
In the N-path, a resulting significand in the range (βˆ’1,1) must be normalized. The
amount of the normalization shift is determined by approximating the number of
leading zeros. Following Nielsen et al. [20], the number of leading zeros are
approximated so that a normalization shift by this amount yields a significand in the
range [1, 4). The final normalization is then performed by post-normalization. The input
used for counting leading zeros in this design is a borrow-save representation of the
difference. This design is amenable to partitioning into pipeline stages, and admits an
elegant correctness proof that avoids a tedious case analysis. Nielsen et. al. [20]
presented the following technique for approximately counting the number of leading
zeros. The method is based on a constant delay reduction of a borrowsave encoded
string to a binary vector. The number of leading zeros in the binary vector almost
equals the number of leading zeros in the binary representation of the absolute value
JADAVPUR	UNIVERSITY	|	FPGA	BASED	IMPLEMENTATION	OF	DOUBLE	PRECISION	IEEE	FP-ADDER		 17	
	
of the number represented by the borrow-save encoded string. This reduction is
described below. The input consists of a borrow-save encoded digit string 𝐹 βˆ’1: 52 ∈
βˆ’1,0,1 54
. The borrow-save encoded string is computed 𝐹′
βˆ’2: 52 =
𝑃(𝑁(𝐹[βˆ’1: 52])), where P() and N() denote P-recoding and N-recoding [5], [20]. (P-
recoding is like a β€œsigned half-adder” in which the carry output has a positive sign, N-
recoding is similar but has an output carry with a negative sign). The correctness of
the technique is based on the following claim.
Claim 1. [20] - Suppose the borrow-save encoded string 𝐹′
βˆ’2: 52 is of the form
𝐹′
βˆ’2: 52 = 0Ε½
βˆ™ 𝜎 βˆ™ 𝑑[1: 54 βˆ’ π‘˜], where β‹… denotes concatenation of strings, 0k
denotes
a block of k zeros,𝜎 ∈ (βˆ’1,1) , and 𝑑 ∈ (βˆ’1,0,1)zβ€˜5Ε½
. Then, the following holds:
1) If 𝜎 = 1, then the value represented by the borrowsave encoded string 𝜎 βˆ™ t satisfies:
𝜎 + 𝑑[𝑖] βˆ™ 25+zβ€˜5Ε½
+., ∈ (
,
β€˜
, 1)
2) If 𝜎 = -1, then the value represented by the borrowsave encoded string 𝜎 βˆ™t satisfies:
𝜎 + 𝑑[𝑖] βˆ™ 25+zβ€˜5Ε½
+., ∈ βˆ’
{
r
, βˆ’
,
r
The implication of Claim 1 is that after PN-recoding, the number of leading zeros in
the borrow-save encoded string 𝐹′
[βˆ’2: 53] (denoted by k in the claim) can be used as
the normalization shift amount to bring the normalized result into one of two binades
(i.e., in the positive case either (
,
β€˜
,
,
r
) or [
,
r
, 1), and in the negative case after negation
either (
,
r
, 1) or [1,
{
r
). Since we are only interested in the number of leading zeros of
𝐹′
[βˆ’2: 53], this borrowsave encoded string may be reduced to a binary string by
applying bitwise- XOR. This technique is implemented so that the normalized
significand is in the range [1,4) as follows:
1. In the positive case, the shift amount is 𝑙𝑧2 = π‘˜ = π‘™β€œ?”J(𝐹′
[βˆ’2: 52]). (Signal LZP2[5:0]
in Fig. 3).
2. In the negative case, the shift amount is 𝑙𝑧1 = π‘˜ βˆ’ 1 = π‘™β€œ?”J(𝐹′
[βˆ’1: 52]). (Signal
LZP1[5:0] in Fig. 3).
The reduction of the borrow-save encoded string to the binary vector involves a P-
recoding, an N-recoding, and bitwise XOR. Hence, the cost per bit is two half-adders
and a XOR-gate (i.e., 3 XOR-gates and two AND-gates). The delay of this reduction
is 3 XOR-gates. In a recent survey of leading zeroes anticipation and detection by
JADAVPUR	UNIVERSITY	|	FPGA	BASED	IMPLEMENTATION	OF	DOUBLE	PRECISION	IEEE	FP-ADDER		 18	
	
Schmookler and Nowka [26], a method that is faster by roughly one logic level is
presented. The steps of this method are as follows:
a) Transform the negative vector of the borrow-save encoded string into a positive vector
by using 2’s complement negation (the increment can be avoided by padding a carry-
save 2 digit from the right). Hence, the problem of computing the normalization shift
amount is reduced to counting the number of leading zeros (ones) of the binary
representation of the sum in case the sum is positive (negative).
b) The position of the leftmost 1 in the binary representation of the sum (when this sum
is positive) is computed as follows: Reduce the carry-save encoded string into a binary
string as follows: 𝛼+ = 𝑋𝑂𝑅(𝑝+, π‘˜β€’R,), where 𝑝+ is the 𝑖th β€œpropagate”- bit (namely, 𝑝+ is
the XOR of the bits in position i of the carry-save encoded string) and π‘˜+ is the ith β€œkill”-
bit (namely, the NOR of the above bits). The position of the leftmost one in the vector
𝛼+ is either the position of the leading digit or one position to the right of the leading
digit. In case the sum is negative, the position of the leftmost zero is computed by the
following reduction: 𝛽+ = 𝑋𝑂𝑅(𝑝+, 𝐺‒R,), where 𝐺+ is the ith β€œgenerate”- bit (namely, the
AND of the bits in position i of the carry-save encoded string).
Each of the binary vectors 𝛼 and 𝛽 are fed to a priority encoder. The cost of this
method is five gates per bit (i.e., 3 XOR-gates, one NOR-gate, and one AND-gate).
The delay is two logic levels (two XOR-gates). A further improvement described in [26]
is to the following reductions:
𝛼+
β€²
= 𝐴𝑁𝐷(𝑃‒, π‘˜β€’R,).
This improvement reduces the cost per bit to three AND-gates, one XOR-gate, and
one NOR-gate. The delay is reduced to one XOR-gate plus an AND-gate.
4.7 Precomputation of Postnormalization Shift
In the R-path, two choices for the rounded significand sum are computed by the
compound adder (see Section 4.5). Either the β€œsum” or the β€œincremented sum” output
of the compound adder is chosen for the rounded result. Because the significand after
the rounding selection is in the range [1, 4) (due to the preshifts from Section 4.2 only
these two binades have to be considered for rounding and for the postnormalization
shift), postnormalization requires at most a right-shift by one bit position. Because the
outputs of the compound adder have to wait for the computation of the rounding
decision (selection based on the range of the sum output), so the postnormalization
JADAVPUR	UNIVERSITY	|	FPGA	BASED	IMPLEMENTATION	OF	DOUBLE	PRECISION	IEEE	FP-ADDER		 19	
	
shift is precomputed on both outputs of the compound adder before the rounding
selection, so that the rounding selection already outputs the normalized significand
result of the R-path.
JADAVPUR	UNIVERSITY	|	FPGA	BASED	IMPLEMENTATION	OF	DOUBLE	PRECISION	IEEE	FP-ADDER		 20	
	
SECTION V. OVERVIEW OF THE PRESENT ALGORITHM
In this section, an overview of this FP-adder algorithm is given. A brief description of
the partitioning of the design, implementing and integrating the optimization
techniques from the previous section is presented. The algorithm is a two-staged
pipeline partitioned into two parallel paths called the R-path and the N-path. The final
result is selected between the outcomes of the two paths based on the signal IS_R
(see (2)). Several block diagrams of this algorithm are depicted in Fig. 1. An overview
of the two paths is in the following.
Overview R-Path
The R-path works under the assumption that
1. An effective addition takes place or
2. An effective subtraction with a significand difference (after preshifting) greater than or
equal to 2 takes place or
3. The absolute value of the exponent difference 𝛿 is larger than or equal to 2.
Note that these assumptions imply that the sign-bit of the sum equals SL. The
algorithm performs various operations on the significands before adding them.
These operations include: swapping, negation, preshifting, and alignment.
For the sake of brevity, it is referred in the outline to a bit string that originates from the
significands 𝑓𝑠 or 𝑓𝑙 as the "small operand" or "large operand", respectively. The R-
path is divided into two pipeline stages. In the first pipeline stage, the exponent
difference is computed and the significands are swapped. In effective subtraction, both
significands are preshifted and only the small operand is negated. The small operand
is then aligned according to the exponent difference.
In the Significand One’s Complement box, the small operand is negated (recall that
one’s complement representation is used).
In the Align 1 box, the small operand is 1) preshifted to the right if an effective
subtraction takes place and 2) aligned to the left by one position if the exponent
difference is positive. The alignment to the left by one position compensates for the
error in the computation of the exponent difference when the difference is positive due
to the one’s complement representation (see Section 4.4). In the Swap box, the large
and small operands are selected according to the sign of the exponent difference.
JADAVPUR	UNIVERSITY	|	FPGA	BASED	IMPLEMENTATION	OF	DOUBLE	PRECISION	IEEE	FP-ADDER		 21	
	
Note that since swapping takes place only after complementation and Align 1, one
must consider complementing and aligning both significands; only one of these
significands is used as the small operand. In the Align2 box, the small operand is
aligned according to the computed exponent difference. The exponent difference box
computes the swap decision and the alignment shift amount. This box is further
partitioned into two paths for medium and large exponent differences. A detailed block
diagram for the implementation of the first cycle of the R-path is depicted in Fig. 2. The
input to the second pipeline stage consists of the large and small operands. The goal
is to compute their sum and round it while taking into account the error due to the one’s
complement representation for effective subtractions. The second pipeline stage is
very similar to the rounding algorithm presented in [8]. The large and small operands
are divided into a low part and a high part that are processed in parallel. The LSB of
the final result is computed based on the low part and the range of the sum. The rest
of the final result is computed based on the high part (i.e., either the sum or the
incremented sum of the high part). The outputs of the compound adder are
postnormalized before the rounding selection is performed. A detailed block diagram
for the implementation of the second cycle of the R-path is depicted in Fig. 2.
Overview N-Path
The N-path works under the assumption that an effective subtraction takes place, the
significand difference (after the swapping of the addends and preshifting) is less than
2, and the absolute value of the exponent difference is less than 2. The N-path has
the following properties:
The exponent difference must be in the set {-1,0,1}. Hence, the exponent difference
can be computed by subtracting the two LSBs of the exponent strings. The alignment
shift is by at most one position. This is implemented in the exponent difference
prediction box.
An effective subtraction takes place, hence, the small operand is always negated. The
one’s complement representation has been used for the negated subtrahend.
The significand difference (after swapping and preshifting) is in the range (-2,2) and
can be exactly represented using 52 bits to the right of the binary point. Hence, no
rounding is required
Based on the exponent difference prediction, the significands are swapped and
aligned by at most one bit position in the align and swap box. The leading zero
JADAVPUR	UNIVERSITY	|	FPGA	BASED	IMPLEMENTATION	OF	DOUBLE	PRECISION	IEEE	FP-ADDER		 22	
	
approximation and the significand difference are then computed in parallel. The result
of the leading zero approximation is selected based on the sign of the significand
difference according to Section 4.6 in the leading zero selection box. The conversion
box computes the absolute value of the difference (Section 4.4) and the normalization
and postnormalization box normalizes the absolute significand difference as a result
of the N-path.
JADAVPUR	UNIVERSITY	|	FPGA	BASED	IMPLEMENTATION	OF	DOUBLE	PRECISION	IEEE	FP-ADDER		 23	
	
	
Fig.	1	Structure	of	the	FP-addition	algorithm	in	different	levels.	Vertical	dashed	line	separates	two	pipelines:	R-
path	and	N-path.	Horizontal	dashed	line	separates	two	pipeline	stages.	
High	level	view	of	the	FP-Adder	algorithm
JADAVPUR	UNIVERSITY	|	FPGA	BASED	IMPLEMENTATION	OF	DOUBLE	PRECISION	IEEE	FP-ADDER		 24	
	
	
Fig.2	Detailed	description	of	the	R-path
JADAVPUR	UNIVERSITY	|	FPGA	BASED	IMPLEMENTATION	OF	DOUBLE	PRECISION	IEEE	FP-ADDER		 25	
	
SECTION VI. SPECIFICATION AND DETAILED
DESCRIPTION
In this section, specifically the computations in the R-path are discussed since they
cover most of the varieties of the input. A more detailed study is necessary for the
implementing prospects in the N-path. The two stages of the R-path are specified as
following.
Specification: R-Path First Cycle
The computation performed by the first pipeline stage in the R-path outputs the
significands and , represented by FLP[1 : 52] and FSOPA[1 : 116]. The
significands and are defined by,
(𝑓𝑙𝑝 , π‘“π‘ π‘œπ‘π‘Ž ) =
𝑓𝑙, π‘“π‘ π‘Žπ‘› 																																				 	 𝑖𝑓	𝑆. 𝐸𝐹𝐹 = 0
(2 βˆ™ 𝑓𝑙 , 2 βˆ™ π‘“π‘ π‘Žπ‘› βˆ’ 25116
)														π‘œπ‘‘ β„Žπ‘’π‘Ÿπ‘€π‘–π‘ π‘’
The detailed view in Fig. 1 depicts how the computation of FLP[-1 : 52] and FSOPA[-
1 : 116] is performed. For each box in the high level view, a region surrounded by
dashed lines is depicted to assist the reader in matching the regions with blocks.
The exponent difference is computed for two ranges: The medium exponent difference
interval consists of [-63; 64], and the big exponent difference intervals consist of [-∞,-
64] and [65,∞]. The outputs of the exponent difference box are specified as follows:
Loosely speaking, the 𝑆𝐼𝐺𝑁_𝑀𝐸𝐷 and 𝑀𝐴𝐺_𝑀𝐸𝐷 are the sign-magnitude
representations of 𝛿, if 𝛿 is in the medium exponent difference interval. Formally,
βˆ’1 ]Λœβ„’Ε‘_β€ΊbΕ“
βˆ™ 𝑀𝐴𝐺_𝑀𝐸𝐷 =
𝛿 βˆ’ 1																									𝑖𝑓	64 β‰₯ 𝛿 β‰₯ 1
		𝛿																													𝑖𝑓	0 β‰₯ 𝛿 β‰₯	βˆ’63	
"don't care" 													 otherwise
The reason for missing 𝛿 by 1 in the positive case is due to the one’s complement
subtraction of the exponents. This error term is compensated for in the Align1 box.
SIGN_BIG is the sign bit of exponent difference 𝛿. 𝐼𝑆_𝐡𝐼𝐺 is a flag defined by:
𝐼𝑆_𝐡𝐼𝐺 =
1												𝑖𝑓	𝛿 β‰₯ 65	𝑂𝑅	𝛿 ≀	βˆ’64
0																																				π‘œπ‘‘β„Žπ‘’π‘Ÿπ‘€π‘–π‘ π‘’
In the big exponent difference intervals, the β€œrequired” alignment shift is at least 64
positions. Since all alignment shifts of 55 positions or more are equivalent (i.e., beyond
the sticky-bit position), the shift amount in this case may be limited. In the Align2
region, one of the following alignment shift occurs:
JADAVPUR	UNIVERSITY	|	FPGA	BASED	IMPLEMENTATION	OF	DOUBLE	PRECISION	IEEE	FP-ADDER		 26	
	
a. a fixed alignment shift by 64 positions in case the exponent difference belongs to the
big exponent difference intervals (this alignment ignores the preshifting altogether) or
b. an alignment shift by π‘šπ‘Žπ‘”_π‘šπ‘’π‘‘ positions in case the exponent difference belongs to
the medium exponent difference interval.
In the One’s Complement box, the signals FAO, FBO, and S.EFF are computed. The
FAO and FBO signals are defined by
FAO[0 : 52],FBO[0 : 52] =
𝐹𝐴 0: 52 , 𝐹𝐡 0: 52 																							𝑖𝑓	𝑆. 𝐸𝐹𝐹 = 0
π‘›π‘œπ‘‘ 𝐹𝐴 0: 52 , π‘›π‘œπ‘‘ 𝐹𝐡 0: 52 							π‘œπ‘‘β„Žπ‘’π‘Ÿπ‘€π‘–π‘ π‘’
The computations performed in the Preshift and Align 1 region are relevant only if the
exponent difference is in the medium exponent difference interval. The significands
are pre-shifted if an effective subtraction takes place. After the preshifting, an
alignment shift by one position takes place if sign med = 1. Table 1 summarizes the
specification of FSOP’[-1 : 53].
In the Swap region, the large operand is selected based on sign big. The small
operand is selected for the medium exponent difference (based on sign med) interval
and for the large exponent difference interval (based on sign big).
The Preshift 2 region deals with preshifting the minuend in case an effective
subtraction takes place.
Specification: R-path Second Cycle
Table 1 – Value of FSOP’ [-1:53] according to Fig. 1
SIGN_MED S.EFF Pre
shift
(Left)
Align
shift
(Right)
Accumulated
right shift
FSOP’[-1:53]
0 0 0 1 1 (00,FBO[0:52])
1 1 1 0 (1,FBO[0:52],1)
1 0 0 0 0 (0,FAO[0:52],0)
1 1 0 -1 (FAO[0:52],11)
JADAVPUR	UNIVERSITY	|	FPGA	BASED	IMPLEMENTATION	OF	DOUBLE	PRECISION	IEEE	FP-ADDER		 27	
	
The input to the second cycle consists of: The sign bit SL, a representation of the
exponent el, the significand strings FLP[-1:52] and FSOPA[-1:116], and the rounding
mode.
Together with the sign bit SL, the rounding mode is reduced to one of the three
rounding modes: RZ, RNE or RI (see Section 4.3). The output consists of the sign bit
SL, the exponent string, and the normalized and rounded significand 𝑓_π‘“π‘Žπ‘Ÿ ∈	[1,2)
represented by F_FAR[0 : 52]. If the significand sum (after pre-shifting) is greater than
or equal to 1, then the output of the second cycle of the R-path satisfies:
π‘Ÿπ‘›π‘‘ π‘ π‘’π‘š = π‘Ÿπ‘›π‘‘( βˆ’1 >;
βˆ™ (𝑓𝑙𝑝 + π‘“π‘ π‘œπ‘π‘Ž + 𝑆. 𝐸𝐹𝐹 βˆ™ 25,,ΕΈ
))
In effective subtraction, 25,,ΕΈ
is added to correct the sum of the one’s complement
representations to the sum of two’s complement representations by the lazy increment
from the first clock cycle. Fig. 1 depicts the partitioning of the computations in the 2nd
cycle of the R-path into basic blocks and specifies the input- and output-signals of
each of these basic blocks.
Details: R-Path First Cycle
Fig. 2 depicts a detailed block diagram of the first cycle of the R-path. The
nonstraightforward regions are described below.
The Exponent Difference region is implemented by cascading a 7-bit adder with a 5-
bit adder. The 7-bit adder computes the lazy one’s complement exponent difference if
the exponent difference is in the medium interval. This difference is converted to a
sign and magnitude representation denoted by 𝑠𝑖𝑔𝑛_π‘šπ‘’π‘‘ and π‘šπ‘Žπ‘”_π‘šπ‘’π‘‘ . The
cascading of the adders enables us to evaluate the exponent difference (for the
medium interval) in parallel with determining whether the exponent difference is in the
big range. The SIGN_BIG signal is simply the MSB of the lazy one’s complement
exponent difference. The IS_BIG signal is computed by OR-ing the bits in positions [6
: 10] of the magnitude of the lazy one’s complement exponent difference. This explains
why the medium interval is not symmetric around zero.
The Align 1 region depicted in Fig. 2 is an optimization of the Preshift and Align1 region
in Fig. 1. The reader can verify that the implementation of the Align 1 region satisfies
the specification of FSOP’[-1 : 53] that is summarized in Table 1.
JADAVPUR	UNIVERSITY	|	FPGA	BASED	IMPLEMENTATION	OF	DOUBLE	PRECISION	IEEE	FP-ADDER		 28	
	
The following condition is computed during the computation of the exponent
difference 𝐼𝑆_𝑅1	 ⟺	( π‘’π‘Ž βˆ’ 𝑒𝑏 ) β‰₯ 2
⟺ π‘‚π‘…π‘‘π‘Ÿπ‘’π‘’(𝐼𝑆_𝐡𝐼𝐺	, 𝑀𝐴𝐺_𝑀𝐸𝐷[5: 0], 𝐴𝑁𝐷(𝑀𝐴𝐺_𝑀𝐸𝐷(0), 𝑁𝑂𝑇(𝑆𝐼𝐺𝑁_𝐡𝐼𝐺))),
which will be used later for the selection of the valid path. Note that, the exponent
difference is computed using one’s complement representation. This implies that the
magnitude is off by one when the exponent difference is positive. In particular, the
case of the exponent difference equal to 2 yields a magnitude of 1 and a sign bit of 0.
This is why the expression AND(MAG_MED[0],NOT(SIGN BIG)) appears in the OR-
tree used to compute the IS_R signal.
Details: R-Path Second Cycle
Fig. 2 depicts a detailed block diagram of the second cycle of the R-path. The details
of the implementation are described below. Our implementation of the R-path in the
second cycle consists of two parallel paths called the upper part and the lower part.
The upper part deals with positions [-1:52] of the significands and the lower part deals
with positions [53:116] of the significands. The processing of the lower part has to take
into account two additional values: the rounding injection, which depends only on the
reduced rounding mode, and the missing ULP (2_116) in effective subtraction due to
the one’s complement representation. The processing of FSOPA[53:116], INJ[53:116]
and S.EFF βˆ™ 25,,ΕΈ
is based on:
𝑇𝐴𝐼𝐿[52: 116] =	
𝐹𝑆𝑂𝑃𝐴[53: 116] + 𝐼𝑁𝐽[53: 116] 																																	𝑖𝑓	𝑆. 𝐸𝐹𝐹 = 0
𝐹𝑆𝑂𝑃𝐴[53: 116] + 𝐼𝑁𝐽[53: 116] + 25,,ΕΈ
																	𝑖𝑓	𝑆. 𝐸𝐹𝐹 = 1
The bits 𝐢[52], 𝑅’, 𝑆’ are defined by 𝐢[52] = 𝑇𝐴𝐼𝐿[52] , R’=TAIL[53],
S’=OR(TAIL[54:116]). They are computed by using a 2-bit injection string that has a
different value for effective addition and subtraction.
Effective addition- Let 𝑆:…… denote the sticky bit that corresponds to FSOPA[54:116],
then 𝑆:……= OR(FSOPA[54], . . . ,FSOPA[116]). The injection can be restricted to two
bits INJ[53 : 54] and we simply perform a 2-bit addition to obtain the three bits C[52],R’,
S’:
(𝐢 52 , 𝑅′
, 𝑆′) = 𝐼𝑁𝐽[53: 54] + (𝐹𝑆𝑂𝑃𝐴 53 , 𝑆:……) .
JADAVPUR	UNIVERSITY	|	FPGA	BASED	IMPLEMENTATION	OF	DOUBLE	PRECISION	IEEE	FP-ADDER		 29	
	
Effective subtraction- In this case, we still have to add to FSOPA the missing 25,,ΕΈ
that we neglected to add during the first cycle. Let 𝑆>kS denote the sticky bit that
corresponds to bit positions [54:116] in the binary representation of
𝐹𝑆𝑂𝑃𝐴[54: 116] + 25,,ΕΈ
, then 𝑆>kS = 𝑂𝑅(𝑁𝑂𝑇(𝐹𝑆𝑂𝑃𝐴[54: 116])),
The addition of 25,,ΕΈ
can create a carry to position [53], which we denote by C[53].
The value of C[53] is one if FSOPA[54:116] is all ones, in which case the addition of
25,,ΕΈ
creates a carry that ripples to position [53]. Therefore, C[53]=NOT(𝑆>kS). Again,
the injection can be restricted to two bits INJ[53:54], and we compute C[52],R’, S’ by
(𝐢 52 , 𝑅′
, 𝑆′) = 𝐼𝑁𝐽[53: 54] + (𝐹𝑆𝑂𝑃𝐴 53 , 𝑆:……) + 2𝐢[53]
The result of this addition cannot be greater than 7 βˆ™ 25zβ€˜
, because C[53]= NOT(𝑠>kS).
A fast implementation of the computation of C[52], R’, S’ proceeds as follows. Let S
=𝑆:…… in effective addition, and S= 𝑆>kS in effective subtraction. Based on S.EFF,
FSOPA[53] and INJ[53:54], the signals C[52], R’, S’ are computed in two paths: one
assuming that S = 1 and the other assuming that S = 0.
Fig. 2 depicts a naive method of computing the sticky bit S to keep the presentation
structured rather than obscure it with optimizations. A conditional inversion of the bits
of FSOPA[54 : 116] is performed by XOR-ing the bits with S.EFF. The possibly
inverted bits are then input to an OR-tree. This suggestion is somewhat slow and
costly. A better method would be to compute the OR and AND of (most of) the bits of
FS[54 : 116] during the alignment shift in the first cycle. The advantages of advancing
(most of) the sticky bit computation to the first cycle is twofold:
1) There is ample time during the alignment shift whereas the sticky bit should be ready
after at most 5 logic levels in the second cycle; and
2) This saves the need to latch all 63 bits (corresponding to FS[54 : 116]) between the
two pipeline stages. The upper part computes the correctly rounded sum (including
post-normalization) and uses for the computation of the strings FLP[-1:52],FSOPA[-
1:52], and (C[52],R’, S’).
The rest of the algorithm is identical to the rounding presented, analyzed, and proven
in [8] for FP multiplication.
JADAVPUR	UNIVERSITY	|	FPGA	BASED	IMPLEMENTATION	OF	DOUBLE	PRECISION	IEEE	FP-ADDER		 30	
	
SECTION VII. DELAY ANALYSIS REPORTING
In this section, the delay of this FP-adder algorithm has been reported. The original
analysis was made by using the Logical Effort Model [33]. In the analysis of gate
delays, this delay model allows a rigorous treatment of the contributions of fanouts
and gate sizing independent of the used technology, process, voltage, or feature size.
Based on this delay model, a proposal of the delay-optimal implementation of this FP
addition algorithm was made by the authors, by considering optimized gate sizing and
driver insertion. For comparisons the delay estimations are scaled from this model to
units of FO4 inverter delays (inverter delay with a fanout of four). The use of this unit
has been a popular choice in literature on circuit design whenever delays are to be
determined in technology-independent terms. It has been demonstrated for several
examples that delay estimations in FO4 delay units are rather insensitive to choices
of technological parameters [12]. The description of, how the delay analysis, provided
by the authors, differ from the traditional one, has been described in [28], [29]. In the
more recent analysis, each gate is assigned an independent size and delay. This
allows for precise modeling of fanout contributions and full custom gate size
optimizations for minimum delay. The previously estimated the total delay of this
algorithm was 24 logic levels between latches, which could be evenly partitioned into
two stages with 12 logic levels each. One logic level (XOR delay with a fanout of 1)
has a delay of 1.6 FO4 inverter delay units in the current framework (this is because
d(XOR1) = 8 logical effort delays and d(FO4) = 5 logical effort delays), so that the
previous analysis estimated the delay of this algorithm to be 38:4 FO4 delays. This is
approximately 19 FO4 delays for each of two pipeline stages between latches.
JADAVPUR	UNIVERSITY	|	FPGA	BASED	IMPLEMENTATION	OF	DOUBLE	PRECISION	IEEE	FP-ADDER		 31	
	
SECTION VIII. IMPLEMENTATION
This algorithm has been implemented using the Xilinx ISE v.9.2i with a XC2V6000
device of virtex2 family, package FF1152. The top level source is Hardware
Description Language type, with a XST(VHDL/VERILOG) environment. The basic
block level implementation is done with VHDL following the connections as per the
algorithm. Instead of a compound adder as it was in the algorithm a simpler model of
addition circuitry has been used in the implementation level. A further study is
necessary for the scope of using the aforementioned adder circuit along with the
implementation of the β€œN-path” mentioned in the algorithm.
The detailed block diagram of the block level implementation is given inFig.1 and Fig.2.
The system requirement for the above mentioned environment is given below.
The synthesis report of the generated circuit model is presented here.
The general debugging tool used for the implementation process is the ISE editor and
Modelsim v.6.3.
The implementation process is shown hierarchically:
Algorithm undestanding
Developing building blocks
(in VHDL)
Simulation of the abstact
model Debugging
Pipelining and optimizing
Chip implementation
(RC10)
JADAVPUR	UNIVERSITY	|	FPGA	BASED	IMPLEMENTATION	OF	DOUBLE	PRECISION	IEEE	FP-ADDER		 32	
	
System Information
Machine name: ADRIN-E68E783C5
Operating System: Windows XP Professional (5.1, Build 2600) Service Pack 2
(2600.xpsp_sp2_rtm.040803-2158)
Language: English (Regional Setting: English)
System Manufacturer: Hewlett-Packard
System Model: HP xw4300 Workstation
BIOS: Default System BIOS
Processor: Intel(R) Pentium(R) 4 CPU 3.80GHz (2 CPUs)
Memory: 1536MB RAM
Page File: 485MB used, 2945MB available
Windows Dir: C:WINDOWS
DirectX Version: DirectX 9.0c (4.09.0000.0904)
DX Setup Parameters: Not found
DxDiag Version: 5.03.2600.2180 32bit Unicode
JADAVPUR	UNIVERSITY	|	FPGA	BASED	IMPLEMENTATION	OF	DOUBLE	PRECISION	IEEE	FP-ADDER		 33	
	
SYNTHESIS REPORT:
Synthesis Options Summary
Source Parameters ----
Input File Name : "top_module.prj"
Input Format : mixed
Ignore Synthesis Constraint File : NO
Target Parameters ----
Output File Name : "top_module"
Output Format : NGC
Target Device : xc2v6000-4-ff1152
Source Options ----
Top Module Name : top_module
Automatic FSM Extraction : YES
FSM Encoding Algorithm : Auto
Safe Implementation : No
FSM Style : lut
RAM Extraction : Yes
RAM Style : Auto
ROM Extraction : Yes
Mux Style : Auto
Decoder Extraction : YES
Priority Encoder Extraction : YES
Shift Register Extraction : YES
Logical Shifter Extraction : YES
XOR Collapsing : YES
ROM Style : Auto
Mux Extraction : YES
Resource Sharing : YES
Asynchronous To Synchronous : NO
Multiplier Style : auto
Automatic Register Balancing : No
Target Options ----
Add IO Buffers : YES
Global Maximum Fanout : 500
Add Generic Clock Buffer(BUFG) : 16
Register Duplication : YES
Slice Packing : YES
Optimize Instantiated Primitives : NO
Convert Tristates To Logic : Yes
JADAVPUR	UNIVERSITY	|	FPGA	BASED	IMPLEMENTATION	OF	DOUBLE	PRECISION	IEEE	FP-ADDER		 34	
	
Use Clock Enable : Yes
Use Synchronous Set : Yes
Use Synchronous Reset : Yes
Pack IO Registers into IOBs : auto
Equivalent register Removal : YES
General Options ----
Optimization Goal : Speed
Optimization Effort : 1
Library Search Order : top_module.lso
Keep Hierarchy : NO
RTL Output : Yes
Global Optimization : AllClockNets
Read Cores : YES
Write Timing Constraints : NO
Cross Clock Analysis : NO
Hierarchy Separator : /
Bus Delimiter : <>
Case Specifier : maintain
Slice Utilization Ratio : 100
BRAM Utilization Ratio : 100
Verilog 2001 : YES
Auto BRAM Packing : NO
Slice Utilization Ratio Delta : 5
HDL Synthesis Report
Macro Statistics
# Adders/Subtractors : 5
11-bit adder : 1
5-bit adder carry in : 1
53-bit adder : 2
8-bit adder : 1
# Registers : 12
1-bit register : 5
11-bit register : 3
118-bit register : 1
52-bit register : 2
54-bit register : 1
# Latches : 1
118-bit latch : 1
# Multiplexers : 1
11-bit 4-to-1 multiplexer : 1
# Logic shifters : 1
118-bit shifter logical right : 1
# Xors : 16
1-bit xor2 : 15
1-bit xor3 : 1
JADAVPUR	UNIVERSITY	|	FPGA	BASED	IMPLEMENTATION	OF	DOUBLE	PRECISION	IEEE	FP-ADDER		 35	
	
Final Report
Final Results
RTL Top Level Output File Name : top_module.ngr
Top Level Output File Name : top_module
Output Format : NGC
Optimization Goal : Speed
Keep Hierarchy : NO
Design Statistics
# IOs : 195
Cell Usage:
# BELS : 1316
# GND : 1
# LUT1 : 52
# LUT2 : 141
# LUT2_D : 3
# LUT2_L : 1
# LUT3 : 284
# LUT3_D : 3
# LUT4 : 441
# LUT4_D : 4
# LUT4_L : 3
# MUXCY : 111
# MUXF5 : 161
# VCC : 1
# XORCY : 110
# Flip Flops/Latches : 308
# FD : 7
# FDC : 240
# FDS : 4
# LD_1 : 57
# Clock Buffers : 2
# BUFG : 1
# BUFGP : 1
# IO Buffers : 194
# IBUF : 130
# OBUF : 64
Device utilization summary:
JADAVPUR	UNIVERSITY	|	FPGA	BASED	IMPLEMENTATION	OF	DOUBLE	PRECISION	IEEE	FP-ADDER		 36	
	
Selected Device : 2v6000ff1152-4
Number of Slices: 528 out of 33792 1%
Number of Slice Flip Flops: 308 out of 67584 0%
Number of 4 input LUTs: 932 out of 67584 1%
Number of IOs: 195
Number of bonded IOBs: 195 out of 824 23%
Number of GCLKs: 2 out of 16 12%
TIMING REPORT
NOTE: THESE TIMING NUMBERS ARE ONLY A SYNTHESIS ESTIMATE.
FOR ACCURATE TIMING INFORMATION PLEASE REFER TO THE TRACE
REPORT GENERATED AFTER PLACE-and-ROUTE.
Clock Information:
-----------------------------------------------+-----------------------------------------+-------+
Clock Signal | Clock buffer(FF name) | Load |
-----------------------------------------------+-----------------------------------------+-------+
Clock | BUFGP | 251 |
-----------------------------------------------+-----------------------------------------+--------|
seff1 (Inst_compliment5353/ | | |
Mxor_s_temp_xo<1>1:O) | BUFG(*)(Inst_allignment/ | |
|goout_101) | 57 |
-----------------------------------------------+-----------------------------------------+--------+
Asynchronous Control Signals Information:
-----------------------------------+-----------------------------+--------+
Control Signal | Buffer(FF name) | Load |
-----------------------------------+------------------------------+-------+
Preset | IBUF | 240 |
-----------------------------------+------------------------------+-------+
Timing Summary:
Speed Grade: -4
Minimum period: 10.631ns (Maximum Frequency: 94.061MHz)
Minimum input arrival time before clock: 1.712ns
Maximum output required time after clock: 16.396ns
Maximum combinational path delay: No path found
Timing Detail:
All values displayed in nanoseconds (ns)
Timing constraint: Default period analysis for Clock 'clock'
JADAVPUR	UNIVERSITY	|	FPGA	BASED	IMPLEMENTATION	OF	DOUBLE	PRECISION	IEEE	FP-ADDER		 37	
	
Clock period: 10.631ns (frequency: 94.061MHz)
Total number of paths / destination ports: 44285 / 125
JADAVPUR	UNIVERSITY	|	FPGA	BASED	IMPLEMENTATION	OF	DOUBLE	PRECISION	IEEE	FP-ADDER		 38	
	
PLACE AND ROUTE REPORT:
Release 9.2i par J.36
Copyright (c) 1995-2007 Xilinx, Inc. All rights reserved.
ADRIN-E68E783C5:: Wed Oct 05 14:42:59 2011
Device speed data version: "PRODUCTION 1.121 2007-04-13".
Device Utilization Summary:
Number of BUFGMUXs 2 out of 16 12%
Number of External IOBs 195 out of 824 23%
Number of LOCed IOBs 0 out of 195 0%
Number of SLICEs 546 out of 33792 1%
Overall effort level (-ol): Standard
Placer effort level (-pl): High
Placer cost table entry (-t): 1
Router effort level (-rl): Standard
REAL time consumed by placer: 20 secs
CPU time consumed by placer: 19 secs
Writing design to file top_module.ncd
Total REAL time to Placer completion: 21 secs
Total CPU time to Placer completion: 20 secs
Total REAL time to Router completion: 40 secs
Total CPU time to Router completion: 39 secs
Generating Clock Report:
+---------------------+------------------+----------+---------+------------------+--------------------+
| Clock Net | Resource |Locked |Fanout |Net Skew(ns) |Max Delay(ns) |
+---------------------+------------------+----------+---------+------------------+--------------------+
| clock_BUFGP | BUFGMUX3P| No | 197 | 0.353 | 2.396 |
+---------------------+------------------+----------+---------+------------------+---------------------+
| Seff |BUFGMUX4P| No | 53 | 0.095 | 2.134 |
+---------------------+------------------+----------+---------+------------------+---------------------+
* Net Skew is the difference between the minimum and maximum routing
only delays for the net. Note this is different from Clock Skew which
is reported in TRCE timing report. Clock Skew is the difference between
the minimum and maximum path delays which includes logic delays.
JADAVPUR	UNIVERSITY	|	FPGA	BASED	IMPLEMENTATION	OF	DOUBLE	PRECISION	IEEE	FP-ADDER		 39	
	
The Delay Summary Report
The NUMBER OF SIGNALS NOT COMPLETELY ROUTED for this design is: 0
The AVERAGE CONNECTION DELAY for this design is: 2.675
The MAXIMUM PIN DELAY IS: 14.081
The AVERAGE CONNECTION DELAY on the 10 WORST NETS is: 9.965
Listing Pin Delays by value: (nsec)
d < 3.00 < d < 6.00 < d < 9.00 < d < 12.00 < d < 15.00 d >= 15.00
--------- --------- --------- --------- --------- ---------
2723 702 391 79 36 0
Summary report for the RC10 chip implementation:
Clock: 110 MHz
ISE Project Status
Project File: FPadder.ise Current State: Programming File Generated
Module Name: top_module
β€’ Errors:
No Errors
Target Device: xc3s1500l-4fg320
β€’ Warnings:
20 Warnings
Product Version: ISE 9.2i
β€’ Updated:
Tue Jun 26 16:00:21 2012
ISE Partition Summary
No partition information was found.
JADAVPUR	UNIVERSITY	|	FPGA	BASED	IMPLEMENTATION	OF	DOUBLE	PRECISION	IEEE	FP-ADDER		 40	
	
Device Utilization Summary
Logic Utilization Used Available Utilization Note(s)
Number of Slice Flip Flops 421 26,624 1%
Number of 4 input LUTs 492 26,624 1%
Logic Distribution
Number of occupied Slices 491 13,312 3%
Number of Slices containing only related logic 491 491 100%
Number of Slices containing unrelated logic 0 491 0%
Total Number of 4 input LUTs 668 26,624 2%
Number used as logic 492
Number used as a route-thru 139
Number used for Dual Port RAMs 36
Number used as Shift registers 1
Number of bonded IOBs 39 221 17%
IOB Flip Flops 15
Number of Block RAMs 1 32 3%
Number of GCLKs 4 8 50%
Number of DCMs 2 4 50%
Total equivalent gate count for design 89,436
Additional JTAG gate count for IOBs 1,872
Performance Summary
Final Timing Score: 0 Pinout Data: Pinout Report
Routing Results: All Signals Completely Routed Clock Data: Clock Report
Timing Constraints: All Constraints Met
Detailed Reports
Report Name Status Generated Errors Warnings Infos
Synthesis Report
Translation Report Current Tue Jun 26 15:59:03 2012 0 0 0
Map Report Current Tue Jun 26 15:59:15 2012 0 3 Warnings 4 Infos
Place and Route Report Current Tue Jun 26 15:59:52 2012 0 9 Warnings 1 Info
Static Timing Report Current Tue Jun 26 16:00:00 2012 0 8 Warnings 2 Infos
Bitgen Report Current Tue Jun 26 16:00:19 2012 0 0 1 Info
JADAVPUR	UNIVERSITY	|	FPGA	BASED	IMPLEMENTATION	OF	DOUBLE	PRECISION	IEEE	FP-ADDER		 41	
	
SCREENSHOTS
JADAVPUR	UNIVERSITY	|	FPGA	BASED	IMPLEMENTATION	OF	DOUBLE	PRECISION	IEEE	FP-ADDER		 42	
	
				
	
	
SCREEN SHOT FOR THE SIMULATION RESULT
JADAVPUR	UNIVERSITY	|	FPGA	BASED	IMPLEMENTATION	OF	DOUBLE	PRECISION	IEEE	FP-ADDER		 43	
	
		
	
SCREEN SHOT OF THE BEHAVIORAL SIMULATION
JADAVPUR	UNIVERSITY	|	FPGA	BASED	IMPLEMENTATION	OF	DOUBLE	PRECISION	IEEE	FP-ADDER		 44	
	
SCREEN SHOT OF THE POST ROUTE SIMULATION
JADAVPUR	UNIVERSITY	|	FPGA	BASED	IMPLEMENTATION	OF	DOUBLE	PRECISION	IEEE	FP-ADDER		 45	
	
SCREEN SHOT OF THE ADDER5 COMPONENT SCHEMATIC
JADAVPUR	UNIVERSITY	|	FPGA	BASED	IMPLEMENTATION	OF	DOUBLE	PRECISION	IEEE	FP-ADDER		 46	
	
SCREEN SHOT OF THE TOP SCHEMATIC
JADAVPUR	UNIVERSITY	|	FPGA	BASED	IMPLEMENTATION	OF	DOUBLE	PRECISION	IEEE	FP-ADDER		 47	
	
SCREEN SHOT OF THE RC10 FLASH INTERFACE
JADAVPUR	UNIVERSITY	|	FPGA	BASED	IMPLEMENTATION	OF	DOUBLE	PRECISION	IEEE	FP-ADDER		 48	
	
SECTION VIII. FUTURE SCOPE
The IEEE double precision FP adder circuit finds usefulness in numerous occasions.
Therefore this present implementation can be stretched to cover up the whole
algorithmic process and it has a greater affinity for the optimization of the total process,
including the rounding mechanisms and the addition process. Also the computation of
the carry and sticky bit requires a lot of logical efforts which are can be minimized.
Upon optimization this can be used in various floating point operation incorporating
with different FPGAs.
JADAVPUR	UNIVERSITY	|	FPGA	BASED	IMPLEMENTATION	OF	DOUBLE	PRECISION	IEEE	FP-ADDER		 49	
	
BIBLIOGRAPHY
[1] S. Bar-Or, G. Even, and Y. Levin, β€œGeneration of Representative Input Vectors for
Parametric Designs: from Low Precision to High Precision,” Integration, the VLSI J.,
vol. 36, issues 1-2, pp. 69-82, Sept. 2003.
[2] A. Beaumont-Smith, N. Burgess, S. Lefrere, and C. Lim, β€œReduced Latency IEEE
Floating-Point Standard Adder Architectures,” Proc.14th IEEE Symp. Computer
Arithmetic, pp. 35-43, 1999.
[3] R. Brent and H. Kung, β€œA Regular Layout for Parallel Adders,” IEEE Trans. On
computers, vol. 31, no. 3, pp. 260-264, Mar. 1982.
[4] J. Bruguera and T. Lang, β€œUsing the Reverse-Carry Approach for Double-Datapath
Floating-Point Addition,” Proc. 15th IEEE Int’l Symp. Computer Arithmetic (Arith15),
pp. 203-210, June 2001.
[5] M. Daumas and D. Matula, β€œRecoders for Partial Compression and Rounding,”
Technical Report RR97-01, Ecole Normale Superieur de Lyon, LIP 1996.
[6] L. Eisen, T. Elliott, R. Golla, and C. Olson, β€œMethod and System for Performing a
High Speed Floating Point Add Operation.” IBM Corporation, US patent 5790445,
1998.
[7] G. Even, S. MuΒ¨ ller, and P. Seidel, β€œDual Precision IEEE Floating- Point Multiplier,”
INTEGRATION The VLSI J., vol. 29, no. 2, pp. 167-180, Sept. 2000.
[8] G. Even and P.-M. Seidel, β€œA Comparison of Three Rounding Algorithms for IEEE
Floating-Point Multiplication,” IEEE Trans Computers, vol. 49, no. 7, pp. 638-650, July
2000.
[9] P. Farmwald, β€œOn the Design of High Performance Digital Arithmetic Units,” PhD
thesis, Stanford Univ., Aug. 1981
[10] P. Farmwald, β€œBifurcated Method and Apparatus for Floating-Point Addition with
Decreased Latency Time,” US patent 4639887,1987.
[11] V. Gorshtein, A. Grushin, and S. Shevtsov, β€œFloating Point Addition Methods and
Apparatus.” Sun Microsystems, US patent 5808926, 1998.
[12]M.Horowitz,β€œVLSI Scaling for Architects,” http://velox.stanford.
du/papers/VLSIScaling.pdf, 2000.
[13] β€œIEEE Standard for Binary Floating Point Arithmetic, ”ANSI/ IEEE754-1985, 1985.
[14] T. Ishikawa, β€œMethod for Adding/Subtracting Floating-Point Representation Data
and Apparatus for the Same,” Toshiba, K.K., US patent 5063530, 1991.
JADAVPUR	UNIVERSITY	|	FPGA	BASED	IMPLEMENTATION	OF	DOUBLE	PRECISION	IEEE	FP-ADDER		 50	
	
[15] T. Kawaguchi, β€œFloating Point Addition and Subtraction Arithmetic Circuit
Performing Preprocessing of Addition or Subtraction Operation Rapidly,” NEC, US
patent 5931896, 1999.
[16] Y. Levin, β€œSupporting Demoralized Numbers in an IEEE Compliant Floating Point
Adder Optimized for Speed,” http:// hyde.eng.tau.ac.il/Projects/FPADD/index.html,
2001.
[17] A. Naini, A. Dhablania, W. James, and D. Das Sarma, β€œ1-GHz HAL SPARC64
Dual Floating Point Unit with RAS Features,” Proc. 15th IEEE Int’l Symp. Computer
Arithmetic (Arith15), pp. 173- 183, June 2001.
[18] T. Nakayama, β€œHardware Arrangement for Floating-Point Addition and
Subtraction,” NEC, US patent 5197023, 1993.
[19] K. Ng, β€œFloating-Point ALU with Parallel Paths,” US patent 5136536, Weitek Corp.,
1992.
[20] A. Nielsen, D. Matula, C.-N. Lyu, and G. Even, β€œIEEE Compliant Floating-Point
Adder that Conforms with the Pipelined Packet- Forwarding Paradigm,” IEEE Trans.
Computers, vol. 49, no. 1,pp. 33-47, Jan. 2000.
[21] S. Oberman, β€œFloating-Point Arithmetic Unit Including an Efficient Close Data
Path,” AMD, US patent 6094668, 2000.
[22] S. Oberman, H. Al-Twaijry, and M. Flynn, β€œThe SNAP Project: Design of Floating
Point Arithmetic Units,” Proc. 13th IEEE Symp.Computer Arithmetic, pp. 156-165,
1997.
[23] W.-C. Park, T.-D. Han, S.-D. Kim, and S.-B. Yang, β€œFloating Point
Adder/Subtractor Performing IEEE Rounding and Addition/Subtraction in Parallel,”
IEICE Trans. Information and Systems, vol. 4, pp. 297-305, 1996.
[24] N. Quach and M. Flynn, β€œDesign and Implementation of the SNAP Floating-Point
Adder,” Technical Report CSL-TR-91-501,Stanford Univ., Dec. 1991.
[25] N. Quach, N. Takagi, and M. Flynn, β€œOn fast IEEE Rounding,” Technical Report
CSL-TR-91-459, Stanford Univ., Jan. 1991.
[26] M. Schmookler and K. Nowka, β€œLeading Zero Anticipation and Detectionβ€”A
Comparison of Methods, Proc. 15th IEEE Symp.Computer Arithmetic, pp. 7-12, 2001.
[27] P.-M. Seidel, β€œOn The Design of IEEE Compliant Floating-Point Units and Their
Quantitative Analysis,” PhD thesis, Univ. of Saarland, Germany, Dec. 1999.
[28] P.-M. Seidel and G. Even, β€œHow Many Logic Levels Does Floating-Point Addition
Require?” Proc. 1998 Int’l Conf. Computer Design (ICCD ’98): VLSI, in Computers &
Processors, pp. 142-149, Oct. 1998.
JADAVPUR	UNIVERSITY	|	FPGA	BASED	IMPLEMENTATION	OF	DOUBLE	PRECISION	IEEE	FP-ADDER		 51	
	
[29] P.-M. Seidel and G. Even, β€œOn the Design of Fast IEEE Floating- Point Adders,”
Proc. 15th IEEE Int’l Symp. Computer Arithmetic (Arith15), pp. 184-194, June 2001.
[30] P.-M. Seidel and G. Even, β€œDelay-Optimized Implementation of IEEE Floating-
Point Addition,” http://www.engr.smu.edu/ seidel/research/fpadd/, 2002.
[31] H. Sit, D. Galbi, and A. Chan, β€œCircuit for Adding/Subtracting Two Floating-Point
Operands,” Intel, US patent 5027308, 1991.
[32] D. Stiles, β€œMethod and Apparatus for Performing Floating-Point Addition,” AMD,
US patent 5764556, 1998.
[33] I. Sutherland, B. Sproull, and D. Harris, Logical Effort: Designing Fast CMOS
Circuits. Morgan Kaufmann 1999.
[34] A. Tyagi, β€œA Reduced-Area Scheme for Carry-Select Adders,” IEEE Trans.
Computers, vol. 42, no. 10, Oct. 1993.
[35] H. Yamada, F. Murabayashi, T. Yamauchi, T. Hotta, H. Sawamoto, T. Nishiyama,
Y. Kiyoshige, and N. Ido, β€œFloating-Point Addition/ Subtraction Processing Apparatus
and Method Thereof,” Hitachi, US patent 5684729, 1997.
[36] Stefan Sjoholm, Lennart Lindh β€œVHDL for designers” Prentice Hall Europe, ISBN
0-13-473414-9.
[37] Kavin Skahill β€œVHDL for PROGRAMMABLE LOGIC”
[38] ACTEL β€œHDL CODING style guide”.

More Related Content

What's hot

EVALUATION OF OPTICALLY ILLUMINATED MOSFET CHARACTERISTICS BY TCAD SIMULATION
EVALUATION OF OPTICALLY ILLUMINATED MOSFET  CHARACTERISTICS BY TCAD SIMULATIONEVALUATION OF OPTICALLY ILLUMINATED MOSFET  CHARACTERISTICS BY TCAD SIMULATION
EVALUATION OF OPTICALLY ILLUMINATED MOSFET CHARACTERISTICS BY TCAD SIMULATION
VLSICS Design
Β 
FPGA based Efficient Interpolator design using DALUT Algorithm
FPGA based Efficient Interpolator design using DALUT AlgorithmFPGA based Efficient Interpolator design using DALUT Algorithm
FPGA based Efficient Interpolator design using DALUT Algorithm
cscpconf
Β 
IRJET- A Survey on Reconstruct Structural Design of FPGA
IRJET-  	  A Survey on Reconstruct Structural Design of FPGAIRJET-  	  A Survey on Reconstruct Structural Design of FPGA
IRJET- A Survey on Reconstruct Structural Design of FPGA
IRJET Journal
Β 
An fpga implementation of the lms adaptive filter
An fpga implementation of the lms adaptive filterAn fpga implementation of the lms adaptive filter
An fpga implementation of the lms adaptive filter
eSAT Publishing House
Β 
An fpga implementation of the lms adaptive filter
An fpga implementation of the lms adaptive filter An fpga implementation of the lms adaptive filter
An fpga implementation of the lms adaptive filter
eSAT Journals
Β 
Novel high functionality fault tolerant ALU
Novel high functionality fault tolerant ALUNovel high functionality fault tolerant ALU
Novel high functionality fault tolerant ALU
TELKOMNIKA JOURNAL
Β 
High functionality reversible arithmetic logic unit
High functionality reversible arithmetic logic unitHigh functionality reversible arithmetic logic unit
High functionality reversible arithmetic logic unit
IJECEIAES
Β 
Cg34503507
Cg34503507Cg34503507
Cg34503507
IJERA Editor
Β 
IRJET- Implementation of TPG-LFSR with Reseeding Pattern Value
IRJET-  	  Implementation of TPG-LFSR with Reseeding Pattern ValueIRJET-  	  Implementation of TPG-LFSR with Reseeding Pattern Value
IRJET- Implementation of TPG-LFSR with Reseeding Pattern Value
IRJET Journal
Β 
IRJET - High Speed Approximation Error Tolerance Adders for Image Processing ...
IRJET - High Speed Approximation Error Tolerance Adders for Image Processing ...IRJET - High Speed Approximation Error Tolerance Adders for Image Processing ...
IRJET - High Speed Approximation Error Tolerance Adders for Image Processing ...
IRJET Journal
Β 
Qualitative Analysis of Optical Interleave Division Multiple Access using Spe...
Qualitative Analysis of Optical Interleave Division Multiple Access using Spe...Qualitative Analysis of Optical Interleave Division Multiple Access using Spe...
Qualitative Analysis of Optical Interleave Division Multiple Access using Spe...
IRJET Journal
Β 
40120140504006
4012014050400640120140504006
40120140504006
IAEME Publication
Β 
Ijecet 06 08_004
Ijecet 06 08_004Ijecet 06 08_004
Ijecet 06 08_004
IAEME Publication
Β 
Ig3415411546
Ig3415411546Ig3415411546
Ig3415411546
IJERA Editor
Β 
OPTIMIZATION OF HEURISTIC ALGORITHMS FOR IMPROVING BER OF ADAPTIVE TURBO CODES
OPTIMIZATION OF HEURISTIC ALGORITHMS FOR IMPROVING BER OF ADAPTIVE TURBO CODESOPTIMIZATION OF HEURISTIC ALGORITHMS FOR IMPROVING BER OF ADAPTIVE TURBO CODES
OPTIMIZATION OF HEURISTIC ALGORITHMS FOR IMPROVING BER OF ADAPTIVE TURBO CODES
IAEME Publication
Β 
International Journal of Computational Engineering Research(IJCER)
International Journal of Computational Engineering Research(IJCER)International Journal of Computational Engineering Research(IJCER)
International Journal of Computational Engineering Research(IJCER)
ijceronline
Β 
Hardware simulation for exponential blind equal throughput algorithm using sy...
Hardware simulation for exponential blind equal throughput algorithm using sy...Hardware simulation for exponential blind equal throughput algorithm using sy...
Hardware simulation for exponential blind equal throughput algorithm using sy...
IJECEIAES
Β 

What's hot (17)

EVALUATION OF OPTICALLY ILLUMINATED MOSFET CHARACTERISTICS BY TCAD SIMULATION
EVALUATION OF OPTICALLY ILLUMINATED MOSFET  CHARACTERISTICS BY TCAD SIMULATIONEVALUATION OF OPTICALLY ILLUMINATED MOSFET  CHARACTERISTICS BY TCAD SIMULATION
EVALUATION OF OPTICALLY ILLUMINATED MOSFET CHARACTERISTICS BY TCAD SIMULATION
Β 
FPGA based Efficient Interpolator design using DALUT Algorithm
FPGA based Efficient Interpolator design using DALUT AlgorithmFPGA based Efficient Interpolator design using DALUT Algorithm
FPGA based Efficient Interpolator design using DALUT Algorithm
Β 
IRJET- A Survey on Reconstruct Structural Design of FPGA
IRJET-  	  A Survey on Reconstruct Structural Design of FPGAIRJET-  	  A Survey on Reconstruct Structural Design of FPGA
IRJET- A Survey on Reconstruct Structural Design of FPGA
Β 
An fpga implementation of the lms adaptive filter
An fpga implementation of the lms adaptive filterAn fpga implementation of the lms adaptive filter
An fpga implementation of the lms adaptive filter
Β 
An fpga implementation of the lms adaptive filter
An fpga implementation of the lms adaptive filter An fpga implementation of the lms adaptive filter
An fpga implementation of the lms adaptive filter
Β 
Novel high functionality fault tolerant ALU
Novel high functionality fault tolerant ALUNovel high functionality fault tolerant ALU
Novel high functionality fault tolerant ALU
Β 
High functionality reversible arithmetic logic unit
High functionality reversible arithmetic logic unitHigh functionality reversible arithmetic logic unit
High functionality reversible arithmetic logic unit
Β 
Cg34503507
Cg34503507Cg34503507
Cg34503507
Β 
IRJET- Implementation of TPG-LFSR with Reseeding Pattern Value
IRJET-  	  Implementation of TPG-LFSR with Reseeding Pattern ValueIRJET-  	  Implementation of TPG-LFSR with Reseeding Pattern Value
IRJET- Implementation of TPG-LFSR with Reseeding Pattern Value
Β 
IRJET - High Speed Approximation Error Tolerance Adders for Image Processing ...
IRJET - High Speed Approximation Error Tolerance Adders for Image Processing ...IRJET - High Speed Approximation Error Tolerance Adders for Image Processing ...
IRJET - High Speed Approximation Error Tolerance Adders for Image Processing ...
Β 
Qualitative Analysis of Optical Interleave Division Multiple Access using Spe...
Qualitative Analysis of Optical Interleave Division Multiple Access using Spe...Qualitative Analysis of Optical Interleave Division Multiple Access using Spe...
Qualitative Analysis of Optical Interleave Division Multiple Access using Spe...
Β 
40120140504006
4012014050400640120140504006
40120140504006
Β 
Ijecet 06 08_004
Ijecet 06 08_004Ijecet 06 08_004
Ijecet 06 08_004
Β 
Ig3415411546
Ig3415411546Ig3415411546
Ig3415411546
Β 
OPTIMIZATION OF HEURISTIC ALGORITHMS FOR IMPROVING BER OF ADAPTIVE TURBO CODES
OPTIMIZATION OF HEURISTIC ALGORITHMS FOR IMPROVING BER OF ADAPTIVE TURBO CODESOPTIMIZATION OF HEURISTIC ALGORITHMS FOR IMPROVING BER OF ADAPTIVE TURBO CODES
OPTIMIZATION OF HEURISTIC ALGORITHMS FOR IMPROVING BER OF ADAPTIVE TURBO CODES
Β 
International Journal of Computational Engineering Research(IJCER)
International Journal of Computational Engineering Research(IJCER)International Journal of Computational Engineering Research(IJCER)
International Journal of Computational Engineering Research(IJCER)
Β 
Hardware simulation for exponential blind equal throughput algorithm using sy...
Hardware simulation for exponential blind equal throughput algorithm using sy...Hardware simulation for exponential blind equal throughput algorithm using sy...
Hardware simulation for exponential blind equal throughput algorithm using sy...
Β 

Viewers also liked

Dk szeged 2012
Dk szeged 2012Dk szeged 2012
Dk szeged 2012
GΓ‘bor Orosz
Β 
Low Voltage Power MOSFET_Final
Low Voltage Power MOSFET_FinalLow Voltage Power MOSFET_Final
Low Voltage Power MOSFET_Final
Somsubhra Ghosh
Β 
Bipolaridad
Bipolaridad Bipolaridad
Bipolaridad
Andrea Velastegui Yanez
Β 
Neoplasm in ruminants
Neoplasm in ruminantsNeoplasm in ruminants
Neoplasm in ruminants
Prof.Dr. Gamal Karrouf
Β 
Security and ethical challenges
Security and ethical challengesSecurity and ethical challenges
Security and ethical challenges
Vishakha Joshi
Β 
Ser un buen ciudadano
Ser un buen ciudadanoSer un buen ciudadano
Ser un buen ciudadano
Ana Cepeda
Β 
Cyclone
CycloneCyclone
Cyclone
Subhrajyoti Roy
Β 
Arthrodesis
ArthrodesisArthrodesis
Arthrodesis
Prof.Dr. Gamal Karrouf
Β 
Implementation of 32 Bit Binary Floating Point Adder Using IEEE 754 Single Pr...
Implementation of 32 Bit Binary Floating Point Adder Using IEEE 754 Single Pr...Implementation of 32 Bit Binary Floating Point Adder Using IEEE 754 Single Pr...
Implementation of 32 Bit Binary Floating Point Adder Using IEEE 754 Single Pr...
iosrjce
Β 
Hernias
HerniasHernias
Importance Of Dance
Importance Of Dance Importance Of Dance
Importance Of Dance
Subhrajyoti Roy
Β 
floating point multiplier
floating point multiplierfloating point multiplier
floating point multiplier
Bipin Likhar
Β 
Cmos Arithmetic Circuits
Cmos Arithmetic CircuitsCmos Arithmetic Circuits
Cmos Arithmetic Circuits
ankitgoel
Β 
Quick tutorial on IEEE 754 FLOATING POINT representation
Quick tutorial on IEEE 754 FLOATING POINT representationQuick tutorial on IEEE 754 FLOATING POINT representation
Quick tutorial on IEEE 754 FLOATING POINT representation
Ritu Ranjan Shrivastwa
Β 
Wallace tree multiplier.pptx1
Wallace tree multiplier.pptx1Wallace tree multiplier.pptx1
Wallace tree multiplier.pptx1
vamshi krishna
Β 
Wallace tree multiplier
Wallace tree multiplierWallace tree multiplier
Wallace tree multiplier
Sudhir Kumar
Β 
ISO 14000
ISO 14000ISO 14000
ISO 14000
Vishakha Joshi
Β 
Bit Serial multiplier using Verilog
Bit Serial multiplier using VerilogBit Serial multiplier using Verilog
Bit Serial multiplier using Verilog
BhargavKatkam
Β 
Array multiplier
Array multiplierArray multiplier
Array multiplier
Mathew George
Β 

Viewers also liked (20)

Dk fordulat
Dk fordulatDk fordulat
Dk fordulat
Β 
Dk szeged 2012
Dk szeged 2012Dk szeged 2012
Dk szeged 2012
Β 
Low Voltage Power MOSFET_Final
Low Voltage Power MOSFET_FinalLow Voltage Power MOSFET_Final
Low Voltage Power MOSFET_Final
Β 
Bipolaridad
Bipolaridad Bipolaridad
Bipolaridad
Β 
Neoplasm in ruminants
Neoplasm in ruminantsNeoplasm in ruminants
Neoplasm in ruminants
Β 
Security and ethical challenges
Security and ethical challengesSecurity and ethical challenges
Security and ethical challenges
Β 
Ser un buen ciudadano
Ser un buen ciudadanoSer un buen ciudadano
Ser un buen ciudadano
Β 
Cyclone
CycloneCyclone
Cyclone
Β 
Arthrodesis
ArthrodesisArthrodesis
Arthrodesis
Β 
Implementation of 32 Bit Binary Floating Point Adder Using IEEE 754 Single Pr...
Implementation of 32 Bit Binary Floating Point Adder Using IEEE 754 Single Pr...Implementation of 32 Bit Binary Floating Point Adder Using IEEE 754 Single Pr...
Implementation of 32 Bit Binary Floating Point Adder Using IEEE 754 Single Pr...
Β 
Hernias
HerniasHernias
Hernias
Β 
Importance Of Dance
Importance Of Dance Importance Of Dance
Importance Of Dance
Β 
floating point multiplier
floating point multiplierfloating point multiplier
floating point multiplier
Β 
Cmos Arithmetic Circuits
Cmos Arithmetic CircuitsCmos Arithmetic Circuits
Cmos Arithmetic Circuits
Β 
Quick tutorial on IEEE 754 FLOATING POINT representation
Quick tutorial on IEEE 754 FLOATING POINT representationQuick tutorial on IEEE 754 FLOATING POINT representation
Quick tutorial on IEEE 754 FLOATING POINT representation
Β 
Wallace tree multiplier.pptx1
Wallace tree multiplier.pptx1Wallace tree multiplier.pptx1
Wallace tree multiplier.pptx1
Β 
Wallace tree multiplier
Wallace tree multiplierWallace tree multiplier
Wallace tree multiplier
Β 
ISO 14000
ISO 14000ISO 14000
ISO 14000
Β 
Bit Serial multiplier using Verilog
Bit Serial multiplier using VerilogBit Serial multiplier using Verilog
Bit Serial multiplier using Verilog
Β 
Array multiplier
Array multiplierArray multiplier
Array multiplier
Β 

Similar to FPGA BASED IMPLEMENTATION OF DELAY OPTIMISED DOUBLE PRECISION IEEE FLOATING-POINT ADDER

AN EFFICIENT DSP ARCHITECTURE DESIGN IN FPGA USING LOOP BACK ALGORITHM
AN EFFICIENT DSP ARCHITECTURE DESIGN IN FPGA USING LOOP BACK ALGORITHMAN EFFICIENT DSP ARCHITECTURE DESIGN IN FPGA USING LOOP BACK ALGORITHM
AN EFFICIENT DSP ARCHITECTURE DESIGN IN FPGA USING LOOP BACK ALGORITHM
IJARIDEA Journal
Β 
A study to Design and comparison of Full Adder using Various Techniques
A study to Design and comparison of Full Adder using Various TechniquesA study to Design and comparison of Full Adder using Various Techniques
A study to Design and comparison of Full Adder using Various Techniques
IOSR Journals
Β 
FPGA IMPLEMENTATION OF PRIORITYARBITER BASED ROUTER DESIGN FOR NOC SYSTEMS
FPGA IMPLEMENTATION OF PRIORITYARBITER BASED ROUTER DESIGN FOR NOC SYSTEMSFPGA IMPLEMENTATION OF PRIORITYARBITER BASED ROUTER DESIGN FOR NOC SYSTEMS
FPGA IMPLEMENTATION OF PRIORITYARBITER BASED ROUTER DESIGN FOR NOC SYSTEMS
IAEME Publication
Β 
FPGA IMPLEMENTATION OF PRIORITYARBITER BASED ROUTER DESIGN FOR NOC SYSTEMS
FPGA IMPLEMENTATION OF PRIORITYARBITER BASED ROUTER DESIGN FOR NOC SYSTEMSFPGA IMPLEMENTATION OF PRIORITYARBITER BASED ROUTER DESIGN FOR NOC SYSTEMS
FPGA IMPLEMENTATION OF PRIORITYARBITER BASED ROUTER DESIGN FOR NOC SYSTEMS
IAEME Publication
Β 
Efficient fpga mapping of pipeline sdf fft cores
Efficient fpga mapping of pipeline sdf fft coresEfficient fpga mapping of pipeline sdf fft cores
Efficient fpga mapping of pipeline sdf fft cores
Nxfee Innovation
Β 
CANONIC SIGNED DIGIT BASED DESIGN OF MULTIPLIER-LESS FIR FILTER USING SELFORG...
CANONIC SIGNED DIGIT BASED DESIGN OF MULTIPLIER-LESS FIR FILTER USING SELFORG...CANONIC SIGNED DIGIT BASED DESIGN OF MULTIPLIER-LESS FIR FILTER USING SELFORG...
CANONIC SIGNED DIGIT BASED DESIGN OF MULTIPLIER-LESS FIR FILTER USING SELFORG...
ijaia
Β 
Area Efficient and high-speed fir filter implementation using divided LUT method
Area Efficient and high-speed fir filter implementation using divided LUT methodArea Efficient and high-speed fir filter implementation using divided LUT method
Area Efficient and high-speed fir filter implementation using divided LUT method
IJMER
Β 
design of FPGA based traffic light controller system
design of FPGA based traffic light controller systemdesign of FPGA based traffic light controller system
design of FPGA based traffic light controller system
Vinny Chweety
Β 
Iaetsd pipelined parallel fft architecture through folding transformation
Iaetsd pipelined parallel fft architecture through folding transformationIaetsd pipelined parallel fft architecture through folding transformation
Iaetsd pipelined parallel fft architecture through folding transformation
Iaetsd Iaetsd
Β 
An Area Efficient Mixed Decimation MDF Architecture for Radix 22 Parallel FFT
An Area Efficient Mixed Decimation MDF Architecture for Radix 22  Parallel FFTAn Area Efficient Mixed Decimation MDF Architecture for Radix 22  Parallel FFT
An Area Efficient Mixed Decimation MDF Architecture for Radix 22 Parallel FFT
IRJET Journal
Β 
Linear Phase FIR Low Pass Filter Design Based on Firefly Algorithm
Linear Phase FIR Low Pass Filter Design Based on Firefly Algorithm Linear Phase FIR Low Pass Filter Design Based on Firefly Algorithm
Linear Phase FIR Low Pass Filter Design Based on Firefly Algorithm
IJECEIAES
Β 
Design of Multiplier Less 32 Tap FIR Filter using VHDL
Design of Multiplier Less 32 Tap FIR Filter using VHDLDesign of Multiplier Less 32 Tap FIR Filter using VHDL
Design of Multiplier Less 32 Tap FIR Filter using VHDL
IJMER
Β 
FPGA based Efficient Interpolator design using DALUT Algorithm
FPGA based Efficient Interpolator design using DALUT AlgorithmFPGA based Efficient Interpolator design using DALUT Algorithm
FPGA based Efficient Interpolator design using DALUT Algorithm
cscpconf
Β 
Modified design for Full Swing SERF and High Speed SERF
Modified design for Full Swing SERF and High Speed SERFModified design for Full Swing SERF and High Speed SERF
Modified design for Full Swing SERF and High Speed SERF
IJERA Editor
Β 
Modified design for Full Swing SERF and High Speed SERF
Modified design for Full Swing SERF and High Speed SERFModified design for Full Swing SERF and High Speed SERF
Modified design for Full Swing SERF and High Speed SERF
IJERA Editor
Β 
Review on optimized area,delay and power efficient carry select adder using n...
Review on optimized area,delay and power efficient carry select adder using n...Review on optimized area,delay and power efficient carry select adder using n...
Review on optimized area,delay and power efficient carry select adder using n...
IRJET Journal
Β 
IRJET - Design and Implementation of FFT using Compressor with XOR Gate Topology
IRJET - Design and Implementation of FFT using Compressor with XOR Gate TopologyIRJET - Design and Implementation of FFT using Compressor with XOR Gate Topology
IRJET - Design and Implementation of FFT using Compressor with XOR Gate Topology
IRJET Journal
Β 
Review On Design Of Digital FIR Filters
Review On Design Of Digital FIR FiltersReview On Design Of Digital FIR Filters
Review On Design Of Digital FIR Filters
IRJET Journal
Β 
E0364025031
E0364025031E0364025031
E0364025031
theijes
Β 
International Journal of Computational Engineering Research(IJCER)
International Journal of Computational Engineering Research(IJCER)International Journal of Computational Engineering Research(IJCER)
International Journal of Computational Engineering Research(IJCER)
ijceronline
Β 

Similar to FPGA BASED IMPLEMENTATION OF DELAY OPTIMISED DOUBLE PRECISION IEEE FLOATING-POINT ADDER (20)

AN EFFICIENT DSP ARCHITECTURE DESIGN IN FPGA USING LOOP BACK ALGORITHM
AN EFFICIENT DSP ARCHITECTURE DESIGN IN FPGA USING LOOP BACK ALGORITHMAN EFFICIENT DSP ARCHITECTURE DESIGN IN FPGA USING LOOP BACK ALGORITHM
AN EFFICIENT DSP ARCHITECTURE DESIGN IN FPGA USING LOOP BACK ALGORITHM
Β 
A study to Design and comparison of Full Adder using Various Techniques
A study to Design and comparison of Full Adder using Various TechniquesA study to Design and comparison of Full Adder using Various Techniques
A study to Design and comparison of Full Adder using Various Techniques
Β 
FPGA IMPLEMENTATION OF PRIORITYARBITER BASED ROUTER DESIGN FOR NOC SYSTEMS
FPGA IMPLEMENTATION OF PRIORITYARBITER BASED ROUTER DESIGN FOR NOC SYSTEMSFPGA IMPLEMENTATION OF PRIORITYARBITER BASED ROUTER DESIGN FOR NOC SYSTEMS
FPGA IMPLEMENTATION OF PRIORITYARBITER BASED ROUTER DESIGN FOR NOC SYSTEMS
Β 
FPGA IMPLEMENTATION OF PRIORITYARBITER BASED ROUTER DESIGN FOR NOC SYSTEMS
FPGA IMPLEMENTATION OF PRIORITYARBITER BASED ROUTER DESIGN FOR NOC SYSTEMSFPGA IMPLEMENTATION OF PRIORITYARBITER BASED ROUTER DESIGN FOR NOC SYSTEMS
FPGA IMPLEMENTATION OF PRIORITYARBITER BASED ROUTER DESIGN FOR NOC SYSTEMS
Β 
Efficient fpga mapping of pipeline sdf fft cores
Efficient fpga mapping of pipeline sdf fft coresEfficient fpga mapping of pipeline sdf fft cores
Efficient fpga mapping of pipeline sdf fft cores
Β 
CANONIC SIGNED DIGIT BASED DESIGN OF MULTIPLIER-LESS FIR FILTER USING SELFORG...
CANONIC SIGNED DIGIT BASED DESIGN OF MULTIPLIER-LESS FIR FILTER USING SELFORG...CANONIC SIGNED DIGIT BASED DESIGN OF MULTIPLIER-LESS FIR FILTER USING SELFORG...
CANONIC SIGNED DIGIT BASED DESIGN OF MULTIPLIER-LESS FIR FILTER USING SELFORG...
Β 
Area Efficient and high-speed fir filter implementation using divided LUT method
Area Efficient and high-speed fir filter implementation using divided LUT methodArea Efficient and high-speed fir filter implementation using divided LUT method
Area Efficient and high-speed fir filter implementation using divided LUT method
Β 
design of FPGA based traffic light controller system
design of FPGA based traffic light controller systemdesign of FPGA based traffic light controller system
design of FPGA based traffic light controller system
Β 
Iaetsd pipelined parallel fft architecture through folding transformation
Iaetsd pipelined parallel fft architecture through folding transformationIaetsd pipelined parallel fft architecture through folding transformation
Iaetsd pipelined parallel fft architecture through folding transformation
Β 
An Area Efficient Mixed Decimation MDF Architecture for Radix 22 Parallel FFT
An Area Efficient Mixed Decimation MDF Architecture for Radix 22  Parallel FFTAn Area Efficient Mixed Decimation MDF Architecture for Radix 22  Parallel FFT
An Area Efficient Mixed Decimation MDF Architecture for Radix 22 Parallel FFT
Β 
Linear Phase FIR Low Pass Filter Design Based on Firefly Algorithm
Linear Phase FIR Low Pass Filter Design Based on Firefly Algorithm Linear Phase FIR Low Pass Filter Design Based on Firefly Algorithm
Linear Phase FIR Low Pass Filter Design Based on Firefly Algorithm
Β 
Design of Multiplier Less 32 Tap FIR Filter using VHDL
Design of Multiplier Less 32 Tap FIR Filter using VHDLDesign of Multiplier Less 32 Tap FIR Filter using VHDL
Design of Multiplier Less 32 Tap FIR Filter using VHDL
Β 
FPGA based Efficient Interpolator design using DALUT Algorithm
FPGA based Efficient Interpolator design using DALUT AlgorithmFPGA based Efficient Interpolator design using DALUT Algorithm
FPGA based Efficient Interpolator design using DALUT Algorithm
Β 
Modified design for Full Swing SERF and High Speed SERF
Modified design for Full Swing SERF and High Speed SERFModified design for Full Swing SERF and High Speed SERF
Modified design for Full Swing SERF and High Speed SERF
Β 
Modified design for Full Swing SERF and High Speed SERF
Modified design for Full Swing SERF and High Speed SERFModified design for Full Swing SERF and High Speed SERF
Modified design for Full Swing SERF and High Speed SERF
Β 
Review on optimized area,delay and power efficient carry select adder using n...
Review on optimized area,delay and power efficient carry select adder using n...Review on optimized area,delay and power efficient carry select adder using n...
Review on optimized area,delay and power efficient carry select adder using n...
Β 
IRJET - Design and Implementation of FFT using Compressor with XOR Gate Topology
IRJET - Design and Implementation of FFT using Compressor with XOR Gate TopologyIRJET - Design and Implementation of FFT using Compressor with XOR Gate Topology
IRJET - Design and Implementation of FFT using Compressor with XOR Gate Topology
Β 
Review On Design Of Digital FIR Filters
Review On Design Of Digital FIR FiltersReview On Design Of Digital FIR Filters
Review On Design Of Digital FIR Filters
Β 
E0364025031
E0364025031E0364025031
E0364025031
Β 
International Journal of Computational Engineering Research(IJCER)
International Journal of Computational Engineering Research(IJCER)International Journal of Computational Engineering Research(IJCER)
International Journal of Computational Engineering Research(IJCER)
Β 

FPGA BASED IMPLEMENTATION OF DELAY OPTIMISED DOUBLE PRECISION IEEE FLOATING-POINT ADDER

  • 1. FPGA BASED IMPLEMENTATION OF DELAY OPTIMISED DOUBLE PRECISION IEEE FLOATING-POINT ADDER Dissertation Submitted in partial fulfillment of the requirement for the degree of Bachelor of Engineering In Electrical Engineering SESSION 2009-2013 Submitted by Somsubhra Ghosh BE (Final year) Under the Supervision of Somnath Ghosh, Scientist, ADRIN, Dept of Space/ISRO, Govt. of INDIA DEPT. OF ELECTRICAL ENGINEERING JADAVPUR UNIVERSITY KOLKATA-700032 INDIA June, 2012 Roll No- 000910801089 Reg. No- 107973 of 2009-10
  • 4. JADAVPUR UNIVERSITY | FPGA BASED IMPLEMENTATION OF DOUBLE PRECISION IEEE FP-ADDER 3 CONTENTS SECTION PAGE NO Preface 4 Acknowledgements 5 Abstract 6 Section-I Introduction and summary 7 Section-II Notation 10 Section-III NaΓ―ve FP-Adder Algorithm 11 Section-IV Optimization techniques 13 4.1 Separation of FP-Adder in parallel paths 13 4.2 Unification of significant result ranges 14 4.3 Reduction of IEEE rounding modes 15 4.4 Sign magnitude computation of a difference 16 4.5 Compound addition 17 4.6 Approximate counting of leading zeros 18 4.7 Precomputation of post normalization shift 20 Section- V Overview of present algorithm 21 Section-VI Specification and detailed description 26 Section VII Delay analysis reporting 32 Section-VIII Implementation 33 Simulation screen shots 42 Section-VIII Future Scope 48 Bibliography 49
  • 5. JADAVPUR UNIVERSITY | FPGA BASED IMPLEMENTATION OF DOUBLE PRECISION IEEE FP-ADDER 4 Preface β€” This dissertation is a part of Bachelor of Electrical Engineering curriculum required in partial fulfillment for the degree of B.E. Being a student of Electrical Engineering I opted to take up a topic related to my preferred specialization. A prerequisite about optimal Floating Point addition is essential and important for ensuring the faster and optimal management of resources demanded by the complex circuit models employing complex arithmetic operations. Addition and subtraction are the basis of the arithmetic operations. So I was fascinated by the topic greatly and started my work. Thereafter I studied the basic algorithmic steps and started to build the basic building blocks using VHDL. Afterwards I integrated the blocks as per the requirement and used the debugging tool for help. At the end circuit has been simulated and verified using standard test benches.
  • 6. JADAVPUR UNIVERSITY | FPGA BASED IMPLEMENTATION OF DOUBLE PRECISION IEEE FP-ADDER 5 ACKNOWLEDGEMENT I sincerely acknowledge to my mentor Mr. Somnath Ghosh, Scientist, ADRIN. To D. Mallikharjuna. Rao, Section Head, ES&DS, Advanced Data Processing And Research Institute (ADRIN) for providing laboratory facility and other facilities. I am grateful to them for introducing me with exposure in research and development area. I am thankful to Director ADRIN and Dr. Venkatraman (GD), for allowing to carry on my project work here. My mentor Mr. Somnath Ghosh has been a great source of encouragement throughout the project work. His appropriate guidance, critical appraisal, scholarly approach has helped me to reach my target. It is a great pleasure to express my deep and heartiest gratitude to Sri Shashank Adimulyam, Scientist, ADRIN, for his guidance and helpful discussion. His valuable suggestions, ideas, and encouragement given in the course of this work were unforgettable. This is a reference implementation of β€œDELAY OPTIMIZED IMPLEMENTATION OF IEEE FLOATING POINT ADDITION” appeared on IEEE trans. On computers vol. 53, NO.2, February 2004. Somsubhra Ghosh
  • 7. JADAVPUR UNIVERSITY | FPGA BASED IMPLEMENTATION OF DOUBLE PRECISION IEEE FP-ADDER 6 Abstract β€” Hereby is presented an implementation of an IEEE double precision floating-point adder (FP-adder) design mentioned in the IEEE publication β€œDELAY OPTIMIZED IMPLEMENTATION OF IEEE FLOATING POINT ADDITION” authored by P.M. Seidel and G. Even. The adder accepts normalized numbers, supports IEEE rounding mode, and outputs the correctly normalized rounded sum/difference in the format required by the IEEE Standard. The FP-adder design achieves a low latency by combining various optimization techniques such as: A nonstandard separation into two paths, a simple rounding algorithm, unification of rounding cases for addition and subtraction, sign- magnitude computation of a difference based on one’s complement subtraction, compound adders. A technology-independent analysis and optimization of this implementation based on the Logical Effort hardware model is done and optimal gate sizes and optimal buffer insertion has been determined. The estimated delay of this optimized design at 30.6 FO4 delays for double precision operands (15.3 FO4 delays per stage between latches). It has been concluded that this algorithm has shorter latency (-13 percent) and cycle time (-22 percent) compared to the next fastest algorithm. Index Terms β€” Floating-point addition, IEEE rounding, delay optimization, dual path algorithm, logical effort, optimized gate sizing, buffer insertion.
  • 8. JADAVPUR UNIVERSITY | FPGA BASED IMPLEMENTATION OF DOUBLE PRECISION IEEE FP-ADDER 7 SECTION I. INTRODUCTION AND SUMMERY FLOATING-POINT addition and subtraction are the most frequent floating-point operations. Both operations use a floating-point adder (FP-adder). Therefore, a lot of effort has been spent on reducing the latency of FP-adders (see [2], [9], [20], [22], [23], [25], [27], [28], and the references that appear there). Many patents deal with FP- adder design (ref. [6], [10], [11], [14], [15], [19], [21], [31], [32], [35]). In this dissertation an FP-adder design is implemented that accepts normalized double precision significands, supports IEEE rounding modes, and outputs the normalized sum/difference that is rounded according to the IEEE FP standard 754 [13]. The latency of this design is analyzed using the Logical Effort Model [33]. This model allows for technology-independent delay analysis of CMOS circuits. The model enables rigorous delay analysis that takes into account fanouts, drivers, and gate-sizing. Following Horowitz [12], the delay of an inverter is used, the fanout of which equals 4, as a technology-independent unit of delay. An inverter with fanout 4 is denoted by FO4. The analysis using the Logical Effort Model shows that the delay of this FP-adder design is 30:6 FO4 delays. This design is partitioned into two pipeline stages, the delay of which is bounded by 15:3 FO4 delays. Extensions of the algorithm that deal with denormal inputs and outputs are discussed in [16], [27]. It is shown there that the delay overhead for supporting denormal numbers can be reduced to 1-2 logic levels (i.e., XOR delays). Several optimization techniques are employed in this algorithm. A detailed examination of these techniques combined, enables implementation of an overall fast FP-adder design. In particular, effective reduction of latency by parallel paths requires balancing the delay of the paths. Such a balance is achieved by a gate- level consideration of the design. The optimization techniques, that has been used are included in the following - 1. A two path design with a nonstandard separation criterion. Instead of separation based on the magnitude of the exponent difference [10], A separation criterion is defined that also considers whether the operation is effective subtraction and the value of the significand difference. This separation criterion maintains the advantages of the standard two-path designs, namely, alignment shift and normalization shift take place only in one of the paths and the full exponent difference is computed only in one path. In addition, this separation technique requires rounding to take place only in one path.
  • 9. JADAVPUR UNIVERSITY | FPGA BASED IMPLEMENTATION OF DOUBLE PRECISION IEEE FP-ADDER 8 2. Reduction of IEEE rounding to three modes [25] and use of injection based rounding [8]. 3. A simpler design is obtained by using unconditional preshifts for effective subtractions to reduce to 2 the number of binades that the significands’ sum and difference may belong to. 4. The sign-magnitude representation of the difference of the exponents and the significands is derived from one’s complement representation of the difference. 5. A parallel-prefix adder is used to compute the sum and the incremented sum of the significands [34]. 6. Recordings are used to estimate the number of leading zeros in the nonredundant representation of a number represented as a borrow-save number [20]. 7. Postnormalization is advanced and takes place before the rounding decision is ready. Form an overview of FP-adder algorithms from technical papers and patents, the optimization techniques that are used in each of these designs are summarized. The algorithms from two particular implementations are also analyzed from literature in some more detail [11], [21]. To allow for a β€œfair” comparison, the functionality of these designs are adopted to match the functionality of the present design. This design uses simpler rounding circuitry and is more amenable to partitioning into two pipeline stages of equal latency. The latency of [21] is also analyzed using the Logical Effort Model. The analysis for this algorithm gives the following results: a) The total delay is 35:2 FO4 delays (i.e., 13 percent slower than this design) and the delay of the slowest pipeline stage is 19:7 FO4 delays (i.e., 22 percent slower). b) This paper focuses on double precision FP-adder implementations. Many FP-adders support multiple precisions (e.g., x86 architectures support single, double, and extended double precision). In [27], it is shown that by aligning the rounding position (i.e., 23 positions to the right of the binary point in single precision and 52 positions to the right of the binary point in double precision) of the significands before they are input to the design and post aligning the outcome of the FP-adder, it is possible to use the FP-adder presented in the mentioned paper for multiple precisions. Hence, the FP-addition algorithm presented in this paper can be used to support multiple precisions. SECTION II. NOTATION Values and their representation:
  • 10. JADAVPUR UNIVERSITY | FPGA BASED IMPLEMENTATION OF DOUBLE PRECISION IEEE FP-ADDER 9 Binary strings are denoted by upper case letters (e.g., S, E, F). The value represented by a binary string is represented in italics (e.g., s; e; f) IEEE FP-numbers. Here normalized IEEE FPnumbers are considered. In double precision, IEEE FP-numbers are represented by three fields (S, E [10:0], F [0:52]) with sign bit S {0 : 1}, exponent string E[10:0] ∈ {0,1}11 and significand string F[0:52] ∈ {0,1}53 . The values of exponent and significand are defined by: 𝑒 = 𝐸[𝑖] βˆ™ 2+,- +.- βˆ’ 1023, 𝑓 = 𝐹[𝑖] βˆ™ 25+,- +.- . Since only normalized FPnumbers are considered, it can be safely concluded that f ∈ [1,2). The value represented by a FP-number (S, E[10:0], F[0:52]) is: 𝑓𝑝9:; 𝑆, 𝐸, 𝐹 = (βˆ’1)> βˆ™ 2? βˆ™ 𝑓. Factorings: Given an IEEE FP-number (S,E,F) it is referred to the triple (s,e,f) as the factoring of the FP-number. Note that s = S since S is a single bit. The advantage of using factorings is the ability to ignore representation details and focus on values. Inputs: The inputs of a FP-addition/subtraction are: Operands denoted by (SA,EA[10:0],FA[0:52]) and (SB,EB[10:0],FB[0:52] ). Operation denoted by SOP ∈ {0,1} to indicate addition/subtraction. IEEE rounding mode. Output: The output is a FP-number (S,E[10:0],F[0:52] ). The value represented by the output equals the IEEE rounded value of π‘“π‘π‘ π‘’π‘š = 𝑓𝑝9:; 𝑆𝐴, 𝐸𝐴 10: 0 , 𝐹𝐴 0: 52 + (βˆ’1)>JK βˆ™ 𝑓𝑝9:; 𝑆𝐡, 𝐸𝐡 10: 0 , 𝐹𝐡 0: 52 .
  • 11. JADAVPUR UNIVERSITY | FPGA BASED IMPLEMENTATION OF DOUBLE PRECISION IEEE FP-ADDER 10 SECTION III. NAIVE FP-ADDER ALGORITHM In this section, the β€œvanilla” FP-addition algorithm has been discussed. To simplify notation, representation has been neglected and only the values of the inputs, outputs, and intermediate results are taken into account. Throughout the paper, the notation defined for the naive algorithm has been used. Let π‘ π‘Ž, π‘’π‘Ž, π‘“π‘Ž π‘Žπ‘›π‘‘ (𝑠𝑏, 𝑒𝑏, 𝑓𝑏) denote the factorings of the operands with a sign-bit, an exponent, and a significand, and let SOP indicate whether the operation is an addition or a subtraction. The requested computation is the IEEE FP representation of the rounded sum: π‘Ÿπ‘›π‘‘ π‘ π‘’π‘š = π‘Ÿπ‘›π‘‘ βˆ’1 >: βˆ™ 2?: βˆ™ π‘“π‘Ž + βˆ’1 >JKR>S βˆ™ 2?S βˆ™ 𝑓𝑏 . Let 𝑆. 𝐸𝐹𝐹 = π‘ π‘Ž 𝑠𝑏 π‘ π‘œπ‘ . The case that S.EFF = 0 is called effective addition and the case that S.EFF = 1 is called effective subtraction. The exponent difference is defined as 𝛿 = π‘’π‘Ž βˆ’ 𝑒𝑏 . The β€œlarge” operand, (sl,el,fl), and the β€œsmall” operand, (ss,es,fs), are defined as follows: 𝑠𝑙, 𝑒𝑙, 𝑓𝑙 = π‘ π‘Ž, π‘’π‘Ž, π‘“π‘Ž 𝑖𝑓 𝛿 β‰₯ 0 (𝑆𝑂𝑃 𝑠𝑏, 𝑒𝑏 , 𝑓𝑏) π‘œπ‘‘β„Žπ‘’π‘Ÿπ‘€π‘–π‘ π‘’ 𝑠𝑠, 𝑒𝑠, 𝑓𝑠 = (𝑆𝑂𝑃 𝑠𝑏, 𝑒𝑏, 𝑓𝑏) 𝑖𝑓 𝛿 β‰₯ 0 ( π‘ π‘Ž, π‘’π‘Ž, 𝑓𝑏) π‘œπ‘‘β„Žπ‘’π‘Ÿπ‘€π‘–π‘ π‘’ The sum can be written as: π‘ π‘’π‘š = (βˆ’1)>; βˆ™ 2?; (𝑓𝑙 + βˆ’1 ]^__ 𝑓𝑠 β‹… 25a ). To simplify the description of the datapaths, the computation of the result’s significand has been given greater attention, which is assumed to be normalized (i.e., in the range ∈ [1,2) after rounding). The significand sum is defined by π‘“π‘ π‘’π‘š = (βˆ’1)].bcc (𝑓𝑠 βˆ™ 25a ). The significand sum is computed, normalized, and rounded as follows : 1. Exponent subtraction: 𝛿 = π‘’π‘Ž βˆ’ 𝑒𝑏 2. Operand swapping: compute 𝑠𝑙, 𝑒𝑙, 𝑓𝑙, and 𝑓𝑠, 3. Limitation of the alignment shift amount: 𝛿;+d = min 𝛼, π‘Žπ‘π‘  𝛿 , where 𝛼 is a constant greater than or equal to 55, 4. Alignment shift of 𝑓𝑠: π‘“π‘ π‘Ž = 𝑓𝑠 β‹… 25a_;+d ,
  • 12. JADAVPUR UNIVERSITY | FPGA BASED IMPLEMENTATION OF DOUBLE PRECISION IEEE FP-ADDER 11 5. Significand negation: π‘“π‘ π‘Žπ‘› = (βˆ’1)].bcc βˆ™ π‘“π‘ π‘Ž ; 6. Significand addition: π‘“π‘ π‘Žπ‘› = 𝑓𝑙 + π‘“π‘ π‘Žπ‘› ; 7. Conversion: π‘Žπ‘π‘ j>kd = π‘Žπ‘π‘ (π‘“π‘ π‘’π‘š) (the sign S of the result is determined by 𝑆 = 𝑠𝑙 (π‘“π‘ π‘’π‘š < 0), 8. Normalization shift: 𝑛j>kd = π‘›π‘œπ‘Ÿπ‘š (π‘Žπ‘π‘ _π‘“π‘ π‘’π‘š), and 9. Rounding and Postnormalization of 𝑛_π‘“π‘ π‘’π‘š. The naive FP-adder implements the nine steps from above sequentially, where the delay of Steps 4, 6, 7, 8, 9 is logarithmic in the significand’s length. Therefore, this is a slow FP-adder implementation.
  • 13. JADAVPUR UNIVERSITY | FPGA BASED IMPLEMENTATION OF DOUBLE PRECISION IEEE FP-ADDER 12 SECTION IV. OPTIMIZATION TECHNIQUES Optimization techniques that were employed in this FP-adder design are discussed in this section 4.1 Separation of FP-Adder into Two Parallel Paths The FP-adder pipeline is separated into two parallel paths that work under different assumptions. The partitioning into two parallel paths enables one to optimize each path separately by simplifying and skipping some steps of the naive addition algorithm (Section 3). Such a dual path approach for FP-addition was first described by Farmwald [9]. Since Farmwald’s dual path FP-addition algorithm, the common criterion for partitioning the computation into two paths has been the exponent difference. The exponent difference criterion is defined as follows: The near path is defined for small exponent differences (i.e., βˆ’1,0, +1), and the far path is defined for the remaining cases. Here a different partitioning criterion has been used for partitioning the algorithm into two paths: namely the N-path for the computation of all effective subtractions with small significand sums π‘“π‘ π‘’π‘š ∈ (βˆ’1,1) and small exponent differences 𝛿 ≀ 1, and the R-path for all the remaining cases. The path selection signal IS_R as follows : 𝐼𝑆_ 𝑅 ⟺ 𝑆. 𝐸𝐹𝐹 𝑂𝑅 𝛿 β‰₯ 2 𝑂𝑅 π‘“π‘ π‘’π‘š ∈ [1,2). (1) The outcome of the R-path is selected for the final result if 𝐼𝑆_ 𝑅 = 1, otherwise the outcome of the N-path is selected. This partitioning has the following advantages: a) In the R-path, the normalization shift is limited to a shift by one position (in Section 4.2, it is shown how the normalization shift may be restricted to one direction). b) Moreover, the addition or subtraction of the significands in the R-path always results with a positive significand and, therefore, the conversion step can be skipped. In the N-path, the alignment shift is limited to a shift by one position to the right. Under the assumptions of the N-path, the exponent difference is in the range{βˆ’1,0,1}. Therefore, a 2-bit subtraction suffices for extracting the exponent difference. Moreover, in the N-path, the significand difference can be exactly represented with 53 bits, hence, no rounding is required. Note that the N-path applies only to effective
  • 14. JADAVPUR UNIVERSITY | FPGA BASED IMPLEMENTATION OF DOUBLE PRECISION IEEE FP-ADDER 13 subtractions in which the significand difference π‘“π‘ π‘’π‘š is less than 1. Thus, in the N- path it is assumed that π‘“π‘ π‘’π‘š ∈ (βˆ’1,1). The advantages of this partitioning criterion compared to the exponent difference criterion stem from the following two observations: a) A conventional implementation of a far path can be used to implement also the R-path b) The N-path is simpler than the near path since no rounding is required and the N-path applies only to effective subtractions. Hence, the N-path is simpler and faster than the near path presented in [28]. 4.2 Unification of Significand Result Ranges In the R-path, the range of the resulting significand is different in effective addition and effective subtraction. Using the notation of Section 3, in effective addition, 𝑓𝑙 ∈ [1,2) and π‘“π‘ π‘Žπ‘› ∈ [0,2). Therefore, π‘“π‘ π‘’π‘š ∈ [1,4). It follows from the definition of the path selection condition that, in effective subtractions π‘“π‘ π‘’π‘š ∈ ( , r , 2) in the R-path. The ranges of π‘“π‘ π‘’π‘š are unified in these two cases to [1, 4) by multiplying the significands by 2 in the case of effective subtraction (i.e., preshifting by one position to the left). The unification of the range of the significand sum in effective subtraction and effective addition simplifies the rounding circuitry. To simplify the notation and the implementation of the path selection condition, the operands are also preshifted for effective subtractions in the N-path. It has to be noted that, in this way, the preshift is computed in the N-path unconditionally, because in the N-path all operations are effective subtractions. In the following, a few examples of values that include the conditional preshift are given (it may be noted that an additional β€œp” is included in the names of the preshifted versions) - 𝑓𝑙𝑝 = 2 βˆ™ 𝑓𝑙 𝑖𝑓 𝑆. 𝐸𝐹𝐹 𝑓𝑙 π‘œπ‘‘β„Žπ‘’π‘Ÿπ‘€π‘–π‘ π‘’ π‘“π‘ π‘π‘Žπ‘› = 2 βˆ™ π‘“π‘ π‘Žπ‘› 𝑖𝑓 𝑆. 𝐸𝐹𝐹 π‘“π‘ π‘Žπ‘› π‘œπ‘‘β„Žπ‘’π‘Ÿπ‘€π‘–π‘ π‘’ π‘“π‘π‘ π‘’π‘š = 2 βˆ™ π‘“π‘ π‘’π‘š 𝑖𝑓 𝑆. 𝐸𝐹𝐹 π‘“π‘’π‘š π‘œπ‘‘β„Žπ‘’π‘Ÿπ‘€π‘–π‘ π‘’
  • 15. JADAVPUR UNIVERSITY | FPGA BASED IMPLEMENTATION OF DOUBLE PRECISION IEEE FP-ADDER 14 Based on the significand sum π‘“π‘π‘ π‘’π‘š, which includes the conditional preshift, the path selection condition (1) can be rewritten as 𝐼𝑆_ 𝑅 ⟺ 𝑆. 𝐸𝐹𝐹 𝑂𝑅 𝛿 β‰₯ 2 𝑂𝑅 π‘“π‘π‘ π‘’π‘š ∈ [2,4) 4.3 Reduction of IEEE Rounding Modes The IEEE-754-1985 Standard defines four rounding modes - Round toward 0, round toward ∞, round toward βˆ’βˆž, and round to nearest (even) [13]. Following Quach et al. [25], the four IEEE rounding modes are reduced to three rounding modes: Round-to- zero RZ, round-to-infinity RI, and round-to-nearest-up RNU. The discrepancy between round-to-nearest-even and RNU is fixed by pulling down the LSB of the fraction (see [8] for more details). In the rounding implementation in the R-path, the three rounding modes RZ, RNU, and RI are further reduced to truncation using injection based rounding [8]. The reduction is based on adding an injection that depends only on the rounding mode. Let X = X0,X1,X2……XK denote the binary representation of a significand with the value π‘₯ = 𝑋 ∈ [1,2) for which 𝐾 β‰₯ 53 (double precision rounding is trivial for k < 53), then the injection is defined by: inj = 0 if RZ 25z{ if RNU 25z{ βˆ’ 25~ if RI For double precision and π‘šπ‘œπ‘‘π‘’ ∈ {𝑅𝑍, π‘…π‘π‘ˆ, 𝑅𝐼} , the effect of adding 𝑖𝑛𝑗 is summarized by: 𝑋 ∈ 1,2 ⟹ π‘Ÿπ‘›π‘‘dJ…? 𝑋 = π‘Ÿπ‘›π‘‘β€ β€‘( 𝑋 = 𝑖𝑛𝑗). (3) 4.4 Sign-Magnitude Computation of a Difference In this technique, the sign-magnitude computation of a difference is computed using one’s complement representation [22]. This technique is applied in two situations: 1. Exponent difference: The sign-magnitude representation of the exponent difference is used for two purposes: a) The sign determines which operand is selected as the ”large” operand and b) The magnitude determines the amount of the alignment shift.
  • 16. JADAVPUR UNIVERSITY | FPGA BASED IMPLEMENTATION OF DOUBLE PRECISION IEEE FP-ADDER 15 2. Significand difference: In case the exponent difference is zero and an effective subtraction takes place, the significand difference might be negative. The sign of the significand difference is used to update the sign of the result and the magnitude is normalized to become the result’s significand. Let A and B denote binary strings and let 𝐴 denote the value represented by A (i.e. 𝐴 = 𝐴+ βˆ™ 2+ + ). The technique is based on the following observation: π‘Žπ‘π‘ ( 𝐴 βˆ’ 𝐡 ) = 𝐴 + 𝐡 + 1, 𝑖𝑓 𝐴 βˆ’ 𝐡 > 0 𝐴 + 𝐡 𝑖𝑓 𝐴 βˆ’ 𝐡 ≀ 0 The actual computation proceeds as follows: The binary string D is computed such that 𝐷 = 𝐴 + 𝐡 . D is referred to as the one’s complement lazy difference of A and B. Two cases are considered to explain the scenario: If the difference is positive, then 𝐷 is off by an ULP and is needed to be incremented. However, to save delay, the increment is avoided as follows: a) In the case of the exponent difference that determines the alignment shift amount, the significands are preshifted by one position to compensate for the error. b) In the case of the significand difference, the missing ULP is provided by computing the incremented sum of 𝐴 π‘Žπ‘›π‘‘ 𝐡 using a compound adder. On the otherhand, if the exponent difference is negative, then the bits of D are negated to obtain an exact representation of the magnitude of the difference. 4.5 Compound Addition The technique of computing in parallel the sum of the significands as well as the incremented sum is well known. The rounding decision controls which of the sums is selected for the final result, thus enabling the computation of the sum and the rounding decision in parallel technique. The technique suggested by Tyagi [34] for implementing a compound adder is followed here. This technique is based on a parallel prefix adder in which the carry-generate and carry-propagate strings, denoted by Gen_C and Prop_C, are computed [3]. Let Gen_C[i] equal the carry bit that is fed to position i. The bits of the sum S of the addends A and B are obtained as usual by: 𝑆 𝑖 = 𝑋𝑂𝑅(𝐴 𝑖 , 𝐡 𝑖 , 𝐺𝑒𝑛_𝑐[𝑖]) The bits of the incremented sum 𝑆𝐼 are obtained by: 𝑆𝐼 𝑖 = 𝑋𝑂𝑅(𝐴 𝑖 , 𝐡 𝑖 , 𝑂𝑅(𝐺𝑒𝑛Œ + , π‘ƒπ‘Ÿπ‘œπ‘_𝑐[𝑖]))
  • 17. JADAVPUR UNIVERSITY | FPGA BASED IMPLEMENTATION OF DOUBLE PRECISION IEEE FP-ADDER 16 Applications - There are two instances of a compound adder in this FP-addition algorithm. One instance appears in the second pipeline stage of the R-path where the delay analysis relies on the assumption that the MSB of the sum is valid one logic level prior to the slowest sum bit. The second instance of a compound adder appears in the N-path. In this case, the problem that the compound adder does not β€œfit” in the first pipeline stage according to the delay analysis, has been properly addressed. This critical path is broken by partitioning the compound adder between the first and second pipeline stages as following: A parallel prefix adder placed in the first pipeline stage computes the carry-generate and carry-propagate signals as well as the bitwise XOR of the addends. From these three binary strings, the sum and incremented sum are computed within two logic levels as described above. However, these two logic levels must belong to different pipeline stages. Therefore, first, the three binary strings 𝑆 𝑖 , 𝑃 𝑖 = 𝐴 𝑖 𝑋𝑂𝑅 𝐡 𝑖 , π‘Žπ‘›π‘‘ 𝐺𝑃_𝐢[𝑖] = 𝑂𝑅(𝐺𝑒𝑛_𝐢[𝑖], π‘ƒπ‘Ÿπ‘œπ‘_𝐢[𝑖]) , which are passed to the second pipeline stage. In this way, the computation of the sum is already completed in the first pipeline stage and only an XOR-line is required in the second pipeline stage to compute also the incremented sum. 4.6 Approximate Counting of Leading Zeros In the N-path, a resulting significand in the range (βˆ’1,1) must be normalized. The amount of the normalization shift is determined by approximating the number of leading zeros. Following Nielsen et al. [20], the number of leading zeros are approximated so that a normalization shift by this amount yields a significand in the range [1, 4). The final normalization is then performed by post-normalization. The input used for counting leading zeros in this design is a borrow-save representation of the difference. This design is amenable to partitioning into pipeline stages, and admits an elegant correctness proof that avoids a tedious case analysis. Nielsen et. al. [20] presented the following technique for approximately counting the number of leading zeros. The method is based on a constant delay reduction of a borrowsave encoded string to a binary vector. The number of leading zeros in the binary vector almost equals the number of leading zeros in the binary representation of the absolute value
  • 18. JADAVPUR UNIVERSITY | FPGA BASED IMPLEMENTATION OF DOUBLE PRECISION IEEE FP-ADDER 17 of the number represented by the borrow-save encoded string. This reduction is described below. The input consists of a borrow-save encoded digit string 𝐹 βˆ’1: 52 ∈ βˆ’1,0,1 54 . The borrow-save encoded string is computed 𝐹′ βˆ’2: 52 = 𝑃(𝑁(𝐹[βˆ’1: 52])), where P() and N() denote P-recoding and N-recoding [5], [20]. (P- recoding is like a β€œsigned half-adder” in which the carry output has a positive sign, N- recoding is similar but has an output carry with a negative sign). The correctness of the technique is based on the following claim. Claim 1. [20] - Suppose the borrow-save encoded string 𝐹′ βˆ’2: 52 is of the form 𝐹′ βˆ’2: 52 = 0Ε½ βˆ™ 𝜎 βˆ™ 𝑑[1: 54 βˆ’ π‘˜], where β‹… denotes concatenation of strings, 0k denotes a block of k zeros,𝜎 ∈ (βˆ’1,1) , and 𝑑 ∈ (βˆ’1,0,1)zβ€˜5Ε½ . Then, the following holds: 1) If 𝜎 = 1, then the value represented by the borrowsave encoded string 𝜎 βˆ™ t satisfies: 𝜎 + 𝑑[𝑖] βˆ™ 25+zβ€˜5Ε½ +., ∈ ( , β€˜ , 1) 2) If 𝜎 = -1, then the value represented by the borrowsave encoded string 𝜎 βˆ™t satisfies: 𝜎 + 𝑑[𝑖] βˆ™ 25+zβ€˜5Ε½ +., ∈ βˆ’ { r , βˆ’ , r The implication of Claim 1 is that after PN-recoding, the number of leading zeros in the borrow-save encoded string 𝐹′ [βˆ’2: 53] (denoted by k in the claim) can be used as the normalization shift amount to bring the normalized result into one of two binades (i.e., in the positive case either ( , β€˜ , , r ) or [ , r , 1), and in the negative case after negation either ( , r , 1) or [1, { r ). Since we are only interested in the number of leading zeros of 𝐹′ [βˆ’2: 53], this borrowsave encoded string may be reduced to a binary string by applying bitwise- XOR. This technique is implemented so that the normalized significand is in the range [1,4) as follows: 1. In the positive case, the shift amount is 𝑙𝑧2 = π‘˜ = π‘™β€œ?”J(𝐹′ [βˆ’2: 52]). (Signal LZP2[5:0] in Fig. 3). 2. In the negative case, the shift amount is 𝑙𝑧1 = π‘˜ βˆ’ 1 = π‘™β€œ?”J(𝐹′ [βˆ’1: 52]). (Signal LZP1[5:0] in Fig. 3). The reduction of the borrow-save encoded string to the binary vector involves a P- recoding, an N-recoding, and bitwise XOR. Hence, the cost per bit is two half-adders and a XOR-gate (i.e., 3 XOR-gates and two AND-gates). The delay of this reduction is 3 XOR-gates. In a recent survey of leading zeroes anticipation and detection by
  • 19. JADAVPUR UNIVERSITY | FPGA BASED IMPLEMENTATION OF DOUBLE PRECISION IEEE FP-ADDER 18 Schmookler and Nowka [26], a method that is faster by roughly one logic level is presented. The steps of this method are as follows: a) Transform the negative vector of the borrow-save encoded string into a positive vector by using 2’s complement negation (the increment can be avoided by padding a carry- save 2 digit from the right). Hence, the problem of computing the normalization shift amount is reduced to counting the number of leading zeros (ones) of the binary representation of the sum in case the sum is positive (negative). b) The position of the leftmost 1 in the binary representation of the sum (when this sum is positive) is computed as follows: Reduce the carry-save encoded string into a binary string as follows: 𝛼+ = 𝑋𝑂𝑅(𝑝+, π‘˜β€’R,), where 𝑝+ is the 𝑖th β€œpropagate”- bit (namely, 𝑝+ is the XOR of the bits in position i of the carry-save encoded string) and π‘˜+ is the ith β€œkill”- bit (namely, the NOR of the above bits). The position of the leftmost one in the vector 𝛼+ is either the position of the leading digit or one position to the right of the leading digit. In case the sum is negative, the position of the leftmost zero is computed by the following reduction: 𝛽+ = 𝑋𝑂𝑅(𝑝+, 𝐺‒R,), where 𝐺+ is the ith β€œgenerate”- bit (namely, the AND of the bits in position i of the carry-save encoded string). Each of the binary vectors 𝛼 and 𝛽 are fed to a priority encoder. The cost of this method is five gates per bit (i.e., 3 XOR-gates, one NOR-gate, and one AND-gate). The delay is two logic levels (two XOR-gates). A further improvement described in [26] is to the following reductions: 𝛼+ β€² = 𝐴𝑁𝐷(𝑃‒, π‘˜β€’R,). This improvement reduces the cost per bit to three AND-gates, one XOR-gate, and one NOR-gate. The delay is reduced to one XOR-gate plus an AND-gate. 4.7 Precomputation of Postnormalization Shift In the R-path, two choices for the rounded significand sum are computed by the compound adder (see Section 4.5). Either the β€œsum” or the β€œincremented sum” output of the compound adder is chosen for the rounded result. Because the significand after the rounding selection is in the range [1, 4) (due to the preshifts from Section 4.2 only these two binades have to be considered for rounding and for the postnormalization shift), postnormalization requires at most a right-shift by one bit position. Because the outputs of the compound adder have to wait for the computation of the rounding decision (selection based on the range of the sum output), so the postnormalization
  • 20. JADAVPUR UNIVERSITY | FPGA BASED IMPLEMENTATION OF DOUBLE PRECISION IEEE FP-ADDER 19 shift is precomputed on both outputs of the compound adder before the rounding selection, so that the rounding selection already outputs the normalized significand result of the R-path.
  • 21. JADAVPUR UNIVERSITY | FPGA BASED IMPLEMENTATION OF DOUBLE PRECISION IEEE FP-ADDER 20 SECTION V. OVERVIEW OF THE PRESENT ALGORITHM In this section, an overview of this FP-adder algorithm is given. A brief description of the partitioning of the design, implementing and integrating the optimization techniques from the previous section is presented. The algorithm is a two-staged pipeline partitioned into two parallel paths called the R-path and the N-path. The final result is selected between the outcomes of the two paths based on the signal IS_R (see (2)). Several block diagrams of this algorithm are depicted in Fig. 1. An overview of the two paths is in the following. Overview R-Path The R-path works under the assumption that 1. An effective addition takes place or 2. An effective subtraction with a significand difference (after preshifting) greater than or equal to 2 takes place or 3. The absolute value of the exponent difference 𝛿 is larger than or equal to 2. Note that these assumptions imply that the sign-bit of the sum equals SL. The algorithm performs various operations on the significands before adding them. These operations include: swapping, negation, preshifting, and alignment. For the sake of brevity, it is referred in the outline to a bit string that originates from the significands 𝑓𝑠 or 𝑓𝑙 as the "small operand" or "large operand", respectively. The R- path is divided into two pipeline stages. In the first pipeline stage, the exponent difference is computed and the significands are swapped. In effective subtraction, both significands are preshifted and only the small operand is negated. The small operand is then aligned according to the exponent difference. In the Significand One’s Complement box, the small operand is negated (recall that one’s complement representation is used). In the Align 1 box, the small operand is 1) preshifted to the right if an effective subtraction takes place and 2) aligned to the left by one position if the exponent difference is positive. The alignment to the left by one position compensates for the error in the computation of the exponent difference when the difference is positive due to the one’s complement representation (see Section 4.4). In the Swap box, the large and small operands are selected according to the sign of the exponent difference.
  • 22. JADAVPUR UNIVERSITY | FPGA BASED IMPLEMENTATION OF DOUBLE PRECISION IEEE FP-ADDER 21 Note that since swapping takes place only after complementation and Align 1, one must consider complementing and aligning both significands; only one of these significands is used as the small operand. In the Align2 box, the small operand is aligned according to the computed exponent difference. The exponent difference box computes the swap decision and the alignment shift amount. This box is further partitioned into two paths for medium and large exponent differences. A detailed block diagram for the implementation of the first cycle of the R-path is depicted in Fig. 2. The input to the second pipeline stage consists of the large and small operands. The goal is to compute their sum and round it while taking into account the error due to the one’s complement representation for effective subtractions. The second pipeline stage is very similar to the rounding algorithm presented in [8]. The large and small operands are divided into a low part and a high part that are processed in parallel. The LSB of the final result is computed based on the low part and the range of the sum. The rest of the final result is computed based on the high part (i.e., either the sum or the incremented sum of the high part). The outputs of the compound adder are postnormalized before the rounding selection is performed. A detailed block diagram for the implementation of the second cycle of the R-path is depicted in Fig. 2. Overview N-Path The N-path works under the assumption that an effective subtraction takes place, the significand difference (after the swapping of the addends and preshifting) is less than 2, and the absolute value of the exponent difference is less than 2. The N-path has the following properties: The exponent difference must be in the set {-1,0,1}. Hence, the exponent difference can be computed by subtracting the two LSBs of the exponent strings. The alignment shift is by at most one position. This is implemented in the exponent difference prediction box. An effective subtraction takes place, hence, the small operand is always negated. The one’s complement representation has been used for the negated subtrahend. The significand difference (after swapping and preshifting) is in the range (-2,2) and can be exactly represented using 52 bits to the right of the binary point. Hence, no rounding is required Based on the exponent difference prediction, the significands are swapped and aligned by at most one bit position in the align and swap box. The leading zero
  • 23. JADAVPUR UNIVERSITY | FPGA BASED IMPLEMENTATION OF DOUBLE PRECISION IEEE FP-ADDER 22 approximation and the significand difference are then computed in parallel. The result of the leading zero approximation is selected based on the sign of the significand difference according to Section 4.6 in the leading zero selection box. The conversion box computes the absolute value of the difference (Section 4.4) and the normalization and postnormalization box normalizes the absolute significand difference as a result of the N-path.
  • 26. JADAVPUR UNIVERSITY | FPGA BASED IMPLEMENTATION OF DOUBLE PRECISION IEEE FP-ADDER 25 SECTION VI. SPECIFICATION AND DETAILED DESCRIPTION In this section, specifically the computations in the R-path are discussed since they cover most of the varieties of the input. A more detailed study is necessary for the implementing prospects in the N-path. The two stages of the R-path are specified as following. Specification: R-Path First Cycle The computation performed by the first pipeline stage in the R-path outputs the significands and , represented by FLP[1 : 52] and FSOPA[1 : 116]. The significands and are defined by, (𝑓𝑙𝑝 , π‘“π‘ π‘œπ‘π‘Ž ) = 𝑓𝑙, π‘“π‘ π‘Žπ‘› 𝑖𝑓 𝑆. 𝐸𝐹𝐹 = 0 (2 βˆ™ 𝑓𝑙 , 2 βˆ™ π‘“π‘ π‘Žπ‘› βˆ’ 25116 ) π‘œπ‘‘ β„Žπ‘’π‘Ÿπ‘€π‘–π‘ π‘’ The detailed view in Fig. 1 depicts how the computation of FLP[-1 : 52] and FSOPA[- 1 : 116] is performed. For each box in the high level view, a region surrounded by dashed lines is depicted to assist the reader in matching the regions with blocks. The exponent difference is computed for two ranges: The medium exponent difference interval consists of [-63; 64], and the big exponent difference intervals consist of [-∞,- 64] and [65,∞]. The outputs of the exponent difference box are specified as follows: Loosely speaking, the 𝑆𝐼𝐺𝑁_𝑀𝐸𝐷 and 𝑀𝐴𝐺_𝑀𝐸𝐷 are the sign-magnitude representations of 𝛿, if 𝛿 is in the medium exponent difference interval. Formally, βˆ’1 ]Λœβ„’Ε‘_β€ΊbΕ“ βˆ™ 𝑀𝐴𝐺_𝑀𝐸𝐷 = 𝛿 βˆ’ 1 𝑖𝑓 64 β‰₯ 𝛿 β‰₯ 1 𝛿 𝑖𝑓 0 β‰₯ 𝛿 β‰₯ βˆ’63 "don't care" otherwise The reason for missing 𝛿 by 1 in the positive case is due to the one’s complement subtraction of the exponents. This error term is compensated for in the Align1 box. SIGN_BIG is the sign bit of exponent difference 𝛿. 𝐼𝑆_𝐡𝐼𝐺 is a flag defined by: 𝐼𝑆_𝐡𝐼𝐺 = 1 𝑖𝑓 𝛿 β‰₯ 65 𝑂𝑅 𝛿 ≀ βˆ’64 0 π‘œπ‘‘β„Žπ‘’π‘Ÿπ‘€π‘–π‘ π‘’ In the big exponent difference intervals, the β€œrequired” alignment shift is at least 64 positions. Since all alignment shifts of 55 positions or more are equivalent (i.e., beyond the sticky-bit position), the shift amount in this case may be limited. In the Align2 region, one of the following alignment shift occurs:
  • 27. JADAVPUR UNIVERSITY | FPGA BASED IMPLEMENTATION OF DOUBLE PRECISION IEEE FP-ADDER 26 a. a fixed alignment shift by 64 positions in case the exponent difference belongs to the big exponent difference intervals (this alignment ignores the preshifting altogether) or b. an alignment shift by π‘šπ‘Žπ‘”_π‘šπ‘’π‘‘ positions in case the exponent difference belongs to the medium exponent difference interval. In the One’s Complement box, the signals FAO, FBO, and S.EFF are computed. The FAO and FBO signals are defined by FAO[0 : 52],FBO[0 : 52] = 𝐹𝐴 0: 52 , 𝐹𝐡 0: 52 𝑖𝑓 𝑆. 𝐸𝐹𝐹 = 0 π‘›π‘œπ‘‘ 𝐹𝐴 0: 52 , π‘›π‘œπ‘‘ 𝐹𝐡 0: 52 π‘œπ‘‘β„Žπ‘’π‘Ÿπ‘€π‘–π‘ π‘’ The computations performed in the Preshift and Align 1 region are relevant only if the exponent difference is in the medium exponent difference interval. The significands are pre-shifted if an effective subtraction takes place. After the preshifting, an alignment shift by one position takes place if sign med = 1. Table 1 summarizes the specification of FSOP’[-1 : 53]. In the Swap region, the large operand is selected based on sign big. The small operand is selected for the medium exponent difference (based on sign med) interval and for the large exponent difference interval (based on sign big). The Preshift 2 region deals with preshifting the minuend in case an effective subtraction takes place. Specification: R-path Second Cycle Table 1 – Value of FSOP’ [-1:53] according to Fig. 1 SIGN_MED S.EFF Pre shift (Left) Align shift (Right) Accumulated right shift FSOP’[-1:53] 0 0 0 1 1 (00,FBO[0:52]) 1 1 1 0 (1,FBO[0:52],1) 1 0 0 0 0 (0,FAO[0:52],0) 1 1 0 -1 (FAO[0:52],11)
  • 28. JADAVPUR UNIVERSITY | FPGA BASED IMPLEMENTATION OF DOUBLE PRECISION IEEE FP-ADDER 27 The input to the second cycle consists of: The sign bit SL, a representation of the exponent el, the significand strings FLP[-1:52] and FSOPA[-1:116], and the rounding mode. Together with the sign bit SL, the rounding mode is reduced to one of the three rounding modes: RZ, RNE or RI (see Section 4.3). The output consists of the sign bit SL, the exponent string, and the normalized and rounded significand 𝑓_π‘“π‘Žπ‘Ÿ ∈ [1,2) represented by F_FAR[0 : 52]. If the significand sum (after pre-shifting) is greater than or equal to 1, then the output of the second cycle of the R-path satisfies: π‘Ÿπ‘›π‘‘ π‘ π‘’π‘š = π‘Ÿπ‘›π‘‘( βˆ’1 >; βˆ™ (𝑓𝑙𝑝 + π‘“π‘ π‘œπ‘π‘Ž + 𝑆. 𝐸𝐹𝐹 βˆ™ 25,,ΕΈ )) In effective subtraction, 25,,ΕΈ is added to correct the sum of the one’s complement representations to the sum of two’s complement representations by the lazy increment from the first clock cycle. Fig. 1 depicts the partitioning of the computations in the 2nd cycle of the R-path into basic blocks and specifies the input- and output-signals of each of these basic blocks. Details: R-Path First Cycle Fig. 2 depicts a detailed block diagram of the first cycle of the R-path. The nonstraightforward regions are described below. The Exponent Difference region is implemented by cascading a 7-bit adder with a 5- bit adder. The 7-bit adder computes the lazy one’s complement exponent difference if the exponent difference is in the medium interval. This difference is converted to a sign and magnitude representation denoted by 𝑠𝑖𝑔𝑛_π‘šπ‘’π‘‘ and π‘šπ‘Žπ‘”_π‘šπ‘’π‘‘ . The cascading of the adders enables us to evaluate the exponent difference (for the medium interval) in parallel with determining whether the exponent difference is in the big range. The SIGN_BIG signal is simply the MSB of the lazy one’s complement exponent difference. The IS_BIG signal is computed by OR-ing the bits in positions [6 : 10] of the magnitude of the lazy one’s complement exponent difference. This explains why the medium interval is not symmetric around zero. The Align 1 region depicted in Fig. 2 is an optimization of the Preshift and Align1 region in Fig. 1. The reader can verify that the implementation of the Align 1 region satisfies the specification of FSOP’[-1 : 53] that is summarized in Table 1.
  • 29. JADAVPUR UNIVERSITY | FPGA BASED IMPLEMENTATION OF DOUBLE PRECISION IEEE FP-ADDER 28 The following condition is computed during the computation of the exponent difference 𝐼𝑆_𝑅1 ⟺ ( π‘’π‘Ž βˆ’ 𝑒𝑏 ) β‰₯ 2 ⟺ π‘‚π‘…π‘‘π‘Ÿπ‘’π‘’(𝐼𝑆_𝐡𝐼𝐺 , 𝑀𝐴𝐺_𝑀𝐸𝐷[5: 0], 𝐴𝑁𝐷(𝑀𝐴𝐺_𝑀𝐸𝐷(0), 𝑁𝑂𝑇(𝑆𝐼𝐺𝑁_𝐡𝐼𝐺))), which will be used later for the selection of the valid path. Note that, the exponent difference is computed using one’s complement representation. This implies that the magnitude is off by one when the exponent difference is positive. In particular, the case of the exponent difference equal to 2 yields a magnitude of 1 and a sign bit of 0. This is why the expression AND(MAG_MED[0],NOT(SIGN BIG)) appears in the OR- tree used to compute the IS_R signal. Details: R-Path Second Cycle Fig. 2 depicts a detailed block diagram of the second cycle of the R-path. The details of the implementation are described below. Our implementation of the R-path in the second cycle consists of two parallel paths called the upper part and the lower part. The upper part deals with positions [-1:52] of the significands and the lower part deals with positions [53:116] of the significands. The processing of the lower part has to take into account two additional values: the rounding injection, which depends only on the reduced rounding mode, and the missing ULP (2_116) in effective subtraction due to the one’s complement representation. The processing of FSOPA[53:116], INJ[53:116] and S.EFF βˆ™ 25,,ΕΈ is based on: 𝑇𝐴𝐼𝐿[52: 116] = 𝐹𝑆𝑂𝑃𝐴[53: 116] + 𝐼𝑁𝐽[53: 116] 𝑖𝑓 𝑆. 𝐸𝐹𝐹 = 0 𝐹𝑆𝑂𝑃𝐴[53: 116] + 𝐼𝑁𝐽[53: 116] + 25,,ΕΈ 𝑖𝑓 𝑆. 𝐸𝐹𝐹 = 1 The bits 𝐢[52], 𝑅’, 𝑆’ are defined by 𝐢[52] = 𝑇𝐴𝐼𝐿[52] , R’=TAIL[53], S’=OR(TAIL[54:116]). They are computed by using a 2-bit injection string that has a different value for effective addition and subtraction. Effective addition- Let 𝑆:…… denote the sticky bit that corresponds to FSOPA[54:116], then 𝑆:……= OR(FSOPA[54], . . . ,FSOPA[116]). The injection can be restricted to two bits INJ[53 : 54] and we simply perform a 2-bit addition to obtain the three bits C[52],R’, S’: (𝐢 52 , 𝑅′ , 𝑆′) = 𝐼𝑁𝐽[53: 54] + (𝐹𝑆𝑂𝑃𝐴 53 , 𝑆:……) .
  • 30. JADAVPUR UNIVERSITY | FPGA BASED IMPLEMENTATION OF DOUBLE PRECISION IEEE FP-ADDER 29 Effective subtraction- In this case, we still have to add to FSOPA the missing 25,,ΕΈ that we neglected to add during the first cycle. Let 𝑆>kS denote the sticky bit that corresponds to bit positions [54:116] in the binary representation of 𝐹𝑆𝑂𝑃𝐴[54: 116] + 25,,ΕΈ , then 𝑆>kS = 𝑂𝑅(𝑁𝑂𝑇(𝐹𝑆𝑂𝑃𝐴[54: 116])), The addition of 25,,ΕΈ can create a carry to position [53], which we denote by C[53]. The value of C[53] is one if FSOPA[54:116] is all ones, in which case the addition of 25,,ΕΈ creates a carry that ripples to position [53]. Therefore, C[53]=NOT(𝑆>kS). Again, the injection can be restricted to two bits INJ[53:54], and we compute C[52],R’, S’ by (𝐢 52 , 𝑅′ , 𝑆′) = 𝐼𝑁𝐽[53: 54] + (𝐹𝑆𝑂𝑃𝐴 53 , 𝑆:……) + 2𝐢[53] The result of this addition cannot be greater than 7 βˆ™ 25zβ€˜ , because C[53]= NOT(𝑠>kS). A fast implementation of the computation of C[52], R’, S’ proceeds as follows. Let S =𝑆:…… in effective addition, and S= 𝑆>kS in effective subtraction. Based on S.EFF, FSOPA[53] and INJ[53:54], the signals C[52], R’, S’ are computed in two paths: one assuming that S = 1 and the other assuming that S = 0. Fig. 2 depicts a naive method of computing the sticky bit S to keep the presentation structured rather than obscure it with optimizations. A conditional inversion of the bits of FSOPA[54 : 116] is performed by XOR-ing the bits with S.EFF. The possibly inverted bits are then input to an OR-tree. This suggestion is somewhat slow and costly. A better method would be to compute the OR and AND of (most of) the bits of FS[54 : 116] during the alignment shift in the first cycle. The advantages of advancing (most of) the sticky bit computation to the first cycle is twofold: 1) There is ample time during the alignment shift whereas the sticky bit should be ready after at most 5 logic levels in the second cycle; and 2) This saves the need to latch all 63 bits (corresponding to FS[54 : 116]) between the two pipeline stages. The upper part computes the correctly rounded sum (including post-normalization) and uses for the computation of the strings FLP[-1:52],FSOPA[- 1:52], and (C[52],R’, S’). The rest of the algorithm is identical to the rounding presented, analyzed, and proven in [8] for FP multiplication.
  • 31. JADAVPUR UNIVERSITY | FPGA BASED IMPLEMENTATION OF DOUBLE PRECISION IEEE FP-ADDER 30 SECTION VII. DELAY ANALYSIS REPORTING In this section, the delay of this FP-adder algorithm has been reported. The original analysis was made by using the Logical Effort Model [33]. In the analysis of gate delays, this delay model allows a rigorous treatment of the contributions of fanouts and gate sizing independent of the used technology, process, voltage, or feature size. Based on this delay model, a proposal of the delay-optimal implementation of this FP addition algorithm was made by the authors, by considering optimized gate sizing and driver insertion. For comparisons the delay estimations are scaled from this model to units of FO4 inverter delays (inverter delay with a fanout of four). The use of this unit has been a popular choice in literature on circuit design whenever delays are to be determined in technology-independent terms. It has been demonstrated for several examples that delay estimations in FO4 delay units are rather insensitive to choices of technological parameters [12]. The description of, how the delay analysis, provided by the authors, differ from the traditional one, has been described in [28], [29]. In the more recent analysis, each gate is assigned an independent size and delay. This allows for precise modeling of fanout contributions and full custom gate size optimizations for minimum delay. The previously estimated the total delay of this algorithm was 24 logic levels between latches, which could be evenly partitioned into two stages with 12 logic levels each. One logic level (XOR delay with a fanout of 1) has a delay of 1.6 FO4 inverter delay units in the current framework (this is because d(XOR1) = 8 logical effort delays and d(FO4) = 5 logical effort delays), so that the previous analysis estimated the delay of this algorithm to be 38:4 FO4 delays. This is approximately 19 FO4 delays for each of two pipeline stages between latches.
  • 32. JADAVPUR UNIVERSITY | FPGA BASED IMPLEMENTATION OF DOUBLE PRECISION IEEE FP-ADDER 31 SECTION VIII. IMPLEMENTATION This algorithm has been implemented using the Xilinx ISE v.9.2i with a XC2V6000 device of virtex2 family, package FF1152. The top level source is Hardware Description Language type, with a XST(VHDL/VERILOG) environment. The basic block level implementation is done with VHDL following the connections as per the algorithm. Instead of a compound adder as it was in the algorithm a simpler model of addition circuitry has been used in the implementation level. A further study is necessary for the scope of using the aforementioned adder circuit along with the implementation of the β€œN-path” mentioned in the algorithm. The detailed block diagram of the block level implementation is given inFig.1 and Fig.2. The system requirement for the above mentioned environment is given below. The synthesis report of the generated circuit model is presented here. The general debugging tool used for the implementation process is the ISE editor and Modelsim v.6.3. The implementation process is shown hierarchically: Algorithm undestanding Developing building blocks (in VHDL) Simulation of the abstact model Debugging Pipelining and optimizing Chip implementation (RC10)
  • 33. JADAVPUR UNIVERSITY | FPGA BASED IMPLEMENTATION OF DOUBLE PRECISION IEEE FP-ADDER 32 System Information Machine name: ADRIN-E68E783C5 Operating System: Windows XP Professional (5.1, Build 2600) Service Pack 2 (2600.xpsp_sp2_rtm.040803-2158) Language: English (Regional Setting: English) System Manufacturer: Hewlett-Packard System Model: HP xw4300 Workstation BIOS: Default System BIOS Processor: Intel(R) Pentium(R) 4 CPU 3.80GHz (2 CPUs) Memory: 1536MB RAM Page File: 485MB used, 2945MB available Windows Dir: C:WINDOWS DirectX Version: DirectX 9.0c (4.09.0000.0904) DX Setup Parameters: Not found DxDiag Version: 5.03.2600.2180 32bit Unicode
  • 34. JADAVPUR UNIVERSITY | FPGA BASED IMPLEMENTATION OF DOUBLE PRECISION IEEE FP-ADDER 33 SYNTHESIS REPORT: Synthesis Options Summary Source Parameters ---- Input File Name : "top_module.prj" Input Format : mixed Ignore Synthesis Constraint File : NO Target Parameters ---- Output File Name : "top_module" Output Format : NGC Target Device : xc2v6000-4-ff1152 Source Options ---- Top Module Name : top_module Automatic FSM Extraction : YES FSM Encoding Algorithm : Auto Safe Implementation : No FSM Style : lut RAM Extraction : Yes RAM Style : Auto ROM Extraction : Yes Mux Style : Auto Decoder Extraction : YES Priority Encoder Extraction : YES Shift Register Extraction : YES Logical Shifter Extraction : YES XOR Collapsing : YES ROM Style : Auto Mux Extraction : YES Resource Sharing : YES Asynchronous To Synchronous : NO Multiplier Style : auto Automatic Register Balancing : No Target Options ---- Add IO Buffers : YES Global Maximum Fanout : 500 Add Generic Clock Buffer(BUFG) : 16 Register Duplication : YES Slice Packing : YES Optimize Instantiated Primitives : NO Convert Tristates To Logic : Yes
  • 35. JADAVPUR UNIVERSITY | FPGA BASED IMPLEMENTATION OF DOUBLE PRECISION IEEE FP-ADDER 34 Use Clock Enable : Yes Use Synchronous Set : Yes Use Synchronous Reset : Yes Pack IO Registers into IOBs : auto Equivalent register Removal : YES General Options ---- Optimization Goal : Speed Optimization Effort : 1 Library Search Order : top_module.lso Keep Hierarchy : NO RTL Output : Yes Global Optimization : AllClockNets Read Cores : YES Write Timing Constraints : NO Cross Clock Analysis : NO Hierarchy Separator : / Bus Delimiter : <> Case Specifier : maintain Slice Utilization Ratio : 100 BRAM Utilization Ratio : 100 Verilog 2001 : YES Auto BRAM Packing : NO Slice Utilization Ratio Delta : 5 HDL Synthesis Report Macro Statistics # Adders/Subtractors : 5 11-bit adder : 1 5-bit adder carry in : 1 53-bit adder : 2 8-bit adder : 1 # Registers : 12 1-bit register : 5 11-bit register : 3 118-bit register : 1 52-bit register : 2 54-bit register : 1 # Latches : 1 118-bit latch : 1 # Multiplexers : 1 11-bit 4-to-1 multiplexer : 1 # Logic shifters : 1 118-bit shifter logical right : 1 # Xors : 16 1-bit xor2 : 15 1-bit xor3 : 1
  • 36. JADAVPUR UNIVERSITY | FPGA BASED IMPLEMENTATION OF DOUBLE PRECISION IEEE FP-ADDER 35 Final Report Final Results RTL Top Level Output File Name : top_module.ngr Top Level Output File Name : top_module Output Format : NGC Optimization Goal : Speed Keep Hierarchy : NO Design Statistics # IOs : 195 Cell Usage: # BELS : 1316 # GND : 1 # LUT1 : 52 # LUT2 : 141 # LUT2_D : 3 # LUT2_L : 1 # LUT3 : 284 # LUT3_D : 3 # LUT4 : 441 # LUT4_D : 4 # LUT4_L : 3 # MUXCY : 111 # MUXF5 : 161 # VCC : 1 # XORCY : 110 # Flip Flops/Latches : 308 # FD : 7 # FDC : 240 # FDS : 4 # LD_1 : 57 # Clock Buffers : 2 # BUFG : 1 # BUFGP : 1 # IO Buffers : 194 # IBUF : 130 # OBUF : 64 Device utilization summary:
  • 37. JADAVPUR UNIVERSITY | FPGA BASED IMPLEMENTATION OF DOUBLE PRECISION IEEE FP-ADDER 36 Selected Device : 2v6000ff1152-4 Number of Slices: 528 out of 33792 1% Number of Slice Flip Flops: 308 out of 67584 0% Number of 4 input LUTs: 932 out of 67584 1% Number of IOs: 195 Number of bonded IOBs: 195 out of 824 23% Number of GCLKs: 2 out of 16 12% TIMING REPORT NOTE: THESE TIMING NUMBERS ARE ONLY A SYNTHESIS ESTIMATE. FOR ACCURATE TIMING INFORMATION PLEASE REFER TO THE TRACE REPORT GENERATED AFTER PLACE-and-ROUTE. Clock Information: -----------------------------------------------+-----------------------------------------+-------+ Clock Signal | Clock buffer(FF name) | Load | -----------------------------------------------+-----------------------------------------+-------+ Clock | BUFGP | 251 | -----------------------------------------------+-----------------------------------------+--------| seff1 (Inst_compliment5353/ | | | Mxor_s_temp_xo<1>1:O) | BUFG(*)(Inst_allignment/ | | |goout_101) | 57 | -----------------------------------------------+-----------------------------------------+--------+ Asynchronous Control Signals Information: -----------------------------------+-----------------------------+--------+ Control Signal | Buffer(FF name) | Load | -----------------------------------+------------------------------+-------+ Preset | IBUF | 240 | -----------------------------------+------------------------------+-------+ Timing Summary: Speed Grade: -4 Minimum period: 10.631ns (Maximum Frequency: 94.061MHz) Minimum input arrival time before clock: 1.712ns Maximum output required time after clock: 16.396ns Maximum combinational path delay: No path found Timing Detail: All values displayed in nanoseconds (ns) Timing constraint: Default period analysis for Clock 'clock'
  • 38. JADAVPUR UNIVERSITY | FPGA BASED IMPLEMENTATION OF DOUBLE PRECISION IEEE FP-ADDER 37 Clock period: 10.631ns (frequency: 94.061MHz) Total number of paths / destination ports: 44285 / 125
  • 39. JADAVPUR UNIVERSITY | FPGA BASED IMPLEMENTATION OF DOUBLE PRECISION IEEE FP-ADDER 38 PLACE AND ROUTE REPORT: Release 9.2i par J.36 Copyright (c) 1995-2007 Xilinx, Inc. All rights reserved. ADRIN-E68E783C5:: Wed Oct 05 14:42:59 2011 Device speed data version: "PRODUCTION 1.121 2007-04-13". Device Utilization Summary: Number of BUFGMUXs 2 out of 16 12% Number of External IOBs 195 out of 824 23% Number of LOCed IOBs 0 out of 195 0% Number of SLICEs 546 out of 33792 1% Overall effort level (-ol): Standard Placer effort level (-pl): High Placer cost table entry (-t): 1 Router effort level (-rl): Standard REAL time consumed by placer: 20 secs CPU time consumed by placer: 19 secs Writing design to file top_module.ncd Total REAL time to Placer completion: 21 secs Total CPU time to Placer completion: 20 secs Total REAL time to Router completion: 40 secs Total CPU time to Router completion: 39 secs Generating Clock Report: +---------------------+------------------+----------+---------+------------------+--------------------+ | Clock Net | Resource |Locked |Fanout |Net Skew(ns) |Max Delay(ns) | +---------------------+------------------+----------+---------+------------------+--------------------+ | clock_BUFGP | BUFGMUX3P| No | 197 | 0.353 | 2.396 | +---------------------+------------------+----------+---------+------------------+---------------------+ | Seff |BUFGMUX4P| No | 53 | 0.095 | 2.134 | +---------------------+------------------+----------+---------+------------------+---------------------+ * Net Skew is the difference between the minimum and maximum routing only delays for the net. Note this is different from Clock Skew which is reported in TRCE timing report. Clock Skew is the difference between the minimum and maximum path delays which includes logic delays.
  • 40. JADAVPUR UNIVERSITY | FPGA BASED IMPLEMENTATION OF DOUBLE PRECISION IEEE FP-ADDER 39 The Delay Summary Report The NUMBER OF SIGNALS NOT COMPLETELY ROUTED for this design is: 0 The AVERAGE CONNECTION DELAY for this design is: 2.675 The MAXIMUM PIN DELAY IS: 14.081 The AVERAGE CONNECTION DELAY on the 10 WORST NETS is: 9.965 Listing Pin Delays by value: (nsec) d < 3.00 < d < 6.00 < d < 9.00 < d < 12.00 < d < 15.00 d >= 15.00 --------- --------- --------- --------- --------- --------- 2723 702 391 79 36 0 Summary report for the RC10 chip implementation: Clock: 110 MHz ISE Project Status Project File: FPadder.ise Current State: Programming File Generated Module Name: top_module β€’ Errors: No Errors Target Device: xc3s1500l-4fg320 β€’ Warnings: 20 Warnings Product Version: ISE 9.2i β€’ Updated: Tue Jun 26 16:00:21 2012 ISE Partition Summary No partition information was found.
  • 41. JADAVPUR UNIVERSITY | FPGA BASED IMPLEMENTATION OF DOUBLE PRECISION IEEE FP-ADDER 40 Device Utilization Summary Logic Utilization Used Available Utilization Note(s) Number of Slice Flip Flops 421 26,624 1% Number of 4 input LUTs 492 26,624 1% Logic Distribution Number of occupied Slices 491 13,312 3% Number of Slices containing only related logic 491 491 100% Number of Slices containing unrelated logic 0 491 0% Total Number of 4 input LUTs 668 26,624 2% Number used as logic 492 Number used as a route-thru 139 Number used for Dual Port RAMs 36 Number used as Shift registers 1 Number of bonded IOBs 39 221 17% IOB Flip Flops 15 Number of Block RAMs 1 32 3% Number of GCLKs 4 8 50% Number of DCMs 2 4 50% Total equivalent gate count for design 89,436 Additional JTAG gate count for IOBs 1,872 Performance Summary Final Timing Score: 0 Pinout Data: Pinout Report Routing Results: All Signals Completely Routed Clock Data: Clock Report Timing Constraints: All Constraints Met Detailed Reports Report Name Status Generated Errors Warnings Infos Synthesis Report Translation Report Current Tue Jun 26 15:59:03 2012 0 0 0 Map Report Current Tue Jun 26 15:59:15 2012 0 3 Warnings 4 Infos Place and Route Report Current Tue Jun 26 15:59:52 2012 0 9 Warnings 1 Info Static Timing Report Current Tue Jun 26 16:00:00 2012 0 8 Warnings 2 Infos Bitgen Report Current Tue Jun 26 16:00:19 2012 0 0 1 Info
  • 49. JADAVPUR UNIVERSITY | FPGA BASED IMPLEMENTATION OF DOUBLE PRECISION IEEE FP-ADDER 48 SECTION VIII. FUTURE SCOPE The IEEE double precision FP adder circuit finds usefulness in numerous occasions. Therefore this present implementation can be stretched to cover up the whole algorithmic process and it has a greater affinity for the optimization of the total process, including the rounding mechanisms and the addition process. Also the computation of the carry and sticky bit requires a lot of logical efforts which are can be minimized. Upon optimization this can be used in various floating point operation incorporating with different FPGAs.
  • 50. JADAVPUR UNIVERSITY | FPGA BASED IMPLEMENTATION OF DOUBLE PRECISION IEEE FP-ADDER 49 BIBLIOGRAPHY [1] S. Bar-Or, G. Even, and Y. Levin, β€œGeneration of Representative Input Vectors for Parametric Designs: from Low Precision to High Precision,” Integration, the VLSI J., vol. 36, issues 1-2, pp. 69-82, Sept. 2003. [2] A. Beaumont-Smith, N. Burgess, S. Lefrere, and C. Lim, β€œReduced Latency IEEE Floating-Point Standard Adder Architectures,” Proc.14th IEEE Symp. Computer Arithmetic, pp. 35-43, 1999. [3] R. Brent and H. Kung, β€œA Regular Layout for Parallel Adders,” IEEE Trans. On computers, vol. 31, no. 3, pp. 260-264, Mar. 1982. [4] J. Bruguera and T. Lang, β€œUsing the Reverse-Carry Approach for Double-Datapath Floating-Point Addition,” Proc. 15th IEEE Int’l Symp. Computer Arithmetic (Arith15), pp. 203-210, June 2001. [5] M. Daumas and D. Matula, β€œRecoders for Partial Compression and Rounding,” Technical Report RR97-01, Ecole Normale Superieur de Lyon, LIP 1996. [6] L. Eisen, T. Elliott, R. Golla, and C. Olson, β€œMethod and System for Performing a High Speed Floating Point Add Operation.” IBM Corporation, US patent 5790445, 1998. [7] G. Even, S. MuΒ¨ ller, and P. Seidel, β€œDual Precision IEEE Floating- Point Multiplier,” INTEGRATION The VLSI J., vol. 29, no. 2, pp. 167-180, Sept. 2000. [8] G. Even and P.-M. Seidel, β€œA Comparison of Three Rounding Algorithms for IEEE Floating-Point Multiplication,” IEEE Trans Computers, vol. 49, no. 7, pp. 638-650, July 2000. [9] P. Farmwald, β€œOn the Design of High Performance Digital Arithmetic Units,” PhD thesis, Stanford Univ., Aug. 1981 [10] P. Farmwald, β€œBifurcated Method and Apparatus for Floating-Point Addition with Decreased Latency Time,” US patent 4639887,1987. [11] V. Gorshtein, A. Grushin, and S. Shevtsov, β€œFloating Point Addition Methods and Apparatus.” Sun Microsystems, US patent 5808926, 1998. [12]M.Horowitz,β€œVLSI Scaling for Architects,” http://velox.stanford. du/papers/VLSIScaling.pdf, 2000. [13] β€œIEEE Standard for Binary Floating Point Arithmetic, ”ANSI/ IEEE754-1985, 1985. [14] T. Ishikawa, β€œMethod for Adding/Subtracting Floating-Point Representation Data and Apparatus for the Same,” Toshiba, K.K., US patent 5063530, 1991.
  • 51. JADAVPUR UNIVERSITY | FPGA BASED IMPLEMENTATION OF DOUBLE PRECISION IEEE FP-ADDER 50 [15] T. Kawaguchi, β€œFloating Point Addition and Subtraction Arithmetic Circuit Performing Preprocessing of Addition or Subtraction Operation Rapidly,” NEC, US patent 5931896, 1999. [16] Y. Levin, β€œSupporting Demoralized Numbers in an IEEE Compliant Floating Point Adder Optimized for Speed,” http:// hyde.eng.tau.ac.il/Projects/FPADD/index.html, 2001. [17] A. Naini, A. Dhablania, W. James, and D. Das Sarma, β€œ1-GHz HAL SPARC64 Dual Floating Point Unit with RAS Features,” Proc. 15th IEEE Int’l Symp. Computer Arithmetic (Arith15), pp. 173- 183, June 2001. [18] T. Nakayama, β€œHardware Arrangement for Floating-Point Addition and Subtraction,” NEC, US patent 5197023, 1993. [19] K. Ng, β€œFloating-Point ALU with Parallel Paths,” US patent 5136536, Weitek Corp., 1992. [20] A. Nielsen, D. Matula, C.-N. Lyu, and G. Even, β€œIEEE Compliant Floating-Point Adder that Conforms with the Pipelined Packet- Forwarding Paradigm,” IEEE Trans. Computers, vol. 49, no. 1,pp. 33-47, Jan. 2000. [21] S. Oberman, β€œFloating-Point Arithmetic Unit Including an Efficient Close Data Path,” AMD, US patent 6094668, 2000. [22] S. Oberman, H. Al-Twaijry, and M. Flynn, β€œThe SNAP Project: Design of Floating Point Arithmetic Units,” Proc. 13th IEEE Symp.Computer Arithmetic, pp. 156-165, 1997. [23] W.-C. Park, T.-D. Han, S.-D. Kim, and S.-B. Yang, β€œFloating Point Adder/Subtractor Performing IEEE Rounding and Addition/Subtraction in Parallel,” IEICE Trans. Information and Systems, vol. 4, pp. 297-305, 1996. [24] N. Quach and M. Flynn, β€œDesign and Implementation of the SNAP Floating-Point Adder,” Technical Report CSL-TR-91-501,Stanford Univ., Dec. 1991. [25] N. Quach, N. Takagi, and M. Flynn, β€œOn fast IEEE Rounding,” Technical Report CSL-TR-91-459, Stanford Univ., Jan. 1991. [26] M. Schmookler and K. Nowka, β€œLeading Zero Anticipation and Detectionβ€”A Comparison of Methods, Proc. 15th IEEE Symp.Computer Arithmetic, pp. 7-12, 2001. [27] P.-M. Seidel, β€œOn The Design of IEEE Compliant Floating-Point Units and Their Quantitative Analysis,” PhD thesis, Univ. of Saarland, Germany, Dec. 1999. [28] P.-M. Seidel and G. Even, β€œHow Many Logic Levels Does Floating-Point Addition Require?” Proc. 1998 Int’l Conf. Computer Design (ICCD ’98): VLSI, in Computers & Processors, pp. 142-149, Oct. 1998.
  • 52. JADAVPUR UNIVERSITY | FPGA BASED IMPLEMENTATION OF DOUBLE PRECISION IEEE FP-ADDER 51 [29] P.-M. Seidel and G. Even, β€œOn the Design of Fast IEEE Floating- Point Adders,” Proc. 15th IEEE Int’l Symp. Computer Arithmetic (Arith15), pp. 184-194, June 2001. [30] P.-M. Seidel and G. Even, β€œDelay-Optimized Implementation of IEEE Floating- Point Addition,” http://www.engr.smu.edu/ seidel/research/fpadd/, 2002. [31] H. Sit, D. Galbi, and A. Chan, β€œCircuit for Adding/Subtracting Two Floating-Point Operands,” Intel, US patent 5027308, 1991. [32] D. Stiles, β€œMethod and Apparatus for Performing Floating-Point Addition,” AMD, US patent 5764556, 1998. [33] I. Sutherland, B. Sproull, and D. Harris, Logical Effort: Designing Fast CMOS Circuits. Morgan Kaufmann 1999. [34] A. Tyagi, β€œA Reduced-Area Scheme for Carry-Select Adders,” IEEE Trans. Computers, vol. 42, no. 10, Oct. 1993. [35] H. Yamada, F. Murabayashi, T. Yamauchi, T. Hotta, H. Sawamoto, T. Nishiyama, Y. Kiyoshige, and N. Ido, β€œFloating-Point Addition/ Subtraction Processing Apparatus and Method Thereof,” Hitachi, US patent 5684729, 1997. [36] Stefan Sjoholm, Lennart Lindh β€œVHDL for designers” Prentice Hall Europe, ISBN 0-13-473414-9. [37] Kavin Skahill β€œVHDL for PROGRAMMABLE LOGIC” [38] ACTEL β€œHDL CODING style guide”.