A Study Of Different Floating Point
Units

PREPARED BY
Dipu P
dipugovind@gmail.com
Floating Point Unit
• Design of A Fully Pipelined Single-Precision Floating-Point
Unit.
• Energy-Efficient Floating-Point Unit Design.
• Improved Architectures for a Fused Floating-Point AddSubtract Unit.
• Optimized Architecture for Floating Point Computation Unit.
• Unified Rectangular Floating-Point Pipelined Architecture.
• Design & Implementation of Floating point ALU on a FPGA
Processor
1.Design of A Fully Pipelined Single-Precision Floating-Point Unit.

• AUTHORS: Zhaolin Li1, Xinyue Zhang2, Gongqiong Li2,
Runde Zhou21Research Institute of Information Technology
Tsinghua University, Beijing 100084, P.R.China2Institute of
Microelectronics, Tsinghua University, Beijing 100084,
P.R.China

• Publication Year:

July 2007

• Journal Name: IEEE TRANSACTIONS
Introduction
• Single-precision floating-point unit is implemented in three
pipeline stages.
• The core of this design is a multiply-add-fused(MAF) unit.
• It is synthesized in 0.18um CMOS technology after
verification.
• The FPU is fully pipelined and is capable of accepting a new
input each clock cycle.
• The floating-point operations, such as multiplication and
addition, are performed.
Architecture of the proposed FPU
• The fundamental operation
implemented by the MAF
unit is given in Equation (1),
where A, B and C refer to
three operands.
±A ± (±B) × (±C)
The detailed structure of the proposed FPU
Advantages
• The proposed FPU is able to implement basic computations
including addition/subtraction, multiplication, multiply-addfused operation, division and square root.
• In this design, division and square root algorithm use the
multiplicative method. This algorithm converges at a
quadratic rate, which means the number of accurate digits in
the estimate doubles after each iteration.
• Multiplicative implementation introduces only small hardware
increase due to the share of multiply unit.
Disadvantages
• Since the instructions have irregular latencies, the instructions
must be carefully scheduling to avoid collisions.
• The design complexity of the data path controller is much
increased.
• Compared with the single-precision MAF unit we can
conclude that in proposed 3-pipelined FPU has 3% more time
delay introduced.
2.Energy-Efficient Floating-Point Unit
Design
•AUTHORS: Sameh Galal, Mark Horowitz
•Publication Year: July 2011
•Journal Name: IEEE TRANSACTIONS ON
COMPUTERS
Block diagram for a single-precision fused
multiply-add unit
Block diagram for a single-precision cascade
multiply-add
Advantages
• Parallel architecture: supports high performance
applications
• Energy efficient as compared to fixed point unit and
other FPU designs
• Incorporate combined floating-point multiply-add
instructions that implement A+B×C operation.
• Better accuracy
• Provides a very large range
Disadvantages
• Rounds off large numbers
• The order of evaluation can effect the accuracy
of the result
3.Improved Architectures for a Fused
Floating-Point Add-Subtract Unit

• AUTHORS: Earl E. Swartzlander, Jongwook Sohn
• Publication Year: October 2010
• Journal Name: IEEE TRANSACTIONS ON Circuits And
Systems
Block Diagram of Fused Floating Point
Adder
Applications
• The fused floating point add-subtract unit is used
mainly in the applications of Digital Signal
Processing (DSP).
• The main applications is in “Fast Fourier Transform”
(FFT) and “Discrete Cosine Transform” (DCT).
• Butterfly Operations are of FFT are benefited with
the help of fused FPU in terms of Low Power
Consumption.
Advantages and Disadvantages
• Highly optimized design for low power applications
in DSP field.
• Higher speed of computation due to fused
architecture.
• Design complexity is high.
• Very costly and difficult to implement.
4.Optimized Architecture for Floating
Point Computation Unit
• AUTHORS: Harish Anand Ti, D.Vaithiyanathan2,
R.Seshasayanan3
• Publication Year: July 2013
• Journal Name: IEEE TRANSACTIONS ON COMPUTERS
Introduction
• performs all the four basic arithmetic operations using simple
hardware like adders, look up tables and interpolation steps.
• Logarithmic approach is used.
• The LUT plays an important role.
Conventional Floating Point Multiplier
Low power Arithmetic circuit model
Application
• hybrid FPGAs
•

applications in FPGAs
Advantages

• 36 % less power than existing FPU
• 28% area is reduced
• Simplified Data path
5.Unified Rectangular Floating-Point
Pipelined Architecture
• AUTHORS: Sateesh Reddy , Vineet J Kanojian
• Publication Year: July 2013
• Journal Name: International Journal of Advanced Engineering
Science And Technologies.
Block diagram
Advantages
• High performance in terms of area and power
• Latency is reduced
• Low complexity in designing architecture
Disadvantages
• Range of numbers handled are limited.
• Precision decreases with range.
• Consumes around 40-70% of hardware.
6.Design & Implementation of Floating point
ALU on a FPGA Processor

• AUTHORS: Prashanth B.u.v P.Anil Kumai, .G Sreenivasulu
• Publication Year: 2012
• Journal Name 2012 International Conference on Computing,
Electronics and Electrical Technologies [ ICCEET].
BLOCK DIAGRAMOF FLOATING POINT
MULTIPLIER:
FUTURE WORK
• This ALU can also be extended for performing
Square root, exponential and logarithmic.
• Even pipelining for above FPU can increase
the efficiency
THANK YOU

Floating point units

  • 1.
    A Study OfDifferent Floating Point Units PREPARED BY Dipu P dipugovind@gmail.com
  • 2.
    Floating Point Unit •Design of A Fully Pipelined Single-Precision Floating-Point Unit. • Energy-Efficient Floating-Point Unit Design. • Improved Architectures for a Fused Floating-Point AddSubtract Unit. • Optimized Architecture for Floating Point Computation Unit. • Unified Rectangular Floating-Point Pipelined Architecture. • Design & Implementation of Floating point ALU on a FPGA Processor
  • 3.
    1.Design of AFully Pipelined Single-Precision Floating-Point Unit. • AUTHORS: Zhaolin Li1, Xinyue Zhang2, Gongqiong Li2, Runde Zhou21Research Institute of Information Technology Tsinghua University, Beijing 100084, P.R.China2Institute of Microelectronics, Tsinghua University, Beijing 100084, P.R.China • Publication Year: July 2007 • Journal Name: IEEE TRANSACTIONS
  • 4.
    Introduction • Single-precision floating-pointunit is implemented in three pipeline stages. • The core of this design is a multiply-add-fused(MAF) unit. • It is synthesized in 0.18um CMOS technology after verification. • The FPU is fully pipelined and is capable of accepting a new input each clock cycle. • The floating-point operations, such as multiplication and addition, are performed.
  • 5.
    Architecture of theproposed FPU • The fundamental operation implemented by the MAF unit is given in Equation (1), where A, B and C refer to three operands. ±A ± (±B) × (±C)
  • 6.
    The detailed structureof the proposed FPU
  • 7.
    Advantages • The proposedFPU is able to implement basic computations including addition/subtraction, multiplication, multiply-addfused operation, division and square root. • In this design, division and square root algorithm use the multiplicative method. This algorithm converges at a quadratic rate, which means the number of accurate digits in the estimate doubles after each iteration. • Multiplicative implementation introduces only small hardware increase due to the share of multiply unit.
  • 8.
    Disadvantages • Since theinstructions have irregular latencies, the instructions must be carefully scheduling to avoid collisions. • The design complexity of the data path controller is much increased. • Compared with the single-precision MAF unit we can conclude that in proposed 3-pipelined FPU has 3% more time delay introduced.
  • 9.
    2.Energy-Efficient Floating-Point Unit Design •AUTHORS:Sameh Galal, Mark Horowitz •Publication Year: July 2011 •Journal Name: IEEE TRANSACTIONS ON COMPUTERS
  • 10.
    Block diagram fora single-precision fused multiply-add unit
  • 11.
    Block diagram fora single-precision cascade multiply-add
  • 12.
    Advantages • Parallel architecture:supports high performance applications • Energy efficient as compared to fixed point unit and other FPU designs • Incorporate combined floating-point multiply-add instructions that implement A+B×C operation. • Better accuracy • Provides a very large range
  • 13.
    Disadvantages • Rounds offlarge numbers • The order of evaluation can effect the accuracy of the result
  • 14.
    3.Improved Architectures fora Fused Floating-Point Add-Subtract Unit • AUTHORS: Earl E. Swartzlander, Jongwook Sohn • Publication Year: October 2010 • Journal Name: IEEE TRANSACTIONS ON Circuits And Systems
  • 15.
    Block Diagram ofFused Floating Point Adder
  • 16.
    Applications • The fusedfloating point add-subtract unit is used mainly in the applications of Digital Signal Processing (DSP). • The main applications is in “Fast Fourier Transform” (FFT) and “Discrete Cosine Transform” (DCT). • Butterfly Operations are of FFT are benefited with the help of fused FPU in terms of Low Power Consumption.
  • 17.
    Advantages and Disadvantages •Highly optimized design for low power applications in DSP field. • Higher speed of computation due to fused architecture. • Design complexity is high. • Very costly and difficult to implement.
  • 18.
    4.Optimized Architecture forFloating Point Computation Unit • AUTHORS: Harish Anand Ti, D.Vaithiyanathan2, R.Seshasayanan3 • Publication Year: July 2013 • Journal Name: IEEE TRANSACTIONS ON COMPUTERS
  • 19.
    Introduction • performs allthe four basic arithmetic operations using simple hardware like adders, look up tables and interpolation steps. • Logarithmic approach is used. • The LUT plays an important role.
  • 20.
  • 21.
    Low power Arithmeticcircuit model
  • 22.
  • 23.
    Advantages • 36 %less power than existing FPU • 28% area is reduced • Simplified Data path
  • 24.
    5.Unified Rectangular Floating-Point PipelinedArchitecture • AUTHORS: Sateesh Reddy , Vineet J Kanojian • Publication Year: July 2013 • Journal Name: International Journal of Advanced Engineering Science And Technologies.
  • 25.
  • 26.
    Advantages • High performancein terms of area and power • Latency is reduced • Low complexity in designing architecture
  • 27.
    Disadvantages • Range ofnumbers handled are limited. • Precision decreases with range. • Consumes around 40-70% of hardware.
  • 28.
    6.Design & Implementationof Floating point ALU on a FPGA Processor • AUTHORS: Prashanth B.u.v P.Anil Kumai, .G Sreenivasulu • Publication Year: 2012 • Journal Name 2012 International Conference on Computing, Electronics and Electrical Technologies [ ICCEET].
  • 30.
    BLOCK DIAGRAMOF FLOATINGPOINT MULTIPLIER:
  • 31.
    FUTURE WORK • ThisALU can also be extended for performing Square root, exponential and logarithmic. • Even pipelining for above FPU can increase the efficiency
  • 32.

Editor's Notes

  • #2 <number>
  • #20 Numerical transformation to logarithmic domain and reduced the overall computation burden. The LUT's size used is also have a major impact in the performance of the model, as the whole multiplication architecture barely depends on logarithmic principles size of log values stored in the LUTs determines the accuracy of the result. As given in [3] <number>
  • #23 <number>
  • #24 the model to completely immune to unwanted glitches and toggles . Datapath is also one of the main factor that influence the power consumption in the circuitry <number>