Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Field-Programmable Gate Arrays
       as tracking devices

          Roberto Rodríguez Osorio
            Javier Díaz Brug...
Outline

Application-specific computing machines
ASIC vs FPGA
FPGA technology basics
Hard cores in FPGAs
Performance
Desig...
Application-specific computing machines

        Microprocessor                   Application-Specific
                   ...
ASIC vs FPGA

                                                  $4M



                                                  $...
ASIC vs FPGA

     6
         Computational efficiency (Mops/w)
10
     5
10                Maximum efficiency            ...
FPGA technology basics – Computing

         a          b                  carry                carry
                    ...
FPGA technology basics – Do not compute

                               Logic blocks
a
         SRAM
b        Memory    s
...
FPGA technology basics – Interconnect
█   █   █   █   █   █   █   █   █   █   █   █   █   █   █   █   █   █   █   █   █   ...
FPGA technology basics – Interconnect




                                        9
FPGA technology basics – Interconnect




                                        10
FPGA technology basics – Interconnect + memory

FPGA fabric consists of a huge number of simple memory
elements connected ...
Hard cores in FPGAs

Memory blocks           ████████████████████
Multipliers             ████████████████████
DSP blocks ...
Memory blocks

Hundreds or thousands of small memory blocks
     Dual-port blocks
     18 K-bit each for Xilinx
     Flexi...
Multipliers and DSP blocks

As FPGAs were becoming larger, some people tried to
  implement DSP algorithms on them
     Ho...
Microprocessors

Xilinx:
   IBMs Power PC processors
    Virtex II Pro
    Virtex-4 FX
    Virtex-5 FX
  Microblaze soft p...
Floating point units

Not implemented so far
• Suggested to help to accelerate scientific computing
• For engineering, fix...
Performance

Compared to an ASIC
    10 times slower, larger and power hungry


Compared to a microprocessor
    Fast, dep...
Design effort

Several scenarios:

Pure VHDL or Verilog coding
     Higher flexibility, efficiency and performance
     Lo...
Choices

Xilinx
         Virtex
         Spartan
Altera
         Stratix
         Cyclone
Others
         Actel
         L...
Choices - Xilinx

                    Spartan 3       Spartan 6        Virtex 6

Logic Cells        1728 – 74880   3840 - ...
In the context of this applications

Device choice
• Logic bounded
    •   Standard logic
    •   Multipliers
• IO bounded...
Upcoming SlideShare
Loading in …5
×

RR Osorio FPGA

854 views

Published on

Published in: Technology
  • Be the first to comment

  • Be the first to like this

RR Osorio FPGA

  1. 1. Field-Programmable Gate Arrays as tracking devices Roberto Rodríguez Osorio Javier Díaz Bruguera Group of Computer Architecture Dept. of Electronics and Computer Science University of Santiago de Compostela
  2. 2. Outline Application-specific computing machines ASIC vs FPGA FPGA technology basics Hard cores in FPGAs Performance Design effort Choices Applications 2
  3. 3. Application-specific computing machines Microprocessor Application-Specific Integrated Circuit Code Data memory memory M p t p M PC IR Register file Control logic MAC Control logic Functional Control units Datapath section Control Datapath section Performance: 10 cycles @ 3GHz Performance: 1 cycle @ 1GHz Dissipated power: ~35 W Dissipated power: ~mW 3
  4. 4. ASIC vs FPGA $4M $3M $2M NRE $1M 0.35 0.25 0.2 0.15 0.1 0.05 Technology (micrometers) 4
  5. 5. ASIC vs FPGA 6 Computational efficiency (Mops/w) 10 5 10 Maximum efficiency FPGA 4 (ASIC) ASSP MPPA 10 GPGPU VLIW ASIP 3 ManyCore 10 ... 2 10 1 10 0 10 2 1 0.5 0.25 0.13 0.07 Technology ( m) 1986 1990 1994 1998 2002 2006 Source: Theo A.C.M Claasen, ISSCC 99 5
  6. 6. FPGA technology basics – Computing a b carry carry input a b s output 0 0 0 0 0 c out FA c in 0 0 1 1 0 0 1 0 1 0 s 0 1 1 0 1 1 0 0 1 0 c in 1 0 1 0 1 a s 1 1 0 0 1 b 1 1 1 1 1 a b a c out cin b c in 6
  7. 7. FPGA technology basics – Do not compute Logic blocks a SRAM b Memory s 8x1-bit cin SRAM Memory cout 8x1-bit 7
  8. 8. FPGA technology basics – Interconnect █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ 8
  9. 9. FPGA technology basics – Interconnect 9
  10. 10. FPGA technology basics – Interconnect 10
  11. 11. FPGA technology basics – Interconnect + memory FPGA fabric consists of a huge number of simple memory elements connected by means of a reconfigurable network Design software must break every computing tasks into 1-bit size operation with no more than 4, 5 or 6 variables Operations are spatially distributed according to proximity criteria Routing may be troublesome Long paths are slow Routing though logic blocks increase area 11
  12. 12. Hard cores in FPGAs Memory blocks ████████████████████ Multipliers ████████████████████ DSP blocks ████████████████████ Microprocessors ████████████████████ Floating point units? ████████████████████ ████████████████████ ████████████████████ ████████████████████ ████████████████████ ████████████████████ 12
  13. 13. Memory blocks Hundreds or thousands of small memory blocks Dual-port blocks 18 K-bit each for Xilinx Flexible configurations Many short words or a few large word Independent access Huge aggregated bandwidth 13
  14. 14. Multipliers and DSP blocks As FPGAs were becoming larger, some people tried to implement DSP algorithms on them However: Multipliers take too much area Therefore: Hardwired multipliers were introduced DSP algorithms are often based on multiply & add multiply & accumulate DSP blocks in modern FPGAs implement hardwired: multipliy, multiply & add, multiply & accumulate optional addition before multiplying three-input add 1 large, 2 medium or 4 small operations on the same hardware shifting, comparisons, bit-wise operations,… Up to 2000 DSP blocks in current FPGAs for massive parallelism 14
  15. 15. Microprocessors Xilinx: IBMs Power PC processors Virtex II Pro Virtex-4 FX Virtex-5 FX Microblaze soft processors Altera: ARM RISC processors Nios soft processor 15
  16. 16. Floating point units Not implemented so far • Suggested to help to accelerate scientific computing • For engineering, fixed point arithmetic is usually enough Would it happen? ☺ It happened with multipliers, transceivers, DSP blocks, … GPUs have already a strong position in this field 16
  17. 17. Performance Compared to an ASIC 10 times slower, larger and power hungry Compared to a microprocessor Fast, depending on: Potential parallelism Required bandwidth Small and simple, even standalone Reduced power consumption (< 1W), they may run on batteries 17
  18. 18. Design effort Several scenarios: Pure VHDL or Verilog coding Higher flexibility, efficiency and performance Long design time Costly debugging Use macros combined with VHDL or Verilog Libraries of IP blocks easy the design process It is not guaranteed that the required functionalities can be found High level languages (DSP logic (Matlab), Impulse-C, Handel-C,…) Efficient and simple implementation for simple algorithms Lack of expressiveness for complex algorithms 18
  19. 19. Choices Xilinx Virtex Spartan Altera Stratix Cyclone Others Actel Lattice Semiconductor … 19
  20. 20. Choices - Xilinx Spartan 3 Spartan 6 Virtex 6 Logic Cells 1728 – 74880 3840 - 147443 74496 – 566784 Block RAM 12 - 1872 216 - 4824 5616 – 32832 (Kbits) Multipliers / 4 – 104 DSP 84 - 126 8 - 180 288 - 2016 Evaluation board < $200 $300 - $1000 $2000 - $2500 cost 20
  21. 21. In the context of this applications Device choice • Logic bounded • Standard logic • Multipliers • IO bounded Parallel acquisition • Switching memory blocks for acquisition and computation High computing speed • Via pipelining Results storage • Internal or external memory Power consumption Configuration 21

×