Fast FPGA
Resource Estimation
Paul Schumacher & Pradip Jha
Xilinx, Inc.
Outline
 Motivations
 Estimation Details
 Results, Examples, & Demo
 Conclusions & Future Improvements
The Need for More Information
Logic Utilization:
Number of Slice Flip Flops: 301 out of 12,288 2%
Number of 4 input LUTs: 900 out of 12,288 7%
Logic Distribution:
Number of occupied Slices: 498 out of 6,144 8%
Total Number 4 input LUTs: 920 out of 12,288 7%
Number used as logic: 900
Number used as a route-thru: 20
Number of bonded IOBs: 21 out of 320 6%
Number of BUFG/BUFGCTRLs: 1 out of 32 3%
Number used as BUFGs: 1
Number used as BUFGCTRLs: 0
Total equivalent gate count for design: 8,944
How are these resources
being used?!?
Virtex®-6
FPGA
®
Exemplary FPGA Design Flow
HLL Definition
Optimization
RTL Generation
Synthesis & Simulation
Constraints
met?
Place & Route
Y
N
FPGA Bitstream
 Issue #1: Perspective
– Overlook vast design space from high level
– Your implementation: very low level
– You need to deftly navigate both levels
 Issue #2: Time
– Designing can be very iterative process
– Performing what if scenarios can be costly
– Possible tradeoffs: speed vs. accuracy
Exemplary FPGA Design Flow
High-Level Estimation
HLL Definition
Optimization
RTL Generation
Synthesis & Simulation
Constraints
met?
Place & Route
Y
N
FPGA Bitstream
 In-depth information
– Supply perspective desired by designers
– Provide context: where, when, why
 Immediate feedback
– Offer quick estimations
– Increasing benefits with multiple runs
What Do We Have?
 Provides estimated FPGA
resources of RTL designs
– Without running synthesis
– All resources reported
– All FPGA families supported
 Benefits
– Fast: 100x faster than synthesis
– Accurate: 15.2% average error
– Transparent: in default RTL flow
– Useful: help user select part, etc.
 Released with ISE®
11
PlanAhead™
Resource Estimation Tool Flow
 User-provided input
– RTL source code
– Settings (e.g., FPGA family)
– Tcl script (batch mode)
 Obtain netlist from HDL
parser/elaborator
 Estimate each macro in netlist
 Refine using synthesis modeling
 Estimation output
– Interactive database (GUI)
– Report files (XML, Excel)
HDL Source Code
Library
Characterization
HDL Parser/Elaborator
Design Netlist
Macro-Level
Estimations
Refinements
Estimation Database
PlanAhead
Tool
Benchmark Results
FPGA
Resource
Post-Map Estimation
Error (Avg.)*
Slices 15.2%
LUTs 15.6%
Flip-Flops 11.5%
BlockRAMs 6.4%
DSP48s 2.2%
Run-Time** 17.2 sec
* Using suite of 100 customer designs across three FPGA families
** Run-time includes HDL parser/elaborator + estimation
Estimation Run-Time
Demo: MPEG-4 Decoder
PlanAhead Integration
Resources
by Function
Resources
by Hierarchy
Results: MPEG-4 Decoder
FPGA Family Comparison Memory Comparison
Both experiments performed in less than one minute!
QCIF CIF 4CIF 720p 1080p
Estimation Use Models
 Early in design: immediate feedback
 Later in design: “hot spot” identification
 Design space exploration (DSE)
 Increased abstraction level
 Design benchmarking & comparison
Conclusions
 Resource estimator integrated into PlanAhead
 Provides detailed analysis of design
– Breakdown of resources by functionality & hierarchy
– Statistics on memories and bit widths
 Future releases
– Improved QoR
– Estimation of other requested design budgets
– Integration with other tools
 Contact us with any questions!
– Email: {paul.schumacher, pradip.jha}@xilinx.com

06 u 2

  • 1.
    Fast FPGA Resource Estimation PaulSchumacher & Pradip Jha Xilinx, Inc.
  • 2.
    Outline  Motivations  EstimationDetails  Results, Examples, & Demo  Conclusions & Future Improvements
  • 3.
    The Need forMore Information Logic Utilization: Number of Slice Flip Flops: 301 out of 12,288 2% Number of 4 input LUTs: 900 out of 12,288 7% Logic Distribution: Number of occupied Slices: 498 out of 6,144 8% Total Number 4 input LUTs: 920 out of 12,288 7% Number used as logic: 900 Number used as a route-thru: 20 Number of bonded IOBs: 21 out of 320 6% Number of BUFG/BUFGCTRLs: 1 out of 32 3% Number used as BUFGs: 1 Number used as BUFGCTRLs: 0 Total equivalent gate count for design: 8,944 How are these resources being used?!? Virtex®-6 FPGA ®
  • 4.
    Exemplary FPGA DesignFlow HLL Definition Optimization RTL Generation Synthesis & Simulation Constraints met? Place & Route Y N FPGA Bitstream  Issue #1: Perspective – Overlook vast design space from high level – Your implementation: very low level – You need to deftly navigate both levels  Issue #2: Time – Designing can be very iterative process – Performing what if scenarios can be costly – Possible tradeoffs: speed vs. accuracy
  • 5.
    Exemplary FPGA DesignFlow High-Level Estimation HLL Definition Optimization RTL Generation Synthesis & Simulation Constraints met? Place & Route Y N FPGA Bitstream  In-depth information – Supply perspective desired by designers – Provide context: where, when, why  Immediate feedback – Offer quick estimations – Increasing benefits with multiple runs
  • 6.
    What Do WeHave?  Provides estimated FPGA resources of RTL designs – Without running synthesis – All resources reported – All FPGA families supported  Benefits – Fast: 100x faster than synthesis – Accurate: 15.2% average error – Transparent: in default RTL flow – Useful: help user select part, etc.  Released with ISE® 11 PlanAhead™
  • 7.
    Resource Estimation ToolFlow  User-provided input – RTL source code – Settings (e.g., FPGA family) – Tcl script (batch mode)  Obtain netlist from HDL parser/elaborator  Estimate each macro in netlist  Refine using synthesis modeling  Estimation output – Interactive database (GUI) – Report files (XML, Excel) HDL Source Code Library Characterization HDL Parser/Elaborator Design Netlist Macro-Level Estimations Refinements Estimation Database PlanAhead Tool
  • 8.
    Benchmark Results FPGA Resource Post-Map Estimation Error(Avg.)* Slices 15.2% LUTs 15.6% Flip-Flops 11.5% BlockRAMs 6.4% DSP48s 2.2% Run-Time** 17.2 sec * Using suite of 100 customer designs across three FPGA families ** Run-time includes HDL parser/elaborator + estimation
  • 9.
  • 10.
  • 11.
  • 12.
    Results: MPEG-4 Decoder FPGAFamily Comparison Memory Comparison Both experiments performed in less than one minute! QCIF CIF 4CIF 720p 1080p
  • 13.
    Estimation Use Models Early in design: immediate feedback  Later in design: “hot spot” identification  Design space exploration (DSE)  Increased abstraction level  Design benchmarking & comparison
  • 14.
    Conclusions  Resource estimatorintegrated into PlanAhead  Provides detailed analysis of design – Breakdown of resources by functionality & hierarchy – Statistics on memories and bit widths  Future releases – Improved QoR – Estimation of other requested design budgets – Integration with other tools  Contact us with any questions! – Email: {paul.schumacher, pradip.jha}@xilinx.com