Your SlideShare is downloading. ×
0
The Microarchitecure Of FPGA Based Soft Processor
The Microarchitecure Of FPGA Based Soft Processor
The Microarchitecure Of FPGA Based Soft Processor
The Microarchitecure Of FPGA Based Soft Processor
The Microarchitecure Of FPGA Based Soft Processor
The Microarchitecure Of FPGA Based Soft Processor
The Microarchitecure Of FPGA Based Soft Processor
The Microarchitecure Of FPGA Based Soft Processor
The Microarchitecure Of FPGA Based Soft Processor
The Microarchitecure Of FPGA Based Soft Processor
The Microarchitecure Of FPGA Based Soft Processor
The Microarchitecure Of FPGA Based Soft Processor
The Microarchitecure Of FPGA Based Soft Processor
The Microarchitecure Of FPGA Based Soft Processor
The Microarchitecure Of FPGA Based Soft Processor
The Microarchitecure Of FPGA Based Soft Processor
The Microarchitecure Of FPGA Based Soft Processor
The Microarchitecure Of FPGA Based Soft Processor
The Microarchitecure Of FPGA Based Soft Processor
The Microarchitecure Of FPGA Based Soft Processor
The Microarchitecure Of FPGA Based Soft Processor
The Microarchitecure Of FPGA Based Soft Processor
The Microarchitecure Of FPGA Based Soft Processor
The Microarchitecure Of FPGA Based Soft Processor
The Microarchitecure Of FPGA Based Soft Processor
The Microarchitecure Of FPGA Based Soft Processor
The Microarchitecure Of FPGA Based Soft Processor
The Microarchitecure Of FPGA Based Soft Processor
The Microarchitecure Of FPGA Based Soft Processor
The Microarchitecure Of FPGA Based Soft Processor
The Microarchitecure Of FPGA Based Soft Processor
The Microarchitecure Of FPGA Based Soft Processor
The Microarchitecure Of FPGA Based Soft Processor
The Microarchitecure Of FPGA Based Soft Processor
The Microarchitecure Of FPGA Based Soft Processor
The Microarchitecure Of FPGA Based Soft Processor
The Microarchitecure Of FPGA Based Soft Processor
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

The Microarchitecure Of FPGA Based Soft Processor

16,448

Published on

this presentation is on the Paper "The Microarchitecure Of FPGA Based Soft Processor" by Peter Yiannacouras, Jonathan Rose and …

this presentation is on the Paper "The Microarchitecure Of FPGA Based Soft Processor" by Peter Yiannacouras, Jonathan Rose and
J Gregory Steffan
Dept. of Electrical and Computer Engineering
University of Toronto

Published in: Education
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
16,448
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
85
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. The Microarchitecure of FPGA-Based Soft Processor Peter Yiannacouras, Jonathan Rose and J Gregory Steffan Dept. of Electrical and Computer Engineering University of Toronto Presented By: Deepak Tomar CS08M054,M Tech II Year CS & E Dept
  • 2. Outline <ul><li>Aim </li></ul><ul><li>The Basics First </li></ul><ul><li>Motivation </li></ul><ul><li>Understanding Soft Processor Microarchitecture </li></ul><ul><li>Overview of SPREE System </li></ul><ul><li>Experimental Framework </li></ul><ul><li>Exploring Soft Processor Architecture (Partially) </li></ul>
  • 3. Aim <ul><li>To build a system for automatically generating soft processors </li></ul><ul><li>To develop a methodology for comparing soft processor architectures </li></ul><ul><li>To begin to populate and analyze soft processor design space </li></ul>
  • 4. The Basics First <ul><li>What is an FPGA? </li></ul><ul><li>How is it different from ASIC? </li></ul><ul><li>What is a Soft Processor? </li></ul><ul><li>Is there Hard Processor too? </li></ul>
  • 5. Field Programmable Gate Array (FPGA) <ul><li>FPGAs are programmable digital logical chips </li></ul><ul><li>Can be programmed to do almost any digital function </li></ul><ul><li>and – important makers of FPGAs </li></ul><ul><li>ASICs are application specific logical chips which are programmed for a dedicated task </li></ul>
  • 6. How FPGAs work ? <ul><li>Logic Cells </li></ul><ul><li>FPGAs built from one logic cell duplicated </li></ul><ul><li>hundred or thousands time. A Logic Cell is </li></ul><ul><li>basically a small lookup table (LUT), a </li></ul><ul><li>D-flip-flop and a 2-to-1 mux. A LUT is a small </li></ul><ul><li>RAM that can implement any logic function </li></ul><ul><li>Interconnect </li></ul><ul><li>Each logic cell can be connected to other logic cells through interconnect resources (wires/muxes placed around the logic cell). Each cell can do a little but with lots of them connected together, complex logic functions can be created </li></ul><ul><li>General Work Flow when working with FPGAs </li></ul>FLIP-FLOP LUT LOGIC CELL Logic Function as Text File Binary file post compilation of Text File Computer FPGA Cable
  • 7. Soft Processor/ Hard Processor <ul><li>In a soft processor, the processor is implemented in the chip using the FPGA fabric itself </li></ul><ul><li>In a hard processor, a processor as it is, is incorporated in the chip </li></ul><ul><li>Examples </li></ul>Altera Excalibur Altera Nios Xilinx Virtex II Pro Xilinx Microblaze Developer Hard Processor Developer Soft Processor
  • 8. Motivation <ul><li>More and more embedded systems using FPGA platforms </li></ul><ul><li>Increasing cost and time-to-market of designing state-of-the-art ASIC </li></ul><ul><li>Drawbacks of hard processor </li></ul><ul><li>▪ Mismatch in number of hard processor on FPGA chip and that required by the application </li></ul><ul><li>▪ Mismatch in performance requirements of a processor for an application and those provided by available FPGA based hard processors </li></ul><ul><li>▪ Difficulty in routing between processor and custom logic </li></ul><ul><li>▪ Leads to specialization of FPGA chip impacting yield and customer base </li></ul>
  • 9. Understanding Soft Processor Microarchitecture <ul><li>A soft processor comparatively slower and less area efficient </li></ul><ul><li>Processor architectures studied using high-level functional simulators due to difficulty in varying design at logic layout level </li></ul><ul><li>In contrast, FPGA CAD tools allow quick and accurate measure of exact speed, area and power </li></ul><ul><li>Full understanding leads to making intelligent application specific architectural trade-offs </li></ul><ul><li>Development of Soft Processor Rapid Exploration Environment (SPREE) to meet our aim </li></ul><ul><li>SPREE is a system for architectural exploration </li></ul>
  • 10. Overview of the SPREE system SPREE RTL Generator Efficiently Synthesizable RTL RTL CAD Flow RTL Simulator <ul><li>Correctness </li></ul><ul><li>Cycle count </li></ul><ul><li>Area </li></ul><ul><li>Clock Frequency </li></ul><ul><li>Power </li></ul>Embedded Benchmarks Applications Architecture Description
  • 11. Preview of capabilities of SPREE Area (Equivalent LEs) 0 200 400 600 800 1000 1200 1400 1600 1800 12000 10000 8000 6000 4000 2000 0 Average Wall Clock Time ( µs ) Multiply Full Hardware Support Multiply Software Routine Altera NiosIIe Altera NiosIIs Altera NiosIIf
  • 12. SPREE RTL Generator <ul><li>Input : The Architecture Description </li></ul><ul><li>Describing the Datapath </li></ul><ul><li>Selecting and Interchanging Components </li></ul><ul><li>Creating and Describing Custom components </li></ul><ul><li>Describing the ISA </li></ul><ul><li>Generating a Soft Processor </li></ul><ul><li>Datapath Verification </li></ul><ul><li>Datapath Instantiation </li></ul><ul><li>Control Generation </li></ul>
  • 13. SPREE RTL Generator Datapath Verification Datapath Instantiation Control Generation Component Library (Efficient RTL) SPREE RTL Generator Datapath Description ISA Description Efficient RTL Description
  • 14. SPREE RTL Generator <ul><li>Input : The Architecture Description </li></ul><ul><li>Describing the Datapath </li></ul><ul><li>Selecting and Interchanging Components </li></ul><ul><li>Creating and Describing Custom components </li></ul><ul><li>Describing the Instruction Set Architecture (ISA) </li></ul><ul><li>Generating a Soft Processor </li></ul><ul><li>Datapath Verification </li></ul><ul><li>Datapath Instantiation </li></ul><ul><li>Control Generation </li></ul>
  • 15. Datapath Description as Interconnection of Components Shift Instruction Memory Reg File mux mux ALU Data Mem
  • 16. SPREE RTL Generator <ul><li>Input : The Architecture Description </li></ul><ul><li>Describing the Datapath </li></ul><ul><li>Selecting and Interchanging Components </li></ul><ul><li>Creating and Describing Custom components </li></ul><ul><li>Describing the ISA </li></ul><ul><li>Generating a Soft Processor </li></ul><ul><li>Datapath Verification </li></ul><ul><li>Datapath Instantiation </li></ul><ul><li>Control Generation </li></ul>
  • 17. Sample component description for a simplified ALU <ul><li>Module alu_small { </li></ul><ul><li>Input opA 32 </li></ul><ul><li>Input opB 32 </li></ul><ul><li>Output result 32 </li></ul><ul><li>Opcode opcode 2{ </li></ul><ul><li>ADD 0 0 </li></ul><ul><li>SUB 1 0 </li></ul><ul><li>SLT 2 0 </li></ul><ul><li>} </li></ul><ul><li>} </li></ul>opcode Functionality Interface Port value Latency in cycles Bit width GENOPs : ADD,SUB and SLT inA inB ADD SUB SLT result
  • 18. SPREE RTL Generator <ul><li>Input : The Architecture Description </li></ul><ul><li>Describing the Datapath </li></ul><ul><li>Selecting and Interchanging Components </li></ul><ul><li>Creating and Describing Custom components </li></ul><ul><li>Describing the ISA </li></ul><ul><li>Generating a Soft Processor </li></ul><ul><li>Datapath Verification </li></ul><ul><li>Datapath Instantiation </li></ul><ul><li>Control Generation </li></ul>
  • 19. MIPS ADDI instruction shown as Data Dependence Graph IFETCH REGREAD SIGN_EXT ADD REGWRITE Rule: No GENOP can execute until all its inputs are ready
  • 20. SPREE RTL Generator <ul><li>Input : The Architecture Description </li></ul><ul><li>Describing the Datapath </li></ul><ul><li>Selecting and Interchanging Components </li></ul><ul><li>Creating and Describing Custom components </li></ul><ul><li>Describing the ISA </li></ul><ul><li>Generating a Soft Processor </li></ul><ul><li>Datapath Verification </li></ul><ul><li>Datapath Instantiation </li></ul><ul><li>Control Generation </li></ul>
  • 21. SPREE RTL Generator Datapath Verification Datapath Instantiation Control Generation Component Library (Efficient RTL) SPREE RTL Generator Datapath Description ISA Description Efficient RTL Description
  • 22. Generating a soft processor <ul><li>Datapath Verification </li></ul><ul><li>Ensuring each instruction’s GENOP graph in ISA is subgraph of datapath GENOP graph </li></ul><ul><li>Datapath Instantiation </li></ul><ul><li>Generate an equivalent Verilog description from input datapath description </li></ul><ul><li>Control Generation </li></ul><ul><li>SPREE generates logic to control datapath’s operation to correctly implement ISA </li></ul><ul><li>Control logic provides each component what operation to perform (Opcodes) and when to perform (Enables) </li></ul>
  • 23. Experimental Framework <ul><li>Required for measuring and comparing soft processor produced by SPREE </li></ul><ul><li>Processor Verification </li></ul><ul><li>Trace-based verification by comparing cycle accurate industrial RTL simulator and MINT (a MIPS instruction set simulator) </li></ul><ul><li>FPGA used : Altera’s Stratix I </li></ul><ul><li>Quartus II v4.2 CAD software for synthesis, technology mapping, placement and routing </li></ul>
  • 24. An Altera Stratix FPGA
  • 25. Experimental Framework (contd.) <ul><li>Metrics for measuring Soft Processors </li></ul><ul><li>Area : In terms of Logic Element (LE) </li></ul><ul><li>LE composed of 4-input lookup table (LUT) and a flip-flop </li></ul><ul><li>Performance : Wall-clock-time for execution of collection of benchmark (BM) applications </li></ul><ul><li>Wall-clock-time = Clock period*CPI*Avg. no of instructions </li></ul><ul><li>Power : Through Quartus’ Power Play tool, based on switching activities of post-placed-and-routed nodes determined by simulating BM applications </li></ul><ul><li>Static power and power of I/O pins substracted </li></ul><ul><li>For each benchmark, energy per instruction is calculated </li></ul>
  • 26. Exploring Soft Processor Microarchitecture <ul><li>Comparison of generated processor with NiosII variations </li></ul><ul><li>Three points in space : NiosIIe (smallest area, lowest performance), NiosIIf (largest area, highest performance), NiosIIs (in between) </li></ul><ul><li>A SPREE generated processor : 80 Mhz, 3-stage pipelined processor is 9% smaller and 11% faster than NiosIIs </li></ul><ul><li>CPI of this processor 1.36 and clock 80Mhz whereas NiosIIs and NiosIIf is 2.36,120Mhz and 1.97, 135Mhz respectively. </li></ul><ul><li>Smallest generated processor within 15% of area and 11% faster than NiosIIe. </li></ul><ul><li>CPI benefit of 2-3 CPI of smallest SPREE generated processor over 6 CPI of NiosIIe is reduced to 11% net win in wall-clock-time as clock freq. is 82 Mhz and 159 Mhz respectively. </li></ul>
  • 27. Avg wall-clock-time vs area of NiosII and generated processor Area (Equivalent LEs) 0 200 400 600 800 1000 1200 1400 1600 1800 12000 10000 8000 6000 4000 2000 0 Average Wall Clock Time ( µs ) Multiply Full Hardware Support Multiply Software Routine Altera NiosIIe Altera NiosIIs Altera NiosIIf
  • 28. Comparison with NiosII variations 135 1.97 NiosIIf 120 2.36 NiosIIs 9% smaller and 11% faster than NiosIIs 80 1.36 SPREE Generated Processor Comment Clock(MHz) CPI Processor
  • 29. Avg wall-clock-time vs area of NiosII and generated processor Area (Equivalent LEs) 0 200 400 600 800 1000 1200 1400 1600 1800 12000 10000 8000 6000 4000 2000 0 Average Wall Clock Time ( µs ) Multiply Full Hardware Support Multiply Software Routine Altera NiosIIe Altera NiosIIs Altera NiosIIf
  • 30. Comparison with NiosII variations 159 6 NiosIIe Within 15% of area and 11% faster than NiosIIe 82 2-3 SPREE Smallest Generated Processor Comment Clock(MHz) CPI Processor
  • 31. Conclusion <ul><li>Results indicate generated processor which came within 15% of smallest NiosII variation while outperforming it by 11% </li></ul><ul><li>Other generated processors both outperformed and smaller than standard NiosII variation </li></ul><ul><li>The Generator can populate the design space while remaining relatively competitive with commercial, hand optimized soft processor </li></ul>
  • 32. References <ul><li>http://portal.acm.org/citation.cfm?id=1086297.1086325 </li></ul><ul><li>http://www.fpga4fun.com/FPGAinfo1.html </li></ul><ul><li>http://en.wikipedia.org/wiki/Field-programmable_gate_array </li></ul>
  • 33. THANK YOU
  • 34. SPREE RTL Generator Datapath Verification Datapath Instantiation Control Generation Component Library (Efficient RTL) SPREE RTL Generator Datapath Description ISA Description Efficient RTL Description
  • 35. NiosII variations <ul><li>NiosIIe : Unpipelined 6-CPI processor witj serial shifter and software multiplication support </li></ul><ul><li>NiosIIs : 5-stage pipeline with multiplier based shifter, hardware multiplication and an instruction cache </li></ul><ul><li>NiosIIf : Large 6-stage pipeline with dynamic branch prediction, instruction and data caches and optional hardware divider </li></ul>
  • 36. Generator collects all timing information from each component Analyze datapath and infer pipeline stage of each component In each pipeline, local stall signals extracted and propagated (stall network) to earlier stages Enables generated if component is not stalled Generation of Enable Signals
  • 37. <ul><li>FPGA-based soft processors adapted more widely in embedded processing, hence need exists to understand architectural tradeoffs to maximize efficiency </li></ul><ul><li>SPREE is an infrastructure for rapidly generating soft processors </li></ul><ul><li>Comparison of generated processors carried out with Altera’s NiosII family of commercial soft processors </li></ul>

×