Managing High Performance  Data Pipeline Execution  with an FPGA Processor Presenter: Ben Hor – Xilinx, Inc Authors: Glenn...
Agenda  <ul><li>What is Control Plane/Data Plane Processing and Why Might I Need It? </li></ul><ul><li>FPGA’s Enable Balan...
What is  Control Plane / Data Plane Processing and  Why Might I Need It?
Challenge Example:    HD Video Streaming <ul><li>720P    74.25 MHz Pixel Rate  </li></ul><ul><ul><li>222.75 MBs data rate...
Coprocessing: An Effective Way of Accelerating Software <ul><li>Distributes the load </li></ul><ul><li>Move computational ...
A Look at Coprocessing Architectures <ul><li>Fully Decoupled </li></ul><ul><ul><li>Common, but not interesting for this to...
What is Control Plane / Data Plane Data In  Data Out User Interface Processor Bus or  Dedicated Control Channel(s) Control...
Control / Data Plane Example <ul><li>Control plane: controls the state of network elements </li></ul><ul><ul><li>Route sel...
FPGA’s Enable Computation Balancing Between a Processor and Application Specific Logic
FPGAs: Ideal for Coprocessing  <ul><li>Tight integration between FPGA & Processor </li></ul><ul><ul><li>Reduced Latency </...
External Processor Challenges <ul><li>Latency for control signals to coprocessor </li></ul><ul><li>Pin challenges </li></u...
Implementation of A  Control Plane / Data Plane System is Straight Forward
Building The Control Plane / Data Plane System <ul><li>Assemble the Control Plane processor </li></ul><ul><li>Assemble the...
Assemble the Control Plane Processor
 
 
<ul><li>Multiple Languages/Tools/Flows to create Coprocessors </li></ul><ul><ul><li>Low Level </li></ul></ul><ul><ul><ul><...
CASE STUDY: HD VIDEO RECOGNITION SYSTEM
The Case Study Problem <ul><li>720P HD Video Stream </li></ul><ul><ul><li>DVI Input and DVI Output </li></ul></ul><ul><li>...
The Architected Solution <ul><li>How Control Plane Processor Was Created </li></ul><ul><li>How the Data Processing Pipelin...
Base Processor Reference Design Linux Xilinx MicroBlaze Processor Block RAM SystemAce Compact Flash ICC GPIO LEDs GPIO DIP...
DVI Pass-through Reference Design <ul><li>Basic “real-time” video processing </li></ul>DVI  Input DVI  Output Image  Proce...
DVI Pass-through Reference Design <ul><li>Basic “real-time” video processing </li></ul>Image  Processing DVI  Input DVI  O...
Integrated Control/Data Plane System DVI The processor is used to dynamically configure filters Processor Local Bus (PLB) ...
HD Object Detection & Highlighting
Connecting the  Embedded Processor to the FPGA with Linux
Control the Pipe with Linux <ul><li>Linux is Now the #1 OS for Embedded FPGA Systems </li></ul><ul><li>Newest Generation I...
Configure Linux for the IO Device <ul><li>// Load the custom driver into Linux kernel </li></ul><ul><ul><li>module_init(xl...
Controlling the Data Pipe with the Linux Application <ul><li>// Open custom I/O device from Linux application </li></ul><u...
SUMMARY <ul><li>FPGAs enable computational balancing between an FPGA based processor and a data processing pipeline reduci...
Thank You Glenn Steiner, Dan Isaacs – Xilinx, Inc. David Pellerin – Impulse Accelerated Technologies
Upcoming SlideShare
Loading in...5
×

Xilinx track g

683

Published on

Published in: Education, Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
683
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
16
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide
  • Example of a “real-time” non-frame buffer based processing solution. There are several products that require a specialized streaming processing, and this example provide a quick and easy method for the developer to quickly the existing design with their algorithm. The fully integrated HW-CoSim environment enables a faster validation cycle with the hardware in the loop functionality.
  • DE Gen - Data Enable Generator - Example of a “real-time” non-frame buffer based processing solution. There are several products that require a specialized streaming processing, and this example provide a quick and easy method for the developer to quickly the existing design with their algorithm. The fully integrated HW-CoSim environment enables a faster validation cycle with the hardware in the loop functionality.
  • Xilinx track g

    1. 1. Managing High Performance Data Pipeline Execution with an FPGA Processor Presenter: Ben Hor – Xilinx, Inc Authors: Glenn Steiner, Dan Isaacs – Xilinx, Inc. David Pellerin – Impulse Accelerated Technologies
    2. 2. Agenda <ul><li>What is Control Plane/Data Plane Processing and Why Might I Need It? </li></ul><ul><li>FPGA’s Enable Balancing Computation Between a Processor and Application Specific Logic </li></ul><ul><li>Implementation of a Control/Data Plane System is Straightforward </li></ul><ul><li>Case Study: An HD Video Recognition System </li></ul><ul><li>Connecting the Embedded Processor to the FPGA with Linux </li></ul><ul><li>Summary </li></ul>
    3. 3. What is Control Plane / Data Plane Processing and Why Might I Need It?
    4. 4. Challenge Example: HD Video Streaming <ul><li>720P  74.25 MHz Pixel Rate </li></ul><ul><ul><li>222.75 MBs data rate </li></ul></ul><ul><ul><li>Hypothetical Dual Core – 2.5GHz, Dual Issue (2 instructions per clock) </li></ul></ul><ul><ul><ul><li>10 GHz Instruction Rate </li></ul></ul></ul><ul><ul><ul><li> 22.4 instructions per byte of data processed </li></ul></ul></ul><ul><li>What about OS overhead </li></ul><ul><ul><li>Task switching times </li></ul></ul><ul><ul><li>Interrupt latency </li></ul></ul><ul><ul><li>All bus bandwidth eaten up with video data </li></ul></ul><ul><li>Can’t Do It With a Standard Processor </li></ul>
    5. 5. Coprocessing: An Effective Way of Accelerating Software <ul><li>Distributes the load </li></ul><ul><li>Move computational load where it belongs </li></ul><ul><li>Dedicated processing element(s) provide dramatic acceleration </li></ul>
    6. 6. A Look at Coprocessing Architectures <ul><li>Fully Decoupled </li></ul><ul><ul><li>Common, but not interesting for this topic </li></ul></ul><ul><li>Single / Multi-Instruction Accelerator </li></ul><ul><ul><li>FPU </li></ul></ul><ul><li>Loosely Coupled - Separated Functions </li></ul><ul><ul><li>Message / Control Passing </li></ul></ul><ul><ul><li>Typically Used for Control Plane / Data Plane Processing </li></ul></ul>
    7. 7. What is Control Plane / Data Plane Data In Data Out User Interface Processor Bus or Dedicated Control Channel(s) Control Plane Data Plane Control Plane Processor (OS) Coprocessor Coprocessor Coprocessor
    8. 8. Control / Data Plane Example <ul><li>Control plane: controls the state of network elements </li></ul><ul><ul><li>Route selection </li></ul></ul><ul><ul><li>RSVP, capability signaling, etc. </li></ul></ul><ul><ul><li>Exception handling </li></ul></ul><ul><li>Data plane: manages data packets </li></ul><ul><ul><li>Packet forwarding </li></ul></ul><ul><ul><li>Packet differentiation </li></ul></ul><ul><ul><li>Buffering, link scheduling </li></ul></ul>Adapted from: Active correlation between the control and data plane – Z. Morley Mao
    9. 9. FPGA’s Enable Computation Balancing Between a Processor and Application Specific Logic
    10. 10. FPGAs: Ideal for Coprocessing <ul><li>Tight integration between FPGA & Processor </li></ul><ul><ul><li>Reduced Latency </li></ul></ul><ul><ul><li>Matched clock rates </li></ul></ul><ul><li>Configure the processors to meet system requirements </li></ul><ul><ul><li>Configure Processors </li></ul></ul><ul><ul><li>Configure the Coprocessors </li></ul></ul><ul><li>Flexible logic enables experimentation </li></ul>
    11. 11. External Processor Challenges <ul><li>Latency for control signals to coprocessor </li></ul><ul><li>Pin challenges </li></ul><ul><ul><li>Many pins reduce latency but at higher power & part cost </li></ul></ul><ul><ul><li>High speed serial (PCIe) minimizes pins at cost of latency & power </li></ul></ul><ul><li>May not be the lowest cost solution </li></ul><ul><li>FPGA embedded processors solve these challenges and enable performance balancing </li></ul>
    12. 12. Implementation of A Control Plane / Data Plane System is Straight Forward
    13. 13. Building The Control Plane / Data Plane System <ul><li>Assemble the Control Plane processor </li></ul><ul><li>Assemble the Data Pipeline </li></ul><ul><ul><li>Combining IP generated by multiple tools </li></ul></ul><ul><ul><li>C to HDL Tools may be an effective option </li></ul></ul><ul><li>Control the Pipe with Processor and OS </li></ul>
    14. 14. Assemble the Control Plane Processor
    15. 17. <ul><li>Multiple Languages/Tools/Flows to create Coprocessors </li></ul><ul><ul><li>Low Level </li></ul></ul><ul><ul><ul><li>Hand Crafted - RTL (VHDL/Verilog) </li></ul></ul></ul><ul><ul><li>High Level </li></ul></ul><ul><ul><ul><li>Matlab / Simulink </li></ul></ul></ul><ul><ul><ul><li>‘ C’ to FPGA (HDL) </li></ul></ul></ul><ul><ul><ul><li>‘ C’ Variants </li></ul></ul></ul>Assemble and Connect the Data Plane
    16. 18. CASE STUDY: HD VIDEO RECOGNITION SYSTEM
    17. 19. The Case Study Problem <ul><li>720P HD Video Stream </li></ul><ul><ul><li>DVI Input and DVI Output </li></ul></ul><ul><li>Locate the clown fish in the video </li></ul><ul><li>Highlight the clown fish </li></ul><ul><li>Continuously track the fish </li></ul><ul><li>Adjust spotlight size based upon likelihood of match </li></ul>
    18. 20. The Architected Solution <ul><li>How Control Plane Processor Was Created </li></ul><ul><li>How the Data Processing Pipeline Was Created </li></ul>
    19. 21. Base Processor Reference Design Linux Xilinx MicroBlaze Processor Block RAM SystemAce Compact Flash ICC GPIO LEDs GPIO DIP Switch Debug Module UART Multiport Memory Controller DDR2 Memory GPIO Push Buttons Clock Generator Reset Module
    20. 22. DVI Pass-through Reference Design <ul><li>Basic “real-time” video processing </li></ul>DVI Input DVI Output Image Processing
    21. 23. DVI Pass-through Reference Design <ul><li>Basic “real-time” video processing </li></ul>Image Processing DVI Input DVI Output <ul><li>Streaming pixel processing </li></ul><ul><ul><li>Streaming video data </li></ul></ul><ul><ul><li>MicroBlaze controls filter coefficients in “real-time” </li></ul></ul><ul><li>Simple design example for customer IP integration </li></ul>System Generator Custom video accelerator pcore
    22. 24. Integrated Control/Data Plane System DVI The processor is used to dynamically configure filters Processor Local Bus (PLB) DVI Filter control (UART) New Pipeline Element DVI In Gamma In Gamma Out DVI Out Xilinx MicroBlaze Processor System 2D FIR Filter Object Detection
    23. 25. HD Object Detection & Highlighting
    24. 26. Connecting the Embedded Processor to the FPGA with Linux
    25. 27. Control the Pipe with Linux <ul><li>Linux is Now the #1 OS for Embedded FPGA Systems </li></ul><ul><li>Newest Generation Is More “Real-Time” </li></ul><ul><li>Large Public Code Base </li></ul><ul><li>Mostly Free </li></ul><ul><li>FPGA IO Drivers Available </li></ul>
    26. 28. Configure Linux for the IO Device <ul><li>// Load the custom driver into Linux kernel </li></ul><ul><ul><li>module_init(xll_example_init); </li></ul></ul><ul><li>// Register driver to specific device number - 253 </li></ul><ul><ul><li>err = register_chrdev_region(devno, 1, &quot;custom_io_example&quot;); </li></ul></ul><ul><ul><li>bash# mknod /dev/custom_io_example0 c 253 0 </li></ul></ul>
    27. 29. Controlling the Data Pipe with the Linux Application <ul><li>// Open custom I/O device from Linux application </li></ul><ul><ul><li>int custom_io_ex_ open (struct inode *inode, struct file *filp) </li></ul></ul><ul><li>// Read / Write to custom peripheral I/O using standard Linux read/function function calls </li></ul><ul><ul><li>ssize_t custom_io_ex_ read (struct file *filp, char __user *buf, size_t count, loff_t *f_pos) </li></ul></ul><ul><ul><li>ssize_t custom_io_ex_ write (struct file *filp, const char __user *buf, size_t count, loff_t *f_pos) </li></ul></ul>
    28. 30. SUMMARY <ul><li>FPGAs enable computational balancing between an FPGA based processor and a data processing pipeline reducing development risks </li></ul><ul><li>Offloading streaming data processing tasks to an FPGA data-plane processing pipeline can enable meeting performance objectives </li></ul><ul><li>An FPGA based single chip control-plane and data-plane processing solution can reduce cost and development time </li></ul><ul><li>Offloading enables Processor to handle multitude of other tasks </li></ul>
    29. 31. Thank You Glenn Steiner, Dan Isaacs – Xilinx, Inc. David Pellerin – Impulse Accelerated Technologies
    1. A particular slide catching your eye?

      Clipping is a handy way to collect important slides you want to go back to later.

    ×