SlideShare a Scribd company logo
Lecture 16
RC Architecture Types &
FPGA Interns
Lecturer:
Simon Winberg
Attribution-ShareAlike 4.0 International (CC BY-SA 4.0)
 Reminders & YODA milestone dates
Marking process
 RC Architecture overview & main types
 Recap of FPGAs
 Evaluating
Performance of
Combinational
Logic / FPGA design
(slides 22)
 Indicate your YODA team in the Wiki.
Add a blog entry to describe your topic
 29 Apr – Blog about your product
 15 May – Design Review
 18-20 May – Demos
 22 May final report & code (although no
late penalty if submitted before 25 May 8am)
See “EEE4084F YODA Mark Allocation Schema.pptx” for process of allocating marks for mark categories
 Assignment work is marked in relation to
Correctness
Completion
Structure, effectiveness of wording & layout
Adequate amount of detail/results shown &
effectively dealing with the details
Indication of student’s understanding and
engagement with the discipline
Clarity of explanations/motivation of results
Professionalism and overall quality
RC Architectures Overview
Reconfigurable Computing
 A determining factor is ability to change
hardware datapaths and control flows by
software control
 This change could be either a post-process /
compile time or dynamically during runtime
(doesn’t have to be both)
processing
elements
Datapath
While the trivial case (a
computer with one changeable
datapath could be argued as
being reconfigurable) it is
usually assumed the computer
system concerned has many
changeable datapaths.
 Currently there are
two basic forms:
Microprocessor-based RC
FPGA-based RC
Microprocessor-based RC:
• A few platform configurability features added to a
microprocessor system (e.g., a multi-processor
motherboard that can reroute the hardware links
between processors)
• Besides that we’ve already seen it all in the
microprocessor parallelism in part of the course
 Microprocessor based RC
 Multi-core processors dynamically
joined to create a larger/smaller
parallel system when needed
 Assumed to be a single computer
platform as apposed to a cluster of computers
 Needs to support software-controlled dynamic
reconfiguration (see previous slide)
 Tends to become:
Hardware essentially changeable in big blocks
(“macro-level reconfiguration” - whole processors at a time)
 FPGA based
Generally much smaller level of
interconnects (more at the “micro-level
reconfiguration”)
Processors that connect to FPGA(s)
 Generally, these systems follow a
processors + coprocessors arrangement
CPU connectors to reprogrammable
hardware (usually FPGAs)
The CPU itself may be
entirely in an FPGA
 The lower-level
architecture is more
involved…
CPU
FPGA-based
Accelerator
card
…
high-speed bus
CPU
…
FPGA-based
Accelerator
card
topic of Seminar #8 (‘Interconnection Fabrics’) and further
discussed in later lectures.
FPGA Interns
EEE4084F
Skip to slide 22; already covered in
text book but scan through these
slides to ensure you are well
versed in these issues.
FPGA internal structure
Programmable
interconnect
Programmable
logic blocks
Image adapted from Maxfield (2004)
Programmable logic
element (PLE)
(or FPLE*)
* FPLE = Field Programmable Logic Element
Note: one programmable logic block (PLB) may contain a complex
arrangement of programmable logic elements (PLE).
The size of a FPGA or programmable logic device (PLD) is measured in the number of
LEs (i.e., Logic Elements) that it has.
 You already know all your logic primitives…
The primitive logic gates
AND, OR, NOR, NOT, NOR, NAND, XOR
AND3, OR4, etc (for multiple inputs).
Pins / sources / terminators
Ground, VCC
Input, output
Storage elements
JK Flip Flops
Latches
Others items: delay, mux
OR
Input Pin
Output Pin
Altera Quartus II representations
 A simple but powerful approach to FPGA
design is to use lookup tables for the
PLBs. These are usually implemented as a
combination of a multiplexer and memory
(even just using NOR gates)
 Essentially, this approach is building
complex circuits using truth tables (where
each LUT enumerates a truth table)
The usual strategy for implementing PLBs
examples follows…
Simple 3-LUT implementation for
a PLB
0
1
1
0
1
0
0
1
8-bit static memory 3
3-bit input bus
1-bit output
000
001
010
011
100
101
110
111
Any guesses as to
what logic circuit this
LUT implements?
input values
Simple 3-LUT implementation for
a PLB
input lines
It’s an XOR of the 3 input
lines!!!
output 0
1
1
0
1
0
0
1
000
001
010
011
100
101
110
111
in out
Mainstream* Programmable
Logic Block (PLB)
k-input
LUT
DFF
clock
…
k inputs output
config_sync
Configure
synchronous or
asynchronous
response (i.e. a line
from another big
LUT).
0
1
Image adapted from Maxfield (2004)
Another example for implementing an alternate logic function.
* Used by manufacturers like Xilinx
Logic block clusters (LBCs) and
Configurable logic blocks (CLBs)
• Assume a k-input LUT for each logic block (LB)
• Assume N x LBs per logic cluster
• BLEs in each logic clusters are fully connected or mostly
connected
Diagram adapted from Sherief Reda (2007), EN2911X Lecture 2 Fall07, Brown University
The diagram shows the
same input lines (I) are
sent to each LB, in
addition to each of the
N LBs’ output lines.
Each LB operates on 4
input lines at a time,
and a MUX is used to
decide which input to
sample. The MUXs may
be configured from a
separate LUT, or could
be controlled by the LB
it is connected to.
LB
LB
…
N x LBs
“Every slice contains four logic-function generators (or LUTs), eight storage elements, wide-
function multiplexers, and carry logic. These elements are used by all slices to provide logic,
arithmetic, and ROM functions. In addition to this, some slices support
two additional functions: storing data using distributed RAM and shifting data with 32-bit
registers. Slices that support these additional functions are called SLICEM; others are called
SLICEL. SLICEM represents a superset of elements and connections found in all slices. Each
CLB can contain zero or one SLICEM. Every other CLB column contains a SLICEMs.
In addition, the two CLB columns to the left of the DSP48E columns both contain a SLICEL
and a SLICEM.” Source: http://www.xilinx.com/support/documentation/user_guides/ug364.pdf pg 8
SLICEM slices support
additional functions; they are a
superset of SLICELs; i.e. the
have all the standard LEs plus
some additions.
Source: http://www.xilinx.com/support/documentation/user_guides/ug364.pdf pg 9
SLICEL slices contain the
standard set of LEs for the
particular FPGA concerned.
As the diagram shows, it looks
a little less complicated than
the design of a SLICEM.
Source: http://www.xilinx.com/support/documentation/user_guides/ug364.pdf pg 10
Evaluating Performance
Evaluating synthesis (simplified) of an FPGA design
HDL to FPGA execution & LE cost
Map ‘AND(e,f,g)’ to LB1
In order to implement a HDL design, the design need to be decomposed and
mapped to the physical LBs on the FPGA and the interconnects need to be
appropriately configured.
Example:
x = AND(e,f,g)
y = AND(b,NAND(NAND(b,c),d))
out = NAND((NAND(x,y),NAND(a,y))
out
x
y
Map ‘NAND((NAND(x,y),NAND(a,y))’
to LB2
Map ‘AND(b,NAND(NAND(b,c),d)) ’ to LB3
Costing: 3 LBs, 8 LEs (assuming LBs have LEs that are AND or NAND gates)
 The previous slide didn’t show whether the
connections were synchronized (i.e., a shared clock)
or asynchronous –since they are all logic gates and
no clocks show it’s probably asynchronous
 Determining the timing constrains for synchronous
configurations are generally easier, because
everything is related to the clock speed. Still, you
need to keep in mind cascading calculations.
 For asynchronous use, the implementation could run
faster, but can also become a more complicated
design, and be more difficult to work out the timing…
 Keep in mind that the propagation delays for the various
gates / LUTs may be different – for example, in the
previous example, let’s assume each AND may take 6ns to
stabilise, and the NANDS 10ns.
 So time to compute out is =
MAX OF (time to compute x, time to compute y) + 2x10ns
= (2x10ns+6ns) + 20ns = 46ns = pretty fast!! Or is it??
Compared to a 1GHz CPU using just registers (and no
mem access)?
Try this calculation for yourself ...
(assume each instruction takes on avg. 3 clocks due to pipeline,
data dependencies, etc, as worst case performance on a
RISC processor)
CPU running at 1GHz  each clock 1ns period
Assume each instruction takes ~ 5 clocks each due to pipeline etc
CODE:
int doit ( unsigned a, b, c, d, e, f, g ) {
unsigned x = AND(e,f,g);
unsigned y = AND(b,NAND(NAND(b,c),d))
out = NAND((NAND(x,y),NAND(a,y))
return out;
}
unsigned t1 = AND(e,f);  1 instruction, i.e. AND t1,e,f
unsigned x = AND(t1,g);
unsigned t1 = NAND(b,c)
unsigned t2 = NAND(t1,d)
unsigned y = AND(b,t2)
t1 = NAND(x,y)
t2 = NAND(a,y)
out = NAND(t1,t2)
in all 8 instructions  8 x 3 clocks ea.
= 24 ns (assuming all registers pre-loaded)
A speed-up of 1.92 over the FPGA case
But some of these
Can’t be done as just 1
RISC instruction.
 RC architecture case studies
IBM Blade & the cell processor
Some large-scale RC systems
 Amdahl’s Law reviewed and critiqued
Image sources:
FYI Stamp – Wikipedia open commons
Reminder stamp – Open Clipart www.openclipart.org (public domain)
Xilinx FPGA related images & schematics – from Xilinx datasheets or their website
Disclaimers and copyright/licensing details
I have tried to follow the correct practices concerning copyright and licensing of material,
particularly image sources that have been used in this presentation. I have put much
effort into trying to make this material open access so that it can be of benefit to others in
their teaching and learning practice. Any mistakes or omissions with regards to these
issues I will correct when notified. To the best of my understanding the material in these
slides can be shared according to the Creative Commons “Attribution-ShareAlike 4.0
International (CC BY-SA 4.0)” license, and that is why I selected that license to apply to
this presentation (it’s not because I particulate want my slides referenced but more to
acknowledge the sources and generosity of others who have provided free material such
as the images I have used).

More Related Content

Similar to Lecture 16 RC Architecture Types & FPGA Interns Lecturer.pptx

Reconfigurable ICs
Reconfigurable ICsReconfigurable ICs
Reconfigurable ICs
Anish Goel
 
Fpga architectures and applications
Fpga architectures and applicationsFpga architectures and applications
Fpga architectures and applications
Sudhanshu Janwadkar
 
FPGA Architecture and application
FPGA Architecture and application FPGA Architecture and application
FPGA Architecture and application
ADARSHJKALATHIL
 
System designing and modelling using fpga
System designing and modelling using fpgaSystem designing and modelling using fpga
System designing and modelling using fpga
IAEME Publication
 
L12 programmable+logic+devices+(pld)
L12 programmable+logic+devices+(pld)L12 programmable+logic+devices+(pld)
L12 programmable+logic+devices+(pld)
NAGASAI547
 
FPGA
FPGAFPGA
L12_PROGRAMMABLE+LOGIC+DEVICES+(PLD).ppt
L12_PROGRAMMABLE+LOGIC+DEVICES+(PLD).pptL12_PROGRAMMABLE+LOGIC+DEVICES+(PLD).ppt
L12_PROGRAMMABLE+LOGIC+DEVICES+(PLD).ppt
MikeTango5
 
VLSI
VLSIVLSI
Fpga lecture
Fpga lectureFpga lecture
Fpga lecture
Zhwan Rashid
 
Introduction to FPGA, VHDL
Introduction to FPGA, VHDL  Introduction to FPGA, VHDL
Introduction to FPGA, VHDL
Amr Rashed
 
Spartan-II FPGA (xc2s30)
Spartan-II FPGA (xc2s30)Spartan-II FPGA (xc2s30)
Spartan-II FPGA (xc2s30)
A B Shinde
 
Lec15 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- EPIC VLIW
Lec15 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- EPIC VLIWLec15 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- EPIC VLIW
Lec15 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- EPIC VLIW
Hsien-Hsin Sean Lee, Ph.D.
 
MPHD RC Overview
MPHD RC OverviewMPHD RC Overview
MPHD RC Overview
Marco Santambrogio
 
Short.course.introduction.to.vhdl
Short.course.introduction.to.vhdlShort.course.introduction.to.vhdl
Short.course.introduction.to.vhdl
Ravi Sony
 
UIC Thesis Candiloro
UIC Thesis CandiloroUIC Thesis Candiloro
UIC Thesis Candiloro
Marco Santambrogio
 
4_BIT_ALU
4_BIT_ALU4_BIT_ALU
4_BIT_ALU
Sohel Siddique
 
NIOS II Processor.ppt
NIOS II Processor.pptNIOS II Processor.ppt
NIOS II Processor.ppt
Atef46
 
Iaetsd a design of fpga with ledr encoding and
Iaetsd a design of fpga with ledr encoding andIaetsd a design of fpga with ledr encoding and
Iaetsd a design of fpga with ledr encoding and
Iaetsd Iaetsd
 
Cpld fpga
Cpld fpgaCpld fpga
Cpld fpga
anishgoel
 

Similar to Lecture 16 RC Architecture Types & FPGA Interns Lecturer.pptx (20)

Reconfigurable ICs
Reconfigurable ICsReconfigurable ICs
Reconfigurable ICs
 
Fpga architectures and applications
Fpga architectures and applicationsFpga architectures and applications
Fpga architectures and applications
 
FPGA Architecture and application
FPGA Architecture and application FPGA Architecture and application
FPGA Architecture and application
 
System designing and modelling using fpga
System designing and modelling using fpgaSystem designing and modelling using fpga
System designing and modelling using fpga
 
Actel fpga
Actel fpgaActel fpga
Actel fpga
 
L12 programmable+logic+devices+(pld)
L12 programmable+logic+devices+(pld)L12 programmable+logic+devices+(pld)
L12 programmable+logic+devices+(pld)
 
FPGA
FPGAFPGA
FPGA
 
L12_PROGRAMMABLE+LOGIC+DEVICES+(PLD).ppt
L12_PROGRAMMABLE+LOGIC+DEVICES+(PLD).pptL12_PROGRAMMABLE+LOGIC+DEVICES+(PLD).ppt
L12_PROGRAMMABLE+LOGIC+DEVICES+(PLD).ppt
 
VLSI
VLSIVLSI
VLSI
 
Fpga lecture
Fpga lectureFpga lecture
Fpga lecture
 
Introduction to FPGA, VHDL
Introduction to FPGA, VHDL  Introduction to FPGA, VHDL
Introduction to FPGA, VHDL
 
Spartan-II FPGA (xc2s30)
Spartan-II FPGA (xc2s30)Spartan-II FPGA (xc2s30)
Spartan-II FPGA (xc2s30)
 
Lec15 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- EPIC VLIW
Lec15 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- EPIC VLIWLec15 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- EPIC VLIW
Lec15 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- EPIC VLIW
 
MPHD RC Overview
MPHD RC OverviewMPHD RC Overview
MPHD RC Overview
 
Short.course.introduction.to.vhdl
Short.course.introduction.to.vhdlShort.course.introduction.to.vhdl
Short.course.introduction.to.vhdl
 
UIC Thesis Candiloro
UIC Thesis CandiloroUIC Thesis Candiloro
UIC Thesis Candiloro
 
4_BIT_ALU
4_BIT_ALU4_BIT_ALU
4_BIT_ALU
 
NIOS II Processor.ppt
NIOS II Processor.pptNIOS II Processor.ppt
NIOS II Processor.ppt
 
Iaetsd a design of fpga with ledr encoding and
Iaetsd a design of fpga with ledr encoding andIaetsd a design of fpga with ledr encoding and
Iaetsd a design of fpga with ledr encoding and
 
Cpld fpga
Cpld fpgaCpld fpga
Cpld fpga
 

More from wafawafa52

Controller 6610 pptx .
Controller 6610 pptx                          .Controller 6610 pptx                          .
Controller 6610 pptx .
wafawafa52
 
Evo Controller 8200BSC PRESENTATION .
Evo Controller 8200BSC PRESENTATION     .Evo Controller 8200BSC PRESENTATION     .
Evo Controller 8200BSC PRESENTATION .
wafawafa52
 
Master Baseband 6620 & 6630 Commissioning On Site
Master Baseband 6620 & 6630 Commissioning On SiteMaster Baseband 6620 & 6630 Commissioning On Site
Master Baseband 6620 & 6630 Commissioning On Site
wafawafa52
 
Ericsson LTE Throughput Troubleshooting Techniques.ppt
Ericsson LTE Throughput Troubleshooting Techniques.pptEricsson LTE Throughput Troubleshooting Techniques.ppt
Ericsson LTE Throughput Troubleshooting Techniques.ppt
wafawafa52
 
Model test result .pptx
Model test result                  .pptxModel test result                  .pptx
Model test result .pptx
wafawafa52
 
Recovery-XPIC-Ericsson- 2-0-MMU 4 A.pptx
Recovery-XPIC-Ericsson- 2-0-MMU 4 A.pptxRecovery-XPIC-Ericsson- 2-0-MMU 4 A.pptx
Recovery-XPIC-Ericsson- 2-0-MMU 4 A.pptx
wafawafa52
 
515878259-Node-Group-Synch-Workshop.pptx
515878259-Node-Group-Synch-Workshop.pptx515878259-Node-Group-Synch-Workshop.pptx
515878259-Node-Group-Synch-Workshop.pptx
wafawafa52
 
385288768-TD-Training-Modules-Mobilis.pptx
385288768-TD-Training-Modules-Mobilis.pptx385288768-TD-Training-Modules-Mobilis.pptx
385288768-TD-Training-Modules-Mobilis.pptx
wafawafa52
 
Ericsson Microwave Products Overview.ppt
Ericsson Microwave Products Overview.pptEricsson Microwave Products Overview.ppt
Ericsson Microwave Products Overview.ppt
wafawafa52
 
BaseBand-6630-Moshell-Commands .pdf
BaseBand-6630-Moshell-Commands      .pdfBaseBand-6630-Moshell-Commands      .pdf
BaseBand-6630-Moshell-Commands .pdf
wafawafa52
 
45555555555-4G-Training .pptx
45555555555-4G-Training            .pptx45555555555-4G-Training            .pptx
45555555555-4G-Training .pptx
wafawafa52
 
5-LTE-IP-Troubleshooting .ppt
5-LTE-IP-Troubleshooting            .ppt5-LTE-IP-Troubleshooting            .ppt
5-LTE-IP-Troubleshooting .ppt
wafawafa52
 
Sharing-Knowledge-OAM-3G-Ericsson .ppt
Sharing-Knowledge-OAM-3G-Ericsson   .pptSharing-Knowledge-OAM-3G-Ericsson   .ppt
Sharing-Knowledge-OAM-3G-Ericsson .ppt
wafawafa52
 
LTE-BASICS-ppt .ppt
LTE-BASICS-ppt                      .pptLTE-BASICS-ppt                      .ppt
LTE-BASICS-ppt .ppt
wafawafa52
 
ran-introicbasictroubleshooting3-230122164831-426c58cd.pdf
ran-introicbasictroubleshooting3-230122164831-426c58cd.pdfran-introicbasictroubleshooting3-230122164831-426c58cd.pdf
ran-introicbasictroubleshooting3-230122164831-426c58cd.pdf
wafawafa52
 
toaz.info-5g-solution-overview-pr_306866f43cebfb285586e3dd90989b89.pdf
toaz.info-5g-solution-overview-pr_306866f43cebfb285586e3dd90989b89.pdftoaz.info-5g-solution-overview-pr_306866f43cebfb285586e3dd90989b89.pdf
toaz.info-5g-solution-overview-pr_306866f43cebfb285586e3dd90989b89.pdf
wafawafa52
 
mop-baseband-integration-xl-project-pa-1docxdocx-pr_299cefaa0fd3e32dd950c7218...
mop-baseband-integration-xl-project-pa-1docxdocx-pr_299cefaa0fd3e32dd950c7218...mop-baseband-integration-xl-project-pa-1docxdocx-pr_299cefaa0fd3e32dd950c7218...
mop-baseband-integration-xl-project-pa-1docxdocx-pr_299cefaa0fd3e32dd950c7218...
wafawafa52
 
FPGA_Logic.pdf
FPGA_Logic.pdfFPGA_Logic.pdf
FPGA_Logic.pdf
wafawafa52
 
DWDM-Presentation.pdf
DWDM-Presentation.pdfDWDM-Presentation.pdf
DWDM-Presentation.pdf
wafawafa52
 
Verilog HDL Design Examples ( PDFDrive ).pdf
Verilog HDL Design Examples ( PDFDrive ).pdfVerilog HDL Design Examples ( PDFDrive ).pdf
Verilog HDL Design Examples ( PDFDrive ).pdf
wafawafa52
 

More from wafawafa52 (20)

Controller 6610 pptx .
Controller 6610 pptx                          .Controller 6610 pptx                          .
Controller 6610 pptx .
 
Evo Controller 8200BSC PRESENTATION .
Evo Controller 8200BSC PRESENTATION     .Evo Controller 8200BSC PRESENTATION     .
Evo Controller 8200BSC PRESENTATION .
 
Master Baseband 6620 & 6630 Commissioning On Site
Master Baseband 6620 & 6630 Commissioning On SiteMaster Baseband 6620 & 6630 Commissioning On Site
Master Baseband 6620 & 6630 Commissioning On Site
 
Ericsson LTE Throughput Troubleshooting Techniques.ppt
Ericsson LTE Throughput Troubleshooting Techniques.pptEricsson LTE Throughput Troubleshooting Techniques.ppt
Ericsson LTE Throughput Troubleshooting Techniques.ppt
 
Model test result .pptx
Model test result                  .pptxModel test result                  .pptx
Model test result .pptx
 
Recovery-XPIC-Ericsson- 2-0-MMU 4 A.pptx
Recovery-XPIC-Ericsson- 2-0-MMU 4 A.pptxRecovery-XPIC-Ericsson- 2-0-MMU 4 A.pptx
Recovery-XPIC-Ericsson- 2-0-MMU 4 A.pptx
 
515878259-Node-Group-Synch-Workshop.pptx
515878259-Node-Group-Synch-Workshop.pptx515878259-Node-Group-Synch-Workshop.pptx
515878259-Node-Group-Synch-Workshop.pptx
 
385288768-TD-Training-Modules-Mobilis.pptx
385288768-TD-Training-Modules-Mobilis.pptx385288768-TD-Training-Modules-Mobilis.pptx
385288768-TD-Training-Modules-Mobilis.pptx
 
Ericsson Microwave Products Overview.ppt
Ericsson Microwave Products Overview.pptEricsson Microwave Products Overview.ppt
Ericsson Microwave Products Overview.ppt
 
BaseBand-6630-Moshell-Commands .pdf
BaseBand-6630-Moshell-Commands      .pdfBaseBand-6630-Moshell-Commands      .pdf
BaseBand-6630-Moshell-Commands .pdf
 
45555555555-4G-Training .pptx
45555555555-4G-Training            .pptx45555555555-4G-Training            .pptx
45555555555-4G-Training .pptx
 
5-LTE-IP-Troubleshooting .ppt
5-LTE-IP-Troubleshooting            .ppt5-LTE-IP-Troubleshooting            .ppt
5-LTE-IP-Troubleshooting .ppt
 
Sharing-Knowledge-OAM-3G-Ericsson .ppt
Sharing-Knowledge-OAM-3G-Ericsson   .pptSharing-Knowledge-OAM-3G-Ericsson   .ppt
Sharing-Knowledge-OAM-3G-Ericsson .ppt
 
LTE-BASICS-ppt .ppt
LTE-BASICS-ppt                      .pptLTE-BASICS-ppt                      .ppt
LTE-BASICS-ppt .ppt
 
ran-introicbasictroubleshooting3-230122164831-426c58cd.pdf
ran-introicbasictroubleshooting3-230122164831-426c58cd.pdfran-introicbasictroubleshooting3-230122164831-426c58cd.pdf
ran-introicbasictroubleshooting3-230122164831-426c58cd.pdf
 
toaz.info-5g-solution-overview-pr_306866f43cebfb285586e3dd90989b89.pdf
toaz.info-5g-solution-overview-pr_306866f43cebfb285586e3dd90989b89.pdftoaz.info-5g-solution-overview-pr_306866f43cebfb285586e3dd90989b89.pdf
toaz.info-5g-solution-overview-pr_306866f43cebfb285586e3dd90989b89.pdf
 
mop-baseband-integration-xl-project-pa-1docxdocx-pr_299cefaa0fd3e32dd950c7218...
mop-baseband-integration-xl-project-pa-1docxdocx-pr_299cefaa0fd3e32dd950c7218...mop-baseband-integration-xl-project-pa-1docxdocx-pr_299cefaa0fd3e32dd950c7218...
mop-baseband-integration-xl-project-pa-1docxdocx-pr_299cefaa0fd3e32dd950c7218...
 
FPGA_Logic.pdf
FPGA_Logic.pdfFPGA_Logic.pdf
FPGA_Logic.pdf
 
DWDM-Presentation.pdf
DWDM-Presentation.pdfDWDM-Presentation.pdf
DWDM-Presentation.pdf
 
Verilog HDL Design Examples ( PDFDrive ).pdf
Verilog HDL Design Examples ( PDFDrive ).pdfVerilog HDL Design Examples ( PDFDrive ).pdf
Verilog HDL Design Examples ( PDFDrive ).pdf
 

Recently uploaded

4. Mosca vol I -Fisica-Tipler-5ta-Edicion-Vol-1.pdf
4. Mosca vol I -Fisica-Tipler-5ta-Edicion-Vol-1.pdf4. Mosca vol I -Fisica-Tipler-5ta-Edicion-Vol-1.pdf
4. Mosca vol I -Fisica-Tipler-5ta-Edicion-Vol-1.pdf
Gino153088
 
Object Oriented Analysis and Design - OOAD
Object Oriented Analysis and Design - OOADObject Oriented Analysis and Design - OOAD
Object Oriented Analysis and Design - OOAD
PreethaV16
 
1FIDIC-CONSTRUCTION-CONTRACT-2ND-ED-2017-RED-BOOK.pdf
1FIDIC-CONSTRUCTION-CONTRACT-2ND-ED-2017-RED-BOOK.pdf1FIDIC-CONSTRUCTION-CONTRACT-2ND-ED-2017-RED-BOOK.pdf
1FIDIC-CONSTRUCTION-CONTRACT-2ND-ED-2017-RED-BOOK.pdf
MadhavJungKarki
 
Tools & Techniques for Commissioning and Maintaining PV Systems W-Animations ...
Tools & Techniques for Commissioning and Maintaining PV Systems W-Animations ...Tools & Techniques for Commissioning and Maintaining PV Systems W-Animations ...
Tools & Techniques for Commissioning and Maintaining PV Systems W-Animations ...
Transcat
 
Blood finder application project report (1).pdf
Blood finder application project report (1).pdfBlood finder application project report (1).pdf
Blood finder application project report (1).pdf
Kamal Acharya
 
Data Driven Maintenance | UReason Webinar
Data Driven Maintenance | UReason WebinarData Driven Maintenance | UReason Webinar
Data Driven Maintenance | UReason Webinar
UReason
 
SENTIMENT ANALYSIS ON PPT AND Project template_.pptx
SENTIMENT ANALYSIS ON PPT AND Project template_.pptxSENTIMENT ANALYSIS ON PPT AND Project template_.pptx
SENTIMENT ANALYSIS ON PPT AND Project template_.pptx
b0754201
 
Mechatronics material . Mechanical engineering
Mechatronics material . Mechanical engineeringMechatronics material . Mechanical engineering
Mechatronics material . Mechanical engineering
sachin chaurasia
 
Introduction to Computer Networks & OSI MODEL.ppt
Introduction to Computer Networks & OSI MODEL.pptIntroduction to Computer Networks & OSI MODEL.ppt
Introduction to Computer Networks & OSI MODEL.ppt
Dwarkadas J Sanghvi College of Engineering
 
Use PyCharm for remote debugging of WSL on a Windo cf5c162d672e4e58b4dde5d797...
Use PyCharm for remote debugging of WSL on a Windo cf5c162d672e4e58b4dde5d797...Use PyCharm for remote debugging of WSL on a Windo cf5c162d672e4e58b4dde5d797...
Use PyCharm for remote debugging of WSL on a Windo cf5c162d672e4e58b4dde5d797...
shadow0702a
 
NATURAL DEEP EUTECTIC SOLVENTS AS ANTI-FREEZING AGENT
NATURAL DEEP EUTECTIC SOLVENTS AS ANTI-FREEZING AGENTNATURAL DEEP EUTECTIC SOLVENTS AS ANTI-FREEZING AGENT
NATURAL DEEP EUTECTIC SOLVENTS AS ANTI-FREEZING AGENT
Addu25809
 
ITSM Integration with MuleSoft.pptx
ITSM  Integration with MuleSoft.pptxITSM  Integration with MuleSoft.pptx
ITSM Integration with MuleSoft.pptx
VANDANAMOHANGOUDA
 
Null Bangalore | Pentesters Approach to AWS IAM
Null Bangalore | Pentesters Approach to AWS IAMNull Bangalore | Pentesters Approach to AWS IAM
Null Bangalore | Pentesters Approach to AWS IAM
Divyanshu
 
SCALING OF MOS CIRCUITS m .pptx
SCALING OF MOS CIRCUITS m                 .pptxSCALING OF MOS CIRCUITS m                 .pptx
SCALING OF MOS CIRCUITS m .pptx
harshapolam10
 
Open Channel Flow: fluid flow with a free surface
Open Channel Flow: fluid flow with a free surfaceOpen Channel Flow: fluid flow with a free surface
Open Channel Flow: fluid flow with a free surface
Indrajeet sahu
 
Zener Diode and its V-I Characteristics and Applications
Zener Diode and its V-I Characteristics and ApplicationsZener Diode and its V-I Characteristics and Applications
Zener Diode and its V-I Characteristics and Applications
Shiny Christobel
 
Supermarket Management System Project Report.pdf
Supermarket Management System Project Report.pdfSupermarket Management System Project Report.pdf
Supermarket Management System Project Report.pdf
Kamal Acharya
 
一比一原版(USF毕业证)旧金山大学毕业证如何办理
一比一原版(USF毕业证)旧金山大学毕业证如何办理一比一原版(USF毕业证)旧金山大学毕业证如何办理
一比一原版(USF毕业证)旧金山大学毕业证如何办理
uqyfuc
 
TIME TABLE MANAGEMENT SYSTEM testing.pptx
TIME TABLE MANAGEMENT SYSTEM testing.pptxTIME TABLE MANAGEMENT SYSTEM testing.pptx
TIME TABLE MANAGEMENT SYSTEM testing.pptx
CVCSOfficial
 
Digital Twins Computer Networking Paper Presentation.pptx
Digital Twins Computer Networking Paper Presentation.pptxDigital Twins Computer Networking Paper Presentation.pptx
Digital Twins Computer Networking Paper Presentation.pptx
aryanpankaj78
 

Recently uploaded (20)

4. Mosca vol I -Fisica-Tipler-5ta-Edicion-Vol-1.pdf
4. Mosca vol I -Fisica-Tipler-5ta-Edicion-Vol-1.pdf4. Mosca vol I -Fisica-Tipler-5ta-Edicion-Vol-1.pdf
4. Mosca vol I -Fisica-Tipler-5ta-Edicion-Vol-1.pdf
 
Object Oriented Analysis and Design - OOAD
Object Oriented Analysis and Design - OOADObject Oriented Analysis and Design - OOAD
Object Oriented Analysis and Design - OOAD
 
1FIDIC-CONSTRUCTION-CONTRACT-2ND-ED-2017-RED-BOOK.pdf
1FIDIC-CONSTRUCTION-CONTRACT-2ND-ED-2017-RED-BOOK.pdf1FIDIC-CONSTRUCTION-CONTRACT-2ND-ED-2017-RED-BOOK.pdf
1FIDIC-CONSTRUCTION-CONTRACT-2ND-ED-2017-RED-BOOK.pdf
 
Tools & Techniques for Commissioning and Maintaining PV Systems W-Animations ...
Tools & Techniques for Commissioning and Maintaining PV Systems W-Animations ...Tools & Techniques for Commissioning and Maintaining PV Systems W-Animations ...
Tools & Techniques for Commissioning and Maintaining PV Systems W-Animations ...
 
Blood finder application project report (1).pdf
Blood finder application project report (1).pdfBlood finder application project report (1).pdf
Blood finder application project report (1).pdf
 
Data Driven Maintenance | UReason Webinar
Data Driven Maintenance | UReason WebinarData Driven Maintenance | UReason Webinar
Data Driven Maintenance | UReason Webinar
 
SENTIMENT ANALYSIS ON PPT AND Project template_.pptx
SENTIMENT ANALYSIS ON PPT AND Project template_.pptxSENTIMENT ANALYSIS ON PPT AND Project template_.pptx
SENTIMENT ANALYSIS ON PPT AND Project template_.pptx
 
Mechatronics material . Mechanical engineering
Mechatronics material . Mechanical engineeringMechatronics material . Mechanical engineering
Mechatronics material . Mechanical engineering
 
Introduction to Computer Networks & OSI MODEL.ppt
Introduction to Computer Networks & OSI MODEL.pptIntroduction to Computer Networks & OSI MODEL.ppt
Introduction to Computer Networks & OSI MODEL.ppt
 
Use PyCharm for remote debugging of WSL on a Windo cf5c162d672e4e58b4dde5d797...
Use PyCharm for remote debugging of WSL on a Windo cf5c162d672e4e58b4dde5d797...Use PyCharm for remote debugging of WSL on a Windo cf5c162d672e4e58b4dde5d797...
Use PyCharm for remote debugging of WSL on a Windo cf5c162d672e4e58b4dde5d797...
 
NATURAL DEEP EUTECTIC SOLVENTS AS ANTI-FREEZING AGENT
NATURAL DEEP EUTECTIC SOLVENTS AS ANTI-FREEZING AGENTNATURAL DEEP EUTECTIC SOLVENTS AS ANTI-FREEZING AGENT
NATURAL DEEP EUTECTIC SOLVENTS AS ANTI-FREEZING AGENT
 
ITSM Integration with MuleSoft.pptx
ITSM  Integration with MuleSoft.pptxITSM  Integration with MuleSoft.pptx
ITSM Integration with MuleSoft.pptx
 
Null Bangalore | Pentesters Approach to AWS IAM
Null Bangalore | Pentesters Approach to AWS IAMNull Bangalore | Pentesters Approach to AWS IAM
Null Bangalore | Pentesters Approach to AWS IAM
 
SCALING OF MOS CIRCUITS m .pptx
SCALING OF MOS CIRCUITS m                 .pptxSCALING OF MOS CIRCUITS m                 .pptx
SCALING OF MOS CIRCUITS m .pptx
 
Open Channel Flow: fluid flow with a free surface
Open Channel Flow: fluid flow with a free surfaceOpen Channel Flow: fluid flow with a free surface
Open Channel Flow: fluid flow with a free surface
 
Zener Diode and its V-I Characteristics and Applications
Zener Diode and its V-I Characteristics and ApplicationsZener Diode and its V-I Characteristics and Applications
Zener Diode and its V-I Characteristics and Applications
 
Supermarket Management System Project Report.pdf
Supermarket Management System Project Report.pdfSupermarket Management System Project Report.pdf
Supermarket Management System Project Report.pdf
 
一比一原版(USF毕业证)旧金山大学毕业证如何办理
一比一原版(USF毕业证)旧金山大学毕业证如何办理一比一原版(USF毕业证)旧金山大学毕业证如何办理
一比一原版(USF毕业证)旧金山大学毕业证如何办理
 
TIME TABLE MANAGEMENT SYSTEM testing.pptx
TIME TABLE MANAGEMENT SYSTEM testing.pptxTIME TABLE MANAGEMENT SYSTEM testing.pptx
TIME TABLE MANAGEMENT SYSTEM testing.pptx
 
Digital Twins Computer Networking Paper Presentation.pptx
Digital Twins Computer Networking Paper Presentation.pptxDigital Twins Computer Networking Paper Presentation.pptx
Digital Twins Computer Networking Paper Presentation.pptx
 

Lecture 16 RC Architecture Types & FPGA Interns Lecturer.pptx

  • 1. Lecture 16 RC Architecture Types & FPGA Interns Lecturer: Simon Winberg Attribution-ShareAlike 4.0 International (CC BY-SA 4.0)
  • 2.  Reminders & YODA milestone dates Marking process  RC Architecture overview & main types  Recap of FPGAs  Evaluating Performance of Combinational Logic / FPGA design (slides 22)
  • 3.  Indicate your YODA team in the Wiki. Add a blog entry to describe your topic  29 Apr – Blog about your product  15 May – Design Review  18-20 May – Demos  22 May final report & code (although no late penalty if submitted before 25 May 8am) See “EEE4084F YODA Mark Allocation Schema.pptx” for process of allocating marks for mark categories
  • 4.  Assignment work is marked in relation to Correctness Completion Structure, effectiveness of wording & layout Adequate amount of detail/results shown & effectively dealing with the details Indication of student’s understanding and engagement with the discipline Clarity of explanations/motivation of results Professionalism and overall quality
  • 6.  A determining factor is ability to change hardware datapaths and control flows by software control  This change could be either a post-process / compile time or dynamically during runtime (doesn’t have to be both) processing elements Datapath While the trivial case (a computer with one changeable datapath could be argued as being reconfigurable) it is usually assumed the computer system concerned has many changeable datapaths.
  • 7.  Currently there are two basic forms: Microprocessor-based RC FPGA-based RC Microprocessor-based RC: • A few platform configurability features added to a microprocessor system (e.g., a multi-processor motherboard that can reroute the hardware links between processors) • Besides that we’ve already seen it all in the microprocessor parallelism in part of the course
  • 8.  Microprocessor based RC  Multi-core processors dynamically joined to create a larger/smaller parallel system when needed  Assumed to be a single computer platform as apposed to a cluster of computers  Needs to support software-controlled dynamic reconfiguration (see previous slide)  Tends to become: Hardware essentially changeable in big blocks (“macro-level reconfiguration” - whole processors at a time)
  • 9.  FPGA based Generally much smaller level of interconnects (more at the “micro-level reconfiguration”) Processors that connect to FPGA(s)
  • 10.  Generally, these systems follow a processors + coprocessors arrangement CPU connectors to reprogrammable hardware (usually FPGAs) The CPU itself may be entirely in an FPGA  The lower-level architecture is more involved… CPU FPGA-based Accelerator card … high-speed bus CPU … FPGA-based Accelerator card topic of Seminar #8 (‘Interconnection Fabrics’) and further discussed in later lectures.
  • 11. FPGA Interns EEE4084F Skip to slide 22; already covered in text book but scan through these slides to ensure you are well versed in these issues.
  • 12. FPGA internal structure Programmable interconnect Programmable logic blocks Image adapted from Maxfield (2004) Programmable logic element (PLE) (or FPLE*) * FPLE = Field Programmable Logic Element Note: one programmable logic block (PLB) may contain a complex arrangement of programmable logic elements (PLE). The size of a FPGA or programmable logic device (PLD) is measured in the number of LEs (i.e., Logic Elements) that it has.
  • 13.  You already know all your logic primitives… The primitive logic gates AND, OR, NOR, NOT, NOR, NAND, XOR AND3, OR4, etc (for multiple inputs). Pins / sources / terminators Ground, VCC Input, output Storage elements JK Flip Flops Latches Others items: delay, mux OR Input Pin Output Pin Altera Quartus II representations
  • 14.  A simple but powerful approach to FPGA design is to use lookup tables for the PLBs. These are usually implemented as a combination of a multiplexer and memory (even just using NOR gates)  Essentially, this approach is building complex circuits using truth tables (where each LUT enumerates a truth table) The usual strategy for implementing PLBs examples follows…
  • 15. Simple 3-LUT implementation for a PLB 0 1 1 0 1 0 0 1 8-bit static memory 3 3-bit input bus 1-bit output 000 001 010 011 100 101 110 111 Any guesses as to what logic circuit this LUT implements? input values
  • 16. Simple 3-LUT implementation for a PLB input lines It’s an XOR of the 3 input lines!!! output 0 1 1 0 1 0 0 1 000 001 010 011 100 101 110 111 in out
  • 17. Mainstream* Programmable Logic Block (PLB) k-input LUT DFF clock … k inputs output config_sync Configure synchronous or asynchronous response (i.e. a line from another big LUT). 0 1 Image adapted from Maxfield (2004) Another example for implementing an alternate logic function. * Used by manufacturers like Xilinx
  • 18. Logic block clusters (LBCs) and Configurable logic blocks (CLBs) • Assume a k-input LUT for each logic block (LB) • Assume N x LBs per logic cluster • BLEs in each logic clusters are fully connected or mostly connected Diagram adapted from Sherief Reda (2007), EN2911X Lecture 2 Fall07, Brown University The diagram shows the same input lines (I) are sent to each LB, in addition to each of the N LBs’ output lines. Each LB operates on 4 input lines at a time, and a MUX is used to decide which input to sample. The MUXs may be configured from a separate LUT, or could be controlled by the LB it is connected to. LB LB … N x LBs
  • 19. “Every slice contains four logic-function generators (or LUTs), eight storage elements, wide- function multiplexers, and carry logic. These elements are used by all slices to provide logic, arithmetic, and ROM functions. In addition to this, some slices support two additional functions: storing data using distributed RAM and shifting data with 32-bit registers. Slices that support these additional functions are called SLICEM; others are called SLICEL. SLICEM represents a superset of elements and connections found in all slices. Each CLB can contain zero or one SLICEM. Every other CLB column contains a SLICEMs. In addition, the two CLB columns to the left of the DSP48E columns both contain a SLICEL and a SLICEM.” Source: http://www.xilinx.com/support/documentation/user_guides/ug364.pdf pg 8
  • 20. SLICEM slices support additional functions; they are a superset of SLICELs; i.e. the have all the standard LEs plus some additions. Source: http://www.xilinx.com/support/documentation/user_guides/ug364.pdf pg 9
  • 21. SLICEL slices contain the standard set of LEs for the particular FPGA concerned. As the diagram shows, it looks a little less complicated than the design of a SLICEM. Source: http://www.xilinx.com/support/documentation/user_guides/ug364.pdf pg 10
  • 22. Evaluating Performance Evaluating synthesis (simplified) of an FPGA design
  • 23. HDL to FPGA execution & LE cost Map ‘AND(e,f,g)’ to LB1 In order to implement a HDL design, the design need to be decomposed and mapped to the physical LBs on the FPGA and the interconnects need to be appropriately configured. Example: x = AND(e,f,g) y = AND(b,NAND(NAND(b,c),d)) out = NAND((NAND(x,y),NAND(a,y)) out x y Map ‘NAND((NAND(x,y),NAND(a,y))’ to LB2 Map ‘AND(b,NAND(NAND(b,c),d)) ’ to LB3 Costing: 3 LBs, 8 LEs (assuming LBs have LEs that are AND or NAND gates)
  • 24.  The previous slide didn’t show whether the connections were synchronized (i.e., a shared clock) or asynchronous –since they are all logic gates and no clocks show it’s probably asynchronous  Determining the timing constrains for synchronous configurations are generally easier, because everything is related to the clock speed. Still, you need to keep in mind cascading calculations.  For asynchronous use, the implementation could run faster, but can also become a more complicated design, and be more difficult to work out the timing…
  • 25.  Keep in mind that the propagation delays for the various gates / LUTs may be different – for example, in the previous example, let’s assume each AND may take 6ns to stabilise, and the NANDS 10ns.  So time to compute out is = MAX OF (time to compute x, time to compute y) + 2x10ns = (2x10ns+6ns) + 20ns = 46ns = pretty fast!! Or is it?? Compared to a 1GHz CPU using just registers (and no mem access)? Try this calculation for yourself ... (assume each instruction takes on avg. 3 clocks due to pipeline, data dependencies, etc, as worst case performance on a RISC processor)
  • 26. CPU running at 1GHz  each clock 1ns period Assume each instruction takes ~ 5 clocks each due to pipeline etc CODE: int doit ( unsigned a, b, c, d, e, f, g ) { unsigned x = AND(e,f,g); unsigned y = AND(b,NAND(NAND(b,c),d)) out = NAND((NAND(x,y),NAND(a,y)) return out; } unsigned t1 = AND(e,f);  1 instruction, i.e. AND t1,e,f unsigned x = AND(t1,g); unsigned t1 = NAND(b,c) unsigned t2 = NAND(t1,d) unsigned y = AND(b,t2) t1 = NAND(x,y) t2 = NAND(a,y) out = NAND(t1,t2) in all 8 instructions  8 x 3 clocks ea. = 24 ns (assuming all registers pre-loaded) A speed-up of 1.92 over the FPGA case But some of these Can’t be done as just 1 RISC instruction.
  • 27.  RC architecture case studies IBM Blade & the cell processor Some large-scale RC systems  Amdahl’s Law reviewed and critiqued
  • 28. Image sources: FYI Stamp – Wikipedia open commons Reminder stamp – Open Clipart www.openclipart.org (public domain) Xilinx FPGA related images & schematics – from Xilinx datasheets or their website Disclaimers and copyright/licensing details I have tried to follow the correct practices concerning copyright and licensing of material, particularly image sources that have been used in this presentation. I have put much effort into trying to make this material open access so that it can be of benefit to others in their teaching and learning practice. Any mistakes or omissions with regards to these issues I will correct when notified. To the best of my understanding the material in these slides can be shared according to the Creative Commons “Attribution-ShareAlike 4.0 International (CC BY-SA 4.0)” license, and that is why I selected that license to apply to this presentation (it’s not because I particulate want my slides referenced but more to acknowledge the sources and generosity of others who have provided free material such as the images I have used).