Call Girls Delhi {Rohini} 9711199012 high profile service
Fpga & VHDL
1. Ovewrview on
FPGA & VHDL
F. De Canio(1)
Francesco_decanio@hotmail.com
11/12/2018
2. What is an FPGA?
• An FPGA is a type of integrated circuit (IC) that can be programmed for different
algorithms after fabrication.
• Modern FPGA devices consist of up to two million logic cells that can be configured
to implement a variety of software algorithms. Although the traditional FPGA
design flow is more similar to a regular IC than a processor, an FPGA provides
significant cost advantages in comparison to an IC development effort and offers
the same level of performance in most cases.
• Another advantage of the FPGA when compared to the IC is its ability to be
dynamically reconfigured. This process, which is the same as loading a program in a
processor, can affect part or all of the resources available in the FPGA fabric.
2
3. FPGA Architecture
3
• Configurable logic block (CLB) is the basic structure of an FPGA.
• CLBs are the main logic resources for implementing sequential as well as combinatorial
circuits.
• Each CLB element is connected to a switch matrix for access to the general routing
matrix.
• A CLB element contains a pair of slices.
• Each SLICE is composed by Logic Cell (LC)
4. SLICE Architecture
4
• Logic Cells are grouped in SLICE
• Four six-input Look Up Tables (LUT)
• Wide multiplexers
• Carry chain
• Four flip-flop/latches
• Four additional flip-flops
6. LOGIC CELL Architecture
6
Look-up table (LUT): this element performs logic operations.
• Flip-Flop (FF): this register element stores the result of the LUT.
• Wires: these elements connect elements to one another.
• Input/Output (I/O) pads: these physically available ports get data in
and out of the FPGA.
7. • 6-input LUT can be two 5-input LUTs with common inputs
– Minimal speed impact to
a 6-input LUT
– One or two outputs
– Any function of six variables or
two independent functions of
five variables
5-LUT
D
A5
A4
A3
A2
A1
5-LUT
D
A5
A4
A3
A2
A1
A6
A5
A4
A3
A2
A1
O6
O5
6-LUT
6-Input LUT with Dual Output
7
8. Wide Multiplexers
• Each F7MUX combines the outputs of
two LUTs together
– Can implement an arbitrary 7-input
function
– Can implement an 8-1 multiplexer
• The F8MUX combines the outputs of
the two F7MUXes
– Can implement an arbitrary 8-input
function
– Can implement a 16-1 multiplexer
• MUX is controlled by the BX/CX/DX
slice input
• MUX output can drive out
combinatorially or to the flip-flop/latch
LUT/RAM/SRL
LUT/RAM/SRL
LUT/RAM/SRL
LUT/RAM/SRL
0 1
8
9. Carry Chain
• Carry chain can implement fast
arithmetic addition and subtraction
– Carry out is propagated vertically
through the four LUTs in a slice
– The carry chain propagates from
one slice to the slice in the same
column in the CLB above
• Carry look-ahead
– Combinatorial carry look-ahead over
the four LUTs in a slice
– Implements faster carry cascading
from slice to slice
LUT/RAM/SRL
LUT/RAM/SRL
LUT/RAM/SRL
LUT/RAM/SRL
0 1
9
10. Slice Flip-Flops and Flip-Flop/Latches
• Each slice has four flip-flop/latches
(FF/L)
– Can be configured as either flip-flops
or latches
– The D input can come from the O6
LUT output, the carry chain, the wide
multiplexer, or the AX/BX/CX/DX
slice input
• Each slice also has four flip-flops (FF)
– D input can come from O5 output or
the AX/BX/CX/DX input
• These don’t have access to the
carry chain, wide multiplexers, or
the slice inputs
• If any of the FF/L are configured as
latches, the four FFs are not available
LUT/RAM/SRL
LUT/RAM/SRL
LUT/RAM/SRL
LUT/RAM/SRL
0 1
FF/LFF
10
11. Slice Flip-Flop Capabilities
• All flip-flops are D type
• All flip-flops have a single clock input (CLK)
Clock can be inverted at the slice boundary
• All flip-flops have an active high chip enable (CE)
• All flip-flops have an active high SR input
Input can be synchronous or asynchronous, as determined by the
configuration bit stream
Sets the flip-flop value to a pre-determined state, as determined by the
configuration bit stream
D
CE
SR
CK
D
CE
SR
Q
CK
11
14. Contemporary FPGA architectures
Contemporary FPGA architectures incorporate the basic elements along with additional
computational and data storage blocks that increase the computational density and
efficiency of the device.
These additional elements are:
• Embedded memories for distributed data storage
• Phase-locked loops (PLLs) for driving the FPGA fabric at different clock rates
• High-speed serial transceivers
• Off-chip memory controllers
• Multiply-accumulate blocks The combination of these elements provides the FPGA
with the flexibility to implement any software algorithm running on a processor and
results in the contemporary FPGA.
14
15. FPGA Parallelism Versus Processor Architectures
• When compared with processor architectures, the structures that comprise the FPGA
fabric enable a high degree of parallelism in application execution.
15
17. VHDL – Introduction
• VHDL is an acronym for Very high speed integrated circuit (VHSIC) Hardware
Description Language which is a programming language that describes a logic circuit
by function, data flow behavior, and/or structure.
• This hardware description is used to configure a programmable logic device (PLD),
such as a field programmable gate array (FPGA), with a custom logic design.
• VHDL : a formal language for specifying the behavior and structure of a digital
circuit
17
18. VHDL – Abstraction Levels
• BEHAVIORAL: functional description of the model. Used at the very beginning
stage of a design in order to be a able to run a simulation as soon as possible. Also
used to describe testbenches. Such descriptions are usually simulatable, but not
synthesizable.
• RTL: the description is divided into combinational logic and storage elements. The
storage elements (flip-flops, latches) are controlled by a system clock. The
description is synthesizable.
• GATE: the design is represented as a netlist with gates (AND, OR, NOT, ...) and
storage elements, all with cell delays. The description has been synthesized.
• LAYOUT (*): the different cells of the target technology are placed on the chip
and the connections are routed. After the layout has been verified, the circuit is
ready for the production process.
18
19. VHDL – BEHAVIORAL description
In a behavioral VHDL description, a Boolean
function, for example, can be modeled as a
simple equation (e.g. i1 + i2 * i3) plus a delay
of N ns. The worst case, i.e. the longest delay
to calculate a new output value, is assumed
here.
Functional behavior is modeled with the VHDL
statement: Process
…
process (A,B) begin
C <= A * B after 50 ns;
D <= A * B * C after 100 ns;
end process;
…
The key word “after”
has no meaning for
synthesis
19
20. VHDL – RTL description
Pure combinational: described with high level instructions, such as +, *, MUX …
Synchronous: clocked described with Flip-Flops.
CLK
20
21. VHDL – GATE description
• A gate level description contains a list of the gates of the design.
• Each element of the circuit is instantiated as a component and connected to the
corresponding signals (e.g. nets).
• All used gates are part of the technology library where additional information like area,
propagation delay, capacity, etc. is stored.
• Here delays can be applied to the used gates for simulation and timing information is
part of the synthesis library. This enables a rough validation of the timing behavior 21
22. VHDL - Terms
• Entity: an entity is the most basic building block in a design. The uppermost level of the
design is the top-level entity. If the design is hierarchical, then the top-level description
will have lower-level descriptions contained in it. These lower-level descriptions will be
lower-level entities contained in the top-level entity description.
• Architecture: all entities that can be simulated have an architecture description. The
architecture describes the behavior of the entity. A single entity can have multiple
architectures. One architecture might be behavioral while another might be a structural
description of the design.
• Bus: with term “bus” usually brings to mind a group of signals or a particular method of
communication used in the design of hardware. In VHDL, a bus is a special kind of signal
that may have its drivers turned off.
• Attribute: an attribute is data that are attached to VHDL objects or predefined data
about VHDL objects. Examples are the current drive capability of a buffer or the
maximum operating temperature of the device.
• Generic: a generic is VHDL’s term for a parameter that passes information to an entity.
For instance, if an entity is a gate level model with a rise and a fall delay, values for the
rise and fall delays could be passed into the entity with generics.
• Process: a process is the basic unit of execution in VHDL. All operations that are
performed in a simulation of a VHDL description are broken into single or multiple
processes.
22
23. VHDL - Entity
ENTITY mux IS
PORT ( a, b, c, d : IN BIT;
s0, s1 : IN BIT;
x, : OUT BIT);
END mux;
• A VHDL entity specifies the name of the entity, the ports of the
entity, and entity-related information. All designs are created using one
or more entities.
23
24. VHDL - Architecture
• The entity describes the interface to the VHDL model. The architecture describes
the underlying functionality of the entity and contains the statements that model
the behavior of the entity. An architecture is always related to an entity and
describes the behavior of that entity.
• This architecture is based on behavioral description.
ARCHITECTURE dataflow OF mux IS
SIGNAL select : INTEGER;
BEGIN
select <= 0 WHEN s0 = ‘0’ AND s1 = ‘0’ ELSE
1 WHEN s0 = ‘1’ AND s1 = ‘0’ ELSE
2 WHEN s0 = ‘0’ AND s1 = ‘1’ ELSE
3;
x <= a AFTER 0.5 NS WHEN select = 0 ELSE
b AFTER 0.5 NS WHEN select = 1 ELSE
c AFTER 0.5 NS WHEN select = 2 ELSE
d AFTER 0.5 NS;
END dataflow;
In a typical programming language such as C or C++, each assignment statement executes one
after the other and in a specified order. The order of execution is determined by the order
of the statements in the source file. Inside a VHDL architecture, there is no specified
ordering of the assignment statements. The order of execution is solely specified by events
occurring on signals that the assignment statements are sensitive to 24
25. SEQUENTIAL AND CONCURRENT STATEMENTS in
VHDL• The sequential domain is represented by a process or subprogram that contains
sequential statements. These statements are executed in the order in which they
appear within the process or subprogram, as in programming languages.
• The concurrent domain is represented by an architecture that contains processes,
concurrent procedure calls, concurrent signal assignments, and component
instantiations
• A process is a sequence of statements that are executed in the specified
order. The process declaration delimits a sequential domain of the
architecture in which the declaration appears. Processes are used for
behavioral descriptions.
• The optional sensitivity list is the list of signals to which the process is
sensitive. Any event on any of the signals specified in the sensitivity list causes
the sequential instructions in the process to be executed, similar to the
instructions in a usual program.
[name:] process[(sensitivity_list)]
[type_declarations]
[constant_declarations]
[variable_declarations]
[subprogram_declarations]
begin
sequential_statements
end process [name];
25
26. Combinatorial Blocks – decoder 3 to 8 – Example 1
library IEEE;
use IEEE.std_logic_1164.all;
use IEEE.std_logic_arith.all;
use IEEE.std_logic_unsigned.all;
entity dec3to8_alt is
port (
signal sel: in std_logic_vector(2 downto 0);
signal en: in std_logic;
signal y: out std_logic_vector(7 downto 0)
);
end dec3to8_alt;
architecture behavior of dec3to8_alt is
begin
y(0) <= '0' when (en = '1' and sel = "000") else '1';
y(1) <= '0' when (en = '1' and sel = "001") else '1';
y(2) <= '0' when (en = '1' and sel = "010") else '1';
y(3) <= '0' when (en = '1' and sel = "011") else '1';
y(4) <= '0' when (en = '1' and sel = "100") else '1';
y(5) <= '0' when (en = '1' and sel = "101") else '1';
y(6) <= '0' when (en = '1' and sel = "110") else '1';
y(7) <= '0' when (en = '1' and sel = "111") else '1';
end behavior;
26
27. Squential blocks – decoder 3 to 8 (TB) – Example 1
library IEEE;
use IEEE.std_logic_1164.all;
use IEEE.Numeric_Std.all;
entity dec3to8_tb is
end dec3to8_tb;
architecture BEH of dec3to8_tb is
signal sel_tb: std_logic_vector(2 downto 0);
signal en_tb: std_logic;
signal y_tb: std_logic_vector(7 downto 0);
begin
TB: entity work.dec3to8_alt PORT MAP(
sel => sel_tb,
en => en_tb,
y => y_tb);
process
begin
en_tb <= '1';
sel_tb<="000"; wait for 20 ns;
sel_tb<="001"; wait for 20 ns;
sel_tb<="010"; wait for 20 ns;
sel_tb<="011"; wait for 20 ns;
sel_tb<="100"; wait for 20 ns;
sel_tb<="101"; wait for 20 ns;
sel_tb<="110"; wait for 20 ns;
sel_tb<="111"; wait for 20 ns;
wait;
end process;
end BEH;
27
28. Combinatorial Blocks – ALU (Concorrential)– Example 2
28
library IEEE;
use IEEE.std_logic_1164.all;
use IEEE.std_logic_arith.all;
use IEEE.std_logic_unsigned.all;
entity ALU_conc is
port (
signal inp_a: std_logic_vector(3 downto 0);
signal inp_b: in std_logic_vector(3 downto 0);
signal sel: in std_logic_vector(2 downto 0);
signal out_y: out std_logic_vector(3 downto 0));
end ALU_conc;
architecture behavior of ALU_conc is
begin
out_y <= signed(inp_a)+ signed(inp_b) when sel = "000" else
signed(inp_a)- signed(inp_b) when sel = "001" else
signed(inp_a) -1 when sel = "010" else
signed(inp_a) +1 when sel = "011" else
inp_a and inp_b when sel = "100" else
inp_a or inp_b when sel = "101" else
not inp_a when sel = "110" else
inp_a xor inp_b;
end behavior;
29. Combinatorial Blocks – ALU (Sequential)– Example 2
29
entity ALU_sequential is
port (
signal inp_a: std_logic_vector(3 downto 0);
signal inp_b: in std_logic_vector(3 downto 0);
signal sel: in std_logic_vector(2 downto 0);
signal out_y: out std_logic_vector(3 downto 0));
end ALU_sequential;
architecture behavior of ALU_sequential is
begin
process(inp_a, inp_b, sel)
begin
case sel is
when "000" =>
out_y<= inp_a + inp_b; --addition
when "001" =>
out_y<= inp_a - inp_b; --subtraction
when "010" =>
out_y<= inp_a - 1; --sub 1
when "011" =>
out_y<= inp_a + 1; --add 1
when "100" =>
out_y<= inp_a and inp_b; --AND gate
when "101" =>
out_y<= inp_a or inp_b; --OR gate
when "110" =>
out_y<= not inp_a ; --NOT gate
when "111" =>
out_y<= inp_a xor inp_b; --XOR gate
when others =>
NULL;
end case;
end process; end behavior;
31. Sequential blocks –D FlipFlop – Example 1
31
library IEEE;
use IEEE.std_logic_1164.all;
entity D_FF is
port (
D : IN STD_LOGIC;
ResetN : IN STD_LOGIC;
Clk : IN STD_LOGIC;
Q : OUT STD_LOGIC);
end D_FF;
architecture behavior of D_FF is
begin
process(ResetN, Clk)
begin
if Clk'Event and Clk ='1' then
if(ResetN='1' ) then
Q<='0';
else
Q<=D;
end if;
end if;
end process;
end behavior;
32. Sequential blocks –D FlipFlop_TB – Example 1
32
LIBRARY ieee;
USE ieee.std_logic_1164.ALL;
USE ieee.numeric_std.ALL;
ENTITY D_FF_TB IS
END D_FF_TB;
ARCHITECTURE behavior OF D_FF_TB IS
constant CLK_PERIOD : time := 100 ns;
signal Clk : std_logic:='0';
signal ResetN : std_logic:='0';
signal D : std_logic:='0';
--Output
signal Q : std_logic;
BEGIN
-- Instantiate the Unit Under Test (UUT)
uut: entity work.D_FF PORT MAP (
Clk => Clk,
ResetN => ResetN,
D => D,
Q => Q
);
CLK_PROCESS: process
begin
Clk <= '0’; wait for CLK_PERIOD/2;
Clk <= '1'; wait for CLK_PERIOD/2;
end process;
STIM_PROCESS: process
begin
wait for CLK_PERIOD*10;
ResetN <='1';
wait for CLK_PERIOD*2;
ResetN <='0';
wait for CLK_PERIOD*20;
D <= '1';;
D <= '0'; wait;
end process;
end;
35. Sequential blocks –4 bit Register – Example 2
35
library IEEE;
use IEEE.std_logic_1164.all;
entity ShiftReg4 is
port (
InShift : IN STD_LOGIC;
ResetN : IN STD_LOGIC;
Clk : IN STD_LOGIC;
Out_Shift : INOUT
STD_LOGIC_VECTOR(3 downto 0));
end ShiftReg4;
architecture structure of ShiftReg4 is
begin
U0: entity work.D_FF PORT MAP (
Clk => Clk,
ResetN => ResetN,
D => InShift,
Q => Out_Shift(0)
);
U1: entity work.D_FF PORT MAP (
Clk => Clk,
ResetN => ResetN,
D => Out_Shift(0),
Q => Out_Shift(1)
);
U2: entity work.D_FF PORT MAP (
Clk => Clk,
ResetN => ResetN,
D => Out_Shift(1),
Q => Out_Shift(2)
);
U3: entity work.D_FF PORT MAP (
Clk => Clk,
ResetN => ResetN,
D => Out_Shift(2),
Q => Out_Shift(3)
);
end structure;
36. Sequential blocks –4 bit Register_TB – Example 2
36
LIBRARY ieee;
USE ieee.std_logic_1164.ALL;
USE ieee.numeric_std.ALL;
ENTITY ShiftReg4_TB IS
END ShiftReg4_TB;
ARCHITECTURE behavior OF ShiftReg4_TB IS
constant CLK_PERIOD : time := 100 ns;
signal Clk : std_logic:='0';
signal ResetN : std_logic:='0';
signal InShift : std_logic:='0';
--Output
signal OutShift : std_logic_vector(3 downto 0)
:=(others => '0');
BEGIN
-- Instantiate the Unit Under Test (UUT)
uut: entity work.ShiftReg4 PORT MAP (
Clk => Clk,
ResetN => ResetN,
InShift => InShift,
Out_Shift => OutShift
);
CLK_PROCESS: process
begin
Clk <= '0';
wait for CLK_PERIOD/2;
Clk <= '1';
wait for CLK_PERIOD/2;
end process;
STIM_PROCESS: process
begin
wait for CLK_PERIOD*10;
ResetN <='1';
wait for CLK_PERIOD*2;
ResetN <='0’;
wait for CLK_PERIOD ;
InShift <='1';
wait for CLK_PERIOD ;
InShift <='0';
wait for CLK_PERIOD ;
InShift <='0';
wait for CLK_PERIOD ;
InShift <='1';
wait;
end process;
end ARCHITECTURE;
37. Sequential blocks –Up Down Counter– Example 3
37
• In this case the Reset Signal is asynchronous
38. Sequential blocks –Up Down Counter– Example 3
38
library IEEE;
use IEEE.std_logic_1164.all;
use IEEE.STD_LOGIC_ARITH.ALL;
use IEEE.STD_LOGIC_UNSIGNED.ALL;
entity UpDownCounter is
port (
ResetN : IN STD_LOGIC;
Clk : IN STD_LOGIC;
Up : IN STD_LOGIC;
Down : IN STD_LOGIC;
Out_Count : INOUT STD_LOGIC_VECTOR(3 downto 0));
end UpDownCounter;
architecture behavior of UpDownCounter is
begin
process(ResetN, Clk)
begin
if(ResetN='1') then
Out_Count <="0000";
elsif(Clk'Event and Clk ='1' ) then
if(Up='1') then
Out_Count <= Out_Count +1;
elsif(Down='1') then
Out_Count <= Out_Count -1;
end if;
end if;
end process;
end architecture;
39. Sequential blocks –Up Down Counter_TB– Example 3
39
LIBRARY ieee;
USE ieee.std_logic_1164.ALL;
USE ieee.numeric_std.ALL;
ENTITY UpDownCounter_TB IS
END UpDownCounter_TB;
ARCHITECTURE behavior OF UpDownCounter_TB IS
constant CLK_PERIOD : time := 100 ns;
signal Clk_TB : std_logic:='0';
signal ResetN_TB : std_logic:='0';
signal Up_TB : std_logic:='0';
signal Down_TB : std_logic:='0';
--Output
signal Out_Count_TB :
std_logic_vector(3 downto 0) :=(others => '0');
BEGIN
-- Instantiate the Unit Under Test (UUT)
uut: entity work.UpDownCounter PORT MAP (
Clk => Clk_TB,
ResetN => ResetN_TB,
Up => Up_TB,
Down => Down_TB,
Out_Count => Out_Count_TB
);
CLK_PROCESS: process
begin
Clk_TB <= '0';
wait for CLK_PERIOD/2;
Clk_TB <= '1';
wait for CLK_PERIOD/2;.
end process;
STIM_PROCESS: process
begin
wait for CLK_PERIOD*2;
ResetN_TB <='1';
wait for CLK_PERIOD*2;
ResetN_TB <='0';
wait for CLK_PERIOD ;
Up_TB <='1';
wait for CLK_PERIOD*15 ;
Up_TB <='0';
wait for CLK_PERIOD ;
Down_TB <='1';
wait for CLK_PERIOD*15 ;
Down_TB <='0';
wait for CLK_PERIOD ;
end process;
end ARCHITECTURE;
40. Finite Sate Machine blocks – 3
40
• A,B,C,D are states
• P is an input Signal
• R is an output
41. Finite Sate Machine blocks – 3 (1/2)
41
--
library ieee;
use ieee.std_logic_1164.all;
use ieee.numeric_std.all;
entity SimpleFSM is
port (
CLK : IN STD_LOGIC;
ResetN : IN STD_LOGIC;
P : IN STD_LOGIC;
R_out : OUT STD_LOGIC
);
end SimpleFSM;
architecture RTL of SimpleFSM is
TYPE State_type IS (A, B, C, D); --
Define the states
SIGNAL State : State_Type; -- Create a
signal that uses
begin
process(CLK, ResetN)
begin
if(ResetN='1') then
State <= A;
elsif(rising_edge(CLK) ) then
CASE State IS
WHEN A =>
IF P='1' THEN
State <= B;
END IF;
WHEN B =>
IF P='1' THEN
State <= C;
END IF;
WHEN C =>
IF P='1' THEN
State <= D;
END IF;
WHEN D=>
IF P='1' THEN
State <= B;
ELSE
State <= A;
END IF;
WHEN others =>
State <= A;
END CASE;
end if;
end process;
42. Finite Sate Machine blocks – 3 (2/2)
42
process(State)
begin
case State IS
when A =>
R_out<='0';
when B =>
R_out<='1';
when C =>
R_out<='1';
when D =>
R_out<='1';
end case;
end process;
end RTL;
43. Finite Sate Machine blocks – 3 TB
43
LIBRARY ieee;
USE ieee.std_logic_1164.ALL;
USE ieee.numeric_std.ALL;
ENTITY SimpleFSM_TB IS
END SimpleFSM_TB;
ARCHITECTURE behavior OF SimpleFSM_TB IS
constant CLK_PERIOD : time := 100 ns;
--input
signal CLK_TB : std_logic:='0';
signal ResetN_TB : std_logic:='0';
signal P_TB : std_logic:='0';
--output
signal R_out_TB : std_logic:='0’;
begin
uut: entity work.SimpleFSM PORT MAP (
CLK => CLK_TB,
ResetN => ResetN_TB,
P => P_TB,
R_out => R_out_TB
);
CLK_PROCESS: process
begin
CLK_TB <= '0';
wait for CLK_PERIOD/2; --for half of
clock period clk stays at '0'.
CLK_TB <= '1';
wait for CLK_PERIOD/2; --for next half
of clock period clk stays at '1'.
end process;
STIM_PROCESS: process
begin
wait for CLK_PERIOD*2; --wait for 10
clock cycles.
ResetN_TB <='1'; --
then apply reset for 2 clock cycles.
wait for CLK_PERIOD*2;
ResetN_TB <='0'; --
then pull down reset for 20 clock cycles.
wait for CLK_PERIOD*2 ;
P_TB <='1';
wait for CLK_PERIOD*9 ;
P_TB <='0';
wait for CLK_PERIOD*2 ;
wait;
end process;