2. FPGAs, alternative to the custom ICs, can be used to
implement an entire System On one Chip (SOC).
The main advantage of FPGA is ability to reprogram. User can
reprogram an FPGA to implement a design and this is done after
the FPGA is manufactured. This brings the name “Field
Programmable.”
Custom ICs are expensive and takes long time to design so they
are useful when produced in bulk amounts.
But FPGAs are easy to implement with in a short time with the
help of Computer Aided Designing (CAD) tools (because there is
no physical layout process, no mask making, and no IC
manufacturing).
Some disadvantages of FPGAs are, they are slow compared to
custom ICs as they can’t handle vary complex designs and also
they draw more power.
3. • The FPGA design flow is the process of
designing and implementing an FPGA-based
system.
• This typically involves creating a design in a
hardware description language (HDL) such as
VHDL or Verilog, synthesizing the design to
generate a gate-level netlist, and then
implementing the design on the FPGA using a
place-and-route tool.
4. FPGA Design flow
Design specification
Behavioural description
RTL Description(HDL)
Functional verification
and testing
Logic synthesis and
technology mapping
Technology netlist
Placement & Routing
Bit stream
Programming FPGA by
downloading the bit stream
5.
6. • Design Entry
• There are different techniques for design
entry.
• Schematic based,
• Hardware Description Language (VHDL,
VERILOG)
• Combination of both.
7. • Synthesis
• The process which translates VHDL or Verilog
code into a device netlist formate. i.e a
complete circuit with logical elements( gates,
flip flops, etc…) for the design.
8. • Synthesis process will check code syntax and
analyze the hierarchy of the design which
ensures that the design is optimized for the
design architecture,
• The resulting netlist(s) is saved to an NGC(
Native Generic Circuit) file (for Xilinx®
Synthesis Technology (XST)).
10. • Translate process combines all the input
netlists and constraints to a logic design file.
This information is saved as a NGD (Native
Generic Database) file.
11. • Map process divides the whole circuit with logical
elements into sub blocks such that they can be fit
into the FPGA logic blocks. That means map process
fits the logic defined by the NGD file into the
targeted FPGA elements (Combinational Logic Blocks
(CLB), Input Output Blocks (IOB)) and generates an
NCD (Native Circuit Description) file
12. • Place and Route PAR program is used for this
process. The place and route process places
the sub blocks from the map process into logic
blocks according to the constraints and
connects the logic blocks.
13. • Design Verification
• Verification can be done at different stages of
the process steps.
• 1. Behavioral Simulation (RTL Simulation)
• This simulation is performed before synthesis
process to verify RTL (behavioral) code
• In this process, signals and variables are
observed, procedures and functions are traced
and breakpoints are set
• This is a very fast simulation
14. • 2.Functional simulation (Post Translate
Simulation) Functional simulation gives
information about the logic operation of the
circuit. Designer can verify the functionality of
the design using this process after the Translate
process.
• 3. Static Timing Analysis This can be done after
MAP or PAR processes Post MAP timing report
lists signal path delays of the design derived from
the design logic. Post Place and Route timing
report incorporates timing delay information to
provide a comprehensive timing summary of the
design.
18. • basic component of an FPGA
• provides the basic logic and storage
functionality for a target application design
• the basic component can be either a transistor
or an entire processor
19. Switch box hasFs=3 as each track incident on it is
connected to 3 tracks of adjacent routing channels.
Connection box Fc(in)=0.5aseach input of the
logicblock is connected to 50% of the tracks of
adjacent routing channel.
20. • directional outingtracks single-driver directional wiring is used
instead of bidirectional wiring, 25%improvement inarea, 9%in
delay and 32%in area-delay canbe achieved
21. • Channel segment distribution
• mesh-based FPGAs,multi-length wires are created to reduce
delay,
• length wires, require fewer switches, reducing routing area
and delay
• Modern commercial FPGAs commonly usea combination of
long and short wires to balance flexibility, area and delay of
the routing network.
25. • It most widely used FPGA programming
technology. It holds their data in FPGA memory.
• The output of the memory cell is directly
connected to another circuit and state of the
memory cell continuously controls the circuit
being configured.
• Each combinational logic element requires many
programming bits and each programming
interconnection point requires its own bit.
26. Advantages
• SRAM based FPGA can be easily programmed
• SRAM based FPGA can be reprogrammed
during the system operation, providing
dynamically reconfigurable systems.
• The circuits used in the SRAM based FPGA can
be fabricated with standard VLSI process.
27. Disadvantages
• SRAM cell requires 6 transistors, costly and
requirement of area
• SRAM configuration memory burns a
noticeable amount of power, even when the
program is not changed.
• The bits in the SRAM are susceptible to theft.
• SRAM based FPGAs have to be configured
every time after power goes up and down.
28. Anti fuse technology
• Anti fuse is a one-time programmable.
• Fuses are permanently put in place.
• The anti-part of anti fuse comes from its
programming method.
• Instead of breaking a metal connection by
passing through a current through it, a link is
grown to make a connection.
• Programming is very slow because each anti
fuse must be programmed separately.
29. • Programming element is an anti fuse (high
impedance(open circuit) on low voltage and
low impedance (connection) on high voltage.
• Small area
• Non-volatile
• Irreversible( design errors can not be
corrected)
30. Advantages
• Antifuse technology is nonvolatile. Design
remain as it is even the power is down.
• Small area
• Delays due to routing are very small.
• Anti fuse FPGAs tend to require lower power.
• lower on resistance and parasitic capacitance
• Theft problem is not there in anti fuse
technology.
31. Disadvantages
• Anti fuse technology requires a complex
fabrication process.
• technology does not make use of standard
CMOS process.
• External programmer is required to program
or configure the design, after which the design
can not be changed.
32. EPROM/EEPROM technology
• Switch is disabled by injecting charge on the
gate using high voltage between gate and
drain.
• The charge is removed by UV light.
Reprogramming through exposure to UV light.
• Non volatile.
• Slower programming than SRAM.
33. Advantages
• more area efficient
• Non volatile in nature.
• No external permanent memory is
needed to program it at power up.
34. Disadvantages
• flashbased devices can not be
reconfigured/reprogrammed an infinite
number of times. Also, flash-based technology
uses non-standard CMOS process.
• Extra processing steps
• Static power loss due to pull up resistor high
resistance.
38. • Basic Architecture of ALTERA FLEX 8000
– Grid of Logic Array Block(LABs) each consisting of 8 independently
programmable Logic Elements(LEs)
– so a chip contains 208–1296 LEs, totaling 2,500–16,000 usable gates
– LABs are connected in rows & columns ,connected by FastTrack
Interconnect with Input-Output elements(IOEs) at the edges
– Ends of interconnect connected to each having
• bidirectional I/O buffer
• Flip-flop to register input or output
– Logic Element has
• 4 I/P LUT
• Programmable Register -has 4 low skew global clock ,clear or preset control
signals from Dedicated input ,I/O pin or Internal signal from LAB local
interconnect
• Dedicated Carry & cascade chains
– 4 signals common to each LE in a LAB
• 2 used as clocks
• 2 used for clear/preset control
39. Altera FLEX 8000 Logic Array Block
• LAB = 8 LEs, plus local interconnect, control signals,
carry & cascade chains
40. Altera FLEX 8000 Logic Element
• Each Logic Element (LE) contains:
– 4-input Look-Up Table (LUT)
• Can produce any function of 4 variables
– Programmable flip-flop
• Can configure as D, T, JR, SR, or bypass
• Has clock, clear, and preset signals that can come from
dedicated inputs, I/O pins, or other LEs
– Carry chain & cascade chain
42. Altera FLEX 8000 Carry Chain (Ex: n-bit adder)
• Carry chain provides very fast (< 1ns) carry-forward between
Les -Feeds both LUT and next part of chain
– Good for high-speed adders & counters
43. Altera FLEX 8000 Cascade Chain
• Cascade chain provides wide fan-in
– Adjacent LE’s LUTs can compute parts of the
function in parallel; cascade chain
• then serially connects intermediate values
– Can use either a logical AND or a logical OR (using
DeMorgan’s theorem) to connect outputs of
adjacent LEs
– Each additional LE provides 4 more inputs to the
width of the function
48. • Each CLB consists of
– LUT
– Multiplexers
– Registers
– Path for control signal
• Each CLB contains 3 function generators (F,G,H)
– Each function generator is based on an LUT with 5ns delay
independent of function being implemented
– 2 Function generators(F & G) can generate any arbitrary function of 4
I/ps and third (H) can generate any Boolean functions of 3 I/Ps
– H function block can get inputs from either F & G LUTs or from
external inputs.
– The 3 Function generators are programmed to generate
• 2 different functions of 3 independent sets of variables-one function must be
registered with CLB
• An arbitary function of 5 variables
• An arbitary function of 4 variables together with some functions of 6 variables
• Some function of 9 variables
49. • Each CLB has 2 storage devices that can be configured as Edge
trigerredflipflops with common clock
– Storage elements get their inputs from function generators or from
D𝑖𝑛 input
– The other elements can get an external input from H1 input
– Storage elements are driven by a global SR during power-up
• Function generators can also drive 2 outputs directly(X &
Y)and independently of the outputs of the storage elements
DEDICATED FAST CARRY & BORROW LOGIC
• F & G function generators have separate dedicated logic for
fast carry & borrow generation with dedicated routing to link
the extra signal to the function in the adjacent CLB
• Prebuilt carry chain within CLB can be used to generate a pair
of 2-bit words in one CLB
50. • F generates a0+b0; G generates a1+b1
• Fast carry will forward the carry to next CLB above or below
• Fast carry & borrow increases efficiency performance of
adders, subtractors, accumulators,comparators & counters
Distributed RAM:
• 3 Function generators can be used as RAM either 16X2 dual
port RAM or 32X1 single port RAM
• Don’t have block RAM but a group of CLBs form an array of
memory
51. XILINX SPARTAN II FPGAs
• contains several logic & memory resources that can support
15K-200K system gates & up to 57Kb block RAM storage
• Contains flexible Input/output (I/O) interfaces
• Manufactured in 0.25/0.18um CMOS technology with 6-layers
metal for interconection
• High performance & high system frequency of 200MHz
• Provides advanced Clock Control with 4 Dedicated Delay Lock
Loops( DLL)
• Supports unlimited reprogramability
52. SPARTAN II ARCHITECTURE
• Each of four quadrants
of CLBs is supported by
DLL
• Bounded by 406bit
block RAM
• Periphery of the chip is
lined with IOBs
• Each CLBs contains four
logic cells organised as a
pair of slices
• Each logic cell has
• 4 i/p LUT
• Logic for carry and
control
• D F/F
53. • Each LUT can be configured as a
16X1 RAM(distributed) and the pair
of LUTs in a logic cell configured as
16X2 bit RAM or 32X1 Bit RAM
• IOBs are individually programmable
to support reference,output
voltage and termination voltagesfor
high speed memory & bus
standards
• Each IOB has 3 registers functioning
as D F/F or as level sensitive
latches:
• One register (TFF) used to register
the signal that controls
(synchronously) programmable
output buffer
• Second register(OFF) is
programmed to register a signal
from internal logic ( alternatively,
a signal from internal logic can
pass directly to the output buffer)
• Third device to register signal
coming from I/O pad
54. • Common clock
drives each
register but has an
independent clock
enable
• Programmable
delay element on
the input path
used to eliminate
the pad-to pad
hold time
55. XILINX VIRTEX FPGAS
• Leading edge of Xilinx Technology
• Addresses 4 key factors influencing the solution to complex system-level
and system-on-chip (SoC) designs:
– Level of Integration
– Amount of embedded memory
– Performance(timing)
– Subsystem interfaces
ARCHITECTURE :
• The programmable device is comprised of input/output blocks (IOBs) and
internal configurable logic blocks (CLBs).
– Programmable I/O blocks provide the interface between package pins and the
internal configurable logic.
– leading-edge I/O standards are supported by the programmable IOBs.
56. includes four major elements organized in a
regular array
Configurable Logic Blocks (CLBs) provide
functional elements for combinatorial and
synchronous logic, including basic storage
elements. BUFTs (3-state buffers) associated
with each CLB element drive dedicated
segmentable horizontal routing resources.
Block SelectRAM memory modules provide
large 18 Kbit storage elements of dual-port
RAM.
Multiplier blocks are 18-bit x 18-bit dedicated
multipliers.
DCM (Digital Clock Manager) blocks provide
self-calibrating, fully digital solutions for clock
distribution delay compensation, clock
multiplication and division, coarse- and fine-
grained clock phase shifting.
57. • A new generation of programmable routing resources called Active
Interconnect Technology interconnects all of these elements.
• The general routing matrix (GRM) is an array of routing switches.
• Each programmable element is tied to a switch matrix, allowing multiple
connections to the general routing matrix
• All programmable elements, including the routing resources, are controlled by
values stored in static memory cells. These values are loaded in the memory
cells during configuration and can be reloaded to change the functions of the
programmable elements.
Virtex-II Features
• Input/Output Blocks (IOBs):
IOBs are programmable and can be categorized as follows:
Input block with an optional single-data-rate or double-data-rate (DDR)
register
Output block with an optional single-data-rate or DDR register, and optional 3-
state buffer, to be driven directly or through a single or DDR register
Bidirectional block (any combination of input and output configurations)
58. • IOB blocks include six storage
elements, as shown in figure
• Each storage element can be
configured either as an edge-
triggered D-type flip-flop or as
a level-sensitive latch
• On the input, output, and 3-
state path, one or two DDR
registers can be used.
• Double data rate is directly
accomplished by the two
registers on each path,
clocked by the rising edges (or
falling edges) from two
different clock nets.
• The two clock signals are
generated by the DCM and
must be 180 degrees out of
phase
• There are two input, output,
and 3-state data signals, each
being alternately clocked out.
59. • These registers are either edge-triggered D-type flip-flops or level-sensitive latches.
• IOBs support the following single-ended I/O standards:
• LVTTL, LVCMOS (3.3V, 2.5V, 1.8V, and 1.5V)
• PCI-X compatible (133 MHz and 66 MHz) at 3.3V
• PCI compliant (66 MHz and 33 MHz) at 3.3V
• CardBus compliant (33 MHz) at 3.3V • GTL and GTLP
• The IOB elements also support the following differential signaling I/O standards:
• LVDS • BLVDS (Bus LVDS) • ULVDS • LDT • LVPECL
• Two adjacent pads are used for each differential pair. Two or four IOB blocks connect to one switch
matrix to access the routing resources.
CLBs
• CLB resources include four slices and two 3-state buffers.
• Each slice is equivalent and contains:
– Two function generators (F & G)
– Two storage elements
– Arithmetic logic gates
– Large multiplexers
– Wide function capability
– Fast carry look-ahead chain
– Horizontal cascade chain (OR gate)
60. • The function generators F & G are configurable as 4-input look-up tables (LUTs), as 16-bit shift
registers, or as 16-bit distributed SelectRAM memory.
• The two storage elements are either edge-triggered D-type flip-flops or level-sensitive latches.
• Each CLB has internal fast interconnect and connects to a switch matrix to access general routing
resources.
Block SelectRAM Memory
The block SelectRAM memory resources are 18 Kb of dual-port RAM, programmable from 16K x 1 bit
to 512 x 36 bits, in various depth and width configurations.
63. Short channel effects
Five different physical phenomena have to be
considered in short-channel devices:
• Drain induced barrier lowering and Punchthrough
• Surface scattering
• Velocity saturation
• Impact ionization
• Hot electrons
64. Drain-induced barrier lowering (DIBL)
• The electrons (carriers) in the channel face a
potential barrier that blocks their flows.
• The potential barrier, in small-geometry
MOSFETs, is controlled by a two-dimensional
electric field vector (in other words by both
VGS and VDS).
• If the drain voltage is increased the potential
barrier in the channel decreases, leading to
Drain-Induced Barrier Lowering (DIBL)
65. Drain-induced barrier lowering (DIBL)
and Punchthrough
• Under DIBL condiction electrons can flow between
the source and drain even if VGS < VT.
• The channel current that flows in this case is called
subthreshold current Punchthrough.
• The DIBL phenomenon can be accompanied by the
so-called punchthrough, that occurs when the
depletion region surrounding the drain extends to
the source.
• Punchthrough minimized with thinner oxide,
larger substrate doping (and longer channel!)
71. High-k Dielectric
• High-κ dielectric refers to a material with a
high dielectric constant (κ, kappa), as compared
to silicon dioxide
• used in semiconductor manufacturing processes where
they are usually used to replace a silicon dioxide gate
dielectric or another dielectric layer of a device.
• As metal-oxide-semiconductor field-effect
transistors (MOSFETs) have decreased in size, the
thickness of the silicon dioxide gate dielectric has
steadily decreased to increase the gate capacitance (per
unit area) and thereby drive current (per device width),
raising device performance.
72. • As the thickness scales below 2 nm, leakage currents
due to tunneling increase drastically, leading to high
power consumption and reduced device reliability
• Replacing the silicon dioxide gate dielectric with a high-κ
material allows increased gate capacitance without the
associated leakage effects.
73. FINFET TECHNOLOGY
Basics OF FINFET:
• Type of Multigate MOSFET
• Widely used over Planer MOSFET
• FIN is channel in between source & drain
• Can have two or four or more FIN in same structure
• Advantages over FET
Area of performance
Lower leakage power
Low voltage operation
Lower retention voltage for SRAM
It is btter control over current
74.
75.
76.
77.
78.
79.
80. ADVANTAGES
• Lower power Consumption
• Operates at low voltage
• Operating speed is high
• Static Leakage current s reduced upto 90%
• Compact