SlideShare a Scribd company logo
1 of 69
Download to read offline
EC8095: VLSI Design Department of ECE 2020-2021
St.Joseph’s College of Engineering / St.Joseph’s Institute of Technology 1
UNIT – I
INTRODUCTION - BASIC MOS TRANSISTOR
The invention of the transistor by William B. Shockley, Walter H. Brattain and John Bardeen
of Bell Telephone laboratories was followed by the development of the Integrated circuit (IC)
The very first IC emerged at the beginning of 1960 and since that time there have already
been 4 generations of ICs
1) SSI ( Small Scale Integration)
2) MSI ( Medium Scale Integration)
3) LSI ( Large Scale Integration)
4) VLSI ( Very Large Scale Integration)
Now we see the emergence of the 5th
generation, ULSI ( Ultra Large Scale Integration) which
is characterized by complexities in excess of 3 million devices on a single IC chip.Within the bounds
of MOS technology, the possible circuit realizations may be based on pMOS, nMOS, CMOS and now
BiCMOS devices. Although CMOS is the dominant technology, some of the examples used to
illustrate the design processes will be presented in nMOS form. The reasons are :
1) For NMOS technology, the design methodology and the design rules are easily learned, thus
providing a simple but excellent introduction to structured design for VLSI.
2) nMOS technology and design processes provide an excellent background for other
technologies. In particular some familiarity with nMOS allows a relatively easy transition to CMOS
technology and design.
3) For GaAs technology some arrangements in relation to logic design are similar to those
employed in nMOS technology. Therefore, understanding the basics of nMOS design will assist in the
layout of GaAs circuits.
BASIC MOS TRANSISTORS
nMOS devices are formed in a p-type substrate of moderate doping level. The source and drain
regions are formed by diffusing n-type impurities through suitable masks into 3 areas to give the
desired n-impurity concentration and give rise to depletion regions which extend mainly in the more
lightly doped p-region.
 Thus, source and drain are isolated from one another by 2 diodes.
 Connections to the source and drain are made by a deposited metal layer. . ( Fig a)
 A polysilicon gate is deposited on a layer of insulation over the region between source and drain
 If the gate is connected to a suitable positive voltage with respect to the source, then the
electric field established between the gate and the substrate gives rise to a charge inversion region in
the substrate under the gate insulation and a conducting path or channel is formed between source and
drain.
 Channel may also be established so that it is present under the condition Vgs = 0 by
implanting suitable impurities in the region between the insulation and the gate. (fig b)
 Substrate is of n-type material and the source and drain diffusions are consequently p-type.(fig c)
EC8095: VLSI Design Department of ECE 2020-2021
St.Joseph’s College of Engineering / St.Joseph’s Institute of Technology 2
ENHANCEMENT MODE TRANSISTOR ACTION:
 In order to establish the channel in the first place a min. voltage level of threshold voltage Vt
must be established between gate and source.
 Fig (a) indicates the conditions
prevailing with the channel established
but no current flowing between source
and drain (Vds = 0)
 Condition: When current flows in the
channel by applying a voltage Vds
between drain and source.
 Corresponding IR drop = Vds along the
channel.
 This results in the voltage between gate
and channel varying with distance along
the channel with the voltage being a max.
ofVgs at the source end.
 Effective voltage Vg = Vgs-Vt, there
will be voltage available to invert the
channel at the drain end so long as Vgs –
Vt>= Vds.
 Limiting condition comes when Vds =
Vgs – Vt.
 For all voltages Vds<Vgs – Vt, the
device is in the non-saturated region of
operation.
 IR drop = Vgs –Vt takes place over less
than the whole length of the channel so
that over part of the channel, near the
drain, there is insufficient electric field
available to give rise to inversion layer to
create the channel.
 Diffusion current completes the path
from source to drain causing the channel
to exhibit a high resistance known as
saturation region.
DEPLETION MODE TRANSISTOR ACTION
 The channel is established, due to the implant, even when Vgs = 0 and to cause the channel to cease
to exist a –ve voltage Vtd must be applied between gate and source.
Vtd is typically < -0.8Vdd, depending on
the implant and substrate bias, but
threshold voltage differences apart.
Drain to source current Ids versus voltage Vds relationships
 The whole concept of the MOS transistor evolves from the use of a voltage on the gate to induce a
charge in the channel between source and drain, which may then be caused to move from source to
drain under the influence of an electric field created by voltage Vds applied between source and drain.
 Since the charge induced is dependent on the gate to source voltage Vgs then Ids is independent on
both Vgs and Vds.
 Consider a structure in which electrons will flow from source to drain.
= , First, transit time ζ sd
But velocity ,Where μ = electron or hole mobility (surface) Eds = electric field (drain to
source) ;
Now , So that , Thus,
Typical values of μ at room temp. areμn = 650 cm2
/Vsec ( surface) μp = 240 cm2
/Vsec (surface)
Non Saturated region:
EC8095: VLSI Design Department of ECE 2020-2021
St.Joseph’s College of Engineering / St.Joseph’s Institute of Technology 3
 Charge induced in channel due to gate voltage is due to to the voltage difference between the gate
and the channel Vgs
 Voltage along the channel varies linearly with distance X from source due to the IR drop in the
channel.
 Assuming the device is not saturated then the average value is Vds/2
 Effective gate voltage Vg = Vgs-Vt, Where Vt is the threshold voltage needed to invert the charge
under the gate and establish the channel.
, Thus induced charge , Where
Eg= avg. electric field gate to channel
εins = relative permittivity of insulation between gate and channel
ε0 = permittivity of free space = 8.85x10-14
Fcm-1
Where D = oxide thickness
Thus 3
Combine eqn 2 & 3 in 1 , we have
or in the non saturated or resistive region where Vds<Vgs - Vtand
/D
The factor W/L is of course contributed by the geometry and it is a common practice to write
 = K. W/L
so that Ids =   
2
/
)
( 2
ds
V
Vds
Vt
Vgs 
  4a ( Alternate form of Eqn 4)
Gate/Channel Capacitance (parallel plate) Also , so
Sometimes it is convenient to use gate capacitance per unit area Co rather than Cg. Noting that Cg = Co
WL
We may also write , Ids = Co W/L 
2
/
)
( 2
ds
V
Vds
Vt
Vgs 
 4c
Saturated region:
Saturation begins when Vds = Vgs - Vt. Since at this point the IR drop in the channel equals the
effective gate to channel voltage at the drain and we may assume that the current remains fairly
constant as Vds increases further.
Ideal I-V Characteristics
Drain current of MOS device in different operating regions.
MOS transistors have three regions of operation:
• Cutoff or sub-threshold region •Linear region • Saturation region
The long-channel model assumes that the current through an OFF transistor is 0.When a transistor
turns ON (Vgs>Vt),the gate attracts carriers(electrons) to form a channel. The electrons drift from
source to drain at a rate proportional to the electric field between these regions. Thus, we can
compute currents if we know the amount of charge in the channel and the rate at which it moves. We
know that the charge on each plate of a capacitor is Q=CV. Thus, the charge in the channel Qchannel
is where Cg is the capacitance of the gate to the channel and Vgc-Vt is
the amount of voltage attracting charge to the channel beyond the minimum required to invert from
pton. The gate voltage is referenced to the channel, which is not grounded. If the source is at Vs and
EC8095: VLSI Design Department of ECE 2020-2021
St.Joseph’s College of Engineering / St.Joseph’s Institute of Technology 4
the drain is at Vd, the average is Vc=(Vs+Vd)/2= Vs+Vds/2. Therefore, the mean difference between
the gate and channel potentials Vgc is Vg–Vc=Vgs–Vds /2,as shown in Figure 2.5. We can model the
gate as a parallel plate capacitor with capacitance proportional to area over thickness. If the gate has
length L and width W and the oxide thickness is tox, as shown in Figure2.6, the capacitance is
Where ε0 is the permittivity of frees pace,8.85×10–14F/cm,andthepermittivityofSiO2is
kox=3.9times as great. Often, the εox/tox term is called Cox, the capacitance per unit area of the
gate oxide.
Some nanometer processes use a different gate dielectric with a higher dielectric constant. In these
processes, tox the equivalent oxide thickness (EOT), the thickness of a layer of SiO2 that has the
same Cox. In this case, tox is thinner than the actual dielectric. Each carrier in the channel is
accelerated to an average velocity, v, proportional to the lateral electric field, i.e., the field between
source and drain. The constant of proportionality μ is called the mobility. The electric field
E is the voltage difference between drain and source Vds divided by the channel length .
The time required for carriers to cross the channel is the channel length divided by the carrier
velocity: L/v. Therefore, the current between source and drain is the total amount of charge in the
channel divided by the time required to cross
The term Vgs–Vt arises so often that it is convenient to abbreviate it as VGT. Equation describes the
linear region of operation, for Vgs>Vt, but Vds relatively small. It is called linear or resistive
because when Vds<<VGT, Ids increases almost linearly with Vds, just like an ideal resistor. The
geometry and technology- dependent parameters are sometimes merged into a single factor ᵝ .
If Vds>Vdsat-VGT, the channel is no longer inverted in the vicinity of the drain; we say it is pinched
off. Beyond this point, called the drain saturation voltage, increasing the drain voltage has no further
effect on current. Substituting Vds=Vdsat at this point of maximum current into Eq(2.5),we find an
expression for the saturation current that is independent of Vds. …
This expression is valid for Vgs>Vt and Vds>Vdsat. Thus, long-channel MOS transistors are said to
exhibit square-law behavior in saturation.
Two key figures of merit for a transistor are Ion and Ioff. Ion (also called Idsat) is the ON current,
Ids, when Vgs=Vds=VDD. Ioff is the OFF current when Vgs=0 and Vds=VDD. According to the
long-channel model, Ioff=0and .
Figure 2.7(a) showsthe I-Vcharacteristicsforthe transistor.Accordingtothefirst-ordermodel,the current
EC8095: VLSI Design Department of ECE 2020-2021
St.Joseph’s College of Engineering / St.Joseph’s Institute of Technology 5
is zero for gate voltages below Vt. For higher gate voltages, current increases linearly with Vds for
small Vds. As Vds reaches the saturation point Vdsat=VGT, current rolls off and eventually becomes
independent of Vds when the transistor is saturated. pMOS transistors behave in the same way, but
with the signs of all voltages and currents reversed. The I-V characteristics are in the third quadrant,
as shown in Figure2.7 (b).
Non -Ideal I-V Effects
The saturation current increases less than quadratically with increasing Vgs . This is caused
by two effects: velocity saturation and mobility degradation.
 At high lateral field strengths (Vds /L), carrier velocity ceases to increase linearly with field
strength. This is called velocity saturation and results in lower Ids than expected at high Vds .
 At high vertical field strengths (Vgs /tox ), the carriers scatter off the oxide interface more
often, slowing their progess. This mobility degradation effect also leads to less current than
expected at high Vgs .
 The saturation current of the nonideal transistor increases somewhat with Vds . This is caused
by channel length modulation, in which higher Vds increases the size of the depletion region
around the drain and thus effectively shortens the channel.
 Increasing the potential between the source and body raises the threshold through the body
effect. Increasing the drain voltage lowers the threshold through drain-induced barrier
lowering. Increasing the channel length raises the threshold through the short channel effect.
 When Vgs<Vt , the current drops off exponentially rather than abruptly becoming zero. This is
called subthreshold conduction. The current into the gate Ig is ideally 0. However, as the
thickness of gate oxides reduces to only a small number of atomic layers, electrons tunnel through
the gate, causing some gate leakage current. The source and drain diffusions are typically reverse-
biased diodes and also experience junction leakage into the substrate or well.
Both mobility and threshold voltage decrease with rising temperature. The mobility effect
tends to dominate for strongly ON transistors, resulting in lower Ids at high temperature. The
threshold effect is most important for OFF transistors, resulting in higher leakage current at high
temperature. In summary, MOS characteristics degrade with temperature.
EC8095: VLSI Design Department of ECE 2020-2021
St.Joseph’s College of Engineering / St.Joseph’s Institute of Technology 6
Mobility Degradtion and Velocity Saturation
 Carrier drift velocity, and hence current, is proportional to the lateral electric field Elat = Vds /L
between source and drain. The constant of proportionality is called the carrier mobility, μ. The long-
channel model assumed that carrier mobility is independent of the applied fields.
 A high voltage at the gate of the transistor attracts the carriers to the edge of the channel, causing
collisions with the oxide interface that slow the carriers. This is called mobility degradation.
 Carriers approach a maximum velocity vsat when high fields are applied. This phenomenon is
called velocity saturation.
Channel Length Modulation
Ideally, Ids is independent of Vds for a transistor in saturation, making the transistor a perfect
current source. The p–n junction between the drain and body forms a depletion region with a width Ld
that increases with Vdb. The depletion region effectively shortens the channel length to Leff = L - Ld
Assume the source voltage is close to the body voltage so Vdb = Vds. Hence, increasing Vds
decreases the effective channel length. Shorter channel length results in higher current; thus, Ids
increases with Vds in saturation. This can be crudely modeled by multiplying EQ (2.10) by a factor of
(1 + Vds / VA), where VA is called the Early voltage. In the saturation region
As channel length gets shorter, the effect of the channel length modulation becomes relatively more
important. Hence, VA is proportional to channel length. This channel length modulation model is a
gross oversimplification of nonlinear behavior and is more useful for conceptual understanding than
for accurate device modeling.
Threshold Effects
So far, we have treated the threshold voltage as a constant. However, Vt increases with the source
voltage, decreases with the body voltage, decreases with the drain voltage, and increases with channel
length. This section models each of these effects.
Body Effect
The body is an implicit fourth terminal. When a voltage Vsb is applied between the source and body,
it increases the amount of charge required to invert the channel, hence, it increases the threshold
voltage. The threshold voltage can be modeled as
where Vt0 is the threshold voltage when the source is at the body potential, ϕs is the surface potential
at threshold and γ is the body effect coefficient, typically in the range 0.4 to 1 V1/2
.
i. Drain induced barrier Lowering (DIBL)
The drain voltage Vds creates an electric field that affects the threshold voltage. This drain-
induced barrier lowering (DIBL) effect is especially pronounced in short-channel transistors.
 It can be modeled asVt = Vto –ηVds. where η is the DIBL coefficient, typically on the order
of 0.1 (often expressed as 100 mV/V).
Drain-induced barrier lowering causes Ids to increase with Vds in saturation, in much the same way as
channel length modulation does. This effect can be lumped into a smaller Early voltage VA.
Short Channel Effects
The threshold voltage typically increases with channel length. This phenomenon is especially
pronounced for small L where the source and drain depletion regions extend into a significant portion
of the channel, and hence is called the short channel effect or Vtrolloff.
ii. Leakage
 Even when transistors are nominally OFF, they leak small amounts of current. Leakage
mechanisms include subthreshold conduction between source and drain, gate leakage from the
gate to body, and junction leakage from source to body and drain to body.
 Subthreshold conduction is caused by thermal emission of carriers over the potential barrier set by
the threshold. Gate leakage is a quantum-mechanical effect caused by tunneling through the
EC8095: VLSI Design Department of ECE 2020-2021
St.Joseph’s College of Engineering / St.Joseph’s Institute of Technology 7
extremely thin gate dielectric. Junction leakage is caused by current through the p-n junction
between the source/drain diffusions and the body.
Subthreshold Leakage
 The long-channel transistor I-V model assumes current only flows from source to drain when
Vgs> Vt. In real transistors, current does not abruptly cut off below threshold, but rather drops off
exponentially.
 When the gate voltage is high, the transistor is strongly ON. When the gate falls below Vt , the
exponential decline in current appears as a straight line on the logarithmic scale. This regime of
Vgs<Vt is called weak inversion.
 The subthreshold leakage current increases significantly with Vds because of drain-induced
barrier lowering. There is a lower limit on Ids set by drain junction leakage that is exacerbated by
the negative gate voltage.
 Subthreshold leakage current is described by EQ (2.42). Ids0 is the current at threshold and is
dependent on process and device geometry.
Gate Leakage
According to quantum mechanics, the electron cloud surrounding an atom has a probabilistic spatial
distribution. For gate oxides thinner than 15–20 Å, side of the oxide, where it will get whisked away
through the channel. This effect of carriers crossing a thin barrier is called tunneling, and results in
leakage current through the gate.
Two physical mechanisms for gate tunneling are called Fowler-Nordheim (FN) tunnelingand
direct tunneling. FN tunneling is most important at high voltage and moderate oxide thickness and is
used to program EEPROM memories. Direct tunneling is most important at lower voltage with thin
oxides and is the dominant leakage component. The direct gate tunneling current can be estimated as
where A and B are technology constants.
Junction Leakage
The p–n junctions between diffusion and the substrate or well form diodes. The well-to-
substrate junction is another diode. The substrate and well are tied to GND or VDD to ensure these
diodes do not become forward biased in normal operation. However, reverse-biased diodes still
conduct a small amount of current ID.
where IS depends on doping levels and on the area and perimeter of the diffusion region and VD is the
diode voltage (e.g., –Vsb or –Vdb). When a junction is reverse biased by significantly
more than the thermal voltage, the leakage is just –IS, generally in the 0.1–0.01 fA/μm2
range, which
is negligible compared to other leakage mechanisms.
More significantly, heavily doped drains are subject to band-to-band tunneling (BTBT) and
gate-induced drain leakage (GIDL).
Temperature Dependence
Transistor characteristics are influenced by temperature. Carrier mobility decreases with temperature.
An approximate relation is
where T is the absolute temperature, Tr is room temperature, and kμ is a fitting parameterwith a
EC8095: VLSI Design Department of ECE 2020-2021
St.Joseph’s College of Engineering / St.Joseph’s Institute of Technology 8
typical value of about 1.5. vsat also decreases with temperature, dropping by about20% from 300 to
400 K. The magnitude of the threshold voltage decreases nearly linearly with temperature and may
be approximated by where kvt is typically about 1–2
mV/K. Ion at high VDD decreases with temperature. Subthreshold leakage increases exponentiallywith
temperature.
 Subthreshold leakage is exponentially dependent on temperature, so lower threshold voltages can
be used. Velocity saturation occurs at higher fields, providing more current.
 As mobility is also higher, these fields are reached at a lower power supply, saving power.
Depletion regions become wider, resulting in less junction capacitance.
Geometry Dependence
 The layout designer draws transistors with width and length Wdrawn and Ldrawn. The actual gate
dimensions may differ by some factors XW and XL.
 the source and drain tend to diffuse laterally under the gate by LD, producing a shorter effective
channel length that the carriers must traverse between source and drain. Similarly, WD accounts
for other effects that shrink the transistor width. The factors of two come from lateral diffusion on
both sides of the channel.
 Therefore, a transistor drawn twice as long may have an effective length that is more than twice as
great. Similarly, two transistors differing in drawn widths by a factor of two may differ in
saturation current by more than a factor of two.
 Threshold voltages also vary with transistor dimensions because of the short and narrow channel
effects.
Combining threshold changes, effective channel lengths, channel length modulation, and
velocity saturation effects, Idsat does not scale exactly as 1/L. In general, when currents must be
precisely matched (e.g., in sense amplifiers or A/D converters), it is best to use the same width and
length for each device. Current ratios can be produced by tying several identical transistors in parallel.
CMOS TECHNOLOGIES
CMOS provides an inherently low power static circuit technology that has the capability of
providing a lower-delay product than comparable design-rule nMOS or pMOS technologies. The
four dominant CMOS technologies are:
P-well process
n-well process
twin-tub process
Silicon on chip process
nMOS FABRICATION
 Processing is carried out on a thin wafer cut from a single crystal of silicon of high purity into
which the required p-impurities are introduced as the crystal is grown.
 A layer of silicon dioxide ( SiO2), typically 1m thick is grown all over he surface of the wafer
to protect the surface, act as a barrier to dopants during processing and provide a generally
insulating substrate on to which other layers may be deposited and patterned.
 The surface is now covered with a photo resist which is deposited onto the wafer and spun to
achieve an even distribution of the required thickness.
 The photo resist layer is then exposed to ultra violet light through a mask which defines those
regions into which diffusion is to take place together with transistor channels.
 These areas are subsequently readily etched away together with the underlying silicon dioxide so
that the wafer surface is exposed in the window defined by the mask.
 Remaining photo resist is removed and a thin layer of SiO2 is grown over the entire chip surface
and then polysilicon is deposited on top of this to form the gate structure. The Layer consists of
heavily doped polysilicon deposited by chemical vapor deposition (CVD).
 Photo resist coating and masking allows the polysilicon to be patterned and then the thin oxide is
removed to expose areas into which n-type impurities are to be diffused.
 Thin oxide is grown over all again and is then masked with photo resist and etched to expose
selected areas of the polysilicon gate and the drain and source areas where connections are to be
made.
 The whole chip then has metal (Al) deposited over its surface to a thickness typically of 1 m.
This metal layer is then masked and etched to form the required interconnection pattern.
EC8095: VLSI Design Department of ECE 2020-2021
St.Joseph’s College of Engineering / St.Joseph’s Institute of Technology 9
CMOS FABRICATION
 P-well process is widely used in practice and then the n-well process is also popular.
P-well process
 The diffusion must be carried out with special care since the p-well doping concentration and depth
will affect the threshold voltages as well as the breakdown voltages of the n-transistor.
 To achieve low threshold voltages ( 0.6 to 1.0 V) we need wither deep well diffusion or high well
resistivity.
 But deep wells require larger spacing due to lateral diffusion and therefore a larger chip area.
EC8095: VLSI Design Department of ECE 2020-2021
St.Joseph’s College of Engineering / St.Joseph’s Institute of Technology 10
 The p-well act as substrates for the n-devices within the parent n-substrate and provided that voltage
polarity restrictions are observed, the 2 areas are electrically isolated.
Layout Design rules
Layout design rules describe how small features can be and how closely they can be reliably
packed in a particular manufacturing process. Industrial design rules are usually specified in
microns. This makes migrating from one process to a more advanced process or a different foundry‘s
process difficult because not all rules scale in the same way.
Mead and Conway popularized scalable design rules based on a single parameter ,λ, that
characterizes the resolution of the process. Λ is generally half of the minimum drawn transistor
channel length. This length is the distance between the source and drain of a transistor and is set by
the minimum width of a polysilicon wire. Designers often describe a process by its feature size.
Feature size refers to minimum transistor length, so λ is half the feature size.
This length is the distance between the source and drain of a transistor and is set by the
minimum width of a polysilicon wire. For example, a 180 nm process has a minimum polysilicon
width (and hence transistor length) of 0.18 μm and uses design rules with λ= 0.09 μm3
. Lambda-
based rules are necessarily conservative because they round up dimensions to an integer multiple of
λ
A conservative but easy-to-use set of design rules for layouts with two metal layers in an n-well
process is as follows:
 Metal and diffusion have minimum width and spacing of 4 λ.
 Contacts are 2 λ × 2 λ and must be surrounded by 1 λ on the layers above and below.
 Polysilicon uses a width of 2 λ.
 Polysilicon overlaps diffusion by 2λ where a transistor is desired and has a spacing
of 1 λ away where no transistor is desired.
 Polysilicon and contacts have a spacing of 3λ from other polysilicon or contacts.
 N-well surrounds pMOS transistors by 6λ and avoids nMOS transistors by 6λ.
Transistor dimensions are often specified by their Width/Length (W/L) ratio. For example, the
nMOS transistor in Figure 1.39 formed where polysilicon crosses n-diffusion has a W/L of 4/2. In a
0.6 μm process, this corresponds to an actual width of 1.2 μm and a length of 0.6 μm. Such a
EC8095: VLSI Design Department of ECE 2020-2021
St.Joseph’s College of Engineering / St.Joseph’s Institute of Technology 11
minimum-width contacted transistor is often called a unit transistor.
pMOS transistors are often wider than nMOS transistors because holes move more slowly than
electrons so the transistor has to be wider to deliver the same current. Figure 1.40(a) shows a unit
inverter layout with a unit nMOS transistor and a double-sized pMOS transistor. Figure 1.40(b)
shows a schematic for the inverter annotated with Width/ Length for each transistor. In digital
systems, transistors are typically chosen to have the minimum possible length because short-channel
transistors are faster, smaller, and consume less power. Figure 1.40(c) shows a shorthand we will
often use, specifying multiples of unit width and assuming minimum length.
Gate layouts
Line of Diffusion based style consists of four horizontal strips:
Metal ground at the bottom of the cell, n-diffusion, p-diffusion, and metal power at the top.
The power and ground lines are often called supply rails. Polysilicon lines run vertically to form
transistor gates. Metal wires within the cell connect the transistors appropriately.
Figure 1.41(a) shows such a layout for an inverter. The input A can be connected from the
top, bottom, or left in polysilicon. The output Y is available at the right side of the cell in metal.
Recall that the p-substrate and n-well must be tied to ground and power, respectively.
Figure 1.41(b) shows the same inverter with well and substrate taps placed under the power
and ground rails, respectively. Figure 1.42 shows a 3-input NAND gate. Notice how the nMOS
transistors are connected in series while the pMOS transistors are connected in parallel. Power and
ground extend 2 λ on each side so if two gates were abutted the contents would be separated by 4 λ,
satisfying design rules. The height of the cell is 36 λ, or 40 λ if the 4 λ space between the cell and
another wire above it is counted. All these examples use transistors of width 4 λ.
EC8095: VLSI Design Department of ECE 2020-2021
St.Joseph’s College of Engineering / St.Joseph’s Institute of Technology 12
EC8095: VLSI Design Department of ECE 2020-2021
St.Joseph’s College of Engineering / St.Joseph’s Institute of Technology 13
UNIT II COMBINATIONAL CIRCUIT DESIGN
DESIGN PRINCIPLE OF STATIC CMOS DESIGN
Digital CMOS circuits are implemented using either static or dynamic design
techniques. In static CMOS, the output is tied to VDD or ground via a low resistance path
(except during switching) and this leads to circuits implementation robust with good noise
immunity. In static CMOS design any function can be realized as a sum of product (SOP) or
a product of sum (POS). If an SOP function pulls the output high, then an SOP-BAR function
will pull the output low. A POS function can pull the output high, while a POS-BAR function
can pull the output low, as shown in fig.
Important properties of static CMOS design:
At any instant of time, the output of the gate is directly connected to Vss or VDD. All
functions are composed of either AND'ed or OR'ed sub functions. The AND function is
composed of NMOS transistors in series. The OR function is composed of NMOS transistors
in parallel. Contains a pull-up network (PUP) and pull down network (PDN). PUP networks
consist of PMOS transistors. PDN networks consist of NMOS transistors. Each network is
the dual of the other network. The output of the complementary gate is inverted.
Advantages of static CMOS design:
 Robust in construction.
 Good noise immunity.
 Static logic has no minimum clock rate, the clock can be paused indefinitely.
 Low power consumption.
 For low operating frequencies, CMOS static logic is used to obtain a relatively small
die size.
Limitations of static CMOS design:
The main limitation of static circuits is slower-speed as compared to dynamic circuits. The
reasons are
1. Increased gate capacitance due to the presence of both PMOS and NMOS transistors.
2. Output depends on the previous cycle inputs due to charges that may be present at internal
inputs.
3. Multiple switching of the output within a cycle depending on the input switching pattern
MOSFETS as Switches
The gate controls the passage of current between the source and the drain. CMOS uses
positive logic - VDD is logic ‗1‘ and Vss is logic '0'. We turn a transistor on or off using the
gate terminal. There are two kinds of CMOS transistors, n - Channel transistors and p -
channel transistors. An n - channel transistor requires a logic T on the gate to make the switch
conducting (to turn the transistor on). A p - channel transistor requires a logic '0' on the gate
to make the switch conducting (to turn the transistor on). The conventional schematic icon
representation along with the switch characteristics is shown.
Basic CMOS Gates In this section, the basic gate implementation in static CMOS are
presented.
AND Gate
If two N-switches are placed in series, the composite switch constructed by this action is
closed (or ON) if both switches are connected to logic '1'. If any one of the switch is at logic
EC8095: VLSI Design Department of ECE 2020-2021
St.Joseph’s College of Engineering / St.Joseph’s Institute of Technology 14
'0' the circuit is said to be open (or OFF) state this yields an 'AND' function. The switch logic
of AND function is shown in
OR Gate
If two N-switches are placed in parallel, the composite switch constructed by this action is
closed (or ON) if any one of the switch is connected to logic ‗1‘.
Bubble Pushing
CMOS stages are inherently inverting, so AND and OR functions must be built from
NAND and NOR gates. DeMorgan‟ s law helps with this conversion:
A NAND gate is equivalent to an OR of inverted inputs. A NOR gate is equivalent to
an AND of inverted inputs. The same relationship applies to gates with more inputs.
Switching between these representations is easy to do on a whiteboard and is often called
bubble pushing.
Compound Gates:
 Static CMOS also efficiently handles compound gates computing various
 The logical effort of each input is the ratio of the input capacitance of that input to the
input capacitance of the inverter
For the AOI21 gate, this means the logical effort is slightly lower for the OR terminal (C)
than for the two AND terminals (A, B).
EC8095: VLSI Design Department of ECE 2020-2021
St.Joseph’s College of Engineering / St.Joseph’s Institute of Technology 15
The parasitic delay is crudely estimated from the total diffusion capacitance on the output
node by summing the sizes of the transistors attached to the output.
Input Ordering Delay Effect
The logical effort and parasitic delay of different gate inputs are often different. Other
gates, like NANDs and NORs, are nominally symmetric but actually have slightly different
logical effort and parasitic delays for the different inputs.
Figure shows a 2-input NAND gate annotated with diffusion parasitic. Consider the
falling output transition occurring when one input held a stable 1 value and the other rises
from 0 to 1. If input B rises last, node x will initially be at VDD – Vt ≈ VDD because it was
pulled up through the nMOS transistor on input A.
The Elmore delay is (R/2)(2C) + R(6C) = 7RC. On the other hand, if input A
rises last, node x will initially be at 0 V because it was discharged through the nMOS
transistor on input B. No charge must be delivered to node x, so the Elmore delay is simply
R(6C) = 6RC.
In general, we define the outer input to be the input closer to the supply rail (e .g., B)
and the inner input to be the input closer to the output (e.g., A). The parasitic delay is smallest
when the inner input switches last because the intermediate nodes have already been
discharged. Therefore, if one signal is known to arrive later than the others, the gate is fastest
when that signal is connected to the inner input.
The inner input has a lower parasitic delay. The logical efforts are lower than
initial estimates might predict because of velocity saturation. Interestingly, the inner input has
a slightly higher logical effort because the intermediate node x tends to rise and cause
negative feedback when the inner input turns ON.
This effect is seldom significant to the designer because the inner input remains faster
EC8095: VLSI Design Department of ECE 2020-2021
St.Joseph’s College of Engineering / St.Joseph’s Institute of Technology 16
over the range of fan-outs used in reasonable circuits. When one input is far less critical than
another, even nominally symmetric gates can be made asymmetric to favor the late input at
the expense of the early one.
For example, consider the path in Figure. Under ordinary conditions, the path acts as a
buffer between A and Y. When reset is asserted, the path forces the output low.
If reset only occurs under exceptional circumstances and can take place slowly, the
circuit should be optimized for input-to-output delay at the expense of reset.
The pulldown resistance is R/4 +R/ (4/3) = R, so the gate still offers the same driver
as a unit inverter. However, the capacitance on input A is only 10/3, so the logical effort is
10/9. This is better than 4/3, which is normally associated with a NAND gate. In the limit of
an infinitely large reset transistor and unit-sized nMOS transistor for input A, the logical
effort approaches 1, just like an inverter.
The improvement in logical effort of input A comes at the cost of much higher effort
on the reset input. Note that the pMOS transistor on the reset input is also shrunk. This
reduces its diffusion capacitance and parasitic delay at the expense of slower response to
reset.
Skewed Gates
In other cases, one input transition is more important than the other. We define H-I
skew gates to favor the rising output transition and LO-skew gates to favor the falling output
transition. This favoring can be done by decreasing the size of the noncritical transistor.
The logical efforts for the rising (up) and falling (down) transitions are called ground gd,
respectively, and are the ratio of the input capacitance of the skewed gate to the input
capacitance of an unskewed inverter with equal drive for that transition.
Figure (a) shows how a H-I skew inverter is constructed by downsizing the nMOS
transistor. This maintains the same effective resistance for the critical transition while
reducing the input capacitance relative to the unskewed inverter of Figure (b), thus reducing
the logical effort on that critical transition to gu = 2.5/3 = 5/6.
Of course , the improvement comes at the expense of the effort on the
noncritical transition. The logical effort for the falling transition is estimated by comparing
the inverter to a smaller unskewed inverter with equal pulldown current, shown in Figure (c),
giving a logical effort of gd = 2.5/1.5 = 5/3.
The degree of skewing (e.g., the ratio of effective resistance for the fast transition
relative to the slow transition) impacts the logical efforts and noise margins; a factor of two is
common. Figure catalogs HI-skew and LO-skew gates with a skew factor of two. Skewed
gates are sometimes denoted with an H or an L on their symbol in a schematic.
P/N Ratios
The pMOS transistors in the unskewed gate are enormous in order to provide
equal rise delay. They contribute input capacitance for both transitions, while only helping
the rising delay. By accepting a slower rise delay, the pMOS transistors can be downsized to
reduce input capacitance and average delay significantly.
EC8095: VLSI Design Department of ECE 2020-2021
St.Joseph’s College of Engineering / St.Joseph’s Institute of Technology 17
Reducing the pMOS size from 2 to for the inverter gives the theoretical fastest
average delay, but this delay improvement is only 3%. However, this significantly reduces
the pMOS transistor area.
It also reduces input capacitance, which in turn reduces power consumption.
Unfortunately, it leads to unequal delay between the outputs. Some paths can be slower than
average if they trigger the worst edge of each gate.
Excessively slow rising outputs ca n also cause hot electron de gradation. And
reducing the pMOS size also moves the switching point lower and reduces the inverter‟ s
noise margin. In summary, the P/N ratio of a library of cells should be chosen on the basis of
area, power, and reliability, not average delay.
For NOR gates , reducing the size of the pMOS transistors significantly improves
both delay and area. In most standard cell libraries, the pitch of the cell determines the P/N
ratio that can be achieved in any particular gate. Ratios of 1.5–2 are commonly used for
inverters.
Multiple Threshold Voltages
Some CMOS processes offer two or more threshold voltages . Transistors with lower
threshold voltages produce more ON current, but also leak exponentially more OFF current.
Libraries can provide both high and low threshold versions of gates. The low - threshold
gates can be used sparingly to reduce the delay of critical paths. Skewed gates can use low
threshold devices on only the critical network of transistors.
Delay estimation:
Estimation of the delay of a Boolean function from its functional description is an
important step towards design exploration at the register transfer level (RTL). This paper
addresses the problem of estimating the delay of certain optimal multi-level implementations
of combinational circuits, given only their functional description.
tpdr: rising propagation delay From input to rising output crossing VDD/2
tpdf: falling propagation delay From input to falling output crossing VDD/2
tpd: average propagation delay tpd = (tpdr + tpdf)/2
tr: rise time From output crossing 20% to 80% VDD
tf: fall time From output crossing 80% to 20% VDD
tcd: average contamination delay tcd = (tcdr + tcdf)/2
tcdr: rising contamination delay: Min from input to rising output crossing VDD/2 tcdf:
falling contamination delay: Min from input to falling output crossinVDD/2
Use RC delay models to estimate delay
C = total capacitance on the output node. Use Effective resistance R, Therefore tpd = RC
Transistors are characterized by finding their effective R.
Transistor sizing:
 Not all gates need to have the same delay.
 Not all inputs to a gate need to have the same delay.
 Adjust transistor sizes to achieve desired delay.
Logical effort
Logical effort is a gate delay model that takes transistor sizes into account. Allows us
to optimize transistor sizes over combinational networks. Isn‘t as accurate for circuits with
reconvergent fanout.
Logical effort gate delay model
EC8095: VLSI Design Department of ECE 2020-2021
St.Joseph’s College of Engineering / St.Joseph’s Institute of Technology 18
 Express delays in process-independent unit
 Gate delay is measured in units of minimum-size inverter delay τ. d = dabs / τ.
τ = 3RC ≈ 12ps in 180 nm process, 40 ps in 0.6 µm process.
 Gate delay formula: d = f + p.
 Effort delay f is related to gate‘s load. Parasitic delay p depends on gate‘s structure.
Represents delay of gate driving no load Set by internal parasitic capacitance
Effort delay
 Effort delay has two components: f = gh.
 Electrical effort h is determined by gate‘s load: h = Cout/Cin Sometimes called fanout
 Logical effort g is determined by gate‘s structure. Measures relative ability of gate to
deliver current g ≡ 1 for inverter
Delay plots:
Computing Logical Effort
Logical effort is the ratio of the input capacitance of a gate to the input capacitance of an
inverter delivering the same output current. Measure from delay Vs fanout plots Or estimate
by counting transistor widths.
Circuit families and its comparison:
The method of logical effort does not apply to arbitrary transistor networks, but only
to logic gates. A logic gate has one or more inputs and one output, subject to the following
restrictions:
The gate of each transistor is connected to an input, a power supply, or the output; and
Inputs are connected only to transistor gates.
The first condition rules out multiple logic gates masquerading as one, and the second
keeps inputs from being connected to transistor sources or drains, as in transmission gates
without explicit drivers.
Pseudo-NMOS circuits
Static CMOS gates are slowed because an input must drive both NMOS and PMOS
transistors. In any transition, either the pullup or pulldown network is activated, meaning the
input capacitance of the inactive network loads the input. Moreover, PMOS transistors have
poor mobility and must be sized larger to achieve comparable rising and falling delays,
further increasing input capacitance.
Pseudo-NMOS and dynamic gates offer improved speed by removing the PMOS
transistors from loading the input. Pseudo-NMOS gates resemble static gates, but replace the
slow PMOS pullup stack with a single grounded PMOS transistor which acts as a pullup
resistor. The effective pullup resistance should be large enough that the NMOS transistors
can pull the output to near ground, yet low enough to rapidly pull the output high.
Figure shows several pseudo-NMOS gates ratioed such that the pulldown transistors
are about four times as strong as the pullup. The logical effort follows from considering the
output current and input capacitance compared to the reference inverter from Figure Sized as
shown, the PMOS transistors produce 1/3 of the current of the reference inverter and the
NMOS transistor stacks produce 4/3 of the current of the reference inverter.
EC8095: VLSI Design Department of ECE 2020-2021
St.Joseph’s College of Engineering / St.Joseph’s Institute of Technology 19
For falling transitions, the output current is the pulldown current minus the pullup
current which is fighting the pulldown, For rising transitions, the output current is just the
pullup current, 1/3. The inverter and NOR gate have an input capacitance of 4/3.
Gate
type
Logical Effort g
Rising Falling Average
2 - NAND 8/3 8/9 16/9
3 - NAND 4 4/3 8/3
4 - NAND 16/3 16/9 32/9
n - NOR 4/3 4/9 8/9
n - mux 8/3 8/9 16/9
The average logical effort is g = (4=9+4=3)=2 = 8. This is independent of the number of
inputs, explaining why pseudo-NMOS is a way to build fast wide NOR gates.
Pass Transistor Logic :
It is a MOS transistor, in which gate is driven by a control signal the source (out),
the drain of the transistor is called constant or variable voltage potential(in) when the control
signal is high, input is passed to the output and when the control signal is low, the output is
floating topology such topology circuits is called pass transistor.
The Pass transistor logic is required to reduce the transistors for implementing logic
by using the primary inputs to drive gate terminals, source and drain terminals. In
complementary CMOS logic primary inputs are allowed to drive only gate terminals.
Figure shows implementation of AND function using only MOS pass transistors. In this gate
if the B input is high the left NMOS is turned ON and copies the input A to the output F.
When B is low the right NMOS pass transistor is turned ON and passes a ‗0‘ to the output F.
This satisfies the truth table of AND gate reproduced in Table below for verification. ‗OR‘
gate using pass transistor logic
The truth table of ‗OR‘ gate is as shown in Table below. Figure below shows the
implementation of OR function using NMOS transistors only. In this gate if the B input is
high the right NMOS is turned ON and copies logic 1 to F and this operation does not
affected by ‗A‘ input. When B is low the left NMOS is turned ON the logic of ‗A‘ is copied
to the output F.
EC8095: VLSI Design Department of ECE 2020-2021
St.Joseph’s College of Engineering / St.Joseph’s Institute of Technology 20
Advantage:
 Fewer transistors are required to implement a given function.
 Lower capacitance because of reduced number of transistors.
 They do not have path VDD to GND and do not dissipate standby power (static power
dissipation).
Drawback:
As discussed NMOS devices are effective in passing strong ‗0‘ but it is poor at
pulling a node to VDD. Hence when the pass transistor pulls a node to high logic the output
only changes upto VDD–VTh. This is the major disadvantage of pass transistors.
Pass transistor logic (PTL) circuits are often superior to standard CMOS circuits in
terms of layout density, circuit delay and power consumption.
Transmission Gate Logic:
The transmission gate logic is used to solve the voltage drop problem of the pass
transistor logic. This technique uses the complementary properties of NMOS and PMOS
transistors. i.e. NMOS devices passes a strong ‗0‘ but a weak ‗1‘ while PMOS transistors
pass a strong ‗1‘ but a weak ‗0‘. The transmission gate combines the best of the two devices
by placing an NMOS transistor in parallel with a PMOS transistor as shown in Figure below.
The control signals to the transmission gate C and ~C are complementary to each
other. The transmission gate is mainly a bi-directional switch enabled by the gate signal ‗C‘.
When C = 1 both MOSFETs are ON and the signal pass through the gate i.e. A = B if C = 1.
Whereas C = 0 makes the MOSFETs cut off creating an open circuit between nodes A and B.
Basic Structure :
The basic structure of transmission gate is shown in Figure below which consists of
NMOS and PMOS transistors. Here, VG is applied to NMOS, and (VDD- VG) applied to the
PMOS.
The transmission gate work voltage-controlled switch. When VG is high, NMOS and
PMOS are conducting hence switch is closed. Therefore, conduction path between left and
right sides exist. When VG is low, then the MOSFETs are in cutoff and switch is open.
Therefore, there is no direct relationship between VA and VB. Figure below shows the
symbol of transmission gate controlled by switching signals X and X* that are applied to the
gates of NMOS and PMOS respectively.
EC8095: VLSI Design Department of ECE 2020-2021
St.Joseph’s College of Engineering / St.Joseph’s Institute of Technology 21
The circuit constructed with the parallel connection of PMOS and NMOS with
shorted drain and source terminals. The gate terminal uses two select signals s and s, when s
is high than the transmission gates passes the signal on the input. The main advantage of
transmission gate is that it eliminates the threshold voltage drop. Multiplexing element of
path selector, A latch element An unlock switch, Act as a voltage controlled resistor
connecting the input and output.
2 : 1 MUX using transmission gate :
A 2:1 multiplexer is shown in Figure below. This gate selects either input A or B on the basis
of the value of the control signal ‗C‘. When control signal C is logic low the output is equal
to the input A and when control signal C is logic high the output is equal to the input B.
A 2 : 1 multiplexer can be implemented using transmission gates. Figure below shows the
connection diagram of the 2 : 1 multiplexer using transmission gates.
The 2 : 1 MUX selects either A or B depending upon the control signal C. This is
equivalent to implementing the Boolean function, F = (A  C + B  ~C) When the control
signal C is high then the upper transmission gate is ON and it passes A through it so that
output = A.
When the control signal C is low then the upper transmission gate turns OFF and it will not
allow A to pass through it, at the same time the lower transmission gate is ‗ON‘ and it allows
B to pass through it so the output = B.
DYNAMIC CMOS LOGIC
Ratioed circuits reduce the input capacitance by replacing the pMOS
transistors connected to the inputs with a single resistive pullup. The drawbacks of ratioed
circuits include slow rising transitions, contention on the falling transitions, static power
dissipation, and a non zero VOL.
Dynamic circuits circumvent these drawbacks by using a clocked pullup transistor
rather than a pMOS that is always ON. Figure compares (a) static CMOS, (b) pseudo- nMOS,
and (c) dynamic inverters. Dynamic circuit operation is divided into two modes, as shown in
Figure
EC8095: VLSI Design Department of ECE 2020-2021
St.Joseph’s College of Engineering / St.Joseph’s Institute of Technology 22
Dynamic circuits are the fastest commonly used circuit family because they have
lower input capacitance and no contention during switching. They also have zero static power
dissipation. However, they require careful clocking, consume significant dynamic power, and
are sensitive to noise during evaluation.
In Figure, if the input A is 1 during precharge, contention will take place because both
the pMOS and nMOS transistors will be ON.
When the input cannot be guaranteed to be 0 during precharge, an extra clocked evaluation
transistor can be added to the bottom of the nMOS stack to avoid contention as shown in
Figure. The extra transistor is sometimes called a foot.
Figure estimates the falling logical effort of both footed and unfooted dynamic gates.
As usual, the pulldown transistors‟ widths are chosen to give unit resistance. Precharge
occurs while the gate is idle and often may take place more slowly. Therefore, the precharge
transistor width is chosen for twice unit resistance.
This reduces the capacitive load on the clock and the parasitic capacitance at the
expense of greater rising delays. We see that the logical efforts are very low. Footed gates
have higher logical effort than their unfooted counterparts but are still an improvement over
static logic. In practice, the logical effort of footed gates is better than predicted because
velocity saturation means series nMOS transistors have less resistance than we have
estimated.
The size of the foot can be increased relative to the other nMOS transistors to reduce
logical effort of the other inputs at the expense of greater clock loading. Like pseudo- nMOS
gates, dynamic gates are particularly well suited to wide NOR functions or multiplexers
because the logical effort is independent of the number of inputs.
A fundamental difficulty with dynamic circuits is the monotonicity
requirement. While a dynamic gate is in evaluation, the inputs must be monotonically rising.
That is, the input can start LOW and remain LOW, start LOW and rise HIGH, start HIGH
and remain HIGH, but not start HIGH and fall LOW.
Figure shows wave forms for a footed dynamic inverter in which the input violates
monotonicity. During precharge, the output is pulled HIGH. When the clock rises, the input
is HIGH so the output is discharged LOW through the pulldown network, as you would want
to have happen in an inverter. The input later falls LOW, turning off the pulldown network.
EC8095: VLSI Design Department of ECE 2020-2021
St.Joseph’s College of Engineering / St.Joseph’s Institute of Technology 23
The output of a dynamic gate be gins HIGH and monotonically falls LOW during
evaluation. This monotonically falling output X is not a suitable input to a second dynamic
gate expecting monotonically rising signals.
CMOS Domino Logic
The monotonicity problem can be solved by placing a static CMOS inverter between
dynamic gates, as shown in Figure. This converts the monotonically falling output into a
monotonically rising signal suitable for the next gate, as shown in Figure.
The dynamic static pair together is called a domino gate because precharge
resembles setting up a chain of dominos and evaluation causes the gates to fire like dominos
tipping over, each triggering the next.
A single clock can be used to precharge and evaluate all the logic gates within the
chain. The dynamic output is monotonically falling during evaluation, so the static inverter
output is monotonically rising. Therefore, the static inverter is usually a HI-skew gate to
favor this rising output.
In general, more complex inverting static CMOS gates such as NANDs or NORs can
be used in place of the inverter . This mixture of dynamic and static logic is called compound
EC8095: VLSI Design Department of ECE 2020-2021
St.Joseph’s College of Engineering / St.Joseph’s Institute of Technology 24
domino.
Domino gates are inherently noninverting, while some functions like XOR gates
necessarily require inversion. Three methods of addressing this problem include pushing
inversions into static logic, delaying clocks, and using dual-rail domino logic.
A second approach is to directly cascade dynamic gates without the static CMOS
inverter, delaying the clock to the later gates to ensure the inputs are monotonic during
evaluation.
Domino circuits
Pseudo-NMOS gates eliminate the bulky PMOS transistors loading the inputs, but pay
the price of quiescent power dissipation and contention between the pullup and pulldown
transistors. Dynamic gates offer even better logical effort and lower power consumption by
using a clocked precharge transistor instead of a pullup that is always conducting.
The dynamic gate is precharged HIGH then may evaluate LOW through an NMOS
stack. Unfortunately, if one dynamic inverter directly drives another, a race can corrupt the
result. When the clock rises, both outputs have been precharged HIGH.
The HIGH input to the first gate causes its output to fall, but the second gate‘s output
also falls in response to its initial HIGH input. The circuit therefore produces an incorrect
result because the second output will never rise during evaluation, as shown in Figure 10.3.
Domino circuits solve this problem by using inverting static gates between dynamic gates so
that the input to each dynamic gate is initially LOW. The falling dynamic output and rising
static output ripple through a chain of gates like a chain of toppling dominos.
In summary, domino logic runs 1:5 to 2 times faster than static CMOS logic because
dynamic gates present a much lower input capacitance for the same output current and have a
lower switching threshold, and because the inverting static gate can be skewed to favor the
critical monotonically rising evaluation edges. Figure shows some domino gates. Each
domino gate consists of a dynamic gate followed by an inverting static gate1.
The static gate is often but not always an inverter. Since the dynamic gate‘s output
falls monotonically during evaluation, the static gate should be skewed high to favor its
monotonically rising output.
A dynamic gate may be designed with or without a clocked evaluation transistor; the
extra transistor slows the gate but eliminates any path between power and ground during
precharge when the inputs are still high.
Dual-Rail Domino Logic:
Dual-rail domino gates encode each signal with a pair of wires. The input and output
signal pairs are denoted with sig_h and sig_l, respectively. Table summarizes the encoding.
The sig_h wire is asserted to indicate that the output of the gate is ―high‖ or 1. The sig_l wire
is asserted to indicate that the output of the gate is ―low‖ or 0.
EC8095: VLSI Design Department of ECE 2020-2021
St.Joseph’s College of Engineering / St.Joseph’s Institute of Technology 25
When the gate is precharged, neither sig_h nor sig_l is asserted. The pair of lines
should never be both asserted simultaneously during correct operation.
Dual-rail domino gates accept both true and complementary inputs and compute both
true and complementary outputs, as shown in Figure. Observe that this is identical to static
CVSL circuits from Figure except that the cross-coupled pMOS transistors are instead
connected to the precharge clock. Therefore, dual-rail domino can be viewed as a dynamic
form of CVSL, sometimes called DCVS.
Figure shows a dual-rail AND/NAND gate and Figure shows a dual-rail XOR/XNOR
gate. The gates are shown with clocked evaluation transistors, but can also be unfooted. Dual-
rail domino is a complete logic family in that it can compute all inverting and non inverting
logic functions.
However, it requires more area, wiring, and power. Dual rail structures also lose the
efficiency of wide dynamic NOR gates because they require complementary tall dynamic
NAND stacks.
Dual rail domino signals not only the result of a computation but also indicates when
the computation is done. Before computation completes, both rails are precharged. When the
computation completes, one rail will be asserted. A NAND gate can be used for completion
detection, as shown in Figure. This is particularly useful for asynchronous circuits
Keepers
Dynamic circuits also suffer from charge leakage on the dynamic node. If a dynamic
node is precharged high and then left floating, the voltage on the dynamic node will drift over
time due to subthreshold, gate, and junction leakage. The time constants tend to be in the
millisecond to nanosecond range, depending on process and temperature. This problem is
analogous to leakage in dynamic RAMs.
EC8095: VLSI Design Department of ECE 2020-2021
St.Joseph’s College of Engineering / St.Joseph’s Institute of Technology 26
More over, dynamic circuits have poor input noise margins . If the input rises above
Vt while the gate is in evaluation, the input transistors will turn on weakly and can incorrectly
discharge the output. Both leakage and noise margin problems can be addressed by adding a
keeper circuit.
Figure shows a conventional keeper on a domino buffer. The keeper is a weak
transistor that holds, or staticizes, the output at the correct level when it would otherwise
float. When the dynamic node X is high, the output Y is low and the keeper is ON to prevent
X from floating. When X falls, the keeper initially opposes the transition so it must be much
weaker than the pulldown network. Eventually Y rises, turning the keeper OFF and avoiding
static power dissipation.
The keeper must be strong (i.e., wide) enough to compensate for any leakage current
drawn when the output is floating and the pulldown stack is OFF. Strong keepers also
improve the noise margin because when the inputs are slightly above Vt the keeper can
supply enough current to hold the output high.
NP and Zipper Domino
Another variation on domino is shown in Figure. The HIskewinverting static gates
are replaced with predischarged dynamic gates using pMOS logic.
For example, a footed dynamic p-logic NAND gate is shown in Figure. When Φ is 0,
the first and third stages pre charge high while the second stage predischarges low. When Φ
rises, all the stages evaluate. Domino connections are possible, as shown in Figure. The
design style is called NP Domino or NORA Domino (NORA).
NORA has two major drawbacks. The logical effort of footed p-logic gates is
generally worse than that of HI-skew gates (e.g., 2 vs. 3/2 for NOR2 and 4/3 vs. 1 for
NAND2). Secondly, NORA is extremely susceptible to noise.
In an ordinary dynamic gate, the input has a low noise margin (about Vt ), but is
strongly driven by a static CMOS gate.
The floating dynamic output is more prone to noise from coupling and charge sharing,
but drives another static CMOS gate with a larger noise margin. In NORA, however, the
EC8095: VLSI Design Department of ECE 2020-2021
St.Joseph’s College of Engineering / St.Joseph’s Institute of Technology 27
sensitive dynamic inputs are driven by noise prone dynamic outputs. Given these drawbacks
and the extra clock phase required, there is little reason to use NORA.
Zipper domino is a closely related technique that leaves the precharge transistors slightly ON
during evaluation by using precharge clocks that swing between 0 and VDD – |Vtp| for the
pMOS precharge and Vtn and VDD for the nMOS precharge. This plays much the same role
as a keeper.
THE STATIC AND DYNAMIC POWER DISSIPATION IN CMOS CIRCUITS
Static CMOS gates are very power-efficient because they dissipate nearly zero power
while idle. For much of the history of CMOS design, power was a secondary consideration
behind speed and area for many chips. As transistor counts and clock frequencies have
increased, power consumption has skyrocketed and now is a primary design constraint.
The instantaneous power P{t} drawn from the power supply is proportional to the
supply current iDD(t) and the supply voltage VDD, P(t) = iDD(t) VDD
The energy consumed over some time interval T is the integral of the instantaneous power
=
The average power over this interval is Pavg =
Power dissipation in CMOS circuits comes from two components
Static dissipation due to
 subthreshold conduction through OFF transistors
 tunneling current through gate oxide
 leakage through reverse-biased diodes
 contention current in ratioed circuits
Dynamic dissipation due to charging and discharging of load capacitances "short
circuit'' current while both pMOS and nMOS networks are partially ON
Ptotal = Pstatic + Pdynamic
Static Dissipation
Considering the static CMOS inverter shown in Figure, if the input = '0,' the
associated nMOS transistor is OFF and the pMOS transistor is ON. The output voltage is
VDD or logic 1.'
When the input = 1 the associated nMOS transistor is ON and the pMOS transistor is
OFF. The output voltage is 0 volts (GND). Note that one of the transistors is always OFF
when the gate is in either of these logic states.
Ideally, no current flows through the OFF transistor so the power dissipation is zero
when the circuit is quiescent, i.e., when no transistors are switching. Zero quiescent power
dissipation is a principle advantage of CMOS over competing transistor technologies.
However, secondary effects including subthreshold conduction, tunneling, and
leakage lead to small amounts of static current flowing through the OFF transistor. Assuming
the leakage current is constant so instantaneous and average power are the same, the static
power dissipation is the product of total leakage current and the supply voltage.
Pstatic = Istatic VDD
OFF transistors still conduct a small amount of subthreshold current. As subthreshold current
is exponentially dependent on threshold voltage, it is increasing dramatically as threshold
voltages have scaled down. There is also some small static dissipation due to reverse biased
diode leakage between diffusion regions, wells, and the substrate. In modern processes, diode
EC8095: VLSI Design Department of ECE 2020-2021
St.Joseph’s College of Engineering / St.Joseph’s Institute of Technology 28
leakage is generally much smaller than the subthreshold or gate leakage and may be
neglected.
Dynamic Dissipation
Over any given interval of time T, the load will be charged and discharged Tfsw times.
Current flows from VDD to the load to charge it. Current then flows from the load to GND
during discharge. In one complete charge/discharge cycle, a total charge of Q = CVDD is
thus transferred from VDD to GND. The average dynamic power dissipation is
Pdynamic =
Pdynamic =
Because most gates do not switch every' clock cycle, it is often more convenient to express
switching frequency fsw as an activity factor a times the clock frequency.
Now the dynamic power dissipation may be rewritten as;
Pdynamic =
A clock has an activity factor of α=1, because it rises and falls every cycle. Most data
has a maximum activity factor of 0.5 because it transitions only once each cycle.
 Static CMOS logic has been empirically determined to have acvtiity factors closer to
0.1 because some gates maintain one output state more often thananother.
 Because the input rise /fall time is greater than zero, both nMOS and pMOS
transistors will be ON for a short period of time while the input is between Vtn and VDD - Vtp.
This results in an additional "short circuit" current pulse from to GND a VDD and typically
increases power dissipation by about 10% .
Methods to reduce dynamic power dissipation
1. Reducing the product of capacitance and its switching frequency.
2. Eliminate logic switching that is not necessary for computation.
3. Reduce activity factor Reduce supply voltage
Methods to reduce static power dissipation
1. By selecting multi threshold voltages on circuit paths with low-Vt transistors
while leakage on other paths with high-Vt transistors.
2. By using two operating modes, active and standby for each function blocks.
3. By adjusting the body bias (i.e) adjusting FBB (Forward Body Bias) in active
mode to increase performance and RBB (Reverse Body Bias) in standby mode
to reduce leakage.
4. By using sleep transistors to isolate the supply from the block to achieve
significant leakage power savings.
EC8095: VLSI Design Department of ECE 2020-2021
St.Joseph’s College of Engineering / St.Joseph’s Institute of Technology 29
UNIT III: SEQUENTIAL LOGIC CIRCUITS
Static & Dynamic Latches and Registers, Pipelining
 In sequential logic circuits, the output not only depends upon the current values of
the inputs, but also upon preceding input values. In other words, a sequential circuit
remembers some of the past history of the system—it hasmemory.
 Figure shows a block diagram of a generic finite state machine (FSM) that consists
of combinational logic and registers, which hold the system state. The system
depicted here belongs to the class of synchronous sequential systems, in which all
registers are under control of a single global clock. The outputs of the FSM are a
function of the current Inputs and the Current State. The Next State is determined
based on the Current State and the current Inputs and is fed to the inputs of
registers.
 On the rising edge of the clock, the Next State bits are copied to the outputs of the
registers (after some propagation delay), and a new cycle begins. The register then
ignores changes in the input signals until the next rising edge. In general, registers
can be positive edge- triggered (where the input data is copied on the positive edge
of the clock) or negative edge- triggered (where the input data is copied on the
negative edge, as is indicated by a small circle at the clock input).
Block diagram of a finite state machine using positive edge-triggered registers.
Timing Metrics for Sequential Circuits
There are three important timing parameters associated with a register as illustrated in
Figure.
1. The set-up time (tsu) is the time that the data inputs (D input) must be valid before
the clock transition (this is, the 0 to 1 transition for a positive edge-triggered
register).
2. The hold time (thold) is the time the data input must remain valid after the clock
edge.
3. Assuming that the set-up and hold-times are met, the data at the D input is copied to
the Q output after a worst-case propagation delay (with reference to the clock edge)
denoted by tc-q. Given the timing information for the registers and the combination
logic, some system-level timing constraints can be derived. Assume that the worst-
case propagation delay of the logic equals tplogic,while itsminimum delay (also
called the contamination delay) is tcd. The minimum clock period T, required for
proper operation of the sequential circuit is given by
The hold time of the register imposes an extra constraint for proper operation,
Wheretcdregisteris the minimum propagation delay (or contamination delay) of the register.
It is important to minimize the values of the timing parameters associated with the register, as
these directly affect the rate at which a sequential circuit can be clocked. In fact, modern
high-performance systems are characterized by a very-low logic depth, and the register
propagation delay and set-up times account for a significant portion of the clock period.
EC8095: VLSI Design Department of ECE 2020-2021
St.Joseph’s College of Engineering / St.Joseph’s Institute of Technology 30
Classification of Memory Elements
Foreground versus Background Memory
Memory that is embedded into logic is foreground memory (internal memory), and is most
often organized as individual registers of register banks. Large amounts of centralized
memory core are referred to as background memory (external memory).
Static versus Dynamic Memory
 Static memories preserve the state as long as the power is turned on.
 Built using positive feedback or regeneration, where the circuit topology consists of
intentional connections between the output and the input of a combinational circuit.
 Static memories are most useful when the register won‘t be updated for extended
periods of time. E.g. configuration data, loaded at power-up time.
 This condition also holds for most processors that use conditional clocking (i.e.,
gated clocks) where the clock is turned off for unused modules. In that case, there
are no guarantees on how frequently the registers will be clocked, and static
memories are needed to preserve the state information.
 Memory based on positive feedback fall under the class of elements called
multivibrator circuits.The bistableelement, is its most popular representative, but
other elements such as monostable and astable circuits are also frequently used.
 Dynamic memories store state for a short period of time—on the order of
milliseconds. They are based on the principle of temporary charge storage on
parasitic capacitors associated with MOS devices. Capacitors have to be refreshed
periodically to annihilate charge leakage.
 Dynamic memories tend to be simpler, resulting in significantly higher performance
and lower power dissipation. They are most useful in datapath circuits that require
high performance levels and are periodically clocked.
Latches versus Registers
A latch is an essential component in the construction of an edge-triggered register. It is
level- sensitive circuit that passes the D input to the Q output when the clock signal is high.
This latch is said to be in transparent mode. When the clock is low, the input data sampled
on the falling edge of the clock is held stable at the output for the entire phase, and the latch
is in hold mode. The inputs must be stable for a short period around the falling edge of the
clock to meet set-up and hold requirements. A latch operating under the above conditions is
a positive latch. Similarly, a negative latch passes the D input to the Q output when the
clock signal is low.
EC8095: VLSI Design Department of ECE 2020-2021
St.Joseph’s College of Engineering / St.Joseph’s Institute of Technology 31
Timing of positive and negative latches
Static Latches and Registers
The Bistability Principle
Static memories use positive feedback to create a bistable circuit — a circuit having two
stable states that represent 0 and 1. The basic idea is shown in Figure a, which shows two
inverters connected in cascade along with a voltage-transfer characteristic typical of such a
circuit. Assume now that the output of the second inverter Vo2 is connected to the input of
the first Vi1, as shown by the dotted lines in Figure a.
The resulting circuit has only three possible operation points (A, B, and C). Under the
condition that the gain of the inverter in the transient region is larger than 1, only A and B
are stable operation points, and C is a metastable operation point. Suppose that the cross-
coupled inverter pair is biased at point C. A small deviation from this bias point, possibly
caused by noise, is amplified and regenerated around the circuit loop. This is a
consequence of the gain around the loop being larger than 1.
On the other hand, A and B are stable operation points. In these points, the loop gain is
much smaller than unity. Hence the cross-coupling of two inverters results in a
bistablecircuit, which serves as a memory, storing either a 1 or a 0 (corresponding to
positions A and B). In order to change the stored value, we must be able to bring the circuit
from state A to B and vice-versa. This is generally done by applying a trigger pulse at Vi1
or Vi2. The width of the trigger pulse need be only a little larger than the total propagation
delay around the circuit loop, which is twice the average propagation delay of the
inverters.
SR Flip-Flops
SR —or set- reset— flip-flopcircuit is similar to the cross-coupled inverter pair with NOR
gates replacing the inverters. The second input of the NOR gates is connected to the trigger
inputs (S and R), that make it possible to force the outputs Q and Q' to a given state. These
outputs are complimentary (except for the SR = 11 state). When both S and R are 0, the
flip-flop is in a quiescent state and both outputs retain their value. If a positive (or 1) pulse
is applied to the S input,theQ output is forced into the 1 state (with Q going to 0). Vice
versa, a 1 pulse on R resets the flip-flop and the Q output goes to 0.
EC8095: VLSI Design Department of ECE 2020-2021
St.Joseph’s College of Engineering / St.Joseph’s Institute of Technology 32
When both S and R are high, both Q and Q'are forced to zero. This is forbidden. An
additional problem with this condition is that when the input triggers return to their zero
levels, the resulting state of the latch is unpredictable and depends on whatever input is last to
go low.
CMOS clocked SR flip-flop
One possible realization of a clocked SR flip-flop— a level-sensitive positive latch— is
shown in Figure. It consists of a cross-coupled inverter pair, plus 4 extra transistors to drive
the flip- flop from one state to another and to provide clocked operation.
Multiplexer-Based Latches
Advantage: the sizing of devices only affects performance and is not critical to the
functionality. For a negative latch, when the clock signal is low, the input 0 of the
multiplexer is selected, and the D input is passed to the output. When the clock signal is
high, the input 1 of the multiplexer, which connects to the output of the latch, is selected.
The feedback holds the output stable while the clock signal is high.
A transistor level implementation of a positive latch based on multiplexers is shown in
Figure.
 When CLK is high, the bottom transmission gate is on and the latch is transparent -
that is, the D input is copied to the Q output.
 The feedback does not have to be overridden to write the memory and hence sizing of
transistors is not critical for realizing correct functionality. The number of transistors
that the clock touches is important since it has an activity factor of 1.
 Not efficient from this metric as it presents a load of 4 transistors to the CLK signal.
To reduce the clock load to 2 transistors, by using NMOS only pass transistor as shown in
Figure. Advantage
 reduced clock load of only two NMOS devices.
EC8095: VLSI Design Department of ECE 2020-2021
St.Joseph’s College of Engineering / St.Joseph’s Institute of Technology 33
 Simple circuit.
Disadvantage:
Results in passing of a degraded high voltage of VDD- VTnto the input of the first inverter.
This impacts both noise margin and the switching performance, especially in the case of
low values of VDD and high values of VTn. It also causes static power dissipation in first
inverter. Since the maximum input-voltage to the inverter equals VDD-VTn, the PMOS
device of the inverter is never turned off, resulting in a static current flow.
Master-Slave Edge-Triggered Register
 The register consists of cascading a negativeWSW latch (master stage) with a positive
latch (slave stage).
 On the low phase of the clock, the master stage is transparent, and the D input is passed
to the master stage output, QM. During this period, the slave stage is in the hold mode,
keeping its previous value using feedback.
 On the rising edge of the clock, the master slave stops sampling the input, and the slave
stage starts sampling. During the high phase of the clock, the slave stage samples the
output ofthe masterstage (QM), while the master stage remains in a hold mode. Since
QM is constant during the high phase of the clock, the output Q makes only one
transition per cycle.
 The value of Q is the value ofDright before the rising edge of the clock, achieving the
positive edge-triggered effect. A negative edge-triggered register can be constructed
using the same principle by simply switching the order of the positive and negative
latch (this is, placing the positive latch first).
A complete transistor-level implementation of the master-slave positive edge-triggered
register is shown in Figure below.
Drawback of the transmission gate register :the high capacitive load presented to the clock
signal. The clock load per register is important, since it directly impacts the power
dissipation of the clock network. Each register has a clock load of 8 transistors. One
approach to reduce the clock load at the cost of robustness is to make the circuit ratioed.
EC8095: VLSI Design Department of ECE 2020-2021
St.Joseph’s College of Engineering / St.Joseph’s Institute of Technology 34
Figure below shows that the feedback transmission gate can be eliminated by directly cross
coupling the inverters.
Another problem with this scheme is the reverse conduction — this is, the second stage can
affect the state of the first latch. When the slave stage is on (Figure above)it is possible for
the combination of T2 and I4 to influence the data stored in I1-I2 latch. As long as I4 is a
weak device, this is fortunately not a major problem.
Non-ideal clock signals
Variations can exist in the wires used to route the two clock signals, or the load
capacitances can vary based on data stored in the connecting latches. This effect, known as
clock skew is a major problem, and causes the two clock signals to overlap as is shown in
Figure 7.20b. Clock-overlap can cause two types of failures, as illustrated for the NMOS-
only negative master- slave register.
 When the clock goes high, the slave stage should stop sampling the master stage
output and go into a hold mode. However, since CLK and CLK bar are both high for
a short period of time (the overlap period), both sampling pass transistors conduct
and there is a direct path from the D input to the Q output. As a result, data at the
output can change on the rising edge of the clock.This is a race condition in which
the value of the output Q is a function of whether the input D arrives at node X
before or after the falling edge of CLK. If node X is sampled in the metastable state,
the output will switch to a value determined by noise in the system.
 The primary advantage of the multiplexer-based register is that the feedback loop is
open during the sampling period, and therefore sizing of devices is not critical to
functionality. However, if there is clock overlap between CLK bar and CLK, node A
can be driven by both D and B, resulting in an undefinedstate.
Those problems can be avoided by using two non-overlapping clocks PHI1 and PHI2
instead, and by keeping the nonoverlap time tnon_overlapbetween the clocks large
enough such that no overlap occurs even in the presence of clock-routing delays.
EC8095: VLSI Design Department of ECE 2020-2021
St.Joseph’s College of Engineering / St.Joseph’s Institute of Technology 35
Dynamic Latches and Registers
The class of circuits based on temporary storage of charge on parasitic capacitors. Charge
stored on a capacitor can be used to represent a logic signal. The absence of charge denotes
a 0, while its presence stands for a stored 1. a periodic refresh of its value is necessary.
Hence the name dynamic storage.
Dynamic Transmission-Gate Edge-triggered Registers:
A fully dynamic positive edge-triggered register based on the master-slave concept is
shown inFigure below.
 When CLK = 0, the input data is sampled on storage node 1, which has an equivalent
capacitance of C1 consisting of the gate capacitance of I1, the junction capacitance
of T1, and the overlap gate capacitance of T1.
 During this period, the slave stage is in a hold mode, with node 2 in a high-
impedance (floating) state.
 On the rising edge of clock, the transmission gate T2 turns on, and the value sampled
on node 1 right before the rising edge propagates to the output Q
 Node 2 now stores the inverted version of node 1.
Very efficient - requires only 8 transistors. The sampling switches
canbeimplementedusingNMOS-onlypasstransistors (6-transistorimplementation).
The set-up time of this circuit is simply the delay of the transmission gate, and corresponds
to the time it takes node 1 to sample the D input. The hold time is approximately zero, since
the transmission gate is turned off on the clock edge and further inputs changes are ignored.
The propagation delay (tc-q) is equal to two inverter delays plus the delay of the
transmission gate T2.
Race Condition and Preventive Measures
Clock overlap is an important concern for this dynamic register. Consider the clock
waveforms shown in Figure below. During the 0-0 overlap period, the PMOS of T1 and
the PMOS of T2 are simultaneously on, creating a direct path for data to flow from the D
input of the register to the Q output. As a result, data at the output can change on the
falling edge of the clock, which is undesired for a positive edge triggered register. The is
known as a race condition in which the value of the output Q is a function of whether the
input D arrives at node X before or after the raising edge of CLK. The output Q can change
on the falling edge if the overlap period is large — obviously an undesirable effect for a
positive edge-triggered register. The sameis true for the 1-1 overlap region, where an
input-output path exists through the NMOS of T1 and the NMOS of T2. The latter case is
taken care of by enforcing a hold time constraint. That is, the data must be stable during
the high-high overlap period. The former situation (0-0 overlap) can be addressed by
making sure that there is enough delay between the D input and node 2 ensuring that new
data sampled by the master stage does not propagate through to the slave stage. Generally
the built in single inverter delay should be sufficient and the overlap period constraint is
givenas:
Similarly, the constraint for the 1-1 overlap is given as:
EC8095: VLSI Design Department of ECE 2020-2021
St.Joseph’s College of Engineering / St.Joseph’s Institute of Technology 36
Impact of overlapping clocks.
C2
MOS—A Clock-Skew Insensitive Approach ( Method to prevent race
condition)
Figure below shows an ingenious positive edge-triggered register, based on a master-slave
concept insensitive to clock overlap. This circuit is called the C2
MOS (Clocked CMOS)
register, and operates in two phases.
1. CLK = 0 (CLK bar = 1): The first tri-state driver is turned on, and the master stage
acts as an inverter sampling the inverted version of D on the internal node X. The
master stage is in the evaluation mode. Meanwhile, the slave section is in a high-
impedance mode, or in ahold mode. Both transistors M7 and M8 are off, decoupling
the output from the input. The output Q retains its previous value stored on the
output capacitorCL2.
2. The roles are reversed when CLK = 1: The master stage section is in hold mode
(M3- M4 off), while the second section evaluates (M7-M8on). The value stored on
CL1propagates to the output node through the slave stage which acts as aninverter.
In the (0-0) overlap case, both PMOS devices are on during this period. New data is
sampled on node X through the series PMOS devices M2-M4, and node X can make a 0-to-1
transition during the overlap period. However, this data cannot propagate to the output
since the NMOS device M7is turned off. At the end of the overlap period, CLK=1 and both
M7 and M8 turn off, putting the slave stage is in the holdmode.
The (1-1) overlap case where both NMOS devices M3 and M7 are turned on. If the D input
changes during the overlap period, node X can make a 1-to-0 transition, but cannot
propagate to the output. However, as soon as the overlap period is over, the PMOS M8is
turned on and the 0 propagates to output. This effect is notdesirable.
The problem is fixed by imposing a hold time constraint on the input data, D, or, in other
words, the data D should be stable during the overlap period.
Pipelining: An approach to optimize sequential circuits
Pipelining is a popular design technique often used to accelerate the operation of the
datapaths in digital processors. The idea is easily explained with the example of
Figure(a).The goal of the presented circuit is to compute log(|a + b|), where both a and b
represent streams of numbers, that is, the computation must be performed on a large set of
inputvalues.
The minimal clock period Tmin necessary to ensure correct evaluation is given as:
wheretc-qand tsuare the propagation delay and the set-up time of the register, respectively.
We assume that the registers are edge-triggered D registers. The term tpd,logicstands for
the worst- case delay path through the combinational network, which consists of the adder,
absolute value, and logarithm functions. In conventional systems, the latter delay is
EC8095: VLSI Design Department of ECE 2020-2021
St.Joseph’s College of Engineering / St.Joseph’s Institute of Technology 37
generally much larger than the delays associated with the registers and dominates the
circuit performance. Assume that each logic module has an equal propagation delay. We
note that each logic module is then active for only 1/3 of the clock period (if the delay of
the register is ignored). For example, the adder unit is active during the first third of the
period and remains idle—this is, it does no useful computation— during the other 2/3 of
theperiod.
(a)
(b)
Pipelining is a technique to improve the resource utilization, and increase the functional
throughput. Assume that we introduce registers between the logic blocks, as shown in
Figure b. This causes the computation for one set of input data to spread over a number of
clock periods, as shown in Table.The advantage of pipelined operation becomes apparent
when examining the minimum clock period of the modified circuit. The combinational
circuit block has been partitioned into three sections, each of which has a smaller
propagation delay than the original function. This effectively reduces the value of the
minimum allowable clock period:
Suppose that all logic blocks have approximately the same propagation delay, and that the
register overhead is small with respect to the logic delays. The pipelined network
outperforms the original circuit by a factor of three under these assumptions, or T
min,pipe=Tmin/3. The increased performance comes at the relatively small cost of two
additional registers, and an increased latency.
Latch- vs. Register-Based Pipelines
Consider the pipelined circuit of Figure below. The pipeline system is implemented based
on pass-transistor-based positive and negative latches instead of edge triggered registers.
Latch-based systems give significantly more flexibility in implementing a pipelined
system, and oftenoffers higher performance. When the clocks CLK and are non-
overlapping,correctpipelineoperationisobtained.InputdataissampledonC1atthenegativeedge
of CLK and the computation of logic block F starts; the result of the logic block F is stored
on C2 on the falling edge of , and the computation of logic block G starts. The
non
overlappingoftheclocksensurescorrectoperation.ThevaluestoredonC2attheendoftheCLKlow
phaseistheresultofpassingthepreviousinput(storedon thefallingedgeofCLKonC1) through
the logic function F.
EC8095: VLSI Design Department of ECE 2020-2021
St.Joseph’s College of Engineering / St.Joseph’s Institute of Technology 38
NORA-CMOS—A Logic Style for Pipelined Structures
The latch-based pipeline circuit can also be implemented using C2
MOS latches, as shown
in Figure below. This topology has one additional, important property:A C2
MOS-based
pipelined circuit is race-free as long as all the logic functions F between the latches are
non-inverting.
The reasoning for the above argument is similar to the argument made in the construction
of a C2
MOS register. During a (0-0) overlap betweenCLK and, all C2
MOS latches,
simplify to pure pull-up networks (see Figure7.27).
The only way a signal can race from stage to stage under this condition is when the logic
function F is inverting, as illustrated in Figure above, where F is replaced by a single,
static CMOS inverter. Similar considerations are valid for the (1-1)overlap.
Sources of Clock Skew and Jitter
A perfect clock is defined as perfectly periodic signal that is simultaneous triggered at
various memory elements on the chip. However, due to a variety of process and
environmental variations, clocks are not ideal. To illustrate the sources of skew and jitter,
consider the simplistic view of clock generation and distribution as shown in Figure below.
Typically, a high frequency clock is either provided from off chip or generated on-chip.
From a central point, the clock is distributed using multiple matched paths to low-level
memory element, registers. Here two paths are shown. The clock paths include wiring and
the associated distributed buffers required to drive interconnects and loads. A key point to
realize in clock distribution is that the absolute delay through a clock distribution path is
not important; But the relative arrival time between the output of each path at the register
points is important.
EC8095: VLSI Design Department of ECE 2020-2021
St.Joseph’s College of Engineering / St.Joseph’s Institute of Technology 39
The sources of clock uncertainty can be classified in several ways. Systematic errors are
nominally identical from chip to chip, and aretypically predictable (e.g., variation in total
load capacitance of each clock path). In principle, such errors can be modeled and
corrected at design time given sufficiently good models and simulators. Random errors are
due to manufacturing variations (e.g., dopant fluctuations that result in threshold
variations) that are difficult to model and eliminate.Mismatch may also be characterized as
static or time-varying. Below, the various sources ofskewand jitter, introduced in Figure
10.14, are described in detail.
 Clock-Signal Generation(1)
The generation of the clock signal itself causes jitter. A typical on-chip clock
generator takes a low-frequency reference clock signal, and produces a high-
frequency global reference for the processor. The core of such a generator is a
Voltage-Controlled Oscillator (VCO). Problem is coupling from the surrounding
noisy digital circuitry through the substrate. These noise source cause temporal
variations of the clock signal that propagate unfiltered through the clock drivers to
the flip-flops.
 Manufacturing Device Variations(2)
Distributed buffers are integral components of the clock distribution networks, as
they are required to drive both the register loads as well as the global and local
interconnects. The matching of devices in the buffers along multiple clock paths is
critical to minimizing timing uncertainty. Device parameters in the buffers vary
along different paths, resulting in static skew.There are many sources of variations
including oxide variations (that affects the gain and threshold), dopant variations,
and lateral dimension (width and length) variations.
 Interconnect Variations(3)
Vertical and lateral dimension variations cause the interconnect capacitance and
resistance to vary across a chip. Since this variation is static, it causes skew between
different paths. One important source of interconnect variation is the Inter-level
Dielectric (ILD) thickness variations. Other interconnect variations include deviation
in the width of the wires and line spacing. This results from photolithography and
etch dependencies.
 Environmental Variations (4 and 5)
The two major sources are temperature and power supply. Temperature gradients
across the chip isa result of variations in power dissipation across the die (chip). This
is an issue with clock gating where some parts of the chip maybe idle while other
parts of the chip might be active. Since the device parameters (such as threshold,
mobility, etc.) depend strongly on temperature, buffer delay for a clock distribution
network along one path can vary drastically for another path. The delay through
buffers is a very strong function of power supply as it directly affects the drive of the
transistors. As with temperature, the power supply voltage is a strong function of the
switching activity. Power supply variations can be classified into static (or slow) and
high frequency variations. Static power supply variations may result from fixed
currents drawn from various modules, while high-frequency variations result from
EC8095: VLSI Design Department of ECE 2020-2021
St.Joseph’s College of Engineering / St.Joseph’s Institute of Technology 40
instantaneous IR drops along the power grid due to fluctuations in switching activity.
 Capacitive Coupling (6 and 7)
The variation in capacitive load also contributes to timing uncertainty. There are two
major sources of capacitive load variations: coupling between the clock lines and
adjacent signal wires and variation in gate capacitance. Any coupling between the
clock wire and adjacent signal results in timing uncertainty leading to clock jitter.
Another major source of clock uncertainty is variation in the gate capacitance related
to the sequential elements. The load capacitance is highly non-linear and depends on
the applied voltage.
Timing Issues in Digital Circuits, Clock Distribution Techniques,Synchronous and
Asynchronous Design
All sequential circuits have one property in common—a well-defined ordering of the
switching events must be imposed if the circuit is to operate correctly. If this were not the
case, wrong data might be written into the memory elements, resulting in a functional
failure. The synchronous system approach, in which all memory elements in the system are
simultaneously updated using a globally distributed periodic synchronization signal (that
is, a global clock signal), represents an effective and popular way to enforce this ordering.
Functionality is ensured by imposing some strict constraints on the generation of the clock
signals and their distribution to the memory elements distributed over the chip; non-
compliance often leads to malfunction.
We analyze the impact of spatial variations of the clock signal, called clock skew, and
temporal variations of the clock signal, called clock jitter, and introduce techniques to cope
with it. These variations fundamentally limit the performance that can be achieved using a
conventional design methodology.
At the other end of the design spectrum is an approach called asynchronous design,
which avoids the problem of clock uncertainty all-together by eliminating the need for
globally-distributed clocks. After discussing the basics of asynchronous design approach,
we analyze the associated overhead and identify some practical applications. The important
issue of synchronization, which is required when interfacing different clock domains
or when sampling an asynchronous signal, also deserves some in-depth treatment. Finally,
the fundamentals of on-chip clock generation using feedback is introduced along with
trends in timing.
Timing Classification Of Digital Systems
In digital systems, signals can be classified depending on how they are related to a local
clock.Signals that transition only at predetermined periods in time can be classified as
synchronous, mesochronous, or plesiochronous with respect to a system clock. A signal that
can transition at arbitrary times is considered asynchronous.
 Synchronous Interconnect: A signal with exact same frequency, and a known fixed
phase offset with respect to the local clock.
 Mesochronous interconnect:Asignal with the same frequency but an unknown
phase offset with respect to the local clock
 Plesiochronous Interconnect A signal which has nominally the same, but slightly
differentfrequency as the local clock
 Asynchronous Interconnect: Asynchronous signals can transition at any arbitrary
time, and are not slaved to any local clock.
Synchronous Design:
Synchronous Timing Basics
All systems designed today use a periodic synchronization signal or clock. The generation
and distribution of a clock has a significant impact on performance and power dissipation.
In the ideal world, assuming the clock paths from a central distribution point to each
vlsi.pdf important qzn answer for ece department
vlsi.pdf important qzn answer for ece department
vlsi.pdf important qzn answer for ece department
vlsi.pdf important qzn answer for ece department
vlsi.pdf important qzn answer for ece department
vlsi.pdf important qzn answer for ece department
vlsi.pdf important qzn answer for ece department
vlsi.pdf important qzn answer for ece department
vlsi.pdf important qzn answer for ece department
vlsi.pdf important qzn answer for ece department
vlsi.pdf important qzn answer for ece department
vlsi.pdf important qzn answer for ece department
vlsi.pdf important qzn answer for ece department
vlsi.pdf important qzn answer for ece department
vlsi.pdf important qzn answer for ece department
vlsi.pdf important qzn answer for ece department
vlsi.pdf important qzn answer for ece department
vlsi.pdf important qzn answer for ece department
vlsi.pdf important qzn answer for ece department
vlsi.pdf important qzn answer for ece department
vlsi.pdf important qzn answer for ece department
vlsi.pdf important qzn answer for ece department
vlsi.pdf important qzn answer for ece department
vlsi.pdf important qzn answer for ece department
vlsi.pdf important qzn answer for ece department
vlsi.pdf important qzn answer for ece department
vlsi.pdf important qzn answer for ece department
vlsi.pdf important qzn answer for ece department
vlsi.pdf important qzn answer for ece department

More Related Content

Similar to vlsi.pdf important qzn answer for ece department

mosfet ppt.pptx
mosfet ppt.pptxmosfet ppt.pptx
mosfet ppt.pptx8885684828
 
Design of up converter at 2.4GHz using Analog VLSI with 22nm Technology
Design of up converter at 2.4GHz using Analog VLSI with 22nm TechnologyDesign of up converter at 2.4GHz using Analog VLSI with 22nm Technology
Design of up converter at 2.4GHz using Analog VLSI with 22nm Technologyijsrd.com
 
Analog_chap_02.ppt
Analog_chap_02.pptAnalog_chap_02.ppt
Analog_chap_02.pptssuserb4d806
 
An Analytical Model for Fringing Capacitance in Double gate Hetero Tunnel FET...
An Analytical Model for Fringing Capacitance in Double gate Hetero Tunnel FET...An Analytical Model for Fringing Capacitance in Double gate Hetero Tunnel FET...
An Analytical Model for Fringing Capacitance in Double gate Hetero Tunnel FET...VLSICS Design
 
Vlsi design mosfet
Vlsi design mosfetVlsi design mosfet
Vlsi design mosfetvennila12
 
Measurement of the hot carrier damage profile in LDMOS devices stressed at hi...
Measurement of the hot carrier damage profile in LDMOS devices stressed at hi...Measurement of the hot carrier damage profile in LDMOS devices stressed at hi...
Measurement of the hot carrier damage profile in LDMOS devices stressed at hi...Saverio Aurite
 
ECE 6030 Device Electronics.docx
ECE 6030 Device Electronics.docxECE 6030 Device Electronics.docx
ECE 6030 Device Electronics.docxwrite31
 
CMOS Topic 3 -_the_device
CMOS Topic 3 -_the_deviceCMOS Topic 3 -_the_device
CMOS Topic 3 -_the_deviceIkhwan_Fakrudin
 
vlsi 2 unit.pdfvlsi unit 2 important notes for ece department
vlsi 2 unit.pdfvlsi unit 2 important notes for ece departmentvlsi 2 unit.pdfvlsi unit 2 important notes for ece department
vlsi 2 unit.pdfvlsi unit 2 important notes for ece departmentnitcse
 
DESIGN OF THREE BIT ANALOG-TO-DIGITAL CONVERTER (ADC) USING SPATIAL WAVEFUNCT...
DESIGN OF THREE BIT ANALOG-TO-DIGITAL CONVERTER (ADC) USING SPATIAL WAVEFUNCT...DESIGN OF THREE BIT ANALOG-TO-DIGITAL CONVERTER (ADC) USING SPATIAL WAVEFUNCT...
DESIGN OF THREE BIT ANALOG-TO-DIGITAL CONVERTER (ADC) USING SPATIAL WAVEFUNCT...VLSICS Design
 
Introduction gadgets have gained a lot of attention.pdf
Introduction gadgets have gained a lot of attention.pdfIntroduction gadgets have gained a lot of attention.pdf
Introduction gadgets have gained a lot of attention.pdfbkbk37
 
Seminar: Fabrication and Characteristics of CMOS
Seminar: Fabrication and Characteristics of CMOSSeminar: Fabrication and Characteristics of CMOS
Seminar: Fabrication and Characteristics of CMOSJay Baxi
 
MOSFET and Short channel effects
MOSFET and Short channel effectsMOSFET and Short channel effects
MOSFET and Short channel effectsLee Rather
 
Introduction to vlsi design
Introduction to vlsi designIntroduction to vlsi design
Introduction to vlsi designJamia Hamdard
 
Diodes and semiconductors - an introduction
Diodes and semiconductors - an introductionDiodes and semiconductors - an introduction
Diodes and semiconductors - an introductionUniversity of Essex
 

Similar to vlsi.pdf important qzn answer for ece department (20)

mosfet ppt.pptx
mosfet ppt.pptxmosfet ppt.pptx
mosfet ppt.pptx
 
Vlsi design notes
Vlsi design notesVlsi design notes
Vlsi design notes
 
Design of up converter at 2.4GHz using Analog VLSI with 22nm Technology
Design of up converter at 2.4GHz using Analog VLSI with 22nm TechnologyDesign of up converter at 2.4GHz using Analog VLSI with 22nm Technology
Design of up converter at 2.4GHz using Analog VLSI with 22nm Technology
 
Analog_chap_02.ppt
Analog_chap_02.pptAnalog_chap_02.ppt
Analog_chap_02.ppt
 
An Analytical Model for Fringing Capacitance in Double gate Hetero Tunnel FET...
An Analytical Model for Fringing Capacitance in Double gate Hetero Tunnel FET...An Analytical Model for Fringing Capacitance in Double gate Hetero Tunnel FET...
An Analytical Model for Fringing Capacitance in Double gate Hetero Tunnel FET...
 
Vlsi design mosfet
Vlsi design mosfetVlsi design mosfet
Vlsi design mosfet
 
Measurement of the hot carrier damage profile in LDMOS devices stressed at hi...
Measurement of the hot carrier damage profile in LDMOS devices stressed at hi...Measurement of the hot carrier damage profile in LDMOS devices stressed at hi...
Measurement of the hot carrier damage profile in LDMOS devices stressed at hi...
 
UNIT 1.pdf
UNIT 1.pdfUNIT 1.pdf
UNIT 1.pdf
 
ECE 6030 Device Electronics.docx
ECE 6030 Device Electronics.docxECE 6030 Device Electronics.docx
ECE 6030 Device Electronics.docx
 
CMOS Topic 3 -_the_device
CMOS Topic 3 -_the_deviceCMOS Topic 3 -_the_device
CMOS Topic 3 -_the_device
 
vlsi 2 unit.pdfvlsi unit 2 important notes for ece department
vlsi 2 unit.pdfvlsi unit 2 important notes for ece departmentvlsi 2 unit.pdfvlsi unit 2 important notes for ece department
vlsi 2 unit.pdfvlsi unit 2 important notes for ece department
 
DESIGN OF THREE BIT ANALOG-TO-DIGITAL CONVERTER (ADC) USING SPATIAL WAVEFUNCT...
DESIGN OF THREE BIT ANALOG-TO-DIGITAL CONVERTER (ADC) USING SPATIAL WAVEFUNCT...DESIGN OF THREE BIT ANALOG-TO-DIGITAL CONVERTER (ADC) USING SPATIAL WAVEFUNCT...
DESIGN OF THREE BIT ANALOG-TO-DIGITAL CONVERTER (ADC) USING SPATIAL WAVEFUNCT...
 
Introduction gadgets have gained a lot of attention.pdf
Introduction gadgets have gained a lot of attention.pdfIntroduction gadgets have gained a lot of attention.pdf
Introduction gadgets have gained a lot of attention.pdf
 
Nano
NanoNano
Nano
 
Seminar: Fabrication and Characteristics of CMOS
Seminar: Fabrication and Characteristics of CMOSSeminar: Fabrication and Characteristics of CMOS
Seminar: Fabrication and Characteristics of CMOS
 
Fet
FetFet
Fet
 
MOSFET and Short channel effects
MOSFET and Short channel effectsMOSFET and Short channel effects
MOSFET and Short channel effects
 
Introduction to vlsi design
Introduction to vlsi designIntroduction to vlsi design
Introduction to vlsi design
 
Diodes and semiconductors - an introduction
Diodes and semiconductors - an introductionDiodes and semiconductors - an introduction
Diodes and semiconductors - an introduction
 
15 mosfet threshold voltage
15 mosfet threshold voltage15 mosfet threshold voltage
15 mosfet threshold voltage
 

Recently uploaded

Class 11 Legal Studies Ch-1 Concept of State .pdf
Class 11 Legal Studies Ch-1 Concept of State .pdfClass 11 Legal Studies Ch-1 Concept of State .pdf
Class 11 Legal Studies Ch-1 Concept of State .pdfakmcokerachita
 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxpboyjonauth
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxSayali Powar
 
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTiammrhaywood
 
History Class XII Ch. 3 Kinship, Caste and Class (1).pptx
History Class XII Ch. 3 Kinship, Caste and Class (1).pptxHistory Class XII Ch. 3 Kinship, Caste and Class (1).pptx
History Class XII Ch. 3 Kinship, Caste and Class (1).pptxsocialsciencegdgrohi
 
Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Celine George
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)eniolaolutunde
 
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdfssuser54595a
 
Biting mechanism of poisonous snakes.pdf
Biting mechanism of poisonous snakes.pdfBiting mechanism of poisonous snakes.pdf
Biting mechanism of poisonous snakes.pdfadityarao40181
 
Final demo Grade 9 for demo Plan dessert.pptx
Final demo Grade 9 for demo Plan dessert.pptxFinal demo Grade 9 for demo Plan dessert.pptx
Final demo Grade 9 for demo Plan dessert.pptxAvyJaneVismanos
 
Alper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentAlper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentInMediaRes1
 
CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxGaneshChakor2
 
A Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformA Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformChameera Dedduwage
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptxVS Mahajan Coaching Centre
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13Steve Thomason
 
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17Celine George
 
internship ppt on smartinternz platform as salesforce developer
internship ppt on smartinternz platform as salesforce developerinternship ppt on smartinternz platform as salesforce developer
internship ppt on smartinternz platform as salesforce developerunnathinaik
 
How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17Celine George
 
Science 7 - LAND and SEA BREEZE and its Characteristics
Science 7 - LAND and SEA BREEZE and its CharacteristicsScience 7 - LAND and SEA BREEZE and its Characteristics
Science 7 - LAND and SEA BREEZE and its CharacteristicsKarinaGenton
 

Recently uploaded (20)

Class 11 Legal Studies Ch-1 Concept of State .pdf
Class 11 Legal Studies Ch-1 Concept of State .pdfClass 11 Legal Studies Ch-1 Concept of State .pdf
Class 11 Legal Studies Ch-1 Concept of State .pdf
 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptx
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
 
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
 
History Class XII Ch. 3 Kinship, Caste and Class (1).pptx
History Class XII Ch. 3 Kinship, Caste and Class (1).pptxHistory Class XII Ch. 3 Kinship, Caste and Class (1).pptx
History Class XII Ch. 3 Kinship, Caste and Class (1).pptx
 
Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)
 
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
 
Biting mechanism of poisonous snakes.pdf
Biting mechanism of poisonous snakes.pdfBiting mechanism of poisonous snakes.pdf
Biting mechanism of poisonous snakes.pdf
 
Final demo Grade 9 for demo Plan dessert.pptx
Final demo Grade 9 for demo Plan dessert.pptxFinal demo Grade 9 for demo Plan dessert.pptx
Final demo Grade 9 for demo Plan dessert.pptx
 
Alper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentAlper Gobel In Media Res Media Component
Alper Gobel In Media Res Media Component
 
CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptx
 
A Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformA Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy Reform
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13
 
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
 
Model Call Girl in Bikash Puri Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Bikash Puri  Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Bikash Puri  Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Bikash Puri Delhi reach out to us at 🔝9953056974🔝
 
internship ppt on smartinternz platform as salesforce developer
internship ppt on smartinternz platform as salesforce developerinternship ppt on smartinternz platform as salesforce developer
internship ppt on smartinternz platform as salesforce developer
 
How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17
 
Science 7 - LAND and SEA BREEZE and its Characteristics
Science 7 - LAND and SEA BREEZE and its CharacteristicsScience 7 - LAND and SEA BREEZE and its Characteristics
Science 7 - LAND and SEA BREEZE and its Characteristics
 

vlsi.pdf important qzn answer for ece department

  • 1. EC8095: VLSI Design Department of ECE 2020-2021 St.Joseph’s College of Engineering / St.Joseph’s Institute of Technology 1 UNIT – I INTRODUCTION - BASIC MOS TRANSISTOR The invention of the transistor by William B. Shockley, Walter H. Brattain and John Bardeen of Bell Telephone laboratories was followed by the development of the Integrated circuit (IC) The very first IC emerged at the beginning of 1960 and since that time there have already been 4 generations of ICs 1) SSI ( Small Scale Integration) 2) MSI ( Medium Scale Integration) 3) LSI ( Large Scale Integration) 4) VLSI ( Very Large Scale Integration) Now we see the emergence of the 5th generation, ULSI ( Ultra Large Scale Integration) which is characterized by complexities in excess of 3 million devices on a single IC chip.Within the bounds of MOS technology, the possible circuit realizations may be based on pMOS, nMOS, CMOS and now BiCMOS devices. Although CMOS is the dominant technology, some of the examples used to illustrate the design processes will be presented in nMOS form. The reasons are : 1) For NMOS technology, the design methodology and the design rules are easily learned, thus providing a simple but excellent introduction to structured design for VLSI. 2) nMOS technology and design processes provide an excellent background for other technologies. In particular some familiarity with nMOS allows a relatively easy transition to CMOS technology and design. 3) For GaAs technology some arrangements in relation to logic design are similar to those employed in nMOS technology. Therefore, understanding the basics of nMOS design will assist in the layout of GaAs circuits. BASIC MOS TRANSISTORS nMOS devices are formed in a p-type substrate of moderate doping level. The source and drain regions are formed by diffusing n-type impurities through suitable masks into 3 areas to give the desired n-impurity concentration and give rise to depletion regions which extend mainly in the more lightly doped p-region.  Thus, source and drain are isolated from one another by 2 diodes.  Connections to the source and drain are made by a deposited metal layer. . ( Fig a)  A polysilicon gate is deposited on a layer of insulation over the region between source and drain  If the gate is connected to a suitable positive voltage with respect to the source, then the electric field established between the gate and the substrate gives rise to a charge inversion region in the substrate under the gate insulation and a conducting path or channel is formed between source and drain.  Channel may also be established so that it is present under the condition Vgs = 0 by implanting suitable impurities in the region between the insulation and the gate. (fig b)  Substrate is of n-type material and the source and drain diffusions are consequently p-type.(fig c)
  • 2. EC8095: VLSI Design Department of ECE 2020-2021 St.Joseph’s College of Engineering / St.Joseph’s Institute of Technology 2 ENHANCEMENT MODE TRANSISTOR ACTION:  In order to establish the channel in the first place a min. voltage level of threshold voltage Vt must be established between gate and source.  Fig (a) indicates the conditions prevailing with the channel established but no current flowing between source and drain (Vds = 0)  Condition: When current flows in the channel by applying a voltage Vds between drain and source.  Corresponding IR drop = Vds along the channel.  This results in the voltage between gate and channel varying with distance along the channel with the voltage being a max. ofVgs at the source end.  Effective voltage Vg = Vgs-Vt, there will be voltage available to invert the channel at the drain end so long as Vgs – Vt>= Vds.  Limiting condition comes when Vds = Vgs – Vt.  For all voltages Vds<Vgs – Vt, the device is in the non-saturated region of operation.  IR drop = Vgs –Vt takes place over less than the whole length of the channel so that over part of the channel, near the drain, there is insufficient electric field available to give rise to inversion layer to create the channel.  Diffusion current completes the path from source to drain causing the channel to exhibit a high resistance known as saturation region. DEPLETION MODE TRANSISTOR ACTION  The channel is established, due to the implant, even when Vgs = 0 and to cause the channel to cease to exist a –ve voltage Vtd must be applied between gate and source. Vtd is typically < -0.8Vdd, depending on the implant and substrate bias, but threshold voltage differences apart. Drain to source current Ids versus voltage Vds relationships  The whole concept of the MOS transistor evolves from the use of a voltage on the gate to induce a charge in the channel between source and drain, which may then be caused to move from source to drain under the influence of an electric field created by voltage Vds applied between source and drain.  Since the charge induced is dependent on the gate to source voltage Vgs then Ids is independent on both Vgs and Vds.  Consider a structure in which electrons will flow from source to drain. = , First, transit time ζ sd But velocity ,Where μ = electron or hole mobility (surface) Eds = electric field (drain to source) ; Now , So that , Thus, Typical values of μ at room temp. areμn = 650 cm2 /Vsec ( surface) μp = 240 cm2 /Vsec (surface) Non Saturated region:
  • 3. EC8095: VLSI Design Department of ECE 2020-2021 St.Joseph’s College of Engineering / St.Joseph’s Institute of Technology 3  Charge induced in channel due to gate voltage is due to to the voltage difference between the gate and the channel Vgs  Voltage along the channel varies linearly with distance X from source due to the IR drop in the channel.  Assuming the device is not saturated then the average value is Vds/2  Effective gate voltage Vg = Vgs-Vt, Where Vt is the threshold voltage needed to invert the charge under the gate and establish the channel. , Thus induced charge , Where Eg= avg. electric field gate to channel εins = relative permittivity of insulation between gate and channel ε0 = permittivity of free space = 8.85x10-14 Fcm-1 Where D = oxide thickness Thus 3 Combine eqn 2 & 3 in 1 , we have or in the non saturated or resistive region where Vds<Vgs - Vtand /D The factor W/L is of course contributed by the geometry and it is a common practice to write  = K. W/L so that Ids =    2 / ) ( 2 ds V Vds Vt Vgs    4a ( Alternate form of Eqn 4) Gate/Channel Capacitance (parallel plate) Also , so Sometimes it is convenient to use gate capacitance per unit area Co rather than Cg. Noting that Cg = Co WL We may also write , Ids = Co W/L  2 / ) ( 2 ds V Vds Vt Vgs   4c Saturated region: Saturation begins when Vds = Vgs - Vt. Since at this point the IR drop in the channel equals the effective gate to channel voltage at the drain and we may assume that the current remains fairly constant as Vds increases further. Ideal I-V Characteristics Drain current of MOS device in different operating regions. MOS transistors have three regions of operation: • Cutoff or sub-threshold region •Linear region • Saturation region The long-channel model assumes that the current through an OFF transistor is 0.When a transistor turns ON (Vgs>Vt),the gate attracts carriers(electrons) to form a channel. The electrons drift from source to drain at a rate proportional to the electric field between these regions. Thus, we can compute currents if we know the amount of charge in the channel and the rate at which it moves. We know that the charge on each plate of a capacitor is Q=CV. Thus, the charge in the channel Qchannel is where Cg is the capacitance of the gate to the channel and Vgc-Vt is the amount of voltage attracting charge to the channel beyond the minimum required to invert from pton. The gate voltage is referenced to the channel, which is not grounded. If the source is at Vs and
  • 4. EC8095: VLSI Design Department of ECE 2020-2021 St.Joseph’s College of Engineering / St.Joseph’s Institute of Technology 4 the drain is at Vd, the average is Vc=(Vs+Vd)/2= Vs+Vds/2. Therefore, the mean difference between the gate and channel potentials Vgc is Vg–Vc=Vgs–Vds /2,as shown in Figure 2.5. We can model the gate as a parallel plate capacitor with capacitance proportional to area over thickness. If the gate has length L and width W and the oxide thickness is tox, as shown in Figure2.6, the capacitance is Where ε0 is the permittivity of frees pace,8.85×10–14F/cm,andthepermittivityofSiO2is kox=3.9times as great. Often, the εox/tox term is called Cox, the capacitance per unit area of the gate oxide. Some nanometer processes use a different gate dielectric with a higher dielectric constant. In these processes, tox the equivalent oxide thickness (EOT), the thickness of a layer of SiO2 that has the same Cox. In this case, tox is thinner than the actual dielectric. Each carrier in the channel is accelerated to an average velocity, v, proportional to the lateral electric field, i.e., the field between source and drain. The constant of proportionality μ is called the mobility. The electric field E is the voltage difference between drain and source Vds divided by the channel length . The time required for carriers to cross the channel is the channel length divided by the carrier velocity: L/v. Therefore, the current between source and drain is the total amount of charge in the channel divided by the time required to cross The term Vgs–Vt arises so often that it is convenient to abbreviate it as VGT. Equation describes the linear region of operation, for Vgs>Vt, but Vds relatively small. It is called linear or resistive because when Vds<<VGT, Ids increases almost linearly with Vds, just like an ideal resistor. The geometry and technology- dependent parameters are sometimes merged into a single factor ᵝ . If Vds>Vdsat-VGT, the channel is no longer inverted in the vicinity of the drain; we say it is pinched off. Beyond this point, called the drain saturation voltage, increasing the drain voltage has no further effect on current. Substituting Vds=Vdsat at this point of maximum current into Eq(2.5),we find an expression for the saturation current that is independent of Vds. … This expression is valid for Vgs>Vt and Vds>Vdsat. Thus, long-channel MOS transistors are said to exhibit square-law behavior in saturation. Two key figures of merit for a transistor are Ion and Ioff. Ion (also called Idsat) is the ON current, Ids, when Vgs=Vds=VDD. Ioff is the OFF current when Vgs=0 and Vds=VDD. According to the long-channel model, Ioff=0and . Figure 2.7(a) showsthe I-Vcharacteristicsforthe transistor.Accordingtothefirst-ordermodel,the current
  • 5. EC8095: VLSI Design Department of ECE 2020-2021 St.Joseph’s College of Engineering / St.Joseph’s Institute of Technology 5 is zero for gate voltages below Vt. For higher gate voltages, current increases linearly with Vds for small Vds. As Vds reaches the saturation point Vdsat=VGT, current rolls off and eventually becomes independent of Vds when the transistor is saturated. pMOS transistors behave in the same way, but with the signs of all voltages and currents reversed. The I-V characteristics are in the third quadrant, as shown in Figure2.7 (b). Non -Ideal I-V Effects The saturation current increases less than quadratically with increasing Vgs . This is caused by two effects: velocity saturation and mobility degradation.  At high lateral field strengths (Vds /L), carrier velocity ceases to increase linearly with field strength. This is called velocity saturation and results in lower Ids than expected at high Vds .  At high vertical field strengths (Vgs /tox ), the carriers scatter off the oxide interface more often, slowing their progess. This mobility degradation effect also leads to less current than expected at high Vgs .  The saturation current of the nonideal transistor increases somewhat with Vds . This is caused by channel length modulation, in which higher Vds increases the size of the depletion region around the drain and thus effectively shortens the channel.  Increasing the potential between the source and body raises the threshold through the body effect. Increasing the drain voltage lowers the threshold through drain-induced barrier lowering. Increasing the channel length raises the threshold through the short channel effect.  When Vgs<Vt , the current drops off exponentially rather than abruptly becoming zero. This is called subthreshold conduction. The current into the gate Ig is ideally 0. However, as the thickness of gate oxides reduces to only a small number of atomic layers, electrons tunnel through the gate, causing some gate leakage current. The source and drain diffusions are typically reverse- biased diodes and also experience junction leakage into the substrate or well. Both mobility and threshold voltage decrease with rising temperature. The mobility effect tends to dominate for strongly ON transistors, resulting in lower Ids at high temperature. The threshold effect is most important for OFF transistors, resulting in higher leakage current at high temperature. In summary, MOS characteristics degrade with temperature.
  • 6. EC8095: VLSI Design Department of ECE 2020-2021 St.Joseph’s College of Engineering / St.Joseph’s Institute of Technology 6 Mobility Degradtion and Velocity Saturation  Carrier drift velocity, and hence current, is proportional to the lateral electric field Elat = Vds /L between source and drain. The constant of proportionality is called the carrier mobility, μ. The long- channel model assumed that carrier mobility is independent of the applied fields.  A high voltage at the gate of the transistor attracts the carriers to the edge of the channel, causing collisions with the oxide interface that slow the carriers. This is called mobility degradation.  Carriers approach a maximum velocity vsat when high fields are applied. This phenomenon is called velocity saturation. Channel Length Modulation Ideally, Ids is independent of Vds for a transistor in saturation, making the transistor a perfect current source. The p–n junction between the drain and body forms a depletion region with a width Ld that increases with Vdb. The depletion region effectively shortens the channel length to Leff = L - Ld Assume the source voltage is close to the body voltage so Vdb = Vds. Hence, increasing Vds decreases the effective channel length. Shorter channel length results in higher current; thus, Ids increases with Vds in saturation. This can be crudely modeled by multiplying EQ (2.10) by a factor of (1 + Vds / VA), where VA is called the Early voltage. In the saturation region As channel length gets shorter, the effect of the channel length modulation becomes relatively more important. Hence, VA is proportional to channel length. This channel length modulation model is a gross oversimplification of nonlinear behavior and is more useful for conceptual understanding than for accurate device modeling. Threshold Effects So far, we have treated the threshold voltage as a constant. However, Vt increases with the source voltage, decreases with the body voltage, decreases with the drain voltage, and increases with channel length. This section models each of these effects. Body Effect The body is an implicit fourth terminal. When a voltage Vsb is applied between the source and body, it increases the amount of charge required to invert the channel, hence, it increases the threshold voltage. The threshold voltage can be modeled as where Vt0 is the threshold voltage when the source is at the body potential, ϕs is the surface potential at threshold and γ is the body effect coefficient, typically in the range 0.4 to 1 V1/2 . i. Drain induced barrier Lowering (DIBL) The drain voltage Vds creates an electric field that affects the threshold voltage. This drain- induced barrier lowering (DIBL) effect is especially pronounced in short-channel transistors.  It can be modeled asVt = Vto –ηVds. where η is the DIBL coefficient, typically on the order of 0.1 (often expressed as 100 mV/V). Drain-induced barrier lowering causes Ids to increase with Vds in saturation, in much the same way as channel length modulation does. This effect can be lumped into a smaller Early voltage VA. Short Channel Effects The threshold voltage typically increases with channel length. This phenomenon is especially pronounced for small L where the source and drain depletion regions extend into a significant portion of the channel, and hence is called the short channel effect or Vtrolloff. ii. Leakage  Even when transistors are nominally OFF, they leak small amounts of current. Leakage mechanisms include subthreshold conduction between source and drain, gate leakage from the gate to body, and junction leakage from source to body and drain to body.  Subthreshold conduction is caused by thermal emission of carriers over the potential barrier set by the threshold. Gate leakage is a quantum-mechanical effect caused by tunneling through the
  • 7. EC8095: VLSI Design Department of ECE 2020-2021 St.Joseph’s College of Engineering / St.Joseph’s Institute of Technology 7 extremely thin gate dielectric. Junction leakage is caused by current through the p-n junction between the source/drain diffusions and the body. Subthreshold Leakage  The long-channel transistor I-V model assumes current only flows from source to drain when Vgs> Vt. In real transistors, current does not abruptly cut off below threshold, but rather drops off exponentially.  When the gate voltage is high, the transistor is strongly ON. When the gate falls below Vt , the exponential decline in current appears as a straight line on the logarithmic scale. This regime of Vgs<Vt is called weak inversion.  The subthreshold leakage current increases significantly with Vds because of drain-induced barrier lowering. There is a lower limit on Ids set by drain junction leakage that is exacerbated by the negative gate voltage.  Subthreshold leakage current is described by EQ (2.42). Ids0 is the current at threshold and is dependent on process and device geometry. Gate Leakage According to quantum mechanics, the electron cloud surrounding an atom has a probabilistic spatial distribution. For gate oxides thinner than 15–20 Å, side of the oxide, where it will get whisked away through the channel. This effect of carriers crossing a thin barrier is called tunneling, and results in leakage current through the gate. Two physical mechanisms for gate tunneling are called Fowler-Nordheim (FN) tunnelingand direct tunneling. FN tunneling is most important at high voltage and moderate oxide thickness and is used to program EEPROM memories. Direct tunneling is most important at lower voltage with thin oxides and is the dominant leakage component. The direct gate tunneling current can be estimated as where A and B are technology constants. Junction Leakage The p–n junctions between diffusion and the substrate or well form diodes. The well-to- substrate junction is another diode. The substrate and well are tied to GND or VDD to ensure these diodes do not become forward biased in normal operation. However, reverse-biased diodes still conduct a small amount of current ID. where IS depends on doping levels and on the area and perimeter of the diffusion region and VD is the diode voltage (e.g., –Vsb or –Vdb). When a junction is reverse biased by significantly more than the thermal voltage, the leakage is just –IS, generally in the 0.1–0.01 fA/μm2 range, which is negligible compared to other leakage mechanisms. More significantly, heavily doped drains are subject to band-to-band tunneling (BTBT) and gate-induced drain leakage (GIDL). Temperature Dependence Transistor characteristics are influenced by temperature. Carrier mobility decreases with temperature. An approximate relation is where T is the absolute temperature, Tr is room temperature, and kμ is a fitting parameterwith a
  • 8. EC8095: VLSI Design Department of ECE 2020-2021 St.Joseph’s College of Engineering / St.Joseph’s Institute of Technology 8 typical value of about 1.5. vsat also decreases with temperature, dropping by about20% from 300 to 400 K. The magnitude of the threshold voltage decreases nearly linearly with temperature and may be approximated by where kvt is typically about 1–2 mV/K. Ion at high VDD decreases with temperature. Subthreshold leakage increases exponentiallywith temperature.  Subthreshold leakage is exponentially dependent on temperature, so lower threshold voltages can be used. Velocity saturation occurs at higher fields, providing more current.  As mobility is also higher, these fields are reached at a lower power supply, saving power. Depletion regions become wider, resulting in less junction capacitance. Geometry Dependence  The layout designer draws transistors with width and length Wdrawn and Ldrawn. The actual gate dimensions may differ by some factors XW and XL.  the source and drain tend to diffuse laterally under the gate by LD, producing a shorter effective channel length that the carriers must traverse between source and drain. Similarly, WD accounts for other effects that shrink the transistor width. The factors of two come from lateral diffusion on both sides of the channel.  Therefore, a transistor drawn twice as long may have an effective length that is more than twice as great. Similarly, two transistors differing in drawn widths by a factor of two may differ in saturation current by more than a factor of two.  Threshold voltages also vary with transistor dimensions because of the short and narrow channel effects. Combining threshold changes, effective channel lengths, channel length modulation, and velocity saturation effects, Idsat does not scale exactly as 1/L. In general, when currents must be precisely matched (e.g., in sense amplifiers or A/D converters), it is best to use the same width and length for each device. Current ratios can be produced by tying several identical transistors in parallel. CMOS TECHNOLOGIES CMOS provides an inherently low power static circuit technology that has the capability of providing a lower-delay product than comparable design-rule nMOS or pMOS technologies. The four dominant CMOS technologies are: P-well process n-well process twin-tub process Silicon on chip process nMOS FABRICATION  Processing is carried out on a thin wafer cut from a single crystal of silicon of high purity into which the required p-impurities are introduced as the crystal is grown.  A layer of silicon dioxide ( SiO2), typically 1m thick is grown all over he surface of the wafer to protect the surface, act as a barrier to dopants during processing and provide a generally insulating substrate on to which other layers may be deposited and patterned.  The surface is now covered with a photo resist which is deposited onto the wafer and spun to achieve an even distribution of the required thickness.  The photo resist layer is then exposed to ultra violet light through a mask which defines those regions into which diffusion is to take place together with transistor channels.  These areas are subsequently readily etched away together with the underlying silicon dioxide so that the wafer surface is exposed in the window defined by the mask.  Remaining photo resist is removed and a thin layer of SiO2 is grown over the entire chip surface and then polysilicon is deposited on top of this to form the gate structure. The Layer consists of heavily doped polysilicon deposited by chemical vapor deposition (CVD).  Photo resist coating and masking allows the polysilicon to be patterned and then the thin oxide is removed to expose areas into which n-type impurities are to be diffused.  Thin oxide is grown over all again and is then masked with photo resist and etched to expose selected areas of the polysilicon gate and the drain and source areas where connections are to be made.  The whole chip then has metal (Al) deposited over its surface to a thickness typically of 1 m. This metal layer is then masked and etched to form the required interconnection pattern.
  • 9. EC8095: VLSI Design Department of ECE 2020-2021 St.Joseph’s College of Engineering / St.Joseph’s Institute of Technology 9 CMOS FABRICATION  P-well process is widely used in practice and then the n-well process is also popular. P-well process  The diffusion must be carried out with special care since the p-well doping concentration and depth will affect the threshold voltages as well as the breakdown voltages of the n-transistor.  To achieve low threshold voltages ( 0.6 to 1.0 V) we need wither deep well diffusion or high well resistivity.  But deep wells require larger spacing due to lateral diffusion and therefore a larger chip area.
  • 10. EC8095: VLSI Design Department of ECE 2020-2021 St.Joseph’s College of Engineering / St.Joseph’s Institute of Technology 10  The p-well act as substrates for the n-devices within the parent n-substrate and provided that voltage polarity restrictions are observed, the 2 areas are electrically isolated. Layout Design rules Layout design rules describe how small features can be and how closely they can be reliably packed in a particular manufacturing process. Industrial design rules are usually specified in microns. This makes migrating from one process to a more advanced process or a different foundry‘s process difficult because not all rules scale in the same way. Mead and Conway popularized scalable design rules based on a single parameter ,λ, that characterizes the resolution of the process. Λ is generally half of the minimum drawn transistor channel length. This length is the distance between the source and drain of a transistor and is set by the minimum width of a polysilicon wire. Designers often describe a process by its feature size. Feature size refers to minimum transistor length, so λ is half the feature size. This length is the distance between the source and drain of a transistor and is set by the minimum width of a polysilicon wire. For example, a 180 nm process has a minimum polysilicon width (and hence transistor length) of 0.18 μm and uses design rules with λ= 0.09 μm3 . Lambda- based rules are necessarily conservative because they round up dimensions to an integer multiple of λ A conservative but easy-to-use set of design rules for layouts with two metal layers in an n-well process is as follows:  Metal and diffusion have minimum width and spacing of 4 λ.  Contacts are 2 λ × 2 λ and must be surrounded by 1 λ on the layers above and below.  Polysilicon uses a width of 2 λ.  Polysilicon overlaps diffusion by 2λ where a transistor is desired and has a spacing of 1 λ away where no transistor is desired.  Polysilicon and contacts have a spacing of 3λ from other polysilicon or contacts.  N-well surrounds pMOS transistors by 6λ and avoids nMOS transistors by 6λ. Transistor dimensions are often specified by their Width/Length (W/L) ratio. For example, the nMOS transistor in Figure 1.39 formed where polysilicon crosses n-diffusion has a W/L of 4/2. In a 0.6 μm process, this corresponds to an actual width of 1.2 μm and a length of 0.6 μm. Such a
  • 11. EC8095: VLSI Design Department of ECE 2020-2021 St.Joseph’s College of Engineering / St.Joseph’s Institute of Technology 11 minimum-width contacted transistor is often called a unit transistor. pMOS transistors are often wider than nMOS transistors because holes move more slowly than electrons so the transistor has to be wider to deliver the same current. Figure 1.40(a) shows a unit inverter layout with a unit nMOS transistor and a double-sized pMOS transistor. Figure 1.40(b) shows a schematic for the inverter annotated with Width/ Length for each transistor. In digital systems, transistors are typically chosen to have the minimum possible length because short-channel transistors are faster, smaller, and consume less power. Figure 1.40(c) shows a shorthand we will often use, specifying multiples of unit width and assuming minimum length. Gate layouts Line of Diffusion based style consists of four horizontal strips: Metal ground at the bottom of the cell, n-diffusion, p-diffusion, and metal power at the top. The power and ground lines are often called supply rails. Polysilicon lines run vertically to form transistor gates. Metal wires within the cell connect the transistors appropriately. Figure 1.41(a) shows such a layout for an inverter. The input A can be connected from the top, bottom, or left in polysilicon. The output Y is available at the right side of the cell in metal. Recall that the p-substrate and n-well must be tied to ground and power, respectively. Figure 1.41(b) shows the same inverter with well and substrate taps placed under the power and ground rails, respectively. Figure 1.42 shows a 3-input NAND gate. Notice how the nMOS transistors are connected in series while the pMOS transistors are connected in parallel. Power and ground extend 2 λ on each side so if two gates were abutted the contents would be separated by 4 λ, satisfying design rules. The height of the cell is 36 λ, or 40 λ if the 4 λ space between the cell and another wire above it is counted. All these examples use transistors of width 4 λ.
  • 12. EC8095: VLSI Design Department of ECE 2020-2021 St.Joseph’s College of Engineering / St.Joseph’s Institute of Technology 12
  • 13. EC8095: VLSI Design Department of ECE 2020-2021 St.Joseph’s College of Engineering / St.Joseph’s Institute of Technology 13 UNIT II COMBINATIONAL CIRCUIT DESIGN DESIGN PRINCIPLE OF STATIC CMOS DESIGN Digital CMOS circuits are implemented using either static or dynamic design techniques. In static CMOS, the output is tied to VDD or ground via a low resistance path (except during switching) and this leads to circuits implementation robust with good noise immunity. In static CMOS design any function can be realized as a sum of product (SOP) or a product of sum (POS). If an SOP function pulls the output high, then an SOP-BAR function will pull the output low. A POS function can pull the output high, while a POS-BAR function can pull the output low, as shown in fig. Important properties of static CMOS design: At any instant of time, the output of the gate is directly connected to Vss or VDD. All functions are composed of either AND'ed or OR'ed sub functions. The AND function is composed of NMOS transistors in series. The OR function is composed of NMOS transistors in parallel. Contains a pull-up network (PUP) and pull down network (PDN). PUP networks consist of PMOS transistors. PDN networks consist of NMOS transistors. Each network is the dual of the other network. The output of the complementary gate is inverted. Advantages of static CMOS design:  Robust in construction.  Good noise immunity.  Static logic has no minimum clock rate, the clock can be paused indefinitely.  Low power consumption.  For low operating frequencies, CMOS static logic is used to obtain a relatively small die size. Limitations of static CMOS design: The main limitation of static circuits is slower-speed as compared to dynamic circuits. The reasons are 1. Increased gate capacitance due to the presence of both PMOS and NMOS transistors. 2. Output depends on the previous cycle inputs due to charges that may be present at internal inputs. 3. Multiple switching of the output within a cycle depending on the input switching pattern MOSFETS as Switches The gate controls the passage of current between the source and the drain. CMOS uses positive logic - VDD is logic ‗1‘ and Vss is logic '0'. We turn a transistor on or off using the gate terminal. There are two kinds of CMOS transistors, n - Channel transistors and p - channel transistors. An n - channel transistor requires a logic T on the gate to make the switch conducting (to turn the transistor on). A p - channel transistor requires a logic '0' on the gate to make the switch conducting (to turn the transistor on). The conventional schematic icon representation along with the switch characteristics is shown. Basic CMOS Gates In this section, the basic gate implementation in static CMOS are presented. AND Gate If two N-switches are placed in series, the composite switch constructed by this action is closed (or ON) if both switches are connected to logic '1'. If any one of the switch is at logic
  • 14. EC8095: VLSI Design Department of ECE 2020-2021 St.Joseph’s College of Engineering / St.Joseph’s Institute of Technology 14 '0' the circuit is said to be open (or OFF) state this yields an 'AND' function. The switch logic of AND function is shown in OR Gate If two N-switches are placed in parallel, the composite switch constructed by this action is closed (or ON) if any one of the switch is connected to logic ‗1‘. Bubble Pushing CMOS stages are inherently inverting, so AND and OR functions must be built from NAND and NOR gates. DeMorgan‟ s law helps with this conversion: A NAND gate is equivalent to an OR of inverted inputs. A NOR gate is equivalent to an AND of inverted inputs. The same relationship applies to gates with more inputs. Switching between these representations is easy to do on a whiteboard and is often called bubble pushing. Compound Gates:  Static CMOS also efficiently handles compound gates computing various  The logical effort of each input is the ratio of the input capacitance of that input to the input capacitance of the inverter For the AOI21 gate, this means the logical effort is slightly lower for the OR terminal (C) than for the two AND terminals (A, B).
  • 15. EC8095: VLSI Design Department of ECE 2020-2021 St.Joseph’s College of Engineering / St.Joseph’s Institute of Technology 15 The parasitic delay is crudely estimated from the total diffusion capacitance on the output node by summing the sizes of the transistors attached to the output. Input Ordering Delay Effect The logical effort and parasitic delay of different gate inputs are often different. Other gates, like NANDs and NORs, are nominally symmetric but actually have slightly different logical effort and parasitic delays for the different inputs. Figure shows a 2-input NAND gate annotated with diffusion parasitic. Consider the falling output transition occurring when one input held a stable 1 value and the other rises from 0 to 1. If input B rises last, node x will initially be at VDD – Vt ≈ VDD because it was pulled up through the nMOS transistor on input A. The Elmore delay is (R/2)(2C) + R(6C) = 7RC. On the other hand, if input A rises last, node x will initially be at 0 V because it was discharged through the nMOS transistor on input B. No charge must be delivered to node x, so the Elmore delay is simply R(6C) = 6RC. In general, we define the outer input to be the input closer to the supply rail (e .g., B) and the inner input to be the input closer to the output (e.g., A). The parasitic delay is smallest when the inner input switches last because the intermediate nodes have already been discharged. Therefore, if one signal is known to arrive later than the others, the gate is fastest when that signal is connected to the inner input. The inner input has a lower parasitic delay. The logical efforts are lower than initial estimates might predict because of velocity saturation. Interestingly, the inner input has a slightly higher logical effort because the intermediate node x tends to rise and cause negative feedback when the inner input turns ON. This effect is seldom significant to the designer because the inner input remains faster
  • 16. EC8095: VLSI Design Department of ECE 2020-2021 St.Joseph’s College of Engineering / St.Joseph’s Institute of Technology 16 over the range of fan-outs used in reasonable circuits. When one input is far less critical than another, even nominally symmetric gates can be made asymmetric to favor the late input at the expense of the early one. For example, consider the path in Figure. Under ordinary conditions, the path acts as a buffer between A and Y. When reset is asserted, the path forces the output low. If reset only occurs under exceptional circumstances and can take place slowly, the circuit should be optimized for input-to-output delay at the expense of reset. The pulldown resistance is R/4 +R/ (4/3) = R, so the gate still offers the same driver as a unit inverter. However, the capacitance on input A is only 10/3, so the logical effort is 10/9. This is better than 4/3, which is normally associated with a NAND gate. In the limit of an infinitely large reset transistor and unit-sized nMOS transistor for input A, the logical effort approaches 1, just like an inverter. The improvement in logical effort of input A comes at the cost of much higher effort on the reset input. Note that the pMOS transistor on the reset input is also shrunk. This reduces its diffusion capacitance and parasitic delay at the expense of slower response to reset. Skewed Gates In other cases, one input transition is more important than the other. We define H-I skew gates to favor the rising output transition and LO-skew gates to favor the falling output transition. This favoring can be done by decreasing the size of the noncritical transistor. The logical efforts for the rising (up) and falling (down) transitions are called ground gd, respectively, and are the ratio of the input capacitance of the skewed gate to the input capacitance of an unskewed inverter with equal drive for that transition. Figure (a) shows how a H-I skew inverter is constructed by downsizing the nMOS transistor. This maintains the same effective resistance for the critical transition while reducing the input capacitance relative to the unskewed inverter of Figure (b), thus reducing the logical effort on that critical transition to gu = 2.5/3 = 5/6. Of course , the improvement comes at the expense of the effort on the noncritical transition. The logical effort for the falling transition is estimated by comparing the inverter to a smaller unskewed inverter with equal pulldown current, shown in Figure (c), giving a logical effort of gd = 2.5/1.5 = 5/3. The degree of skewing (e.g., the ratio of effective resistance for the fast transition relative to the slow transition) impacts the logical efforts and noise margins; a factor of two is common. Figure catalogs HI-skew and LO-skew gates with a skew factor of two. Skewed gates are sometimes denoted with an H or an L on their symbol in a schematic. P/N Ratios The pMOS transistors in the unskewed gate are enormous in order to provide equal rise delay. They contribute input capacitance for both transitions, while only helping the rising delay. By accepting a slower rise delay, the pMOS transistors can be downsized to reduce input capacitance and average delay significantly.
  • 17. EC8095: VLSI Design Department of ECE 2020-2021 St.Joseph’s College of Engineering / St.Joseph’s Institute of Technology 17 Reducing the pMOS size from 2 to for the inverter gives the theoretical fastest average delay, but this delay improvement is only 3%. However, this significantly reduces the pMOS transistor area. It also reduces input capacitance, which in turn reduces power consumption. Unfortunately, it leads to unequal delay between the outputs. Some paths can be slower than average if they trigger the worst edge of each gate. Excessively slow rising outputs ca n also cause hot electron de gradation. And reducing the pMOS size also moves the switching point lower and reduces the inverter‟ s noise margin. In summary, the P/N ratio of a library of cells should be chosen on the basis of area, power, and reliability, not average delay. For NOR gates , reducing the size of the pMOS transistors significantly improves both delay and area. In most standard cell libraries, the pitch of the cell determines the P/N ratio that can be achieved in any particular gate. Ratios of 1.5–2 are commonly used for inverters. Multiple Threshold Voltages Some CMOS processes offer two or more threshold voltages . Transistors with lower threshold voltages produce more ON current, but also leak exponentially more OFF current. Libraries can provide both high and low threshold versions of gates. The low - threshold gates can be used sparingly to reduce the delay of critical paths. Skewed gates can use low threshold devices on only the critical network of transistors. Delay estimation: Estimation of the delay of a Boolean function from its functional description is an important step towards design exploration at the register transfer level (RTL). This paper addresses the problem of estimating the delay of certain optimal multi-level implementations of combinational circuits, given only their functional description. tpdr: rising propagation delay From input to rising output crossing VDD/2 tpdf: falling propagation delay From input to falling output crossing VDD/2 tpd: average propagation delay tpd = (tpdr + tpdf)/2 tr: rise time From output crossing 20% to 80% VDD tf: fall time From output crossing 80% to 20% VDD tcd: average contamination delay tcd = (tcdr + tcdf)/2 tcdr: rising contamination delay: Min from input to rising output crossing VDD/2 tcdf: falling contamination delay: Min from input to falling output crossinVDD/2 Use RC delay models to estimate delay C = total capacitance on the output node. Use Effective resistance R, Therefore tpd = RC Transistors are characterized by finding their effective R. Transistor sizing:  Not all gates need to have the same delay.  Not all inputs to a gate need to have the same delay.  Adjust transistor sizes to achieve desired delay. Logical effort Logical effort is a gate delay model that takes transistor sizes into account. Allows us to optimize transistor sizes over combinational networks. Isn‘t as accurate for circuits with reconvergent fanout. Logical effort gate delay model
  • 18. EC8095: VLSI Design Department of ECE 2020-2021 St.Joseph’s College of Engineering / St.Joseph’s Institute of Technology 18  Express delays in process-independent unit  Gate delay is measured in units of minimum-size inverter delay τ. d = dabs / τ. τ = 3RC ≈ 12ps in 180 nm process, 40 ps in 0.6 µm process.  Gate delay formula: d = f + p.  Effort delay f is related to gate‘s load. Parasitic delay p depends on gate‘s structure. Represents delay of gate driving no load Set by internal parasitic capacitance Effort delay  Effort delay has two components: f = gh.  Electrical effort h is determined by gate‘s load: h = Cout/Cin Sometimes called fanout  Logical effort g is determined by gate‘s structure. Measures relative ability of gate to deliver current g ≡ 1 for inverter Delay plots: Computing Logical Effort Logical effort is the ratio of the input capacitance of a gate to the input capacitance of an inverter delivering the same output current. Measure from delay Vs fanout plots Or estimate by counting transistor widths. Circuit families and its comparison: The method of logical effort does not apply to arbitrary transistor networks, but only to logic gates. A logic gate has one or more inputs and one output, subject to the following restrictions: The gate of each transistor is connected to an input, a power supply, or the output; and Inputs are connected only to transistor gates. The first condition rules out multiple logic gates masquerading as one, and the second keeps inputs from being connected to transistor sources or drains, as in transmission gates without explicit drivers. Pseudo-NMOS circuits Static CMOS gates are slowed because an input must drive both NMOS and PMOS transistors. In any transition, either the pullup or pulldown network is activated, meaning the input capacitance of the inactive network loads the input. Moreover, PMOS transistors have poor mobility and must be sized larger to achieve comparable rising and falling delays, further increasing input capacitance. Pseudo-NMOS and dynamic gates offer improved speed by removing the PMOS transistors from loading the input. Pseudo-NMOS gates resemble static gates, but replace the slow PMOS pullup stack with a single grounded PMOS transistor which acts as a pullup resistor. The effective pullup resistance should be large enough that the NMOS transistors can pull the output to near ground, yet low enough to rapidly pull the output high. Figure shows several pseudo-NMOS gates ratioed such that the pulldown transistors are about four times as strong as the pullup. The logical effort follows from considering the output current and input capacitance compared to the reference inverter from Figure Sized as shown, the PMOS transistors produce 1/3 of the current of the reference inverter and the NMOS transistor stacks produce 4/3 of the current of the reference inverter.
  • 19. EC8095: VLSI Design Department of ECE 2020-2021 St.Joseph’s College of Engineering / St.Joseph’s Institute of Technology 19 For falling transitions, the output current is the pulldown current minus the pullup current which is fighting the pulldown, For rising transitions, the output current is just the pullup current, 1/3. The inverter and NOR gate have an input capacitance of 4/3. Gate type Logical Effort g Rising Falling Average 2 - NAND 8/3 8/9 16/9 3 - NAND 4 4/3 8/3 4 - NAND 16/3 16/9 32/9 n - NOR 4/3 4/9 8/9 n - mux 8/3 8/9 16/9 The average logical effort is g = (4=9+4=3)=2 = 8. This is independent of the number of inputs, explaining why pseudo-NMOS is a way to build fast wide NOR gates. Pass Transistor Logic : It is a MOS transistor, in which gate is driven by a control signal the source (out), the drain of the transistor is called constant or variable voltage potential(in) when the control signal is high, input is passed to the output and when the control signal is low, the output is floating topology such topology circuits is called pass transistor. The Pass transistor logic is required to reduce the transistors for implementing logic by using the primary inputs to drive gate terminals, source and drain terminals. In complementary CMOS logic primary inputs are allowed to drive only gate terminals. Figure shows implementation of AND function using only MOS pass transistors. In this gate if the B input is high the left NMOS is turned ON and copies the input A to the output F. When B is low the right NMOS pass transistor is turned ON and passes a ‗0‘ to the output F. This satisfies the truth table of AND gate reproduced in Table below for verification. ‗OR‘ gate using pass transistor logic The truth table of ‗OR‘ gate is as shown in Table below. Figure below shows the implementation of OR function using NMOS transistors only. In this gate if the B input is high the right NMOS is turned ON and copies logic 1 to F and this operation does not affected by ‗A‘ input. When B is low the left NMOS is turned ON the logic of ‗A‘ is copied to the output F.
  • 20. EC8095: VLSI Design Department of ECE 2020-2021 St.Joseph’s College of Engineering / St.Joseph’s Institute of Technology 20 Advantage:  Fewer transistors are required to implement a given function.  Lower capacitance because of reduced number of transistors.  They do not have path VDD to GND and do not dissipate standby power (static power dissipation). Drawback: As discussed NMOS devices are effective in passing strong ‗0‘ but it is poor at pulling a node to VDD. Hence when the pass transistor pulls a node to high logic the output only changes upto VDD–VTh. This is the major disadvantage of pass transistors. Pass transistor logic (PTL) circuits are often superior to standard CMOS circuits in terms of layout density, circuit delay and power consumption. Transmission Gate Logic: The transmission gate logic is used to solve the voltage drop problem of the pass transistor logic. This technique uses the complementary properties of NMOS and PMOS transistors. i.e. NMOS devices passes a strong ‗0‘ but a weak ‗1‘ while PMOS transistors pass a strong ‗1‘ but a weak ‗0‘. The transmission gate combines the best of the two devices by placing an NMOS transistor in parallel with a PMOS transistor as shown in Figure below. The control signals to the transmission gate C and ~C are complementary to each other. The transmission gate is mainly a bi-directional switch enabled by the gate signal ‗C‘. When C = 1 both MOSFETs are ON and the signal pass through the gate i.e. A = B if C = 1. Whereas C = 0 makes the MOSFETs cut off creating an open circuit between nodes A and B. Basic Structure : The basic structure of transmission gate is shown in Figure below which consists of NMOS and PMOS transistors. Here, VG is applied to NMOS, and (VDD- VG) applied to the PMOS. The transmission gate work voltage-controlled switch. When VG is high, NMOS and PMOS are conducting hence switch is closed. Therefore, conduction path between left and right sides exist. When VG is low, then the MOSFETs are in cutoff and switch is open. Therefore, there is no direct relationship between VA and VB. Figure below shows the symbol of transmission gate controlled by switching signals X and X* that are applied to the gates of NMOS and PMOS respectively.
  • 21. EC8095: VLSI Design Department of ECE 2020-2021 St.Joseph’s College of Engineering / St.Joseph’s Institute of Technology 21 The circuit constructed with the parallel connection of PMOS and NMOS with shorted drain and source terminals. The gate terminal uses two select signals s and s, when s is high than the transmission gates passes the signal on the input. The main advantage of transmission gate is that it eliminates the threshold voltage drop. Multiplexing element of path selector, A latch element An unlock switch, Act as a voltage controlled resistor connecting the input and output. 2 : 1 MUX using transmission gate : A 2:1 multiplexer is shown in Figure below. This gate selects either input A or B on the basis of the value of the control signal ‗C‘. When control signal C is logic low the output is equal to the input A and when control signal C is logic high the output is equal to the input B. A 2 : 1 multiplexer can be implemented using transmission gates. Figure below shows the connection diagram of the 2 : 1 multiplexer using transmission gates. The 2 : 1 MUX selects either A or B depending upon the control signal C. This is equivalent to implementing the Boolean function, F = (A  C + B  ~C) When the control signal C is high then the upper transmission gate is ON and it passes A through it so that output = A. When the control signal C is low then the upper transmission gate turns OFF and it will not allow A to pass through it, at the same time the lower transmission gate is ‗ON‘ and it allows B to pass through it so the output = B. DYNAMIC CMOS LOGIC Ratioed circuits reduce the input capacitance by replacing the pMOS transistors connected to the inputs with a single resistive pullup. The drawbacks of ratioed circuits include slow rising transitions, contention on the falling transitions, static power dissipation, and a non zero VOL. Dynamic circuits circumvent these drawbacks by using a clocked pullup transistor rather than a pMOS that is always ON. Figure compares (a) static CMOS, (b) pseudo- nMOS, and (c) dynamic inverters. Dynamic circuit operation is divided into two modes, as shown in Figure
  • 22. EC8095: VLSI Design Department of ECE 2020-2021 St.Joseph’s College of Engineering / St.Joseph’s Institute of Technology 22 Dynamic circuits are the fastest commonly used circuit family because they have lower input capacitance and no contention during switching. They also have zero static power dissipation. However, they require careful clocking, consume significant dynamic power, and are sensitive to noise during evaluation. In Figure, if the input A is 1 during precharge, contention will take place because both the pMOS and nMOS transistors will be ON. When the input cannot be guaranteed to be 0 during precharge, an extra clocked evaluation transistor can be added to the bottom of the nMOS stack to avoid contention as shown in Figure. The extra transistor is sometimes called a foot. Figure estimates the falling logical effort of both footed and unfooted dynamic gates. As usual, the pulldown transistors‟ widths are chosen to give unit resistance. Precharge occurs while the gate is idle and often may take place more slowly. Therefore, the precharge transistor width is chosen for twice unit resistance. This reduces the capacitive load on the clock and the parasitic capacitance at the expense of greater rising delays. We see that the logical efforts are very low. Footed gates have higher logical effort than their unfooted counterparts but are still an improvement over static logic. In practice, the logical effort of footed gates is better than predicted because velocity saturation means series nMOS transistors have less resistance than we have estimated. The size of the foot can be increased relative to the other nMOS transistors to reduce logical effort of the other inputs at the expense of greater clock loading. Like pseudo- nMOS gates, dynamic gates are particularly well suited to wide NOR functions or multiplexers because the logical effort is independent of the number of inputs. A fundamental difficulty with dynamic circuits is the monotonicity requirement. While a dynamic gate is in evaluation, the inputs must be monotonically rising. That is, the input can start LOW and remain LOW, start LOW and rise HIGH, start HIGH and remain HIGH, but not start HIGH and fall LOW. Figure shows wave forms for a footed dynamic inverter in which the input violates monotonicity. During precharge, the output is pulled HIGH. When the clock rises, the input is HIGH so the output is discharged LOW through the pulldown network, as you would want to have happen in an inverter. The input later falls LOW, turning off the pulldown network.
  • 23. EC8095: VLSI Design Department of ECE 2020-2021 St.Joseph’s College of Engineering / St.Joseph’s Institute of Technology 23 The output of a dynamic gate be gins HIGH and monotonically falls LOW during evaluation. This monotonically falling output X is not a suitable input to a second dynamic gate expecting monotonically rising signals. CMOS Domino Logic The monotonicity problem can be solved by placing a static CMOS inverter between dynamic gates, as shown in Figure. This converts the monotonically falling output into a monotonically rising signal suitable for the next gate, as shown in Figure. The dynamic static pair together is called a domino gate because precharge resembles setting up a chain of dominos and evaluation causes the gates to fire like dominos tipping over, each triggering the next. A single clock can be used to precharge and evaluate all the logic gates within the chain. The dynamic output is monotonically falling during evaluation, so the static inverter output is monotonically rising. Therefore, the static inverter is usually a HI-skew gate to favor this rising output. In general, more complex inverting static CMOS gates such as NANDs or NORs can be used in place of the inverter . This mixture of dynamic and static logic is called compound
  • 24. EC8095: VLSI Design Department of ECE 2020-2021 St.Joseph’s College of Engineering / St.Joseph’s Institute of Technology 24 domino. Domino gates are inherently noninverting, while some functions like XOR gates necessarily require inversion. Three methods of addressing this problem include pushing inversions into static logic, delaying clocks, and using dual-rail domino logic. A second approach is to directly cascade dynamic gates without the static CMOS inverter, delaying the clock to the later gates to ensure the inputs are monotonic during evaluation. Domino circuits Pseudo-NMOS gates eliminate the bulky PMOS transistors loading the inputs, but pay the price of quiescent power dissipation and contention between the pullup and pulldown transistors. Dynamic gates offer even better logical effort and lower power consumption by using a clocked precharge transistor instead of a pullup that is always conducting. The dynamic gate is precharged HIGH then may evaluate LOW through an NMOS stack. Unfortunately, if one dynamic inverter directly drives another, a race can corrupt the result. When the clock rises, both outputs have been precharged HIGH. The HIGH input to the first gate causes its output to fall, but the second gate‘s output also falls in response to its initial HIGH input. The circuit therefore produces an incorrect result because the second output will never rise during evaluation, as shown in Figure 10.3. Domino circuits solve this problem by using inverting static gates between dynamic gates so that the input to each dynamic gate is initially LOW. The falling dynamic output and rising static output ripple through a chain of gates like a chain of toppling dominos. In summary, domino logic runs 1:5 to 2 times faster than static CMOS logic because dynamic gates present a much lower input capacitance for the same output current and have a lower switching threshold, and because the inverting static gate can be skewed to favor the critical monotonically rising evaluation edges. Figure shows some domino gates. Each domino gate consists of a dynamic gate followed by an inverting static gate1. The static gate is often but not always an inverter. Since the dynamic gate‘s output falls monotonically during evaluation, the static gate should be skewed high to favor its monotonically rising output. A dynamic gate may be designed with or without a clocked evaluation transistor; the extra transistor slows the gate but eliminates any path between power and ground during precharge when the inputs are still high. Dual-Rail Domino Logic: Dual-rail domino gates encode each signal with a pair of wires. The input and output signal pairs are denoted with sig_h and sig_l, respectively. Table summarizes the encoding. The sig_h wire is asserted to indicate that the output of the gate is ―high‖ or 1. The sig_l wire is asserted to indicate that the output of the gate is ―low‖ or 0.
  • 25. EC8095: VLSI Design Department of ECE 2020-2021 St.Joseph’s College of Engineering / St.Joseph’s Institute of Technology 25 When the gate is precharged, neither sig_h nor sig_l is asserted. The pair of lines should never be both asserted simultaneously during correct operation. Dual-rail domino gates accept both true and complementary inputs and compute both true and complementary outputs, as shown in Figure. Observe that this is identical to static CVSL circuits from Figure except that the cross-coupled pMOS transistors are instead connected to the precharge clock. Therefore, dual-rail domino can be viewed as a dynamic form of CVSL, sometimes called DCVS. Figure shows a dual-rail AND/NAND gate and Figure shows a dual-rail XOR/XNOR gate. The gates are shown with clocked evaluation transistors, but can also be unfooted. Dual- rail domino is a complete logic family in that it can compute all inverting and non inverting logic functions. However, it requires more area, wiring, and power. Dual rail structures also lose the efficiency of wide dynamic NOR gates because they require complementary tall dynamic NAND stacks. Dual rail domino signals not only the result of a computation but also indicates when the computation is done. Before computation completes, both rails are precharged. When the computation completes, one rail will be asserted. A NAND gate can be used for completion detection, as shown in Figure. This is particularly useful for asynchronous circuits Keepers Dynamic circuits also suffer from charge leakage on the dynamic node. If a dynamic node is precharged high and then left floating, the voltage on the dynamic node will drift over time due to subthreshold, gate, and junction leakage. The time constants tend to be in the millisecond to nanosecond range, depending on process and temperature. This problem is analogous to leakage in dynamic RAMs.
  • 26. EC8095: VLSI Design Department of ECE 2020-2021 St.Joseph’s College of Engineering / St.Joseph’s Institute of Technology 26 More over, dynamic circuits have poor input noise margins . If the input rises above Vt while the gate is in evaluation, the input transistors will turn on weakly and can incorrectly discharge the output. Both leakage and noise margin problems can be addressed by adding a keeper circuit. Figure shows a conventional keeper on a domino buffer. The keeper is a weak transistor that holds, or staticizes, the output at the correct level when it would otherwise float. When the dynamic node X is high, the output Y is low and the keeper is ON to prevent X from floating. When X falls, the keeper initially opposes the transition so it must be much weaker than the pulldown network. Eventually Y rises, turning the keeper OFF and avoiding static power dissipation. The keeper must be strong (i.e., wide) enough to compensate for any leakage current drawn when the output is floating and the pulldown stack is OFF. Strong keepers also improve the noise margin because when the inputs are slightly above Vt the keeper can supply enough current to hold the output high. NP and Zipper Domino Another variation on domino is shown in Figure. The HIskewinverting static gates are replaced with predischarged dynamic gates using pMOS logic. For example, a footed dynamic p-logic NAND gate is shown in Figure. When Φ is 0, the first and third stages pre charge high while the second stage predischarges low. When Φ rises, all the stages evaluate. Domino connections are possible, as shown in Figure. The design style is called NP Domino or NORA Domino (NORA). NORA has two major drawbacks. The logical effort of footed p-logic gates is generally worse than that of HI-skew gates (e.g., 2 vs. 3/2 for NOR2 and 4/3 vs. 1 for NAND2). Secondly, NORA is extremely susceptible to noise. In an ordinary dynamic gate, the input has a low noise margin (about Vt ), but is strongly driven by a static CMOS gate. The floating dynamic output is more prone to noise from coupling and charge sharing, but drives another static CMOS gate with a larger noise margin. In NORA, however, the
  • 27. EC8095: VLSI Design Department of ECE 2020-2021 St.Joseph’s College of Engineering / St.Joseph’s Institute of Technology 27 sensitive dynamic inputs are driven by noise prone dynamic outputs. Given these drawbacks and the extra clock phase required, there is little reason to use NORA. Zipper domino is a closely related technique that leaves the precharge transistors slightly ON during evaluation by using precharge clocks that swing between 0 and VDD – |Vtp| for the pMOS precharge and Vtn and VDD for the nMOS precharge. This plays much the same role as a keeper. THE STATIC AND DYNAMIC POWER DISSIPATION IN CMOS CIRCUITS Static CMOS gates are very power-efficient because they dissipate nearly zero power while idle. For much of the history of CMOS design, power was a secondary consideration behind speed and area for many chips. As transistor counts and clock frequencies have increased, power consumption has skyrocketed and now is a primary design constraint. The instantaneous power P{t} drawn from the power supply is proportional to the supply current iDD(t) and the supply voltage VDD, P(t) = iDD(t) VDD The energy consumed over some time interval T is the integral of the instantaneous power = The average power over this interval is Pavg = Power dissipation in CMOS circuits comes from two components Static dissipation due to  subthreshold conduction through OFF transistors  tunneling current through gate oxide  leakage through reverse-biased diodes  contention current in ratioed circuits Dynamic dissipation due to charging and discharging of load capacitances "short circuit'' current while both pMOS and nMOS networks are partially ON Ptotal = Pstatic + Pdynamic Static Dissipation Considering the static CMOS inverter shown in Figure, if the input = '0,' the associated nMOS transistor is OFF and the pMOS transistor is ON. The output voltage is VDD or logic 1.' When the input = 1 the associated nMOS transistor is ON and the pMOS transistor is OFF. The output voltage is 0 volts (GND). Note that one of the transistors is always OFF when the gate is in either of these logic states. Ideally, no current flows through the OFF transistor so the power dissipation is zero when the circuit is quiescent, i.e., when no transistors are switching. Zero quiescent power dissipation is a principle advantage of CMOS over competing transistor technologies. However, secondary effects including subthreshold conduction, tunneling, and leakage lead to small amounts of static current flowing through the OFF transistor. Assuming the leakage current is constant so instantaneous and average power are the same, the static power dissipation is the product of total leakage current and the supply voltage. Pstatic = Istatic VDD OFF transistors still conduct a small amount of subthreshold current. As subthreshold current is exponentially dependent on threshold voltage, it is increasing dramatically as threshold voltages have scaled down. There is also some small static dissipation due to reverse biased diode leakage between diffusion regions, wells, and the substrate. In modern processes, diode
  • 28. EC8095: VLSI Design Department of ECE 2020-2021 St.Joseph’s College of Engineering / St.Joseph’s Institute of Technology 28 leakage is generally much smaller than the subthreshold or gate leakage and may be neglected. Dynamic Dissipation Over any given interval of time T, the load will be charged and discharged Tfsw times. Current flows from VDD to the load to charge it. Current then flows from the load to GND during discharge. In one complete charge/discharge cycle, a total charge of Q = CVDD is thus transferred from VDD to GND. The average dynamic power dissipation is Pdynamic = Pdynamic = Because most gates do not switch every' clock cycle, it is often more convenient to express switching frequency fsw as an activity factor a times the clock frequency. Now the dynamic power dissipation may be rewritten as; Pdynamic = A clock has an activity factor of α=1, because it rises and falls every cycle. Most data has a maximum activity factor of 0.5 because it transitions only once each cycle.  Static CMOS logic has been empirically determined to have acvtiity factors closer to 0.1 because some gates maintain one output state more often thananother.  Because the input rise /fall time is greater than zero, both nMOS and pMOS transistors will be ON for a short period of time while the input is between Vtn and VDD - Vtp. This results in an additional "short circuit" current pulse from to GND a VDD and typically increases power dissipation by about 10% . Methods to reduce dynamic power dissipation 1. Reducing the product of capacitance and its switching frequency. 2. Eliminate logic switching that is not necessary for computation. 3. Reduce activity factor Reduce supply voltage Methods to reduce static power dissipation 1. By selecting multi threshold voltages on circuit paths with low-Vt transistors while leakage on other paths with high-Vt transistors. 2. By using two operating modes, active and standby for each function blocks. 3. By adjusting the body bias (i.e) adjusting FBB (Forward Body Bias) in active mode to increase performance and RBB (Reverse Body Bias) in standby mode to reduce leakage. 4. By using sleep transistors to isolate the supply from the block to achieve significant leakage power savings.
  • 29. EC8095: VLSI Design Department of ECE 2020-2021 St.Joseph’s College of Engineering / St.Joseph’s Institute of Technology 29 UNIT III: SEQUENTIAL LOGIC CIRCUITS Static & Dynamic Latches and Registers, Pipelining  In sequential logic circuits, the output not only depends upon the current values of the inputs, but also upon preceding input values. In other words, a sequential circuit remembers some of the past history of the system—it hasmemory.  Figure shows a block diagram of a generic finite state machine (FSM) that consists of combinational logic and registers, which hold the system state. The system depicted here belongs to the class of synchronous sequential systems, in which all registers are under control of a single global clock. The outputs of the FSM are a function of the current Inputs and the Current State. The Next State is determined based on the Current State and the current Inputs and is fed to the inputs of registers.  On the rising edge of the clock, the Next State bits are copied to the outputs of the registers (after some propagation delay), and a new cycle begins. The register then ignores changes in the input signals until the next rising edge. In general, registers can be positive edge- triggered (where the input data is copied on the positive edge of the clock) or negative edge- triggered (where the input data is copied on the negative edge, as is indicated by a small circle at the clock input). Block diagram of a finite state machine using positive edge-triggered registers. Timing Metrics for Sequential Circuits There are three important timing parameters associated with a register as illustrated in Figure. 1. The set-up time (tsu) is the time that the data inputs (D input) must be valid before the clock transition (this is, the 0 to 1 transition for a positive edge-triggered register). 2. The hold time (thold) is the time the data input must remain valid after the clock edge. 3. Assuming that the set-up and hold-times are met, the data at the D input is copied to the Q output after a worst-case propagation delay (with reference to the clock edge) denoted by tc-q. Given the timing information for the registers and the combination logic, some system-level timing constraints can be derived. Assume that the worst- case propagation delay of the logic equals tplogic,while itsminimum delay (also called the contamination delay) is tcd. The minimum clock period T, required for proper operation of the sequential circuit is given by The hold time of the register imposes an extra constraint for proper operation, Wheretcdregisteris the minimum propagation delay (or contamination delay) of the register. It is important to minimize the values of the timing parameters associated with the register, as these directly affect the rate at which a sequential circuit can be clocked. In fact, modern high-performance systems are characterized by a very-low logic depth, and the register propagation delay and set-up times account for a significant portion of the clock period.
  • 30. EC8095: VLSI Design Department of ECE 2020-2021 St.Joseph’s College of Engineering / St.Joseph’s Institute of Technology 30 Classification of Memory Elements Foreground versus Background Memory Memory that is embedded into logic is foreground memory (internal memory), and is most often organized as individual registers of register banks. Large amounts of centralized memory core are referred to as background memory (external memory). Static versus Dynamic Memory  Static memories preserve the state as long as the power is turned on.  Built using positive feedback or regeneration, where the circuit topology consists of intentional connections between the output and the input of a combinational circuit.  Static memories are most useful when the register won‘t be updated for extended periods of time. E.g. configuration data, loaded at power-up time.  This condition also holds for most processors that use conditional clocking (i.e., gated clocks) where the clock is turned off for unused modules. In that case, there are no guarantees on how frequently the registers will be clocked, and static memories are needed to preserve the state information.  Memory based on positive feedback fall under the class of elements called multivibrator circuits.The bistableelement, is its most popular representative, but other elements such as monostable and astable circuits are also frequently used.  Dynamic memories store state for a short period of time—on the order of milliseconds. They are based on the principle of temporary charge storage on parasitic capacitors associated with MOS devices. Capacitors have to be refreshed periodically to annihilate charge leakage.  Dynamic memories tend to be simpler, resulting in significantly higher performance and lower power dissipation. They are most useful in datapath circuits that require high performance levels and are periodically clocked. Latches versus Registers A latch is an essential component in the construction of an edge-triggered register. It is level- sensitive circuit that passes the D input to the Q output when the clock signal is high. This latch is said to be in transparent mode. When the clock is low, the input data sampled on the falling edge of the clock is held stable at the output for the entire phase, and the latch is in hold mode. The inputs must be stable for a short period around the falling edge of the clock to meet set-up and hold requirements. A latch operating under the above conditions is a positive latch. Similarly, a negative latch passes the D input to the Q output when the clock signal is low.
  • 31. EC8095: VLSI Design Department of ECE 2020-2021 St.Joseph’s College of Engineering / St.Joseph’s Institute of Technology 31 Timing of positive and negative latches Static Latches and Registers The Bistability Principle Static memories use positive feedback to create a bistable circuit — a circuit having two stable states that represent 0 and 1. The basic idea is shown in Figure a, which shows two inverters connected in cascade along with a voltage-transfer characteristic typical of such a circuit. Assume now that the output of the second inverter Vo2 is connected to the input of the first Vi1, as shown by the dotted lines in Figure a. The resulting circuit has only three possible operation points (A, B, and C). Under the condition that the gain of the inverter in the transient region is larger than 1, only A and B are stable operation points, and C is a metastable operation point. Suppose that the cross- coupled inverter pair is biased at point C. A small deviation from this bias point, possibly caused by noise, is amplified and regenerated around the circuit loop. This is a consequence of the gain around the loop being larger than 1. On the other hand, A and B are stable operation points. In these points, the loop gain is much smaller than unity. Hence the cross-coupling of two inverters results in a bistablecircuit, which serves as a memory, storing either a 1 or a 0 (corresponding to positions A and B). In order to change the stored value, we must be able to bring the circuit from state A to B and vice-versa. This is generally done by applying a trigger pulse at Vi1 or Vi2. The width of the trigger pulse need be only a little larger than the total propagation delay around the circuit loop, which is twice the average propagation delay of the inverters. SR Flip-Flops SR —or set- reset— flip-flopcircuit is similar to the cross-coupled inverter pair with NOR gates replacing the inverters. The second input of the NOR gates is connected to the trigger inputs (S and R), that make it possible to force the outputs Q and Q' to a given state. These outputs are complimentary (except for the SR = 11 state). When both S and R are 0, the flip-flop is in a quiescent state and both outputs retain their value. If a positive (or 1) pulse is applied to the S input,theQ output is forced into the 1 state (with Q going to 0). Vice versa, a 1 pulse on R resets the flip-flop and the Q output goes to 0.
  • 32. EC8095: VLSI Design Department of ECE 2020-2021 St.Joseph’s College of Engineering / St.Joseph’s Institute of Technology 32 When both S and R are high, both Q and Q'are forced to zero. This is forbidden. An additional problem with this condition is that when the input triggers return to their zero levels, the resulting state of the latch is unpredictable and depends on whatever input is last to go low. CMOS clocked SR flip-flop One possible realization of a clocked SR flip-flop— a level-sensitive positive latch— is shown in Figure. It consists of a cross-coupled inverter pair, plus 4 extra transistors to drive the flip- flop from one state to another and to provide clocked operation. Multiplexer-Based Latches Advantage: the sizing of devices only affects performance and is not critical to the functionality. For a negative latch, when the clock signal is low, the input 0 of the multiplexer is selected, and the D input is passed to the output. When the clock signal is high, the input 1 of the multiplexer, which connects to the output of the latch, is selected. The feedback holds the output stable while the clock signal is high. A transistor level implementation of a positive latch based on multiplexers is shown in Figure.  When CLK is high, the bottom transmission gate is on and the latch is transparent - that is, the D input is copied to the Q output.  The feedback does not have to be overridden to write the memory and hence sizing of transistors is not critical for realizing correct functionality. The number of transistors that the clock touches is important since it has an activity factor of 1.  Not efficient from this metric as it presents a load of 4 transistors to the CLK signal. To reduce the clock load to 2 transistors, by using NMOS only pass transistor as shown in Figure. Advantage  reduced clock load of only two NMOS devices.
  • 33. EC8095: VLSI Design Department of ECE 2020-2021 St.Joseph’s College of Engineering / St.Joseph’s Institute of Technology 33  Simple circuit. Disadvantage: Results in passing of a degraded high voltage of VDD- VTnto the input of the first inverter. This impacts both noise margin and the switching performance, especially in the case of low values of VDD and high values of VTn. It also causes static power dissipation in first inverter. Since the maximum input-voltage to the inverter equals VDD-VTn, the PMOS device of the inverter is never turned off, resulting in a static current flow. Master-Slave Edge-Triggered Register  The register consists of cascading a negativeWSW latch (master stage) with a positive latch (slave stage).  On the low phase of the clock, the master stage is transparent, and the D input is passed to the master stage output, QM. During this period, the slave stage is in the hold mode, keeping its previous value using feedback.  On the rising edge of the clock, the master slave stops sampling the input, and the slave stage starts sampling. During the high phase of the clock, the slave stage samples the output ofthe masterstage (QM), while the master stage remains in a hold mode. Since QM is constant during the high phase of the clock, the output Q makes only one transition per cycle.  The value of Q is the value ofDright before the rising edge of the clock, achieving the positive edge-triggered effect. A negative edge-triggered register can be constructed using the same principle by simply switching the order of the positive and negative latch (this is, placing the positive latch first). A complete transistor-level implementation of the master-slave positive edge-triggered register is shown in Figure below. Drawback of the transmission gate register :the high capacitive load presented to the clock signal. The clock load per register is important, since it directly impacts the power dissipation of the clock network. Each register has a clock load of 8 transistors. One approach to reduce the clock load at the cost of robustness is to make the circuit ratioed.
  • 34. EC8095: VLSI Design Department of ECE 2020-2021 St.Joseph’s College of Engineering / St.Joseph’s Institute of Technology 34 Figure below shows that the feedback transmission gate can be eliminated by directly cross coupling the inverters. Another problem with this scheme is the reverse conduction — this is, the second stage can affect the state of the first latch. When the slave stage is on (Figure above)it is possible for the combination of T2 and I4 to influence the data stored in I1-I2 latch. As long as I4 is a weak device, this is fortunately not a major problem. Non-ideal clock signals Variations can exist in the wires used to route the two clock signals, or the load capacitances can vary based on data stored in the connecting latches. This effect, known as clock skew is a major problem, and causes the two clock signals to overlap as is shown in Figure 7.20b. Clock-overlap can cause two types of failures, as illustrated for the NMOS- only negative master- slave register.  When the clock goes high, the slave stage should stop sampling the master stage output and go into a hold mode. However, since CLK and CLK bar are both high for a short period of time (the overlap period), both sampling pass transistors conduct and there is a direct path from the D input to the Q output. As a result, data at the output can change on the rising edge of the clock.This is a race condition in which the value of the output Q is a function of whether the input D arrives at node X before or after the falling edge of CLK. If node X is sampled in the metastable state, the output will switch to a value determined by noise in the system.  The primary advantage of the multiplexer-based register is that the feedback loop is open during the sampling period, and therefore sizing of devices is not critical to functionality. However, if there is clock overlap between CLK bar and CLK, node A can be driven by both D and B, resulting in an undefinedstate. Those problems can be avoided by using two non-overlapping clocks PHI1 and PHI2 instead, and by keeping the nonoverlap time tnon_overlapbetween the clocks large enough such that no overlap occurs even in the presence of clock-routing delays.
  • 35. EC8095: VLSI Design Department of ECE 2020-2021 St.Joseph’s College of Engineering / St.Joseph’s Institute of Technology 35 Dynamic Latches and Registers The class of circuits based on temporary storage of charge on parasitic capacitors. Charge stored on a capacitor can be used to represent a logic signal. The absence of charge denotes a 0, while its presence stands for a stored 1. a periodic refresh of its value is necessary. Hence the name dynamic storage. Dynamic Transmission-Gate Edge-triggered Registers: A fully dynamic positive edge-triggered register based on the master-slave concept is shown inFigure below.  When CLK = 0, the input data is sampled on storage node 1, which has an equivalent capacitance of C1 consisting of the gate capacitance of I1, the junction capacitance of T1, and the overlap gate capacitance of T1.  During this period, the slave stage is in a hold mode, with node 2 in a high- impedance (floating) state.  On the rising edge of clock, the transmission gate T2 turns on, and the value sampled on node 1 right before the rising edge propagates to the output Q  Node 2 now stores the inverted version of node 1. Very efficient - requires only 8 transistors. The sampling switches canbeimplementedusingNMOS-onlypasstransistors (6-transistorimplementation). The set-up time of this circuit is simply the delay of the transmission gate, and corresponds to the time it takes node 1 to sample the D input. The hold time is approximately zero, since the transmission gate is turned off on the clock edge and further inputs changes are ignored. The propagation delay (tc-q) is equal to two inverter delays plus the delay of the transmission gate T2. Race Condition and Preventive Measures Clock overlap is an important concern for this dynamic register. Consider the clock waveforms shown in Figure below. During the 0-0 overlap period, the PMOS of T1 and the PMOS of T2 are simultaneously on, creating a direct path for data to flow from the D input of the register to the Q output. As a result, data at the output can change on the falling edge of the clock, which is undesired for a positive edge triggered register. The is known as a race condition in which the value of the output Q is a function of whether the input D arrives at node X before or after the raising edge of CLK. The output Q can change on the falling edge if the overlap period is large — obviously an undesirable effect for a positive edge-triggered register. The sameis true for the 1-1 overlap region, where an input-output path exists through the NMOS of T1 and the NMOS of T2. The latter case is taken care of by enforcing a hold time constraint. That is, the data must be stable during the high-high overlap period. The former situation (0-0 overlap) can be addressed by making sure that there is enough delay between the D input and node 2 ensuring that new data sampled by the master stage does not propagate through to the slave stage. Generally the built in single inverter delay should be sufficient and the overlap period constraint is givenas: Similarly, the constraint for the 1-1 overlap is given as:
  • 36. EC8095: VLSI Design Department of ECE 2020-2021 St.Joseph’s College of Engineering / St.Joseph’s Institute of Technology 36 Impact of overlapping clocks. C2 MOS—A Clock-Skew Insensitive Approach ( Method to prevent race condition) Figure below shows an ingenious positive edge-triggered register, based on a master-slave concept insensitive to clock overlap. This circuit is called the C2 MOS (Clocked CMOS) register, and operates in two phases. 1. CLK = 0 (CLK bar = 1): The first tri-state driver is turned on, and the master stage acts as an inverter sampling the inverted version of D on the internal node X. The master stage is in the evaluation mode. Meanwhile, the slave section is in a high- impedance mode, or in ahold mode. Both transistors M7 and M8 are off, decoupling the output from the input. The output Q retains its previous value stored on the output capacitorCL2. 2. The roles are reversed when CLK = 1: The master stage section is in hold mode (M3- M4 off), while the second section evaluates (M7-M8on). The value stored on CL1propagates to the output node through the slave stage which acts as aninverter. In the (0-0) overlap case, both PMOS devices are on during this period. New data is sampled on node X through the series PMOS devices M2-M4, and node X can make a 0-to-1 transition during the overlap period. However, this data cannot propagate to the output since the NMOS device M7is turned off. At the end of the overlap period, CLK=1 and both M7 and M8 turn off, putting the slave stage is in the holdmode. The (1-1) overlap case where both NMOS devices M3 and M7 are turned on. If the D input changes during the overlap period, node X can make a 1-to-0 transition, but cannot propagate to the output. However, as soon as the overlap period is over, the PMOS M8is turned on and the 0 propagates to output. This effect is notdesirable. The problem is fixed by imposing a hold time constraint on the input data, D, or, in other words, the data D should be stable during the overlap period. Pipelining: An approach to optimize sequential circuits Pipelining is a popular design technique often used to accelerate the operation of the datapaths in digital processors. The idea is easily explained with the example of Figure(a).The goal of the presented circuit is to compute log(|a + b|), where both a and b represent streams of numbers, that is, the computation must be performed on a large set of inputvalues. The minimal clock period Tmin necessary to ensure correct evaluation is given as: wheretc-qand tsuare the propagation delay and the set-up time of the register, respectively. We assume that the registers are edge-triggered D registers. The term tpd,logicstands for the worst- case delay path through the combinational network, which consists of the adder, absolute value, and logarithm functions. In conventional systems, the latter delay is
  • 37. EC8095: VLSI Design Department of ECE 2020-2021 St.Joseph’s College of Engineering / St.Joseph’s Institute of Technology 37 generally much larger than the delays associated with the registers and dominates the circuit performance. Assume that each logic module has an equal propagation delay. We note that each logic module is then active for only 1/3 of the clock period (if the delay of the register is ignored). For example, the adder unit is active during the first third of the period and remains idle—this is, it does no useful computation— during the other 2/3 of theperiod. (a) (b) Pipelining is a technique to improve the resource utilization, and increase the functional throughput. Assume that we introduce registers between the logic blocks, as shown in Figure b. This causes the computation for one set of input data to spread over a number of clock periods, as shown in Table.The advantage of pipelined operation becomes apparent when examining the minimum clock period of the modified circuit. The combinational circuit block has been partitioned into three sections, each of which has a smaller propagation delay than the original function. This effectively reduces the value of the minimum allowable clock period: Suppose that all logic blocks have approximately the same propagation delay, and that the register overhead is small with respect to the logic delays. The pipelined network outperforms the original circuit by a factor of three under these assumptions, or T min,pipe=Tmin/3. The increased performance comes at the relatively small cost of two additional registers, and an increased latency. Latch- vs. Register-Based Pipelines Consider the pipelined circuit of Figure below. The pipeline system is implemented based on pass-transistor-based positive and negative latches instead of edge triggered registers. Latch-based systems give significantly more flexibility in implementing a pipelined system, and oftenoffers higher performance. When the clocks CLK and are non- overlapping,correctpipelineoperationisobtained.InputdataissampledonC1atthenegativeedge of CLK and the computation of logic block F starts; the result of the logic block F is stored on C2 on the falling edge of , and the computation of logic block G starts. The non overlappingoftheclocksensurescorrectoperation.ThevaluestoredonC2attheendoftheCLKlow phaseistheresultofpassingthepreviousinput(storedon thefallingedgeofCLKonC1) through the logic function F.
  • 38. EC8095: VLSI Design Department of ECE 2020-2021 St.Joseph’s College of Engineering / St.Joseph’s Institute of Technology 38 NORA-CMOS—A Logic Style for Pipelined Structures The latch-based pipeline circuit can also be implemented using C2 MOS latches, as shown in Figure below. This topology has one additional, important property:A C2 MOS-based pipelined circuit is race-free as long as all the logic functions F between the latches are non-inverting. The reasoning for the above argument is similar to the argument made in the construction of a C2 MOS register. During a (0-0) overlap betweenCLK and, all C2 MOS latches, simplify to pure pull-up networks (see Figure7.27). The only way a signal can race from stage to stage under this condition is when the logic function F is inverting, as illustrated in Figure above, where F is replaced by a single, static CMOS inverter. Similar considerations are valid for the (1-1)overlap. Sources of Clock Skew and Jitter A perfect clock is defined as perfectly periodic signal that is simultaneous triggered at various memory elements on the chip. However, due to a variety of process and environmental variations, clocks are not ideal. To illustrate the sources of skew and jitter, consider the simplistic view of clock generation and distribution as shown in Figure below. Typically, a high frequency clock is either provided from off chip or generated on-chip. From a central point, the clock is distributed using multiple matched paths to low-level memory element, registers. Here two paths are shown. The clock paths include wiring and the associated distributed buffers required to drive interconnects and loads. A key point to realize in clock distribution is that the absolute delay through a clock distribution path is not important; But the relative arrival time between the output of each path at the register points is important.
  • 39. EC8095: VLSI Design Department of ECE 2020-2021 St.Joseph’s College of Engineering / St.Joseph’s Institute of Technology 39 The sources of clock uncertainty can be classified in several ways. Systematic errors are nominally identical from chip to chip, and aretypically predictable (e.g., variation in total load capacitance of each clock path). In principle, such errors can be modeled and corrected at design time given sufficiently good models and simulators. Random errors are due to manufacturing variations (e.g., dopant fluctuations that result in threshold variations) that are difficult to model and eliminate.Mismatch may also be characterized as static or time-varying. Below, the various sources ofskewand jitter, introduced in Figure 10.14, are described in detail.  Clock-Signal Generation(1) The generation of the clock signal itself causes jitter. A typical on-chip clock generator takes a low-frequency reference clock signal, and produces a high- frequency global reference for the processor. The core of such a generator is a Voltage-Controlled Oscillator (VCO). Problem is coupling from the surrounding noisy digital circuitry through the substrate. These noise source cause temporal variations of the clock signal that propagate unfiltered through the clock drivers to the flip-flops.  Manufacturing Device Variations(2) Distributed buffers are integral components of the clock distribution networks, as they are required to drive both the register loads as well as the global and local interconnects. The matching of devices in the buffers along multiple clock paths is critical to minimizing timing uncertainty. Device parameters in the buffers vary along different paths, resulting in static skew.There are many sources of variations including oxide variations (that affects the gain and threshold), dopant variations, and lateral dimension (width and length) variations.  Interconnect Variations(3) Vertical and lateral dimension variations cause the interconnect capacitance and resistance to vary across a chip. Since this variation is static, it causes skew between different paths. One important source of interconnect variation is the Inter-level Dielectric (ILD) thickness variations. Other interconnect variations include deviation in the width of the wires and line spacing. This results from photolithography and etch dependencies.  Environmental Variations (4 and 5) The two major sources are temperature and power supply. Temperature gradients across the chip isa result of variations in power dissipation across the die (chip). This is an issue with clock gating where some parts of the chip maybe idle while other parts of the chip might be active. Since the device parameters (such as threshold, mobility, etc.) depend strongly on temperature, buffer delay for a clock distribution network along one path can vary drastically for another path. The delay through buffers is a very strong function of power supply as it directly affects the drive of the transistors. As with temperature, the power supply voltage is a strong function of the switching activity. Power supply variations can be classified into static (or slow) and high frequency variations. Static power supply variations may result from fixed currents drawn from various modules, while high-frequency variations result from
  • 40. EC8095: VLSI Design Department of ECE 2020-2021 St.Joseph’s College of Engineering / St.Joseph’s Institute of Technology 40 instantaneous IR drops along the power grid due to fluctuations in switching activity.  Capacitive Coupling (6 and 7) The variation in capacitive load also contributes to timing uncertainty. There are two major sources of capacitive load variations: coupling between the clock lines and adjacent signal wires and variation in gate capacitance. Any coupling between the clock wire and adjacent signal results in timing uncertainty leading to clock jitter. Another major source of clock uncertainty is variation in the gate capacitance related to the sequential elements. The load capacitance is highly non-linear and depends on the applied voltage. Timing Issues in Digital Circuits, Clock Distribution Techniques,Synchronous and Asynchronous Design All sequential circuits have one property in common—a well-defined ordering of the switching events must be imposed if the circuit is to operate correctly. If this were not the case, wrong data might be written into the memory elements, resulting in a functional failure. The synchronous system approach, in which all memory elements in the system are simultaneously updated using a globally distributed periodic synchronization signal (that is, a global clock signal), represents an effective and popular way to enforce this ordering. Functionality is ensured by imposing some strict constraints on the generation of the clock signals and their distribution to the memory elements distributed over the chip; non- compliance often leads to malfunction. We analyze the impact of spatial variations of the clock signal, called clock skew, and temporal variations of the clock signal, called clock jitter, and introduce techniques to cope with it. These variations fundamentally limit the performance that can be achieved using a conventional design methodology. At the other end of the design spectrum is an approach called asynchronous design, which avoids the problem of clock uncertainty all-together by eliminating the need for globally-distributed clocks. After discussing the basics of asynchronous design approach, we analyze the associated overhead and identify some practical applications. The important issue of synchronization, which is required when interfacing different clock domains or when sampling an asynchronous signal, also deserves some in-depth treatment. Finally, the fundamentals of on-chip clock generation using feedback is introduced along with trends in timing. Timing Classification Of Digital Systems In digital systems, signals can be classified depending on how they are related to a local clock.Signals that transition only at predetermined periods in time can be classified as synchronous, mesochronous, or plesiochronous with respect to a system clock. A signal that can transition at arbitrary times is considered asynchronous.  Synchronous Interconnect: A signal with exact same frequency, and a known fixed phase offset with respect to the local clock.  Mesochronous interconnect:Asignal with the same frequency but an unknown phase offset with respect to the local clock  Plesiochronous Interconnect A signal which has nominally the same, but slightly differentfrequency as the local clock  Asynchronous Interconnect: Asynchronous signals can transition at any arbitrary time, and are not slaved to any local clock. Synchronous Design: Synchronous Timing Basics All systems designed today use a periodic synchronization signal or clock. The generation and distribution of a clock has a significant impact on performance and power dissipation. In the ideal world, assuming the clock paths from a central distribution point to each