SlideShare a Scribd company logo
ELEC516/10 Lecture 9
1
ELEC 516 VLSI System Design
and Design Automation Spring
2010
Lecture 9 - Low Power Digital
CMOS Design
Reading Assignment:
Rabaey: Chapter 4 and 7,11
Note: some of the figures in this slide set are adapted from the slide set
of “ Digital Integrated Circuits” by Rabaey, Copyright UCB 2002
ELEC516/10 Lecture 9
2
Why worry about power?
-- Heat Dissipation
Besides low power
required for portable
applications !!!!
Power is a very big concern in today’s advanced technologies.
Power produces heat on the chip which has to be carried off
through the chip socket  expensive packaging solutions
ELEC516/10 Lecture 9
3
Motivation for low power design
• Packaging costs
• Power supply rail design
• Chip and system cooling costs
• Noise immunity and system reliability
• Battery life (in portable systems)
• Environmental concerns
– ICT equipment accounted for 10% of total US
commercial energy usage in 2010 and may reach 20%
by 2020
– Energy Star compliant systems
ELEC516/10 Lecture 9
4
Why worry about power — Portability
Multimedia Terminals
Laptop Computers
Digital Cellular Telephony
BATTERY
(40+ lbs)
Year
Nominal
Capacity
(Watt-hours
/
lb)
Nickel-Cadium
Ni-Metal Hydride
65 70 75 80 85 90 95
0
10
20
30
40
50
Rechargable Lithium
Expected Battery Lifetime increase
over next 5 years: 30-40%
ELEC516/10 Lecture 9
5
Why worry about power? -- Standby Power
 Drain leakage will increase as VT decreases to maintain noise margins and
meet frequency demands, leading to excessive battery draining standby
power consumption.
8KW
1.7KW
400W
88W
12W
0%
10%
20%
30%
40%
50%
2000 2002 2004 2006 2008
Standby
Power
Source: Borkar, De Intel
Year 2002 2005 2008 2011 2014
Power supply Vdd (V) 1.5 1.2 0.9 0.7 0.6
Threshold VT (V) 0.4 0.4 0.35 0.3 0.25
…and phones leaky!
ELEC516/10 Lecture 9
6
Power and Energy Figures of Merit
• Power consumption in Watts
– determines battery life in hours
• Peak power
– determines power ground wiring designs
– sets packaging limits
– impacts signal noise margin and reliability analysis
• Energy efficiency in Joules
– rate at which power is consumed over time
• Energy = power * delay
– Joules = Watts * seconds
– lower energy number means less power to perform a
computation at the same frequency
ELEC516/10 Lecture 9
7
Power versus Energy
Watts
time
Power is height of curve
Watts
time
Approach 1
Approach 2
Approach 2
Approach 1
Energy is area under curve
Lower power design could simply be slower
Two approaches require the same energy
ELEC516/10 Lecture 9
8
PDP and EDP
• Power-delay product (PDP) = Pav * tp = (CLVDD
2)/2
– PDP is the average energy consumed per switching event
(Watts * sec = Joule)
– lower power design could simply be a slower design
l allows one to understand tradeoffs better
0
5
10
15
0.5 1 1.5 2 2.5
Vdd (V)
Energy-Delay
(normalized)
energy-delay
energy
delay
 Energy-delay product (EDP) = PDP * tp = Pav * tp
2
l EDP is the average energy
consumed multiplied by the
computation time required
l takes into account that one
can trade increased delay
for lower energy/operation
(e.g., via supply voltage
scaling that increases delay,
but decreases energy
consumption)
ELEC516/10 Lecture 9
9
• Dynamic power
– Charge-discharge current:
• Dominant source of power (CV2 per transition)
– Short circuit current (both NMOS and PMOS on during transit)
• <10% of c/d current if transitions are fast
• Subthreshold leakage (transistors not OFF completely)
– Becoming important 10-30% active power in <0.18um techn
– Diode leakage from reverse source and drain diodes (neglig)
– Gate leakage (no longer negligible due to very thin gate oxide)
Where Does Power Go in CMOS?
ELEC516/10 Lecture 9
10
Dynamic Power Consumption
Vin Vout
CL
Energy/transition = C L * V dd
2
Power = Energy/transition * f = C L * V dd
2 * f
Need to reduce C L , V dd, and f to reduce power.
Vdd
Not a function of transistor sizes!
ELEC516/10 Lecture 9
11
Dynamic Power Consumption - Revisited
Power = Energy/transition * transition rate
= CL * V dd
2
* f 0 1
= C L * V dd
2
* P 0 1* f
= C EFF * V dd
2
* f
Power Dissipation is Data Dependent
Function of Switching Activity
CEFF = Effective Capacitance = C L * P0 1
ELEC516/10 Lecture 9
12
Power Consumption is Data Dependent
Example: Static 2 Input NOR Gate
Assume:
P(A=1) = 1/2
P(B=1) = 1/2
P(Out=1) = 1/4
P(01)
= 3/4X 1/4 = 3/16
Then:
= P(Out=0).P(Out=1)
CEFF = 3/16 * CL
ELEC516/10 Lecture 9
13
Transition Probabilities for Basic Gates
ELEC516/10 Lecture 9
14
Transition Probability of 2-input NOR Gate
ELEC516/10 Lecture 9
16
Inter-signal Correlations
B
A
Z
X
P(Z=1) = P(B=1) & P(A=1 | B=1)
0.5
0.5
(1-0.5)(1-0.5)x(1-(1-0.5)(1-0.5)) = 3/16
(1- 3/16 x 0.5) x (3/16 x 0.5) = 0.085
Reconvergent
• Determining switching activity is complicated by the
fact that signals exhibit correlation in space and time
– reconvergent fan-out
• Have to use conditional probabilities
ELEC516/10 Lecture 9
17
Logic Restructuring
Chain implementation has a lower overall switching activity than
the tree implementation for random inputs
Ignores glitching effects
 Logic restructuring: changing the topology of a logic
network to reduce transitions
A
B
C
D F
A
B
C
D Z
F
W
X
Y
0.5
0.5
(1-0.25)*0.25 = 3/16
0.5
0.5
0.5
0.5
0.5
0.5
7/64
15/256
3/16
3/16
15/256
AND: P01 = P0 x P1 = (1 - PAPB) x PAPB
ELEC516/10 Lecture 9
19
Input Ordering
Beneficial to postpone the introduction of signals with a high
transition rate (signals with signal probability close to 0.5)
A
B
C
X
F
0.5
0.2
0.1
B
C
A
X
F
0.2
0.1
0.5
(1-0.5x0.2)x(0.5x0.2)=0.09 (1-0.2x0.1)x(0.2x0.1)=0.0196
ELEC516/10 Lecture 9
20
How about Dynamic Circuits?
Power is Only Dissipated when Out=0!
Mp
Me
VDD
PDN
f
In1
In2
In3
Out
f
CEFF = P(Out=0).C L
ELEC516/10 Lecture 9
21
2-input NOR Gate
Example: Dynamic 2 Input NOR Gate
Assume:
P(A=1) = 1/2
P(B=1) = 1/2
P(Out=0) = 3/4
Then:
CEFF = 3/4 * C
L
Switching Activity Is Always Higher in Dynamic Circuits
ELEC516/10 Lecture 9
22
Transition Probabilities for Dynamic Gates
Switching Activity for Precharged Dynamic Gates
P01 = P0
ELEC516/10 Lecture 9
24
Glitching in Static CMOS Networks
ABC
X
Z
101 000
Unit Delay
A
B
X
Z
C
• Gates have a nonzero propagation delay resulting in spurious
transitions or glitches (dynamic hazards)
– glitch: node exhibits multiple transitions in a single cycle
before settling to the correct logic value
ELEC516/10 Lecture 9
25
Example : Adder Circuit
0 5 10
0.0
2.0
4.0
Time, ns
Sum
Output
Voltage,
Volts
Cin
S15
S10
6
5
4
3
2
S1
Add0 Add1 Add2 Add14 Add15
S0 S1 S2 S14 S15
Cin
ELEC516/10 Lecture 9
26
How to Cope with Glitching?
F1
F2
F3
F1
F3
F2
0
0
0
0
1
2
0
0
0
0
1
1
Equalize Lengths of Timing Paths Through Design
ELEC516/10 Lecture 9
27
Balanced Delay Paths to Reduce Glitching
So equalize the lengths of timing paths through logic
F1
F2
F3
0
0
0
0
1
2
F1
F2
F3
0
0
0
0
1
1
 Glitching is due to a mismatch in the path lengths in
the logic network; if all input signals of a gate change
simultaneously, no glitching occurs
ELEC516/10 Lecture 9
28
Glitch Reduction by Pipelining
• Glitches depend on the logic depth of the circuit - gates
deeper in the logic network are more prone to glitching
– arrival times of the gate inputs are more spread due to
delay imbalances
– usually affected more by primary input switching
• Reduce logic depth by adding pipeline registers
– additional energy used by the clock and pipeline registers
PC
Fetch Decode Execute Memory WriteBack
Instruction
MAR
MDR
I$ D$
clk
pipeline
stage
isolation
register
ELEC516/10 Lecture 9
29
Short Circuit Power Consumption
Finite slope of the input signal causes a direct
current path between VDD and GND for a short
period of time during switching when both the
NMOS and PMOS transistors are conducting.
Vin Vout
CL
Isc
ELEC516/10 Lecture 9
30
Short Circuit Currents Determinates
• Duration and slope of the input signal, tsc
• Ipeak determined by
– the saturation current of the P and N transistors which
depend on their sizes, process technology, temperature,
etc.
– strong function of the ratio between input and output
slopes
• a function of CL
Esc = tsc VDD Ipeak P01
Psc = tsc VDD Ipeak f01
ELEC516/10 Lecture 9
31
Impact of CL on Psc
Vin Vout
CL
Isc  0
Vin Vout
CL
Isc  Imax
Large capacitive load
Output fall time significantly
larger than input rise time.
Small capacitive load
Output fall time substantially
smaller than the input rise
time.
ELEC516/10 Lecture 9
32
Ipeak as a Function of CL
-0.5
0
0.5
1
1.5
2
2.5
0 2 4 6
time (sec)
x 10-10
x 10-4
CL = 20 fF
CL = 100 fF
CL = 500 fF
500 psec input slope
Short circuit dissipation
is minimized by
matching the rise/fall
times of the input and
output signals - slope
engineering.
When load capacitance
is small, Ipeak is large.
ELEC516/10 Lecture 9
33
Static Power Consumption
Vin=5V
Vout
CL
Vdd
Istat
Pstat = P(In=1).Vdd . Istat
• Dominates over dynamic consumption
• Not a function of switching frequency
ELEC516/10 Lecture 9
34
Leakage (Static) Power Consumption
Sub-threshold current is the dominant factor.
All increase exponentially with temperature!
VDD Ileakage
Vout
Drain junction
leakage
Sub-threshold current
Gate leakage
ELEC516/10 Lecture 9
35
Power consideration
– leakage current
)
1
( /
/
)
( kT
q
V
nkT
q
V
V
sub
ds
th
gs
e
e
K
I 







K: technology constant; q: electronic charge; k: Boltzman constant
N: nonlinearity constant (between 1 and 2); T: Temperature
ELEC516/10 Lecture 9
36
Leakage as a Function of VT
0 0.2 0.4 0.6 0.8 1
VGS (V)
ID
(A)
VT=0.4V
VT=0.1V
10-2
10-12
10-7
 Continued scaling of supply voltage and the subsequent
scaling of threshold voltage will make subthreshold
conduction a dominate component of power dissipation.
 An 90mV/decade VT
roll-off - so each
255mV increase in
VT gives 3 orders of
magnitude reduction
in leakage (but
adversely affects
performance)
ELEC516/10 Lecture 9
37
Sub-Threshold in MOS
VT=0.6
VT=0.2
ID
VGS
Lower Bound on Threshold to Prevent Leakage
ELEC516/10 Lecture 9
38
Low Power Design Space
• The dynamic power consumption equation reveals
the three degrees of freedom inherent in the low
power design space:
– Voltage
– Physical capacitance
– Data activity
• Optimization for power entails an attempt to reduce
one or more of these factors. Interactions among
these factors complicate the optimization problem.
• Deep sub-micron design - need to minimize leakage
and sub-threshold current
ELEC516/10 Lecture 9
39
Power Reduction Strategy (I)
• Voltage Reduction
– 5V ->3.3 V -> 2.5V ->1.8V-> 1.0V
– Mixed supplies in system and/or on chip, by using the
minimum voltage for different chips of functions
within a chip, together with on-chip voltage
converters if required.
• Low-voltage circuit techniques are required to give
good performances even with low voltages
• Less noisy structures and better signal integrity
handling is required
• Lower Vth process is required to maintain good
transistor speed performances
ELEC516/10 Lecture 9
40
Reducing Vdd
P x td = Et = CL * Vdd
2
E(Vdd=2)
=
(CL) * (2)2
(CL) * (5)2
E(Vdd=5)
Strong function of voltage (V 2
dependence).
Relatively independent of logic function and style.
E(Vdd=2)  0.16 E (Vdd =5)
0.03
0.05
0.07
0.1
0.15
0.20
0.30
0.50
0.70
1.00
1.5
1 2 5
51 stage ring oscillator
8-bit adder
Vdd (volts)
quadratic dependence
N
O
RMALI
Z
ED
PO
WER
-
DELAY
P
R
O
DUCT
Power Delay Product Improves with lowering V DD .
ELEC516/10 Lecture 9
41
Lower Vdd Increases Delay
C L * Vdd
I
=
Td
Td(Vdd=5)
Td(Vdd=2)
=
(2) * (5 - 0.7)2
(5) * (2 - 0.7)2
 4
I ~ (Vdd - Vt)2
Relatively independent of logic function and style.
1.00
1.50
2.00
2.50
3.00
3.50
4.00
4.50
5.00
5.50
6.00
6.50
7.00
7.50
2.00 4.00 6.00
Vdd (volts)
NORMALIZED
DELAY
adder (SPICE)
microcoded DSP chip
multiplier
adder
ring oscillator
clock generator
2.0mm technology
ELEC516/10 Lecture 9
42
Lowering the Threshold
DESIGN FOR PLeakage == PDynamic
Vt = 0.2
Vt = 0
I
D
VGS
Reduces the Speed Loss, But Increases Leakage
Vdd
Delay
2Vt
Interesting Design Approach:
ELEC516/10 Lecture 9
43
What Threshold Voltage to Use?
• Energy vs. Vt for a fixed throughput
• An “optimal” Vt/Vdd point trades switching and
leakage energy
ELEC516/10 Lecture 9
44
Stack Effect
• Leakage is a function of the circuit topology and the value
of the inputs
VT = VT0 + (|-2fF + VSB| - |-2fF|)
where VT0 is the threshold voltage at VSB = 0; VSB is the source-
bulk (substrate) voltage;  is the body-effect coefficient
A B
B
A
Out
VX
A B VX ISUB
0 0 VT ln(1+n) VGS=VBS= -VX
0 1 0 VGS=VBS=0
1 0 VDD-VT VGS=VBS=0
1 1 0 VSG=VSB=0
• Leakage is least when A = B = 0
• Leakage reduction due to stacked
transistors is called the stack effect
ELEC516/10 Lecture 9
45
Leakage as a Function of Design Time VT
• Reducing the VT increases
the sub-threshold leakage
current (exponentially)
– 90mV reduction in VT
increases leakage by an
order of magnitude
• But, reducing VT decreases
gate delay (increases
performance) 0 0.2 0.4 0.6 0.8 1
VGS (V)
ID
(A)
VT=0.4V
VT=0.1V
• Determine the critical path(s) at design time and use low VT
devices on the transistors on those paths for speed. Use a high
VT on the other logic for leakage control.
– A careful assignment of VT’s can reduce the leakage by as
much as 80%
ELEC516/10 Lecture 9
46
Dual-Thresholds Inside a Logic Block
• Minimum energy consumption is achieved if all logic
paths are critical (have the same delay)
• Use lower threshold on timing-critical paths
– Assignment can be done on a per gate or transistor basis;
no clustering of the logic is needed
– No level converters are needed
ELEC516/10 Lecture 9
47
Variable VT (ABB) at Run Time
• VT = VT0 + (|-2fF + VSB| - |-2fF|)
0.4
0.45
0.5
0.55
0.6
0.65
0.7
0.75
0.8
0.85
0.9
-2.5 -2 -1.5 -1 -0.5 0
VSB (V)
 A negative bias on VSB
causes VT to increase
 Adjusting the substrate
bias at run time is called
adaptive body-biasing
(ABB)
l Requires a dual well fab
process
 For an n-channel device, the substrate is normally tied
to ground (VSB = 0)
ELEC516/10 Lecture 9
48
Techniques for Burst Mode Computation
High VT
High VT
Low VT
SLEEP
SLEEP
-
+
-
+
on
standby
standby
on
Vin Vout
Vdd
Vp >0
Vn < 0
Multiple VT Technology Substrate Bias Controlled
Variable VT Devices
(Disable high VT devices during idle periods) (increase VT during idle periods)
e.g. [Sakata-93], [Mutoh-93] e.g. [Seta-95]
ELEC516/10 Lecture 9
49
Multiple-Threshold Circuits
• Previous approaches cannot reduce leakage power
during active mode.
• Use dual Vt CMOS logic, low Vt for critical paths and
high Vt for non-critical paths, problem - large
threshold swing
• Triple Vt CMOS circuit to reduce the sub-thresold
swing [Fujii et. al. ISSCC-98]
– for high speed low power active mode, low and
medium Vt are used for critical and non-critical paths,
respectively.
– For standby mode, high Vt MOSFET is inserted
between the supply rail and the virtual supply rail.
ELEC516/10 Lecture 9
50
Multiple-Threshold Circuits
ELEC516/10 Lecture 9
51
Power considerations – reduce leakage
• Leakage proportional to device width
– Use smallest devices for critical path.
• Leakage drops with stacked devices (drain voltage
divider)
– Use stacked transistors for critical path.
• Leakage drops with increasing channel length
– Slightly increase L for critical Path
• Use dual VT process providing two threshold VT
– Use high VT transistors for critical path
ELEC516/10 Lecture 9
52
Power consideration
– reduce leakage
• Switch off critical path
transistors when not needed.
• Stand-by mode between supply
and virtual supply lines
• Stand-by vectors
– Apply input vector which
minimizes leakage.
– Achieved using Mux
ELEC516/10 Lecture 9
53
Power gating transistor Sizing
• The effect of power gating transistor size
– As the size decreases, logic performance also
decreases.
– As the size increases, leakage current and chip
area also increase.
– Proper sizing is very important.
– power gating transistor size should be decided within
2% performance degradation.
Vop = VDD - V
V must be sized
within 2% performance degradation.
VDD
GND
Low Vt
HighVt
Switch
Control
ELEC516/10 Lecture 9
54
Power Reduction Strategy (II)
• Reducing capacitance
– Process scaling and better integration, with smaller
capacitances in more aggressive processes
– Improved devices and interconnect technology
– Efficient clock generation and distribution
– Good memory hierarchy
– In-place optimization using a library containing ranges
of gates with different strengths, through replacement
of gates to use the optimum drive in the critical paths
and minimum drive elsewhere
ELEC516/10 Lecture 9
55
Reducing Effective Capacitance
Global bus architecture Local bus architecture
Shared Resources incur Switching Overhead
ELEC516/10 Lecture 9
56
Power consideration – Reduce
Capacitance
• Reduce switched capacitance C
– Careful transistor sizing, transistor ordering, tighter and
more compact layout
– Hierarchical architecture and add TG to isolate buses
– Segmented structures
Shared bus driven by A or B when
Sending values to C
Insert switch to isolate bus segment
when B sending to C
ELEC516/10 Lecture 9
57
Power Reduction Strategy (III)
• Reducing activities
– lowering operating frequency
– Using power management strategies, such as Gated
clocks, Power-Down of non-operational units
– Reduce switching activities, Power = Energy/transition *
transition rate =
• Power dissipation is data dependent and hence is a
function of switching activity.
• P0->1 is the switching probability
• Effective switching capacitance = Ceff=CL* P0->1
f
V
C
f
P
V
C
f
V
C dd
eff
dd
L
dd
L 








 

2
1
0
2
1
0
2
ELEC516/10 Lecture 9
58
Factors Affecting Power
Consumption - Revisited
• Degree of freedom for low power design space:
– Voltage
– Physical capacitance
– Data activity
• Power minimization approaches:
– Run at minimum allowable voltage
– Minimize effective switching capacitances
ELEC516/10 Lecture 9
59
System Level - Power Down Techniques
Operating States:
Active or Full-On
(fastest clock)
Standby
(slow clock)
Suspend or Sleep
(slow clock or shut down)
Micro
Processor
Activity Analyzer
ELEC516/10 Lecture 9
60
Dynamic Power as a Function of VDD
• Decreasing the VDD
decreases dynamic
energy consumption
(quadratically)
• But, increases gate
delay (decreases
performance) 1
1.5
2
2.5
3
3.5
4
4.5
5
5.5
0.8 1 1.2 1.4 1.6 1.8 2 2.2 2.4
VDD (V)
• Determine the critical path(s) at design time and use high
VDD for the transistors on those paths for speed. Use a
lower VDD on the other gates, especially those that drive
large capacitances (as this yields the largest energy
benefits).
ELEC516/10 Lecture 9
61
Multiple VDD Considerations
• How many VDD? – Two is becoming common
– Many chips already have two supplies (one for core and one for
I/O)
• When combining multiple supplies, level converters are required
whenever a module at the lower supply drives a gate at the higher
supply (step-up)
– If a gate supplied with VDDL drives a gate at VDDH, the PMOS
never turns off
• The cross-coupled PMOS transistors
do the level conversion
• The NMOS transistor operate on a
reduced supply
– Level converters are not needed
for a step-down change in voltage
– Overhead of level converters can be mitigated by doing
conversions at register boundaries and embedding the
level conversion inside the flipflop
VDDH
Vin
Vout
VDDL
ELEC516/10 Lecture 9
62
Dual-Supply Inside a Logic Block
• Minimum energy consumption is achieved if all logic paths
are critical (have the same delay)
• Clustered voltage-scaling
– Each path starts with VDDH and switches to VDDL (gray logic
gates) when delay slack is available
– Level conversion is done in the flipflops at the end of the
paths
ELEC516/10 Lecture 9
63
Power Conscious Behavioral Design
• Run functional units at the minimum allowed voltage
while satisfying the timing constraints
• Parallelize or pipeline data-path, memory and controllers
to compensate for throughput loss due to reduced
supply voltage
• Power down functional units which are not is use; Put in
“dynamic power management” capability
• Avoid centralized resources (controllers, functional
blocks, global busses, etc.) as much possible
• Map functions to hardware so that inter-chip
communication is minimized
• Schedule and bind operations to functional units so as
to reduce the activity of the input operands
• Reorder operands to reduce switching activity; Keep the
inputs to an idle unit unchanged
ELEC516/10 Lecture 9
64
Low Power Datapath Design - reducing
the supply voltage
• Reducing supply voltage has quadratic effect on
power saving, but a negative effect on performance.
• Performance can be gained back by logical and
architectural optimizations, e.g. lookahead adder
instead of ripple-carry adder, using parallelism to
increase performance
ELEC516/10 Lecture 9
65
Power Considerations – reduce f and VDD
• Reducing frequency does not save energy, just reduces
rate at which it is consumed
– Power is lower but system must run longer
• Reducing supply voltage is very effective (reduce
voltage by 0.5 improves energy/transition by 0.25).
• Dropping the voltage will result in reduced
performance in terms of speed (need to recover the
performance using parallelism).
• Trade surplus performance for lower energy by
reducing the supply voltage until performance are as
required
ELEC516/10 Lecture 9
66
Power considerations
– Dynamic voltage Scaling
Can run at lower voltage and hence
improves quadraticaly power comsp.
– 8-bit adder/comparator: (Chandrakasan er. Al.)
• consumes Pref at 40Mhz at 5V with Area= 0.53mm2
– Two parallel interleaved architecture:
• Consumes 0.36Pref at 20MHz, 2.9V with Area=1.80mm2
– One Pipelined architecture:
• Consumes 0.39Pref at 40MHz, 2.9V with Area=0.69mm2
– Pipelined and parallel
• Consumes 0.2Pref at 20MHz, 2.0V with Area=1.96mm2
ELEC516/10 Lecture 9
67
Minimizing the power consumption
using parallelism
• Reference Design
Critical path = 25ns, clock frequency = 40MHz
Supply voltage = 5V
ref
ref
ref
ref f
V
C
P 2

ELEC516/10 Lecture 9
68
• Parallel implementation of the reference design
New critical path = 50nsec,
Cpar->2.15 Cref, Vpar ->2.9V, fpar ->0.5 fref
ref
ref
ref
ref
par
par
par
par P
f
V
C
f
V
C
P 36
.
0
)
2
(
)
58
.
0
)(
15
.
2
( 2
2 


ELEC516/10 Lecture 9
69
Pipelined Data Path
• Critical path delay is less => max[Tadder, Tcomparator].
• Keeping clock rate constant: fpipe=fref, Voltage can be
dropped to Vpipe = Vref/1.7, while maintaining the original
througput
• Capacitance slightly higher: Cpipe=1.15Cref
• Ppipe=(1.15Cref)(Vref/1.7)2fref » 0.39Pref
ELEC516/10 Lecture 9
70
How low a voltage can be used
• Capacitance overhead starts to dominate at “high”
levels of parallelism and results in an optimum
voltage.
ELEC516/10 Lecture 9
71
Voltage as a design variable
Adapting Voltage to Workload yields cubic reduction
ELEC516/10 Lecture 9
72
Multiple supply voltages – Filter Example
1
2
3
4
5
6
7
8
9
10
* * * *
* * * *
+ +
* * * *
+ +
+ +
+ +
+ +
+ +
+ + + +
Power (5V)/Power(5V,3V,2.4V) = 1.5 [Raje95]
2.4V
3V
5V
ELEC516/10 Lecture 9
73
Using Multiple Voltages
C1 C2 C3
Vdd1
Vdd2
Vdd3
Vdd1Vdd2 Vdd3
critical path
non-critical
Vdd1 Vdd2
Vdd1Vdd2
[Horowitz-95]
ELEC516/10 Lecture 9
74
Processor Usage Model
• System Optimizations:
– Maximize Peak Throughput
– Minimize Average Energy/operation
– Maximize computation per battery life
Desired
Throughput
time
Compute-intensive and
Low-latency processes
Ceiling: set by top
Speed of the processor
Single-user system
not always
computing
Background and high-
latency processes
ELEC516/10 Lecture 9
75
Scale Supply Voltage with fclk
ELEC516/10 Lecture 9
76
Dynamic Voltage Scaling
Implementation
• VCO: ring oscillator which matches mP critical path
• DC-DC: perform D/A, converts battery to regulated
Vdd
• Provides both voltage regulation and clock
generation.
Loop Filter DC-DC VCO
mP
-
+
Mfref
fout
Frequency detector battery
Vdd
ELEC516/10 Lecture 9
77
Dynamic Voltage Scaling in Practice
Fixed Throughput, Energy/operation
Occasionally Demand Peak Throughput
Throughput = 8MIPS
Energy/ops = 0.24nJ/inst.
1.92mW
Throughput = 100MIPS
Energy/ops = 2.2nJ/inst.
220mW
Peak Throughput = 100MIPS
Average Energy/op. = 0.24nJ/inst (~1.8mW)
DVS only advantageous when a majority of
computation is performed at low throughput
ELEC516/10 Lecture 9
78
Low-voltage Switching Regulator
• Arbitrary Vdd (<Vin) generated using the Buck
converter
– Vdd = Vin Duty Cycle at Node X
• Chief sources of inefficiencies:
– Conduction loss (I2R)
– Switching loss (CxVin
2fs and LsI2fs)
– Gate-drive loss (CgVin
2fs)
Page 17
[Stratakos-94]
ELEC516/10 Lecture 9
79
Adaptive Power Supply Voltage
• Exploit Data Dependent Computation Times To vary
the Supply [Nielsen94]
REG
FIFO
FIFO
REG
Control
Self-timed
Processor
Power
Supply
Vdd(t)
ELEC516/10 Lecture 9
80
Variable Supply Voltage Control Scheme
ELEC516/10 Lecture 9
81
Voltage Scheduling for variable
workload system
• Voltage scheduling under timing constraints
• Example [Ishihara-98]
– Energy consumption of a processor:
• 10nJ/cycle at 2.5V
• 25nJ/cycle at 4 V
• 40nJ/cycle at 5V
– maximum clock frequencies:
• 50MHz at 5V, 40MHz at 4V, 25MHz at 2.5V
– Given that an application needs 1000M cycles to
finish and the timing constraint is 25sec.
ELEC516/10 Lecture 9
82
Different Voltage Schedules
0 5 10 15 20 25 Time(sec)
5.02
1000Mcycles
50MHz
40J
(A)
0 5 10 15 20 25 Time(sec)
5.02
750Mcycles
50MHz
32.5J
(B)
0 5 10 15 20 25 Time(sec)
5.02
1000Mcycles
40MHz
25J
(C)
Timing constraint
2.52
250Mcycles
25MHz
4.02
Energy
consumption
(
V
dd
2
)
ELEC516/10 Lecture 9
83
Reduce Power Further by Buffering
Two samples processed every two sample periods
-> increased latency
ELEC516/10 Lecture 9
84
Example of Buffering
Block 1
Block 2
Block 3
Block 4
Block 1, 2
Block 1, 2
Block 3,4
Block 3,4
Tsample Tsample
Vdd =5
Vdd =2.5
Vdd =5
Vdd =2.5
Vdd =3.75
Vdd =3.75
Vdd =3.75
Vdd =3.75
C
C
C
Energysample
14
2
)
5
.
2
(
2
1
)
5
( 2
2



C
C
Energysample
5
.
10
2
)
75
.
3
(
)
75
.
0
(
2 2


ELEC516/10 Lecture 9
85
Voltage Island Concept
 Trade off power for delay by running
functional blocks at different voltages
 Can use mix of Low and High Vt to
balance performance and leakage
 Switch off inactive blocks to reduce
leakage power
 Requires IP standards for power
management, clock gating, etc.
Delay vs. Voltage
30
25
20
15
10
5
0
Ddelay
(ps)
0.7 0.8 0 .9 1.0 1.1 1.2 1.3
Voltage (Vdd)
Std. Vt Low Vt
E.g.: Telecom ASIC with 1.0/1.2 V islands saved :
16 % active power
50 % standby power
Power Management Unit
SWITCH SWITCH
Logic
Low VT
Logic
Vddo
Vdd1 Vdd2
IP1 IP2
Source from Bergamaschi
*Slide from Prof Kyung of KAIST
ELEC516/10 Lecture 9
86
Power Management
I/O’s, VReg, Gnd
Memory Arrays
Vdd 4
High Vt device arrays
Optimized for low active
power
Memory Arrays
Vdd 3
Low Vt device arrays
Optimized for low active
power
Microcontroller
Vdd 2
DSP
Vdd 2
ROM
Vdd 1
Monitor Logic Vdd 4
ROM
Vdd1
RLM 1
RLM 2
Memory Arrays
Vdd 3
Low Vt device arrays
Optimized for low active
power
I/O’s, VReg, Gnd
Analog Vdd 5
RLM 3
Vdd 1
I/O’s,
VReg,
Gnd
I/O’s,
VReg,
Gnd
 Independently controlled domain power switches
 Multiple On-Chip Voltage Islands
 On-Chip Voltage Regulators
*Slide from Prof Kyung of KAIST
ELEC516/10 Lecture 9
87
Controlling VDD and VTH for low power
Active Stand-by
Multiple VTH Dual-VTH MTCMOS
Variable VTH VTH hopping VTCMOS
Multiple VDD Dual-VDD Boosted gate MOS
Variable VDD VDD hopping
Software-hardware cooperation
Technology-circuit cooperation
 MTCMOS : Multi-Threshold CMOS
 VTCMOS : Variable Threshold CMOS
 Multiple : spatial assignment
 Variable : temporal assignment
*Slide from Prof Kyung of KAIST
ELEC516/10 Lecture 9
88
RTL-level optimization-Reducing
effective switching activity
• General Principle: Avoid Waste.
– Application-specific processing
– Resource sharing/Locality of reference
– Data representations
– Preservation of Data correlations
– architectural restructuring
– Distributed processing
– Demand-driven/data-driven computation
ELEC516/10 Lecture 9
89
Application Specific Processing
Application Specific Processing Reduces
“Implementation Overhead”
ELEC516/10 Lecture 9
90
Eliminating Redundant Computation
• Dynamically vary the number of operations per
sample.
• Trade power consumption and filter quality [Ludwig-
96]
fs fs fs fs fs
….
Out
Power Down
ELEC516/10 Lecture 9
91
Eliminating Redundant Computation
Switched Capacitance Reduction ~=
Peak Number of Operations
Average Number of Operations
Strong Function of Signal Statistics
~=2
ELEC516/10 Lecture 9
92
Reducing switching activity
• Multiplexing multiple operations on a single
hardware unit can have detrimental effect on the
power consumption, because the switching activity
may be increased, e.g. shared bus
ELEC516/10 Lecture 9
93
Bus Multiplexing
• Share long data buses with time multiplexing (S1 uses
even cycles, S2 odd)
S2
S1
D1
D2
S1
S2 D2
D1
• Buses are a significant source of power dissipation due
to high switching activities and large capacitive loading
– 15% of total power in Alpha 21064
– 30% of total power in Intel 80386
• But what if data samples are correlated (e.g., sign bits)?
ELEC516/10 Lecture 9
94
Reducing the Effective Capacitance
• Circuit and Logic Style - select a circuit style with low
capacitance and/or switching activity, e.g. 8-bit adder
ELEC516/10 Lecture 9
95
Reducing Glitching Activity
• Some circuit structures can be the cause of
spurious transients, e.g. a 16-bit ripple carry adder
• Glitches can be reduced
by selecting structures
that have balanced
signal paths, e.g. tree.
• The Brent-Kung
lookahead adder and
Wallace tree multiplier
both have this
properties, thus more
power attractive.
ELEC516/10 Lecture 9
96
Data Representation
Two’s complement Sign Magnitude
• Sign-extension activity significantly reduced using sign
magnitude representation.
• An accumulator example: sign magnitude datapath switches
30% less capacitance for uniformly distributed inputs
ELEC516/10 Lecture 9
97
Data Representation –
Accumulator Example
Sign magnitude datapath switches 30% less
capacitance for uniformly distributed inputs
ELEC516/10 Lecture 9
98
Two’s Complement vs.
Sign-Magnitude
Two’s complement datapath has a significantly
higher switching activity
ELEC516/10 Lecture 9
99
Bus encoding to reduce switching
activity
• Minimizing temporal bit transition activity by data
representation
• Bit encoding:
– Active high encoding
• high-level voltage for 1, low-level voltage for 0
– Transition-based encoding
• voltage change identifies logic 1, no voltage range
identifies logic 0
• Word encoding
– assign patterns of 1’s and 0’s to each word of
information
– Non-redundant codes vs. redundant codes
• Example of low power coding
– Limited-weight code
– Gray code
– One-hot code
– Bus-invert code
ELEC516/10 Lecture 9
100
Example of Bit Encoding
• Reducing average no. of switching on a data bus
• transition-based encoding may limit the no. of transitions
for non equiprobable input lines
• Let p(0) > p(1) and no temporal correlation exists. If
active-high encoding is used, the average no. of
transition is
• For transition-based encoding, it is simply p(1). If p(1) <<
p(0), transition-based is better than active-high encoding.
• To reduce transition, the input patterns are transformed
in such a way that the p(0) and p(1) prob. of each input
line becomes as different as possible, and then to apply
a transition-based encoding before the data is
transmitted.
)
1
(
))
1
(
1
(
2
)
0
(
)
1
(
)
1
(
)
0
( p
p
p
p
p
p 


ELEC516/10 Lecture 9
101
Word encoding
• One-hot coding
•Gray coding - good for sequential data, e.g.
addressing for microprocessor [Su-94]
•Disadvantage of Gray code - only good for
address bus, not for data bus, additional
conversion circuitry is needed
ELEC516/10 Lecture 9
102
Word encoding (cont.)
• Bus-invert encoding - use redundancy to save power.
• Given a n bit data bus with 2n patterns to be represented and
assumes all the patterns are equiprobable and no temporal
correlation. p(0) = p(1) = 0.5, the average no. of transition is
n/2 per cycle, while the worst case transition is n.
• Bus-invert coding - add an extra line to the bus, I, and then
comparing the consecutive patterns before transmission. Tw
cases
– - If the Hamming distance between the two patterns =< n/2, the
current pattern is transmitted as it is and I is set to 0.
– - if the Hamming distance between the two patterns > n/2, the
current pattern is first inverted and then transmitted, and I is se
to 1.
• Max. transition is limited to n/2 and average transition is
reduced by 25%
• Drawback - extra line I to indicate whether a pattern has been
inverted.
ELEC516/10 Lecture 9
103
Conditional Inversion Coding
ELEC516/10 Lecture 9
104
Bus encoding for cross coupling cap.
• Cross-coupling cap dominates for very deep sub-
micron technology, e.g. < 70nm
• Bus model
– Stand alone cap. Cs
– Cross coupling cap. Cc
• Stand alone switching
– Apply to single bit line
– 0-1 transition
• Cross coupling switching
– Occurs on adjacent wires
– Four types of coupling transition
H --> L
H --> L
L --> H
L --> L
H --> L
L --> H
H --> L
H --> L
Type 1 Type 2 Type 3 Type 4
0 1 2 0
Cs
Cc
Cc Cs
Cs
bit 0
bit 1
bit 2
ELEC516/10 Lecture 9
105
Permutation Based Address code
• Rearrange the physical order of the bit lines
• Work efficiently on address bus (40%), but not on
instruction bus (4%)
– Correlation is lower
– Permutation is fixed
Sender Receiver
ELEC516/10 Lecture 9
106
Dynamic Reconfigurable Bus
Encoding Scheme for Instruction Bus
ELEC516/10 Lecture 9
107
Overview of the Scheme
Decoding information
stored as header
 Decoding information
are loaded to LUT first
Instruction is called
Decoding information
stored in LUT will control
the Cross_bar
Instructions enter
Cross_bar to decode
Encoding
.
.
.
MEM
B1
Bn
Mem_bus
Processing Element
.
.
B1
Bm
CACHE
CPU Core
Cache_bus
Decoder
(Cross_bar)
Look-Up
Table
Address_bus
Target Processor
Decoding
Computer
Target
Processor
Bit lines reordering
Encoding during compilation
Mem
ELEC516/10 Lecture 9
108
Demand/Data-driven operation
• Clocking strategy
– Gated clocking
• System Power Down
• Computing Paradigm
ELEC516/10 Lecture 9
109
Logic Level Power Down Technique
• Activity driven - precomputation-based sequential logic
optimization
– Selectively precompute the output logic values of the circuit
one clock cycle before they are required, and use the
precomputed values to reduce internal switching activity in
the succeeding cycle
It is required that
g1 = 1 -> f = 1
g2 = 1 -> f = 0
A
g2
LE
Original Circuit
Modified Circuit
f
A
R1
R1 R2
FF
FF
g1
R2
ELEC516/10 Lecture 9
110
An example: n-bit comparator
• This circuit compares two n-bit numbers A & and
computes the function A > B
• In general, precomputation works best when there are a
small number of complex functions corresponding to
the logic block A

More Related Content

Similar to 5006278.ppt

Lecture11 combinational logic dynamics
Lecture11 combinational logic dynamicsLecture11 combinational logic dynamics
Lecture11 combinational logic dynamics
vidhya DS
 
APEC 2012 Slides - IPC-9592 Derating Guidance
APEC 2012 Slides - IPC-9592 Derating GuidanceAPEC 2012 Slides - IPC-9592 Derating Guidance
APEC 2012 Slides - IPC-9592 Derating Guidance
Alessandro Cervone
 
IRJET - Analysis of Power Consumption in Glitch Free Dual Edge Triggered ...
IRJET -  	  Analysis of Power Consumption in Glitch Free Dual Edge Triggered ...IRJET -  	  Analysis of Power Consumption in Glitch Free Dual Edge Triggered ...
IRJET - Analysis of Power Consumption in Glitch Free Dual Edge Triggered ...
IRJET Journal
 
Chapter 15
Chapter 15Chapter 15
Chapter 15
Tha Mike
 
Praba_2022_IOP_Conf._Ser.__Mater._Sci._Eng._1258_012058.pdf
Praba_2022_IOP_Conf._Ser.__Mater._Sci._Eng._1258_012058.pdfPraba_2022_IOP_Conf._Ser.__Mater._Sci._Eng._1258_012058.pdf
Praba_2022_IOP_Conf._Ser.__Mater._Sci._Eng._1258_012058.pdf
RSEYEZHAI
 
Power
PowerPower
On the Impact of Timer Resolution in the Efficiency Optimization of Synchrono...
On the Impact of Timer Resolution in the Efficiency Optimization of Synchrono...On the Impact of Timer Resolution in the Efficiency Optimization of Synchrono...
On the Impact of Timer Resolution in the Efficiency Optimization of Synchrono...
IJPEDS-IAES
 
Original Transistor NPN MJE13003 KSE13003 E13003 13003 1.5A 400V TO-126 New
Original Transistor NPN MJE13003 KSE13003 E13003 13003 1.5A 400V TO-126 NewOriginal Transistor NPN MJE13003 KSE13003 E13003 13003 1.5A 400V TO-126 New
Original Transistor NPN MJE13003 KSE13003 E13003 13003 1.5A 400V TO-126 New
AUTHELECTRONIC
 
Single Phase Power Generation System from Fuel Cell
Single Phase Power Generation System from Fuel CellSingle Phase Power Generation System from Fuel Cell
Single Phase Power Generation System from Fuel Cell
International Journal of Power Electronics and Drive Systems
 
Design of bridgeless high-power-factor buck-converter operating in discontinu...
Design of bridgeless high-power-factor buck-converter operating in discontinu...Design of bridgeless high-power-factor buck-converter operating in discontinu...
Design of bridgeless high-power-factor buck-converter operating in discontinu...
IRJET Journal
 
IRJET- Design and Analysis of Single Ended Primary Inductance Converter (SEPI...
IRJET- Design and Analysis of Single Ended Primary Inductance Converter (SEPI...IRJET- Design and Analysis of Single Ended Primary Inductance Converter (SEPI...
IRJET- Design and Analysis of Single Ended Primary Inductance Converter (SEPI...
IRJET Journal
 
Series Voltage Compensator Modeling and Design for Reduction of Grid-Tie Sola...
Series Voltage Compensator Modeling and Design for Reduction of Grid-Tie Sola...Series Voltage Compensator Modeling and Design for Reduction of Grid-Tie Sola...
Series Voltage Compensator Modeling and Design for Reduction of Grid-Tie Sola...
IRJET Journal
 
A Novel Power Factor Correction Modified Bridge Less-CUK Converter for LED La...
A Novel Power Factor Correction Modified Bridge Less-CUK Converter for LED La...A Novel Power Factor Correction Modified Bridge Less-CUK Converter for LED La...
A Novel Power Factor Correction Modified Bridge Less-CUK Converter for LED La...
IJPEDS-IAES
 
Conducted electromagnetic interference mitigation in super-lift Luo-converte...
Conducted electromagnetic interference mitigation in super-lift  Luo-converte...Conducted electromagnetic interference mitigation in super-lift  Luo-converte...
Conducted electromagnetic interference mitigation in super-lift Luo-converte...
IJECEIAES
 
Lect2 up380 (100329)
Lect2 up380 (100329)Lect2 up380 (100329)
Lect2 up380 (100329)aicdesign
 
Feasibility analysis of offshore wind power plants with DC collection grid
Feasibility analysis of offshore wind power plants with DC collection gridFeasibility analysis of offshore wind power plants with DC collection grid
Feasibility analysis of offshore wind power plants with DC collection grid
Iktiham Bin Taher
 
Design of CMOS Inverter for Low Power and High Speed using Mentor Graphics
Design of CMOS Inverter for Low Power and High Speed using Mentor GraphicsDesign of CMOS Inverter for Low Power and High Speed using Mentor Graphics
Design of CMOS Inverter for Low Power and High Speed using Mentor GraphicsIJEEE
 
Frequency Shit Keying(FSK) modulation project report
Frequency Shit Keying(FSK) modulation project reportFrequency Shit Keying(FSK) modulation project report
Frequency Shit Keying(FSK) modulation project report
Md. Rayid Hasan Mojumder
 
Cmos vlsi nalini
Cmos vlsi naliniCmos vlsi nalini
Cmos vlsi nalini
Sekhar Reddy
 
Lect2 up120 (100325)
Lect2 up120 (100325)Lect2 up120 (100325)
Lect2 up120 (100325)aicdesign
 

Similar to 5006278.ppt (20)

Lecture11 combinational logic dynamics
Lecture11 combinational logic dynamicsLecture11 combinational logic dynamics
Lecture11 combinational logic dynamics
 
APEC 2012 Slides - IPC-9592 Derating Guidance
APEC 2012 Slides - IPC-9592 Derating GuidanceAPEC 2012 Slides - IPC-9592 Derating Guidance
APEC 2012 Slides - IPC-9592 Derating Guidance
 
IRJET - Analysis of Power Consumption in Glitch Free Dual Edge Triggered ...
IRJET -  	  Analysis of Power Consumption in Glitch Free Dual Edge Triggered ...IRJET -  	  Analysis of Power Consumption in Glitch Free Dual Edge Triggered ...
IRJET - Analysis of Power Consumption in Glitch Free Dual Edge Triggered ...
 
Chapter 15
Chapter 15Chapter 15
Chapter 15
 
Praba_2022_IOP_Conf._Ser.__Mater._Sci._Eng._1258_012058.pdf
Praba_2022_IOP_Conf._Ser.__Mater._Sci._Eng._1258_012058.pdfPraba_2022_IOP_Conf._Ser.__Mater._Sci._Eng._1258_012058.pdf
Praba_2022_IOP_Conf._Ser.__Mater._Sci._Eng._1258_012058.pdf
 
Power
PowerPower
Power
 
On the Impact of Timer Resolution in the Efficiency Optimization of Synchrono...
On the Impact of Timer Resolution in the Efficiency Optimization of Synchrono...On the Impact of Timer Resolution in the Efficiency Optimization of Synchrono...
On the Impact of Timer Resolution in the Efficiency Optimization of Synchrono...
 
Original Transistor NPN MJE13003 KSE13003 E13003 13003 1.5A 400V TO-126 New
Original Transistor NPN MJE13003 KSE13003 E13003 13003 1.5A 400V TO-126 NewOriginal Transistor NPN MJE13003 KSE13003 E13003 13003 1.5A 400V TO-126 New
Original Transistor NPN MJE13003 KSE13003 E13003 13003 1.5A 400V TO-126 New
 
Single Phase Power Generation System from Fuel Cell
Single Phase Power Generation System from Fuel CellSingle Phase Power Generation System from Fuel Cell
Single Phase Power Generation System from Fuel Cell
 
Design of bridgeless high-power-factor buck-converter operating in discontinu...
Design of bridgeless high-power-factor buck-converter operating in discontinu...Design of bridgeless high-power-factor buck-converter operating in discontinu...
Design of bridgeless high-power-factor buck-converter operating in discontinu...
 
IRJET- Design and Analysis of Single Ended Primary Inductance Converter (SEPI...
IRJET- Design and Analysis of Single Ended Primary Inductance Converter (SEPI...IRJET- Design and Analysis of Single Ended Primary Inductance Converter (SEPI...
IRJET- Design and Analysis of Single Ended Primary Inductance Converter (SEPI...
 
Series Voltage Compensator Modeling and Design for Reduction of Grid-Tie Sola...
Series Voltage Compensator Modeling and Design for Reduction of Grid-Tie Sola...Series Voltage Compensator Modeling and Design for Reduction of Grid-Tie Sola...
Series Voltage Compensator Modeling and Design for Reduction of Grid-Tie Sola...
 
A Novel Power Factor Correction Modified Bridge Less-CUK Converter for LED La...
A Novel Power Factor Correction Modified Bridge Less-CUK Converter for LED La...A Novel Power Factor Correction Modified Bridge Less-CUK Converter for LED La...
A Novel Power Factor Correction Modified Bridge Less-CUK Converter for LED La...
 
Conducted electromagnetic interference mitigation in super-lift Luo-converte...
Conducted electromagnetic interference mitigation in super-lift  Luo-converte...Conducted electromagnetic interference mitigation in super-lift  Luo-converte...
Conducted electromagnetic interference mitigation in super-lift Luo-converte...
 
Lect2 up380 (100329)
Lect2 up380 (100329)Lect2 up380 (100329)
Lect2 up380 (100329)
 
Feasibility analysis of offshore wind power plants with DC collection grid
Feasibility analysis of offshore wind power plants with DC collection gridFeasibility analysis of offshore wind power plants with DC collection grid
Feasibility analysis of offshore wind power plants with DC collection grid
 
Design of CMOS Inverter for Low Power and High Speed using Mentor Graphics
Design of CMOS Inverter for Low Power and High Speed using Mentor GraphicsDesign of CMOS Inverter for Low Power and High Speed using Mentor Graphics
Design of CMOS Inverter for Low Power and High Speed using Mentor Graphics
 
Frequency Shit Keying(FSK) modulation project report
Frequency Shit Keying(FSK) modulation project reportFrequency Shit Keying(FSK) modulation project report
Frequency Shit Keying(FSK) modulation project report
 
Cmos vlsi nalini
Cmos vlsi naliniCmos vlsi nalini
Cmos vlsi nalini
 
Lect2 up120 (100325)
Lect2 up120 (100325)Lect2 up120 (100325)
Lect2 up120 (100325)
 

More from kavita417551

5172197.ppt
5172197.ppt5172197.ppt
5172197.ppt
kavita417551
 
9402094.ppt
9402094.ppt9402094.ppt
9402094.ppt
kavita417551
 
5378086.ppt
5378086.ppt5378086.ppt
5378086.ppt
kavita417551
 
9077262.ppt
9077262.ppt9077262.ppt
9077262.ppt
kavita417551
 
11136442.ppt
11136442.ppt11136442.ppt
11136442.ppt
kavita417551
 
Lec-2.pdf
Lec-2.pdfLec-2.pdf
Lec-2.pdf
kavita417551
 

More from kavita417551 (6)

5172197.ppt
5172197.ppt5172197.ppt
5172197.ppt
 
9402094.ppt
9402094.ppt9402094.ppt
9402094.ppt
 
5378086.ppt
5378086.ppt5378086.ppt
5378086.ppt
 
9077262.ppt
9077262.ppt9077262.ppt
9077262.ppt
 
11136442.ppt
11136442.ppt11136442.ppt
11136442.ppt
 
Lec-2.pdf
Lec-2.pdfLec-2.pdf
Lec-2.pdf
 

Recently uploaded

CLASS 11 CBSE B.St Project AIDS TO TRADE - INSURANCE
CLASS 11 CBSE B.St Project AIDS TO TRADE - INSURANCECLASS 11 CBSE B.St Project AIDS TO TRADE - INSURANCE
CLASS 11 CBSE B.St Project AIDS TO TRADE - INSURANCE
BhavyaRajput3
 
Biological Screening of Herbal Drugs in detailed.
Biological Screening of Herbal Drugs in detailed.Biological Screening of Herbal Drugs in detailed.
Biological Screening of Herbal Drugs in detailed.
Ashokrao Mane college of Pharmacy Peth-Vadgaon
 
A Strategic Approach: GenAI in Education
A Strategic Approach: GenAI in EducationA Strategic Approach: GenAI in Education
A Strategic Approach: GenAI in Education
Peter Windle
 
Overview on Edible Vaccine: Pros & Cons with Mechanism
Overview on Edible Vaccine: Pros & Cons with MechanismOverview on Edible Vaccine: Pros & Cons with Mechanism
Overview on Edible Vaccine: Pros & Cons with Mechanism
DeeptiGupta154
 
The basics of sentences session 5pptx.pptx
The basics of sentences session 5pptx.pptxThe basics of sentences session 5pptx.pptx
The basics of sentences session 5pptx.pptx
heathfieldcps1
 
Thesis Statement for students diagnonsed withADHD.ppt
Thesis Statement for students diagnonsed withADHD.pptThesis Statement for students diagnonsed withADHD.ppt
Thesis Statement for students diagnonsed withADHD.ppt
EverAndrsGuerraGuerr
 
special B.ed 2nd year old paper_20240531.pdf
special B.ed 2nd year old paper_20240531.pdfspecial B.ed 2nd year old paper_20240531.pdf
special B.ed 2nd year old paper_20240531.pdf
Special education needs
 
"Protectable subject matters, Protection in biotechnology, Protection of othe...
"Protectable subject matters, Protection in biotechnology, Protection of othe..."Protectable subject matters, Protection in biotechnology, Protection of othe...
"Protectable subject matters, Protection in biotechnology, Protection of othe...
SACHIN R KONDAGURI
 
Home assignment II on Spectroscopy 2024 Answers.pdf
Home assignment II on Spectroscopy 2024 Answers.pdfHome assignment II on Spectroscopy 2024 Answers.pdf
Home assignment II on Spectroscopy 2024 Answers.pdf
Tamralipta Mahavidyalaya
 
Polish students' mobility in the Czech Republic
Polish students' mobility in the Czech RepublicPolish students' mobility in the Czech Republic
Polish students' mobility in the Czech Republic
Anna Sz.
 
Language Across the Curriculm LAC B.Ed.
Language Across the  Curriculm LAC B.Ed.Language Across the  Curriculm LAC B.Ed.
Language Across the Curriculm LAC B.Ed.
Atul Kumar Singh
 
TESDA TM1 REVIEWER FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
TESDA TM1 REVIEWER  FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...TESDA TM1 REVIEWER  FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
TESDA TM1 REVIEWER FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
EugeneSaldivar
 
The French Revolution Class 9 Study Material pdf free download
The French Revolution Class 9 Study Material pdf free downloadThe French Revolution Class 9 Study Material pdf free download
The French Revolution Class 9 Study Material pdf free download
Vivekanand Anglo Vedic Academy
 
The Accursed House by Émile Gaboriau.pptx
The Accursed House by Émile Gaboriau.pptxThe Accursed House by Émile Gaboriau.pptx
The Accursed House by Émile Gaboriau.pptx
DhatriParmar
 
Digital Tools and AI for Teaching Learning and Research
Digital Tools and AI for Teaching Learning and ResearchDigital Tools and AI for Teaching Learning and Research
Digital Tools and AI for Teaching Learning and Research
Vikramjit Singh
 
The Challenger.pdf DNHS Official Publication
The Challenger.pdf DNHS Official PublicationThe Challenger.pdf DNHS Official Publication
The Challenger.pdf DNHS Official Publication
Delapenabediema
 
CACJapan - GROUP Presentation 1- Wk 4.pdf
CACJapan - GROUP Presentation 1- Wk 4.pdfCACJapan - GROUP Presentation 1- Wk 4.pdf
CACJapan - GROUP Presentation 1- Wk 4.pdf
camakaiclarkmusic
 
The approach at University of Liverpool.pptx
The approach at University of Liverpool.pptxThe approach at University of Liverpool.pptx
The approach at University of Liverpool.pptx
Jisc
 
Phrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXX
Phrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXXPhrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXX
Phrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXX
MIRIAMSALINAS13
 
Adversarial Attention Modeling for Multi-dimensional Emotion Regression.pdf
Adversarial Attention Modeling for Multi-dimensional Emotion Regression.pdfAdversarial Attention Modeling for Multi-dimensional Emotion Regression.pdf
Adversarial Attention Modeling for Multi-dimensional Emotion Regression.pdf
Po-Chuan Chen
 

Recently uploaded (20)

CLASS 11 CBSE B.St Project AIDS TO TRADE - INSURANCE
CLASS 11 CBSE B.St Project AIDS TO TRADE - INSURANCECLASS 11 CBSE B.St Project AIDS TO TRADE - INSURANCE
CLASS 11 CBSE B.St Project AIDS TO TRADE - INSURANCE
 
Biological Screening of Herbal Drugs in detailed.
Biological Screening of Herbal Drugs in detailed.Biological Screening of Herbal Drugs in detailed.
Biological Screening of Herbal Drugs in detailed.
 
A Strategic Approach: GenAI in Education
A Strategic Approach: GenAI in EducationA Strategic Approach: GenAI in Education
A Strategic Approach: GenAI in Education
 
Overview on Edible Vaccine: Pros & Cons with Mechanism
Overview on Edible Vaccine: Pros & Cons with MechanismOverview on Edible Vaccine: Pros & Cons with Mechanism
Overview on Edible Vaccine: Pros & Cons with Mechanism
 
The basics of sentences session 5pptx.pptx
The basics of sentences session 5pptx.pptxThe basics of sentences session 5pptx.pptx
The basics of sentences session 5pptx.pptx
 
Thesis Statement for students diagnonsed withADHD.ppt
Thesis Statement for students diagnonsed withADHD.pptThesis Statement for students diagnonsed withADHD.ppt
Thesis Statement for students diagnonsed withADHD.ppt
 
special B.ed 2nd year old paper_20240531.pdf
special B.ed 2nd year old paper_20240531.pdfspecial B.ed 2nd year old paper_20240531.pdf
special B.ed 2nd year old paper_20240531.pdf
 
"Protectable subject matters, Protection in biotechnology, Protection of othe...
"Protectable subject matters, Protection in biotechnology, Protection of othe..."Protectable subject matters, Protection in biotechnology, Protection of othe...
"Protectable subject matters, Protection in biotechnology, Protection of othe...
 
Home assignment II on Spectroscopy 2024 Answers.pdf
Home assignment II on Spectroscopy 2024 Answers.pdfHome assignment II on Spectroscopy 2024 Answers.pdf
Home assignment II on Spectroscopy 2024 Answers.pdf
 
Polish students' mobility in the Czech Republic
Polish students' mobility in the Czech RepublicPolish students' mobility in the Czech Republic
Polish students' mobility in the Czech Republic
 
Language Across the Curriculm LAC B.Ed.
Language Across the  Curriculm LAC B.Ed.Language Across the  Curriculm LAC B.Ed.
Language Across the Curriculm LAC B.Ed.
 
TESDA TM1 REVIEWER FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
TESDA TM1 REVIEWER  FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...TESDA TM1 REVIEWER  FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
TESDA TM1 REVIEWER FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
 
The French Revolution Class 9 Study Material pdf free download
The French Revolution Class 9 Study Material pdf free downloadThe French Revolution Class 9 Study Material pdf free download
The French Revolution Class 9 Study Material pdf free download
 
The Accursed House by Émile Gaboriau.pptx
The Accursed House by Émile Gaboriau.pptxThe Accursed House by Émile Gaboriau.pptx
The Accursed House by Émile Gaboriau.pptx
 
Digital Tools and AI for Teaching Learning and Research
Digital Tools and AI for Teaching Learning and ResearchDigital Tools and AI for Teaching Learning and Research
Digital Tools and AI for Teaching Learning and Research
 
The Challenger.pdf DNHS Official Publication
The Challenger.pdf DNHS Official PublicationThe Challenger.pdf DNHS Official Publication
The Challenger.pdf DNHS Official Publication
 
CACJapan - GROUP Presentation 1- Wk 4.pdf
CACJapan - GROUP Presentation 1- Wk 4.pdfCACJapan - GROUP Presentation 1- Wk 4.pdf
CACJapan - GROUP Presentation 1- Wk 4.pdf
 
The approach at University of Liverpool.pptx
The approach at University of Liverpool.pptxThe approach at University of Liverpool.pptx
The approach at University of Liverpool.pptx
 
Phrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXX
Phrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXXPhrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXX
Phrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXX
 
Adversarial Attention Modeling for Multi-dimensional Emotion Regression.pdf
Adversarial Attention Modeling for Multi-dimensional Emotion Regression.pdfAdversarial Attention Modeling for Multi-dimensional Emotion Regression.pdf
Adversarial Attention Modeling for Multi-dimensional Emotion Regression.pdf
 

5006278.ppt

  • 1. ELEC516/10 Lecture 9 1 ELEC 516 VLSI System Design and Design Automation Spring 2010 Lecture 9 - Low Power Digital CMOS Design Reading Assignment: Rabaey: Chapter 4 and 7,11 Note: some of the figures in this slide set are adapted from the slide set of “ Digital Integrated Circuits” by Rabaey, Copyright UCB 2002
  • 2. ELEC516/10 Lecture 9 2 Why worry about power? -- Heat Dissipation Besides low power required for portable applications !!!! Power is a very big concern in today’s advanced technologies. Power produces heat on the chip which has to be carried off through the chip socket  expensive packaging solutions
  • 3. ELEC516/10 Lecture 9 3 Motivation for low power design • Packaging costs • Power supply rail design • Chip and system cooling costs • Noise immunity and system reliability • Battery life (in portable systems) • Environmental concerns – ICT equipment accounted for 10% of total US commercial energy usage in 2010 and may reach 20% by 2020 – Energy Star compliant systems
  • 4. ELEC516/10 Lecture 9 4 Why worry about power — Portability Multimedia Terminals Laptop Computers Digital Cellular Telephony BATTERY (40+ lbs) Year Nominal Capacity (Watt-hours / lb) Nickel-Cadium Ni-Metal Hydride 65 70 75 80 85 90 95 0 10 20 30 40 50 Rechargable Lithium Expected Battery Lifetime increase over next 5 years: 30-40%
  • 5. ELEC516/10 Lecture 9 5 Why worry about power? -- Standby Power  Drain leakage will increase as VT decreases to maintain noise margins and meet frequency demands, leading to excessive battery draining standby power consumption. 8KW 1.7KW 400W 88W 12W 0% 10% 20% 30% 40% 50% 2000 2002 2004 2006 2008 Standby Power Source: Borkar, De Intel Year 2002 2005 2008 2011 2014 Power supply Vdd (V) 1.5 1.2 0.9 0.7 0.6 Threshold VT (V) 0.4 0.4 0.35 0.3 0.25 …and phones leaky!
  • 6. ELEC516/10 Lecture 9 6 Power and Energy Figures of Merit • Power consumption in Watts – determines battery life in hours • Peak power – determines power ground wiring designs – sets packaging limits – impacts signal noise margin and reliability analysis • Energy efficiency in Joules – rate at which power is consumed over time • Energy = power * delay – Joules = Watts * seconds – lower energy number means less power to perform a computation at the same frequency
  • 7. ELEC516/10 Lecture 9 7 Power versus Energy Watts time Power is height of curve Watts time Approach 1 Approach 2 Approach 2 Approach 1 Energy is area under curve Lower power design could simply be slower Two approaches require the same energy
  • 8. ELEC516/10 Lecture 9 8 PDP and EDP • Power-delay product (PDP) = Pav * tp = (CLVDD 2)/2 – PDP is the average energy consumed per switching event (Watts * sec = Joule) – lower power design could simply be a slower design l allows one to understand tradeoffs better 0 5 10 15 0.5 1 1.5 2 2.5 Vdd (V) Energy-Delay (normalized) energy-delay energy delay  Energy-delay product (EDP) = PDP * tp = Pav * tp 2 l EDP is the average energy consumed multiplied by the computation time required l takes into account that one can trade increased delay for lower energy/operation (e.g., via supply voltage scaling that increases delay, but decreases energy consumption)
  • 9. ELEC516/10 Lecture 9 9 • Dynamic power – Charge-discharge current: • Dominant source of power (CV2 per transition) – Short circuit current (both NMOS and PMOS on during transit) • <10% of c/d current if transitions are fast • Subthreshold leakage (transistors not OFF completely) – Becoming important 10-30% active power in <0.18um techn – Diode leakage from reverse source and drain diodes (neglig) – Gate leakage (no longer negligible due to very thin gate oxide) Where Does Power Go in CMOS?
  • 10. ELEC516/10 Lecture 9 10 Dynamic Power Consumption Vin Vout CL Energy/transition = C L * V dd 2 Power = Energy/transition * f = C L * V dd 2 * f Need to reduce C L , V dd, and f to reduce power. Vdd Not a function of transistor sizes!
  • 11. ELEC516/10 Lecture 9 11 Dynamic Power Consumption - Revisited Power = Energy/transition * transition rate = CL * V dd 2 * f 0 1 = C L * V dd 2 * P 0 1* f = C EFF * V dd 2 * f Power Dissipation is Data Dependent Function of Switching Activity CEFF = Effective Capacitance = C L * P0 1
  • 12. ELEC516/10 Lecture 9 12 Power Consumption is Data Dependent Example: Static 2 Input NOR Gate Assume: P(A=1) = 1/2 P(B=1) = 1/2 P(Out=1) = 1/4 P(01) = 3/4X 1/4 = 3/16 Then: = P(Out=0).P(Out=1) CEFF = 3/16 * CL
  • 13. ELEC516/10 Lecture 9 13 Transition Probabilities for Basic Gates
  • 14. ELEC516/10 Lecture 9 14 Transition Probability of 2-input NOR Gate
  • 15. ELEC516/10 Lecture 9 16 Inter-signal Correlations B A Z X P(Z=1) = P(B=1) & P(A=1 | B=1) 0.5 0.5 (1-0.5)(1-0.5)x(1-(1-0.5)(1-0.5)) = 3/16 (1- 3/16 x 0.5) x (3/16 x 0.5) = 0.085 Reconvergent • Determining switching activity is complicated by the fact that signals exhibit correlation in space and time – reconvergent fan-out • Have to use conditional probabilities
  • 16. ELEC516/10 Lecture 9 17 Logic Restructuring Chain implementation has a lower overall switching activity than the tree implementation for random inputs Ignores glitching effects  Logic restructuring: changing the topology of a logic network to reduce transitions A B C D F A B C D Z F W X Y 0.5 0.5 (1-0.25)*0.25 = 3/16 0.5 0.5 0.5 0.5 0.5 0.5 7/64 15/256 3/16 3/16 15/256 AND: P01 = P0 x P1 = (1 - PAPB) x PAPB
  • 17. ELEC516/10 Lecture 9 19 Input Ordering Beneficial to postpone the introduction of signals with a high transition rate (signals with signal probability close to 0.5) A B C X F 0.5 0.2 0.1 B C A X F 0.2 0.1 0.5 (1-0.5x0.2)x(0.5x0.2)=0.09 (1-0.2x0.1)x(0.2x0.1)=0.0196
  • 18. ELEC516/10 Lecture 9 20 How about Dynamic Circuits? Power is Only Dissipated when Out=0! Mp Me VDD PDN f In1 In2 In3 Out f CEFF = P(Out=0).C L
  • 19. ELEC516/10 Lecture 9 21 2-input NOR Gate Example: Dynamic 2 Input NOR Gate Assume: P(A=1) = 1/2 P(B=1) = 1/2 P(Out=0) = 3/4 Then: CEFF = 3/4 * C L Switching Activity Is Always Higher in Dynamic Circuits
  • 20. ELEC516/10 Lecture 9 22 Transition Probabilities for Dynamic Gates Switching Activity for Precharged Dynamic Gates P01 = P0
  • 21. ELEC516/10 Lecture 9 24 Glitching in Static CMOS Networks ABC X Z 101 000 Unit Delay A B X Z C • Gates have a nonzero propagation delay resulting in spurious transitions or glitches (dynamic hazards) – glitch: node exhibits multiple transitions in a single cycle before settling to the correct logic value
  • 22. ELEC516/10 Lecture 9 25 Example : Adder Circuit 0 5 10 0.0 2.0 4.0 Time, ns Sum Output Voltage, Volts Cin S15 S10 6 5 4 3 2 S1 Add0 Add1 Add2 Add14 Add15 S0 S1 S2 S14 S15 Cin
  • 23. ELEC516/10 Lecture 9 26 How to Cope with Glitching? F1 F2 F3 F1 F3 F2 0 0 0 0 1 2 0 0 0 0 1 1 Equalize Lengths of Timing Paths Through Design
  • 24. ELEC516/10 Lecture 9 27 Balanced Delay Paths to Reduce Glitching So equalize the lengths of timing paths through logic F1 F2 F3 0 0 0 0 1 2 F1 F2 F3 0 0 0 0 1 1  Glitching is due to a mismatch in the path lengths in the logic network; if all input signals of a gate change simultaneously, no glitching occurs
  • 25. ELEC516/10 Lecture 9 28 Glitch Reduction by Pipelining • Glitches depend on the logic depth of the circuit - gates deeper in the logic network are more prone to glitching – arrival times of the gate inputs are more spread due to delay imbalances – usually affected more by primary input switching • Reduce logic depth by adding pipeline registers – additional energy used by the clock and pipeline registers PC Fetch Decode Execute Memory WriteBack Instruction MAR MDR I$ D$ clk pipeline stage isolation register
  • 26. ELEC516/10 Lecture 9 29 Short Circuit Power Consumption Finite slope of the input signal causes a direct current path between VDD and GND for a short period of time during switching when both the NMOS and PMOS transistors are conducting. Vin Vout CL Isc
  • 27. ELEC516/10 Lecture 9 30 Short Circuit Currents Determinates • Duration and slope of the input signal, tsc • Ipeak determined by – the saturation current of the P and N transistors which depend on their sizes, process technology, temperature, etc. – strong function of the ratio between input and output slopes • a function of CL Esc = tsc VDD Ipeak P01 Psc = tsc VDD Ipeak f01
  • 28. ELEC516/10 Lecture 9 31 Impact of CL on Psc Vin Vout CL Isc  0 Vin Vout CL Isc  Imax Large capacitive load Output fall time significantly larger than input rise time. Small capacitive load Output fall time substantially smaller than the input rise time.
  • 29. ELEC516/10 Lecture 9 32 Ipeak as a Function of CL -0.5 0 0.5 1 1.5 2 2.5 0 2 4 6 time (sec) x 10-10 x 10-4 CL = 20 fF CL = 100 fF CL = 500 fF 500 psec input slope Short circuit dissipation is minimized by matching the rise/fall times of the input and output signals - slope engineering. When load capacitance is small, Ipeak is large.
  • 30. ELEC516/10 Lecture 9 33 Static Power Consumption Vin=5V Vout CL Vdd Istat Pstat = P(In=1).Vdd . Istat • Dominates over dynamic consumption • Not a function of switching frequency
  • 31. ELEC516/10 Lecture 9 34 Leakage (Static) Power Consumption Sub-threshold current is the dominant factor. All increase exponentially with temperature! VDD Ileakage Vout Drain junction leakage Sub-threshold current Gate leakage
  • 32. ELEC516/10 Lecture 9 35 Power consideration – leakage current ) 1 ( / / ) ( kT q V nkT q V V sub ds th gs e e K I         K: technology constant; q: electronic charge; k: Boltzman constant N: nonlinearity constant (between 1 and 2); T: Temperature
  • 33. ELEC516/10 Lecture 9 36 Leakage as a Function of VT 0 0.2 0.4 0.6 0.8 1 VGS (V) ID (A) VT=0.4V VT=0.1V 10-2 10-12 10-7  Continued scaling of supply voltage and the subsequent scaling of threshold voltage will make subthreshold conduction a dominate component of power dissipation.  An 90mV/decade VT roll-off - so each 255mV increase in VT gives 3 orders of magnitude reduction in leakage (but adversely affects performance)
  • 34. ELEC516/10 Lecture 9 37 Sub-Threshold in MOS VT=0.6 VT=0.2 ID VGS Lower Bound on Threshold to Prevent Leakage
  • 35. ELEC516/10 Lecture 9 38 Low Power Design Space • The dynamic power consumption equation reveals the three degrees of freedom inherent in the low power design space: – Voltage – Physical capacitance – Data activity • Optimization for power entails an attempt to reduce one or more of these factors. Interactions among these factors complicate the optimization problem. • Deep sub-micron design - need to minimize leakage and sub-threshold current
  • 36. ELEC516/10 Lecture 9 39 Power Reduction Strategy (I) • Voltage Reduction – 5V ->3.3 V -> 2.5V ->1.8V-> 1.0V – Mixed supplies in system and/or on chip, by using the minimum voltage for different chips of functions within a chip, together with on-chip voltage converters if required. • Low-voltage circuit techniques are required to give good performances even with low voltages • Less noisy structures and better signal integrity handling is required • Lower Vth process is required to maintain good transistor speed performances
  • 37. ELEC516/10 Lecture 9 40 Reducing Vdd P x td = Et = CL * Vdd 2 E(Vdd=2) = (CL) * (2)2 (CL) * (5)2 E(Vdd=5) Strong function of voltage (V 2 dependence). Relatively independent of logic function and style. E(Vdd=2)  0.16 E (Vdd =5) 0.03 0.05 0.07 0.1 0.15 0.20 0.30 0.50 0.70 1.00 1.5 1 2 5 51 stage ring oscillator 8-bit adder Vdd (volts) quadratic dependence N O RMALI Z ED PO WER - DELAY P R O DUCT Power Delay Product Improves with lowering V DD .
  • 38. ELEC516/10 Lecture 9 41 Lower Vdd Increases Delay C L * Vdd I = Td Td(Vdd=5) Td(Vdd=2) = (2) * (5 - 0.7)2 (5) * (2 - 0.7)2  4 I ~ (Vdd - Vt)2 Relatively independent of logic function and style. 1.00 1.50 2.00 2.50 3.00 3.50 4.00 4.50 5.00 5.50 6.00 6.50 7.00 7.50 2.00 4.00 6.00 Vdd (volts) NORMALIZED DELAY adder (SPICE) microcoded DSP chip multiplier adder ring oscillator clock generator 2.0mm technology
  • 39. ELEC516/10 Lecture 9 42 Lowering the Threshold DESIGN FOR PLeakage == PDynamic Vt = 0.2 Vt = 0 I D VGS Reduces the Speed Loss, But Increases Leakage Vdd Delay 2Vt Interesting Design Approach:
  • 40. ELEC516/10 Lecture 9 43 What Threshold Voltage to Use? • Energy vs. Vt for a fixed throughput • An “optimal” Vt/Vdd point trades switching and leakage energy
  • 41. ELEC516/10 Lecture 9 44 Stack Effect • Leakage is a function of the circuit topology and the value of the inputs VT = VT0 + (|-2fF + VSB| - |-2fF|) where VT0 is the threshold voltage at VSB = 0; VSB is the source- bulk (substrate) voltage;  is the body-effect coefficient A B B A Out VX A B VX ISUB 0 0 VT ln(1+n) VGS=VBS= -VX 0 1 0 VGS=VBS=0 1 0 VDD-VT VGS=VBS=0 1 1 0 VSG=VSB=0 • Leakage is least when A = B = 0 • Leakage reduction due to stacked transistors is called the stack effect
  • 42. ELEC516/10 Lecture 9 45 Leakage as a Function of Design Time VT • Reducing the VT increases the sub-threshold leakage current (exponentially) – 90mV reduction in VT increases leakage by an order of magnitude • But, reducing VT decreases gate delay (increases performance) 0 0.2 0.4 0.6 0.8 1 VGS (V) ID (A) VT=0.4V VT=0.1V • Determine the critical path(s) at design time and use low VT devices on the transistors on those paths for speed. Use a high VT on the other logic for leakage control. – A careful assignment of VT’s can reduce the leakage by as much as 80%
  • 43. ELEC516/10 Lecture 9 46 Dual-Thresholds Inside a Logic Block • Minimum energy consumption is achieved if all logic paths are critical (have the same delay) • Use lower threshold on timing-critical paths – Assignment can be done on a per gate or transistor basis; no clustering of the logic is needed – No level converters are needed
  • 44. ELEC516/10 Lecture 9 47 Variable VT (ABB) at Run Time • VT = VT0 + (|-2fF + VSB| - |-2fF|) 0.4 0.45 0.5 0.55 0.6 0.65 0.7 0.75 0.8 0.85 0.9 -2.5 -2 -1.5 -1 -0.5 0 VSB (V)  A negative bias on VSB causes VT to increase  Adjusting the substrate bias at run time is called adaptive body-biasing (ABB) l Requires a dual well fab process  For an n-channel device, the substrate is normally tied to ground (VSB = 0)
  • 45. ELEC516/10 Lecture 9 48 Techniques for Burst Mode Computation High VT High VT Low VT SLEEP SLEEP - + - + on standby standby on Vin Vout Vdd Vp >0 Vn < 0 Multiple VT Technology Substrate Bias Controlled Variable VT Devices (Disable high VT devices during idle periods) (increase VT during idle periods) e.g. [Sakata-93], [Mutoh-93] e.g. [Seta-95]
  • 46. ELEC516/10 Lecture 9 49 Multiple-Threshold Circuits • Previous approaches cannot reduce leakage power during active mode. • Use dual Vt CMOS logic, low Vt for critical paths and high Vt for non-critical paths, problem - large threshold swing • Triple Vt CMOS circuit to reduce the sub-thresold swing [Fujii et. al. ISSCC-98] – for high speed low power active mode, low and medium Vt are used for critical and non-critical paths, respectively. – For standby mode, high Vt MOSFET is inserted between the supply rail and the virtual supply rail.
  • 48. ELEC516/10 Lecture 9 51 Power considerations – reduce leakage • Leakage proportional to device width – Use smallest devices for critical path. • Leakage drops with stacked devices (drain voltage divider) – Use stacked transistors for critical path. • Leakage drops with increasing channel length – Slightly increase L for critical Path • Use dual VT process providing two threshold VT – Use high VT transistors for critical path
  • 49. ELEC516/10 Lecture 9 52 Power consideration – reduce leakage • Switch off critical path transistors when not needed. • Stand-by mode between supply and virtual supply lines • Stand-by vectors – Apply input vector which minimizes leakage. – Achieved using Mux
  • 50. ELEC516/10 Lecture 9 53 Power gating transistor Sizing • The effect of power gating transistor size – As the size decreases, logic performance also decreases. – As the size increases, leakage current and chip area also increase. – Proper sizing is very important. – power gating transistor size should be decided within 2% performance degradation. Vop = VDD - V V must be sized within 2% performance degradation. VDD GND Low Vt HighVt Switch Control
  • 51. ELEC516/10 Lecture 9 54 Power Reduction Strategy (II) • Reducing capacitance – Process scaling and better integration, with smaller capacitances in more aggressive processes – Improved devices and interconnect technology – Efficient clock generation and distribution – Good memory hierarchy – In-place optimization using a library containing ranges of gates with different strengths, through replacement of gates to use the optimum drive in the critical paths and minimum drive elsewhere
  • 52. ELEC516/10 Lecture 9 55 Reducing Effective Capacitance Global bus architecture Local bus architecture Shared Resources incur Switching Overhead
  • 53. ELEC516/10 Lecture 9 56 Power consideration – Reduce Capacitance • Reduce switched capacitance C – Careful transistor sizing, transistor ordering, tighter and more compact layout – Hierarchical architecture and add TG to isolate buses – Segmented structures Shared bus driven by A or B when Sending values to C Insert switch to isolate bus segment when B sending to C
  • 54. ELEC516/10 Lecture 9 57 Power Reduction Strategy (III) • Reducing activities – lowering operating frequency – Using power management strategies, such as Gated clocks, Power-Down of non-operational units – Reduce switching activities, Power = Energy/transition * transition rate = • Power dissipation is data dependent and hence is a function of switching activity. • P0->1 is the switching probability • Effective switching capacitance = Ceff=CL* P0->1 f V C f P V C f V C dd eff dd L dd L             2 1 0 2 1 0 2
  • 55. ELEC516/10 Lecture 9 58 Factors Affecting Power Consumption - Revisited • Degree of freedom for low power design space: – Voltage – Physical capacitance – Data activity • Power minimization approaches: – Run at minimum allowable voltage – Minimize effective switching capacitances
  • 56. ELEC516/10 Lecture 9 59 System Level - Power Down Techniques Operating States: Active or Full-On (fastest clock) Standby (slow clock) Suspend or Sleep (slow clock or shut down) Micro Processor Activity Analyzer
  • 57. ELEC516/10 Lecture 9 60 Dynamic Power as a Function of VDD • Decreasing the VDD decreases dynamic energy consumption (quadratically) • But, increases gate delay (decreases performance) 1 1.5 2 2.5 3 3.5 4 4.5 5 5.5 0.8 1 1.2 1.4 1.6 1.8 2 2.2 2.4 VDD (V) • Determine the critical path(s) at design time and use high VDD for the transistors on those paths for speed. Use a lower VDD on the other gates, especially those that drive large capacitances (as this yields the largest energy benefits).
  • 58. ELEC516/10 Lecture 9 61 Multiple VDD Considerations • How many VDD? – Two is becoming common – Many chips already have two supplies (one for core and one for I/O) • When combining multiple supplies, level converters are required whenever a module at the lower supply drives a gate at the higher supply (step-up) – If a gate supplied with VDDL drives a gate at VDDH, the PMOS never turns off • The cross-coupled PMOS transistors do the level conversion • The NMOS transistor operate on a reduced supply – Level converters are not needed for a step-down change in voltage – Overhead of level converters can be mitigated by doing conversions at register boundaries and embedding the level conversion inside the flipflop VDDH Vin Vout VDDL
  • 59. ELEC516/10 Lecture 9 62 Dual-Supply Inside a Logic Block • Minimum energy consumption is achieved if all logic paths are critical (have the same delay) • Clustered voltage-scaling – Each path starts with VDDH and switches to VDDL (gray logic gates) when delay slack is available – Level conversion is done in the flipflops at the end of the paths
  • 60. ELEC516/10 Lecture 9 63 Power Conscious Behavioral Design • Run functional units at the minimum allowed voltage while satisfying the timing constraints • Parallelize or pipeline data-path, memory and controllers to compensate for throughput loss due to reduced supply voltage • Power down functional units which are not is use; Put in “dynamic power management” capability • Avoid centralized resources (controllers, functional blocks, global busses, etc.) as much possible • Map functions to hardware so that inter-chip communication is minimized • Schedule and bind operations to functional units so as to reduce the activity of the input operands • Reorder operands to reduce switching activity; Keep the inputs to an idle unit unchanged
  • 61. ELEC516/10 Lecture 9 64 Low Power Datapath Design - reducing the supply voltage • Reducing supply voltage has quadratic effect on power saving, but a negative effect on performance. • Performance can be gained back by logical and architectural optimizations, e.g. lookahead adder instead of ripple-carry adder, using parallelism to increase performance
  • 62. ELEC516/10 Lecture 9 65 Power Considerations – reduce f and VDD • Reducing frequency does not save energy, just reduces rate at which it is consumed – Power is lower but system must run longer • Reducing supply voltage is very effective (reduce voltage by 0.5 improves energy/transition by 0.25). • Dropping the voltage will result in reduced performance in terms of speed (need to recover the performance using parallelism). • Trade surplus performance for lower energy by reducing the supply voltage until performance are as required
  • 63. ELEC516/10 Lecture 9 66 Power considerations – Dynamic voltage Scaling Can run at lower voltage and hence improves quadraticaly power comsp. – 8-bit adder/comparator: (Chandrakasan er. Al.) • consumes Pref at 40Mhz at 5V with Area= 0.53mm2 – Two parallel interleaved architecture: • Consumes 0.36Pref at 20MHz, 2.9V with Area=1.80mm2 – One Pipelined architecture: • Consumes 0.39Pref at 40MHz, 2.9V with Area=0.69mm2 – Pipelined and parallel • Consumes 0.2Pref at 20MHz, 2.0V with Area=1.96mm2
  • 64. ELEC516/10 Lecture 9 67 Minimizing the power consumption using parallelism • Reference Design Critical path = 25ns, clock frequency = 40MHz Supply voltage = 5V ref ref ref ref f V C P 2 
  • 65. ELEC516/10 Lecture 9 68 • Parallel implementation of the reference design New critical path = 50nsec, Cpar->2.15 Cref, Vpar ->2.9V, fpar ->0.5 fref ref ref ref ref par par par par P f V C f V C P 36 . 0 ) 2 ( ) 58 . 0 )( 15 . 2 ( 2 2   
  • 66. ELEC516/10 Lecture 9 69 Pipelined Data Path • Critical path delay is less => max[Tadder, Tcomparator]. • Keeping clock rate constant: fpipe=fref, Voltage can be dropped to Vpipe = Vref/1.7, while maintaining the original througput • Capacitance slightly higher: Cpipe=1.15Cref • Ppipe=(1.15Cref)(Vref/1.7)2fref » 0.39Pref
  • 67. ELEC516/10 Lecture 9 70 How low a voltage can be used • Capacitance overhead starts to dominate at “high” levels of parallelism and results in an optimum voltage.
  • 68. ELEC516/10 Lecture 9 71 Voltage as a design variable Adapting Voltage to Workload yields cubic reduction
  • 69. ELEC516/10 Lecture 9 72 Multiple supply voltages – Filter Example 1 2 3 4 5 6 7 8 9 10 * * * * * * * * + + * * * * + + + + + + + + + + + + + + Power (5V)/Power(5V,3V,2.4V) = 1.5 [Raje95] 2.4V 3V 5V
  • 70. ELEC516/10 Lecture 9 73 Using Multiple Voltages C1 C2 C3 Vdd1 Vdd2 Vdd3 Vdd1Vdd2 Vdd3 critical path non-critical Vdd1 Vdd2 Vdd1Vdd2 [Horowitz-95]
  • 71. ELEC516/10 Lecture 9 74 Processor Usage Model • System Optimizations: – Maximize Peak Throughput – Minimize Average Energy/operation – Maximize computation per battery life Desired Throughput time Compute-intensive and Low-latency processes Ceiling: set by top Speed of the processor Single-user system not always computing Background and high- latency processes
  • 72. ELEC516/10 Lecture 9 75 Scale Supply Voltage with fclk
  • 73. ELEC516/10 Lecture 9 76 Dynamic Voltage Scaling Implementation • VCO: ring oscillator which matches mP critical path • DC-DC: perform D/A, converts battery to regulated Vdd • Provides both voltage regulation and clock generation. Loop Filter DC-DC VCO mP - + Mfref fout Frequency detector battery Vdd
  • 74. ELEC516/10 Lecture 9 77 Dynamic Voltage Scaling in Practice Fixed Throughput, Energy/operation Occasionally Demand Peak Throughput Throughput = 8MIPS Energy/ops = 0.24nJ/inst. 1.92mW Throughput = 100MIPS Energy/ops = 2.2nJ/inst. 220mW Peak Throughput = 100MIPS Average Energy/op. = 0.24nJ/inst (~1.8mW) DVS only advantageous when a majority of computation is performed at low throughput
  • 75. ELEC516/10 Lecture 9 78 Low-voltage Switching Regulator • Arbitrary Vdd (<Vin) generated using the Buck converter – Vdd = Vin Duty Cycle at Node X • Chief sources of inefficiencies: – Conduction loss (I2R) – Switching loss (CxVin 2fs and LsI2fs) – Gate-drive loss (CgVin 2fs) Page 17 [Stratakos-94]
  • 76. ELEC516/10 Lecture 9 79 Adaptive Power Supply Voltage • Exploit Data Dependent Computation Times To vary the Supply [Nielsen94] REG FIFO FIFO REG Control Self-timed Processor Power Supply Vdd(t)
  • 77. ELEC516/10 Lecture 9 80 Variable Supply Voltage Control Scheme
  • 78. ELEC516/10 Lecture 9 81 Voltage Scheduling for variable workload system • Voltage scheduling under timing constraints • Example [Ishihara-98] – Energy consumption of a processor: • 10nJ/cycle at 2.5V • 25nJ/cycle at 4 V • 40nJ/cycle at 5V – maximum clock frequencies: • 50MHz at 5V, 40MHz at 4V, 25MHz at 2.5V – Given that an application needs 1000M cycles to finish and the timing constraint is 25sec.
  • 79. ELEC516/10 Lecture 9 82 Different Voltage Schedules 0 5 10 15 20 25 Time(sec) 5.02 1000Mcycles 50MHz 40J (A) 0 5 10 15 20 25 Time(sec) 5.02 750Mcycles 50MHz 32.5J (B) 0 5 10 15 20 25 Time(sec) 5.02 1000Mcycles 40MHz 25J (C) Timing constraint 2.52 250Mcycles 25MHz 4.02 Energy consumption ( V dd 2 )
  • 80. ELEC516/10 Lecture 9 83 Reduce Power Further by Buffering Two samples processed every two sample periods -> increased latency
  • 81. ELEC516/10 Lecture 9 84 Example of Buffering Block 1 Block 2 Block 3 Block 4 Block 1, 2 Block 1, 2 Block 3,4 Block 3,4 Tsample Tsample Vdd =5 Vdd =2.5 Vdd =5 Vdd =2.5 Vdd =3.75 Vdd =3.75 Vdd =3.75 Vdd =3.75 C C C Energysample 14 2 ) 5 . 2 ( 2 1 ) 5 ( 2 2    C C Energysample 5 . 10 2 ) 75 . 3 ( ) 75 . 0 ( 2 2  
  • 82. ELEC516/10 Lecture 9 85 Voltage Island Concept  Trade off power for delay by running functional blocks at different voltages  Can use mix of Low and High Vt to balance performance and leakage  Switch off inactive blocks to reduce leakage power  Requires IP standards for power management, clock gating, etc. Delay vs. Voltage 30 25 20 15 10 5 0 Ddelay (ps) 0.7 0.8 0 .9 1.0 1.1 1.2 1.3 Voltage (Vdd) Std. Vt Low Vt E.g.: Telecom ASIC with 1.0/1.2 V islands saved : 16 % active power 50 % standby power Power Management Unit SWITCH SWITCH Logic Low VT Logic Vddo Vdd1 Vdd2 IP1 IP2 Source from Bergamaschi *Slide from Prof Kyung of KAIST
  • 83. ELEC516/10 Lecture 9 86 Power Management I/O’s, VReg, Gnd Memory Arrays Vdd 4 High Vt device arrays Optimized for low active power Memory Arrays Vdd 3 Low Vt device arrays Optimized for low active power Microcontroller Vdd 2 DSP Vdd 2 ROM Vdd 1 Monitor Logic Vdd 4 ROM Vdd1 RLM 1 RLM 2 Memory Arrays Vdd 3 Low Vt device arrays Optimized for low active power I/O’s, VReg, Gnd Analog Vdd 5 RLM 3 Vdd 1 I/O’s, VReg, Gnd I/O’s, VReg, Gnd  Independently controlled domain power switches  Multiple On-Chip Voltage Islands  On-Chip Voltage Regulators *Slide from Prof Kyung of KAIST
  • 84. ELEC516/10 Lecture 9 87 Controlling VDD and VTH for low power Active Stand-by Multiple VTH Dual-VTH MTCMOS Variable VTH VTH hopping VTCMOS Multiple VDD Dual-VDD Boosted gate MOS Variable VDD VDD hopping Software-hardware cooperation Technology-circuit cooperation  MTCMOS : Multi-Threshold CMOS  VTCMOS : Variable Threshold CMOS  Multiple : spatial assignment  Variable : temporal assignment *Slide from Prof Kyung of KAIST
  • 85. ELEC516/10 Lecture 9 88 RTL-level optimization-Reducing effective switching activity • General Principle: Avoid Waste. – Application-specific processing – Resource sharing/Locality of reference – Data representations – Preservation of Data correlations – architectural restructuring – Distributed processing – Demand-driven/data-driven computation
  • 86. ELEC516/10 Lecture 9 89 Application Specific Processing Application Specific Processing Reduces “Implementation Overhead”
  • 87. ELEC516/10 Lecture 9 90 Eliminating Redundant Computation • Dynamically vary the number of operations per sample. • Trade power consumption and filter quality [Ludwig- 96] fs fs fs fs fs …. Out Power Down
  • 88. ELEC516/10 Lecture 9 91 Eliminating Redundant Computation Switched Capacitance Reduction ~= Peak Number of Operations Average Number of Operations Strong Function of Signal Statistics ~=2
  • 89. ELEC516/10 Lecture 9 92 Reducing switching activity • Multiplexing multiple operations on a single hardware unit can have detrimental effect on the power consumption, because the switching activity may be increased, e.g. shared bus
  • 90. ELEC516/10 Lecture 9 93 Bus Multiplexing • Share long data buses with time multiplexing (S1 uses even cycles, S2 odd) S2 S1 D1 D2 S1 S2 D2 D1 • Buses are a significant source of power dissipation due to high switching activities and large capacitive loading – 15% of total power in Alpha 21064 – 30% of total power in Intel 80386 • But what if data samples are correlated (e.g., sign bits)?
  • 91. ELEC516/10 Lecture 9 94 Reducing the Effective Capacitance • Circuit and Logic Style - select a circuit style with low capacitance and/or switching activity, e.g. 8-bit adder
  • 92. ELEC516/10 Lecture 9 95 Reducing Glitching Activity • Some circuit structures can be the cause of spurious transients, e.g. a 16-bit ripple carry adder • Glitches can be reduced by selecting structures that have balanced signal paths, e.g. tree. • The Brent-Kung lookahead adder and Wallace tree multiplier both have this properties, thus more power attractive.
  • 93. ELEC516/10 Lecture 9 96 Data Representation Two’s complement Sign Magnitude • Sign-extension activity significantly reduced using sign magnitude representation. • An accumulator example: sign magnitude datapath switches 30% less capacitance for uniformly distributed inputs
  • 94. ELEC516/10 Lecture 9 97 Data Representation – Accumulator Example Sign magnitude datapath switches 30% less capacitance for uniformly distributed inputs
  • 95. ELEC516/10 Lecture 9 98 Two’s Complement vs. Sign-Magnitude Two’s complement datapath has a significantly higher switching activity
  • 96. ELEC516/10 Lecture 9 99 Bus encoding to reduce switching activity • Minimizing temporal bit transition activity by data representation • Bit encoding: – Active high encoding • high-level voltage for 1, low-level voltage for 0 – Transition-based encoding • voltage change identifies logic 1, no voltage range identifies logic 0 • Word encoding – assign patterns of 1’s and 0’s to each word of information – Non-redundant codes vs. redundant codes • Example of low power coding – Limited-weight code – Gray code – One-hot code – Bus-invert code
  • 97. ELEC516/10 Lecture 9 100 Example of Bit Encoding • Reducing average no. of switching on a data bus • transition-based encoding may limit the no. of transitions for non equiprobable input lines • Let p(0) > p(1) and no temporal correlation exists. If active-high encoding is used, the average no. of transition is • For transition-based encoding, it is simply p(1). If p(1) << p(0), transition-based is better than active-high encoding. • To reduce transition, the input patterns are transformed in such a way that the p(0) and p(1) prob. of each input line becomes as different as possible, and then to apply a transition-based encoding before the data is transmitted. ) 1 ( )) 1 ( 1 ( 2 ) 0 ( ) 1 ( ) 1 ( ) 0 ( p p p p p p   
  • 98. ELEC516/10 Lecture 9 101 Word encoding • One-hot coding •Gray coding - good for sequential data, e.g. addressing for microprocessor [Su-94] •Disadvantage of Gray code - only good for address bus, not for data bus, additional conversion circuitry is needed
  • 99. ELEC516/10 Lecture 9 102 Word encoding (cont.) • Bus-invert encoding - use redundancy to save power. • Given a n bit data bus with 2n patterns to be represented and assumes all the patterns are equiprobable and no temporal correlation. p(0) = p(1) = 0.5, the average no. of transition is n/2 per cycle, while the worst case transition is n. • Bus-invert coding - add an extra line to the bus, I, and then comparing the consecutive patterns before transmission. Tw cases – - If the Hamming distance between the two patterns =< n/2, the current pattern is transmitted as it is and I is set to 0. – - if the Hamming distance between the two patterns > n/2, the current pattern is first inverted and then transmitted, and I is se to 1. • Max. transition is limited to n/2 and average transition is reduced by 25% • Drawback - extra line I to indicate whether a pattern has been inverted.
  • 101. ELEC516/10 Lecture 9 104 Bus encoding for cross coupling cap. • Cross-coupling cap dominates for very deep sub- micron technology, e.g. < 70nm • Bus model – Stand alone cap. Cs – Cross coupling cap. Cc • Stand alone switching – Apply to single bit line – 0-1 transition • Cross coupling switching – Occurs on adjacent wires – Four types of coupling transition H --> L H --> L L --> H L --> L H --> L L --> H H --> L H --> L Type 1 Type 2 Type 3 Type 4 0 1 2 0 Cs Cc Cc Cs Cs bit 0 bit 1 bit 2
  • 102. ELEC516/10 Lecture 9 105 Permutation Based Address code • Rearrange the physical order of the bit lines • Work efficiently on address bus (40%), but not on instruction bus (4%) – Correlation is lower – Permutation is fixed Sender Receiver
  • 103. ELEC516/10 Lecture 9 106 Dynamic Reconfigurable Bus Encoding Scheme for Instruction Bus
  • 104. ELEC516/10 Lecture 9 107 Overview of the Scheme Decoding information stored as header  Decoding information are loaded to LUT first Instruction is called Decoding information stored in LUT will control the Cross_bar Instructions enter Cross_bar to decode Encoding . . . MEM B1 Bn Mem_bus Processing Element . . B1 Bm CACHE CPU Core Cache_bus Decoder (Cross_bar) Look-Up Table Address_bus Target Processor Decoding Computer Target Processor Bit lines reordering Encoding during compilation Mem
  • 105. ELEC516/10 Lecture 9 108 Demand/Data-driven operation • Clocking strategy – Gated clocking • System Power Down • Computing Paradigm
  • 106. ELEC516/10 Lecture 9 109 Logic Level Power Down Technique • Activity driven - precomputation-based sequential logic optimization – Selectively precompute the output logic values of the circuit one clock cycle before they are required, and use the precomputed values to reduce internal switching activity in the succeeding cycle It is required that g1 = 1 -> f = 1 g2 = 1 -> f = 0 A g2 LE Original Circuit Modified Circuit f A R1 R1 R2 FF FF g1 R2
  • 107. ELEC516/10 Lecture 9 110 An example: n-bit comparator • This circuit compares two n-bit numbers A & and computes the function A > B • In general, precomputation works best when there are a small number of complex functions corresponding to the logic block A