SlideShare a Scribd company logo
1 of 109
VLSI and Embedded System DESIGN –
(An Overview)
Prof. N.S.Murthy
(Former Dean and HOD/ ECE/
NIT- Warangal)
nsmurthy58@gmail.com
1
Lecture Outline
• What are the challenges in VLSI Design?
• What is an Embedded System?
• What is SOC?
• What is FPGA?
• ASIC vs FPGAs?
• Applications
2
3
4
The First Integrated Circuits
Bipolar logic
1960’s
ECL 3-input Gate
Motorola 1966
5
Intel 4004 Micro-Processor
1971
1000 transistors
1 MHz operation
6
7
Intel Pentium (IV) microprocessor
8
9
Moore’s Law
16
15
14
13
12
11
10
9
8
7
6
5
4
3
2
1
0
1959
1960
1961
1962
1963
1964
1965
1966
1967
1968
1969
1970
1971
1972
1973
1974
1975
LOG
2
OF
THE
NUMBER
OF
COMPONENTS
PER
INTEGRATED
FUNCTION
Electronics, April 19, 1965.
10
Transistor Counts
1,000,000
100,000
10,000
1,000
10
100
1
1975 1980 1985 1990 1995 2000 2005 2010
8086
80286
i386
i486
Pentium®
Pentium® Pro
K
1 Billion
Transistors
Source: Intel
Projected
Pentium®
II
Pentium®
III
Courtesy, Intel
11
Frequency
P6
Pentium ® proc
486
386
286
8086
8085
8080
8008
4004
0.1
1
10
100
1000
10000
1970 1980 1990 2000 2010
Year
Frequency
(Mhz)
Lead Microprocessors frequency doubles every 2 years
Doubles every
2 years
Courtesy, Intel
12
Power will be a major problem
5KW
18KW
1.5KW
500W
4004
8008
8080
8085
8086
286
386
486
Pentium® proc
0.1
1
10
100
1000
10000
100000
1971 1974 1978 1985 1992 2000 2004 2008
Year
Power
(Watts)
Power delivery and dissipation will be prohibitive
Courtesy, Intel
13
Power density
4004
8008
8080
8085
8086
286
386
486
Pentium® proc
P6
1
10
100
1000
10000
1970 1980 1990 2000 2010
Year
Power
Density
(W/cm2)
Hot Plate
Nuclear
Reactor
Rocket
Nozzle
Power density too high to keep junctions at low temp
Courtesy, Intel
Power Consumption
• Dynamic
– Transition
– Short circuit
• Leakage
– Sub-threshold leakage
– Diode/Drain leakage
– Gate leakage
At 250nm leakage power was only 5% but it is increasing
rapidly as geometries decrease
14
15
16
17
18
19
20
21
Not Only Microprocessors
Digital Cellular Market
(Phones Shipped)
1996 1997 1998 1999 2000
Units 48M 86M 162M 260M 435M
Analog
Baseband
Digital Baseband
(DSP + MCU)
Power
Manageme
nt
Small
Signal RF
Powe
r
RF
(data from Texas Instruments)
Cell
Phone
22
Challenges in Digital Design
“Microscopic Problems”
• Ultra-high speed design
• Interconnect
• Noise, Crosstalk
• Reliability, Manufacturability
• Power Dissipation
• Clock distribution.
Everything Looks a Little Different
“Macroscopic Issues”
• Time-to-Market
• Millions of Gates
• High-Level Abstractions
• Reuse & IP: Portability
• Predictability
• etc.
…and There’s a Lot of Them
 DSM  1/DSM
?
23
Productivity Trends
1
10
100
1,000
10,000
100,000
1,000,000
10,000,000
2003
1981
1983
1985
1987
1989
1991
1993
1995
1997
1999
2001
2005
2007
2009
10
100
1,000
10,000
100,000
1,000,000
10,000,000
100,000,000
Logic Tr./Chip
Tr./Staff Month.
x
x
x
x
x
x
x
21%/Yr. compound
Productivity growth rate
x
58%/Yr. compounded
Complexity growth rate
10,000
1,000
100
10
1
0.1
0.01
0.001
Logic
Transistor
per
Chip
(M)
0.01
0.1
1
10
100
1,000
10,000
100,000
Productivity
(K)
Trans./Staff
-
Mo.
Source: Sematech
Complexity outpaces design productivity
Complexity
Courtesy, ITRS Roadmap
24
Why Scaling?
• Technology shrinks by 0.7/generation
• With every generation can integrate 2x more
functions per chip; chip cost does not increase
significantly
• Cost of a function decreases by 2x
• But …
– How to design chips with more and more functions?
– Design engineering population does not double every two
years…
• Hence, a need for more efficient design methods
– Exploit different levels of abstraction
25
Design Abstraction Levels
n+
n+
S
G
D
+
DEVICE
CIRCUIT
GATE
MODULE
SYSTEM
Major Design Challenges
26
Microscopic issues
 ultra-high speeds
 power dissipation and supply
rail drop
 growing importance of
interconnect
 noise, crosstalk
 reliability, manufacturability
 clock distribution
Macroscopic issues
 time-to-market
 design complexity (millions of
gates)
 high levels of abstractions
 reuse and IP, portability
 systems on a chip (SoC)
 tool interoperability
Design Approach
Top – Down approach
Define top-block , identify the sub blocks needed
to build the top level block and divide further up
to the leaf cells
Bottom – Up approach
Identify the available building blocks, use them to
build a bigger cells and use them to build top
level block
Combination of both
27
Design Metrics
• How to evaluate performance of a digital
circuit (gate, block, …)?
– Cost
– Reliability
– Scalability
– Speed (delay, operating frequency)
– Power dissipation
– Energy to perform a function
28
Cost of Integrated Circuits
• NRE (non-recurrent engineering) costs
– design time and effort, mask generation
– one-time cost factor
• Recurrent costs
– silicon processing, packaging, test
– proportional to volume
– proportional to chip area
29
NRE Cost is Increasing
30
Die Cost
Single die
Wafer
From http://www.amd.com
Going up to 12” (30cm)
31
Cost per Transistor
0.0000001
0.000001
0.00001
0.0001
0.001
0.01
0.1
1
1982 1985 1988 1991 1994 1997 2000 2003 2006 2009 2012
cost:
-per-transistor
Fabrication capital cost per transistor (Moore’s law)
32
Some Examples (1994)
Chip Metal
layers
Line
width
Wafer
cost
Def./
cm2
Area
mm2
Dies/
wafer
Yield Die
cost
386DX 2 0.90 $900 1.0 43 360 71% $4
486 DX2 3 0.80 $1200 1.0 81 181 54% $12
Power PC
601
4 0.80 $1700 1.3 121 115 28% $53
HP PA 7100 3 0.80 $1300 1.0 196 66 27% $73
DEC Alpha 3 0.70 $1500 1.2 234 53 19% $149
Super Sparc 3 0.70 $1700 1.6 256 48 13% $272
Pentium 3 0.80 $1500 1.5 296 40 9% $417
Embedded Systems Design: A
Unified Hardware/Software
Introduction
33
Introduction to
Embedded Systems Design
34
Embedded systems overview
• Computing systems are everywhere
• Most of us think of “desktop” computers
– PC’s
– Laptops
– Mainframes
– Servers
• But there’s another type of computing system
– Far more common...
35
Embedded systems overview
• Embedded computing systems
– Computing systems embedded
within electronic devices
– Hard to define. Nearly any
computing system other than a
desktop computer
– Billions of units produced yearly,
versus millions of desktop units
– Perhaps 50 per household and
per automobile
Computers are in
here...
and here...
and even here...
Lots more of
these, though they
cost a lot less
each.
36
A “short list” of embedded systems
And the list goes on and on
Anti-lock brakes
Auto-focus cameras
Automatic teller
machines
Automatic toll systems
Automatic transmission
Avionic systems
Battery chargers
Camcorders
Cell phones
Cell-phone base stations
Cordless phones
Cruise control
Curbside check-in
systems
Digital cameras
Disk drives
Electronic card readers
Electronic instruments
Electronic toys/games
Factory control
Fax machines
Fingerprint identifiers
Home security systems
Life-support systems
Medical testing systems
Modems
MPEG decoders
Network cards
Network switches/routers
On-board navigation
Pagers
Photocopiers
Point-of-sale systems
Portable video games
Printers
Satellite phones
Scanners
Smart ovens/dishwashers
Speech recognizers
Stereo systems
Teleconferencing systems
Televisions
Temperature controllers
Theft tracking systems
TV set-top boxes
VCR’s, DVD players
Video game consoles
Video phones
Washers and dryers
37
Some common characteristics of
embedded systems
• Single-functioned
– Executes a single program, repeatedly
• Tightly-constrained
– Low cost, low power, small, fast, etc.
• Reactive and real-time
– Continually reacts to changes in the system’s
environment
– Must compute certain results in real-time without
delay
38
An embedded system example -- a
digital camera
Microcontroller
CCD preprocessor Pixel coprocessor
A2D
D2A
JPEG codec
DMA controller
Memory controller ISA bus interface UART LCD ctrl
Display ctrl
Multiplier/Accum
Digital camera chip
lens
CCD
• Single-functioned -- always a digital camera
• Tightly-constrained -- Low cost, low power, small, fast
• Reactive and real-time -- only to a small extent
39
Design challenge – optimizing
design metrics
• Obvious design goal:
– Construct an implementation with desired
functionality
• Key design challenge:
– Simultaneously optimize numerous design metrics
• Design metric
– A measurable feature of a system’s
implementation
– Optimizing design metrics is a key challenge
40
Design challenge – optimizing
design metrics
• Common metrics
– Unit cost: the monetary cost of manufacturing each copy of the system,
excluding NRE cost
– NRE cost (Non-Recurring Engineering cost): The one-
time monetary cost of designing the system
– Size: the physical space required by the system
– Performance: the execution time or throughput of the system
– Power: the amount of power consumed by the system
– Flexibility: the ability to change the functionality of the system without
incurring heavy NRE cost
41
Design challenge – optimizing
design metrics
• Common metrics (continued)
– Time-to-prototype: the time needed to build a working version of
the system
– Time-to-market: the time required to develop a system to the point
that it can be released and sold to customers
– Maintainability: the ability to modify the system after its initial
release
– Correctness, safety, many more
42
Design metric competition --
improving one may worsen others
• Expertise with both
software and hardware is
needed to optimize design
metrics
– Not just a hardware or
software expert, as is common
– A designer must be
comfortable with various
technologies in order to
choose the best for a given
application and constraints
Size
Performanc
e
Power
NRE
cost
Microcontro
ller
CCD
preprocessor
Pixel
coprocessor
A2D
D2A
JPEG codec
DMA controller
Memory
controller
ISA bus
interface
UART LCD ctrl
Display
ctrl
Multiplier/Accu
m
Digital camera chip
lens
CCD
Hardware
Software
43
Three key embedded system
technologies
• Technology
– A manner of accomplishing a task, especially using
technical processes, methods, or knowledge
• Three key technologies for embedded systems
– Processor technology
– IC technology
– Design technology
44
Processor technology
• The architecture of the computation engine used to implement
a system’s desired functionality
• Processor does not have to be programmable
– “Processor” not equal to general-purpose processor
Application-specific
Registers
Custom
ALU
Datapath
Controller
Program
memory
Assembly
code for:
total = 0
for i =1 to …
Control
logic and
State
register
Data
memory
IR PC
Single-purpose (“hardware”)
Datapath
Controller
Control
logic
State
register
Data
memory
index
total
+
IR PC
Register
file
General
ALU
Datapath
Controller
Program
memory
Assembly
code for:
total = 0
for i =1 to …
Control
logic and
State
register
Data
memory
General-purpose (“software”)
45
Processor technology
• Processors vary in their customization for the problem at hand
total = 0
for i = 1 to N
loop
total += M[i]
end loop
General-
purpose
processor
Single-
purpose
processor
Application-
specific
processor
Desired
functionality
46
The co-design ladder
• In the past:
– Hardware and software
design technologies were
very different
– Recent maturation of
synthesis enables a unified
view of hardware and
software
• Hardware/software
“codesign” Implementation
Assembly
instructions
Machine
instructions
Register transfers
Compilers
(1960's,1970
's)
Assemblers,
linkers
(1950's, 1960's)
Behavioral
synthesis
(1990's)
RT synthesis
(1980's,
1990's)
Logic
synthesis
(1970's,
1980's)
Microprocessor plus
program bits:
“software”
VLSI, ASIC, or PLD
implementation:
“hardware”
Logic gates
Logic equations /
FSM's
Sequential program code (e.g., C, VHDL)
The choice of hardware versus software for a particular function is simply a
tradeoff among various design metrics, like performance, power, size, NRE
cost, and especially flexibility; there is no fundamental difference between
what hardware or software can implement.
47
Summary
• Embedded systems are everywhere
• Key challenge: optimization of design metrics
– Design metrics compete with one another
• A unified view of hardware and software is necessary to
improve productivity
• Three key technologies
– Processor: general-purpose, application-specific, single-purpose
– IC: Full-custom, semi-custom, PLD
– Design: Compilation/synthesis, libraries/IP, test/verification
Why Worry about Power?
 Total Energy of Milky Way
Galaxy: 1059 J
 Minimum switching energy
for digital gate (1
electron@100 mV):
1.6 *10-20 J
(limit -- thermal noise)
 Upper bound on number of digital operations: 6 x1078
 Operations/year performed by 1 billion 100 MOPS
computers: 3 1024
 Energy consumed in 180 years, assuming a doubling of
computational requirements every year (Moore’s Law).
The Tongue-in-Cheek Answer
48
Power the Dominant Design Constraint (1)
Cost of large data centers solely determined by power bill …
Columbia River
Google Data Centre
Oregaon.
8,00
0 100,000
450,000
NY Times, June 06
49
50
400 Millions of Personal Computers
worldwide (Year 2000)
- Assumed to consume 0.16 Tera kWh per
year
Equivalent to 26 nuclear power plants
Over 1 Giga kWh per year just for cooling
Including manufacturing electricity
[Ref: Bar-Cohen et al., 2000]
Power the Dominant Design Constraint
51
Chip Architecture and Power Density
Integration of diverse functionality
SoC causes major variations in activ
(and hence power density)
The past: temperature
uniformity
Today: steep
gradients
Temperature variations cause
performance degradation –
higher temperature means
slower clock speed
52
Temperature Gradients (and Performance)
IBM Power PC 4 temperature map
Hot spot:
138 W/cm2
(3.6 x chip avg flux)
Glass ceramic substrate
SiC spreader (chip underneath spreade
Copper hat (heat sink on top not shown
53
54
Power The Dominant Design Constraint (3)
Exciting emerging applications require “zero-power”
Example: Computation/Communication Nodes
for Wireless Sensor Networks
Meso-scale low-cost wireless transceivers for
ubiquitous wireless data acquisition that
• are fully integrated
– Size smaller than 1 cm3
•are dirt cheap
–At or below 1$
• minimize power/energy dissipation
– Limiting power dissipation to 100 mW
enables energy scavenging
• and form self-configuring, robust, ad-hoc networks
containing 100’s to 1000’s of nodes
55
How to Make Electronics Truly Disappear?
From 10’s of cm3 and 10’s to 100’s of mW
To 10’s of mm3 and 10’s of mW
Power the Dominant Design Constraint
Exciting emerging applications require “zero-power”
Real-time Health Monitoring
Smart Surfaces
Artificial Skin
Philips Sand module
UCBmm3 radio
UCB PicoCube
Still at least one order of magnitude away
57
Evolution of Supply Voltages in the Past
Minimum Feature Size (micron)
10
-1
1
0.5
1
1.5
2
2.5
3
3.5
4
4.5
5
Supply
Voltage
(V)
Supply voltage scaling only from the 1990’s
58
Subthreshold Leakage As an Extra Complication
Year
2002 ’04 ’06 ’08 ’10 ’12 ’14 ’16
0
0.2
0.4
0.6
0.8
1
1.2
0
20
40
60
80
100
120
Technology
node[nm]
Voltage
[V]
VTH
VDD
Technology
node
2002 ’04 ’06 ’08 ’10 ’12 ’14’16
0
1
2
Year
PDYNAMIC
PLEAK
Power
[µW
/
gate]
Subthreshold leak
(Active leakage)
© IEEE 2003
59
Complicating the Issue: The Diversity of SoCs
Power budgets of leading general purpose (MPU) and
special purpose (ASSP) processors
60
Supply and Threshold Voltage Trends
VDD
VT
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
2004 2006 2008 2010 2012 2014 2016 2018 2020 2022
VDD/VTH = 2!
Voltage reduction projected to saturate
Optimistic scenario – some claims exist that VDD may get stuck aro
61
The Design Productivity Challenge
Source: sematech9
A growing gap between design complexity and design
productivity
Productiv
58%/Yr. compound
Complexity growth rate
21%/Yr. compound
Productivity growth rate
1981
10
Logic
Transistors
per
Chip
(K)
Productivity
(Trans./Staff-Month)
100
1,000
10,000
100,000
1,000,000
10,000,000
1
X
X
X X
X
X
x
100
1,000
10,000
100,000
1,000,000
10,000,000
100,000,000
10
2.5m
.35m
.10m
1983
1985
1987
1989
1991
1993
1995
1997
1999
2001
2003
2005
2007
2009
Transistor/Staff Month
Logic Transistors/Chip
62
Impact of Implementation Choices
Energy
Efficiency
(in
MOPS/mW)
Flexibility
(or application scope)
0.1-1
1-10
10-100
100-1000
None Fully
flexible
Somewhat
flexible
Hardwired
custom
Configurable/Parameterizable
Domain-specific
processor
(e.g.
DSP)
Embedded
microprocessor
63
MANY CORE SYSTEM ON CHIP TRENDS
64
Motivation for Low Power Design
Low power design is important from three
different reasons
• Device temperature
– Failure rate, Cooling and packaging costs
• Life of the battery
– Meantime between charging, System cost
• Environment
– Overall energy consumption
65
Power Consumption
• Dynamic
– Transition
– Short circuit
• Leakage
– Sub-threshold leakage
– Diode/Drain leakage
– Gate leakage
At 250nm leakage power was only 5% but it is increasing
rapidly as geometries decrease
66
Dynamic Energy Consumption
Energy/transition = CL * VDD
2 * P01
Power = CL * VDD
2 * f
Vin Vout
CL
Vdd
Transition
Power
67
Leakage Energy
Vout
Independent of switching
Sub-
threshold
current
Drain
junction
leakage
OFF
Gate leakage
68
Modification for Circuits with Reduced Swing
CL
Vdd
Vdd
Vdd -Vt
E0 1
 CL Vdd Vdd Vt
–
 


=
Can exploit reduced swing to lower power
(e.g., reduced bit-line swing in memory)
69
Dynamic vs Static Power
0.01 0.1 1 10
Gate Length (microns)
1E-8
1E-6
1E-4
1E-2
1E+0
1E+2
1E+4
Power
Density
(W/cm^2)
Shrinking Margin
SubThreshold
Power
Active Power
Source: Leon Stok, DAC
42©
70
Short Circuit Currents
Vin Vout
CL
Vdd
I
VDD
(mA)
0.15
0.10
0.05
Vin (V)
5.0
4.0
3.0
2.0
1.0
0.0
71
Reverse-Biased Diode Leakage
N
p+ p+
Reverse Leakage Current
+
-
Vdd
GATE
IDL = JS  A
JS = 1-5pA/mm2
for a 1.2mm CMOS technology
Js double with every 9o
C increase in temperature
JS = 10-100 pA/mm2 at 25 deg C for 0.25mm CMOS
JS doubles for every 9 deg C!
72
Subthreshold Leakage Component
73
Basic VLSI Design Flow
 SYSTEM SPECIFICATION
 ARCHITECTURAL DESIGN
 LOGIC DESIGN
 CIRCUIT DESIGN
 DEVICE DESIGN
 LAYOUT
•Fabrication, On-Wafer Testing, Packaging, Testing
(Algorithmic Level)
(Register Transfer Level)
(Gate Level)
(Transistor Level)
- - - - - - - - - - - - - - - - - -
Physical Design
74
Design Levels
• System
• Algorithmic/Module
• RTL
• Gate
• Circuit
• Device technology
75
Optimization is possible at every level
 ALGORITHMIC LEVEL eg: DFT: O(N2); FFT: O(N)
 ARCHITECTURAL LEVEL
 LOGIC LEVEL
 CIRCUIT LEVEL
 DEVICE LEVEL
…..
76
System Level Design
Same MP3 Application
running on different systems
consume significantly
different amounts of power
• System partitioning
• Busses/Memory/IO devices /interfaces
• Choice of components
• Coding
• System states (sleep/snooze etc)
• DVS/DFS/..
77
Algorithmic/sub-system Level
• Choice of algorithm (operation count etc.)
• Word length choices
• Module interfaces
• Implementation technology
– SW: Processor selection
– HW: ASIC/FPGA/..
• Behavioral synthesis constraints and trade-off
78
RTL
• Pipelining/retiming
• Module selection
• Multiple frequency and voltage islands
• Reduction in switching activity through
transformations
79
Gate Level
• Clock gating
• Power gating
• Clock tree optimization
• Logic level transformations to reduce
switching activity
80
Circuit Level
• Transistor sizing
• Power efficient circuits
• Cell design
• Multi-threshold circuits
81
Device Technology
• Multi-oxide devices
• Multiple “cell types” on a single substrate
– Logic, SRAM, Flash etc.
• Support for many other low power design
techniques (multiple thresholds, multiple
voltages, multiple frequencies etc.)
82
83
84
85
Reduction of Sub-threshold Leakage
Current
• Reduce supply voltage
• Reduce size of the circuit
– Resize transistors as per performance requirements
– Dynamically cut power supply to unused circuits
• Cooling
• Reduce threshold voltage
– Stack the off-transistors in series
– Isolating supply through sleep transistors
– Dual threshold; higher threshold on non-critical paths
– Adaptive body biasing
86
OPTIMIZATION AT LOGIC LEVEL
20 Transistors
2:1 MULTIPLEXER
6
6
6
2
S
A
B
Y
4
4
4
2
S
A
B
Y
14 Transistors
87
OPTIMIZATION AT CIRCUIT LEVEL
2:1 MULTIPLEXER USING TRANSMISSION GATE LOGIC
S
A
B
S
Y
6 Transistors
(including 2 for inverting S)
S
S
88
A
A
B
B
C
C
D
D
VDD
Y
Y = (AB + C)D
0
A
B
C
D
Y
18 Transistors 8 Transistors
Optimized transistor level realization of Boolean function
89
Low Power RTL Synthesis Techniques
• Module selection
• Retiming
• Pipelining
• Parallelism
• Bus data encoding
• FSM encoding
• Transformations for Switching activity
reduction
90
Module Selection
• Modules are used for implementing functional
units, small memory modules etc.
• Significant difference in power consumption
of different implementations
• Word-length as well as number coding
techniques employed can play a significant
role
91
Ripple Carry Adder
Carry signal switching propagates through all
the stages
and consumes Power
ACTEL:
MAPLD2004
92
Carry Look Ahead Adder
Carry signal switching propagates through much
less number of stages and thus not only reduces
delay but can also consume less power
ACTEL:
MAPLD2004
93
Other Operations and Operators
• ALUs
– Traditional method: Perform all operations and
use select for the output; very inefficient in terms
of switching activity
– Permit switching activity only in the operator
required in this cycle
• Complex operators like MAC
• Cordic functions
– Look up table vs computation
94
Alternative ALU Structures
F1 F2 Fn
Function
Select
Mu
x
Inputs
F1 F2 Fn
Function
Select
Mu
x
Inputs
Demu
x
95
Retiming - Positioning a Flip-flop and
Power Consumption
Logi
c
Logi
c
FF
CL
CL
CR
Eg
Eg
ER
P1 = k * Eg *
CL
P2 = k * (Eg * CR + ER
* CL)
P2 can be less than
P1
96
Pipelining
• Pipelining effects power in two different ways
• One factor is similar to retiming where flip-
flops can cut down on glitches
• As pipelining can reduce the critical path to
give higher frequency and performance
(throughput), this can be used to reduce the
voltage for the given throughput to reduce
power
97
Effect of Pipelining
Logi
c
FF
Logi
c
Case1:
No
Pipelining
Logi
c
FF Logi
c
Case2:
Pipelining
for
performan
ce
Logi
c
FF Logi
c
FF Logi
c
Case 3:
Pipelining
for low
power
freq: f0
voltage: v0
f1 > f0
v1 = v0
f2 = f0
v2 < v0
98
Increasing Parallelism/ Concurrency
• Chandrakasan[4] first showed that concurrency can
be used to reduce power instead of increasing
performance
• Primary idea is to reduce the frequency of operation
and/or voltage to meet a certain throughput
• Power consumed by additional logic required to
distribute computation and multiplex results needs
to be accounted for
99
Effect of Parallelism
Case1:
Single FU
Case2:
Two FUs
for
enhanced
performan
ce
Case 3:
Two FUs
for
reducing
power
freq: f0
voltage: v0
throughput:
T0
f1 = f0
v1 = v0
T1 > T0
FU
reg
FU
reg
FU
reg
FU
reg
FU
f2 < f0
v2 < v0
T2 = T0
M
U
x
M
U
x
100
Examples and Case study
• Usage of redundant arithmetic
• Usage of alternative number representation
(normalized / Gray coded)
• Usage of running transforms
____________________________________________
• Design of an alternative arithmetic unit (e.g. CORDIC)
• Design of an FFT address generator
101
ALTERNATIVE ARITHMETIC UNIT
CO-ORDINATE ROTATION DIGITAL COMPUTER (CORDIC)
x
y
x’ = x cos + y sin
y’=  x sin  + y cos

102
CORDIC Algorithm – a simple, smart way of computing
trigonometric quantities (e.g., cos ) in digital hardware
and to realize multiplierless architectures.
CORDIC : “Coordinate Rotation Digital Computer”
Define a set of basic CORDIC angles , 0 90 .
o o
k k
 
 
•
1
tan (2 ), 0,1,2, ....
k
k k
  
 
• 0 0 1 2 1
45 , ..... ..... 0
o o
k k
     

      
Given an angle , 0 90 ,
o o
 
  we can write,
0
,
i i
i
  


  where 0
1 ( 1)
i
 
   
103
In practice, the summation is truncated up to a finite
number of terms, say, M (called wordlength)
:
as
generated
is
sequence
The i

End
i
sign
i
i
M
to
i
For
i
i
i
)];
1
(
[
;
)
(
)
1
(
1
0
1
,
(0)
1
0










 








104
CORDIC Rotation
( , )
x y
( ', ')
x y

1
1 1 0 0
0
Rot( ) = Rot( ). . Rot( )
M
k k M M
k
       

 

 

' cos sin
' sin cos
1 tan
cos
tan 1
Rot ( )
x x
y y
x
y
x
y
 
 





     

     
     

   
    
   
 
  
 
1
0
1
0 1 0
1 tan
1 tan
'
( cos )
' tan 1 tan 1
M
M
k
k M
x x
y y



 


 
 
 
   
   
 
   
 
   
   

105
( 1) 0
1
( 1) 0
0
' 1 2 1 2
( cos )
' 2 1 2 1
M
M
k M
k
x x
y y

  

  

   
   
     
   
 
   
   
   

•
1
0
cos : Universal constant
M
k
k




• Elementary rotations – need only shifting operations
--- can be pipelined
• Shifting done by direct bus connections
2 i
v u

 
u
v
106
i
x i
y
2 i

2 i
 i

i

sign
1
i
x 1
i
y 1
i
 
i

Rotation by i
 (i-th stage of a pipelined CORDIC unit)
0
x x

0
y y

0
 

0
Rot( )
 Pipeline
latch
1
y
1

1
M
x 
1
M
y 
1
M
 
1
Rot( )
M
 
M
x
M
y
Pipelined CORDIC Unit (PCU)
107
CONCLUSION
•There ALWAYS EXISTS A BETTER SOLUTION
than the present one and we can think of that.
•But
There ALWAYS EXISTS A BETTER SOLUTION
than what we can think of!
108
Thank You
109

More Related Content

Similar to VLSI and Embedded System DESIGN - An Overview of Key Challenges

SISTec Microelectronics VLSI design
SISTec Microelectronics VLSI designSISTec Microelectronics VLSI design
SISTec Microelectronics VLSI designDr. Ravi Mishra
 
Fundamentals.pptx
Fundamentals.pptxFundamentals.pptx
Fundamentals.pptxdhivyak49
 
Computer Architecture
Computer ArchitectureComputer Architecture
Computer ArchitectureHaris456
 
Introduction to embedded system design
Introduction to embedded system designIntroduction to embedded system design
Introduction to embedded system designMukesh Bansal
 
software engineering CSE675_01_Introduction.ppt
software engineering CSE675_01_Introduction.pptsoftware engineering CSE675_01_Introduction.ppt
software engineering CSE675_01_Introduction.pptSomnathMule5
 
lec01.pdf
lec01.pdflec01.pdf
lec01.pdfBeiYu6
 
Lec4 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- ISA
Lec4 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- ISALec4 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- ISA
Lec4 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- ISAHsien-Hsin Sean Lee, Ph.D.
 
1 Computer Architecture
1 Computer Architecture1 Computer Architecture
1 Computer Architecturefika sweety
 
Computer Architechture and Organization
Computer Architechture and OrganizationComputer Architechture and Organization
Computer Architechture and OrganizationAiman Hafeez
 
CSE675_01_Introduction.ppt
CSE675_01_Introduction.pptCSE675_01_Introduction.ppt
CSE675_01_Introduction.pptAshokRachapalli1
 
CSE675_01_Introduction.ppt
CSE675_01_Introduction.pptCSE675_01_Introduction.ppt
CSE675_01_Introduction.pptAshokRachapalli1
 
High speed-pcb-board-design-and-analysiscadence-130218085524-phpapp01
High speed-pcb-board-design-and-analysiscadence-130218085524-phpapp01High speed-pcb-board-design-and-analysiscadence-130218085524-phpapp01
High speed-pcb-board-design-and-analysiscadence-130218085524-phpapp01khalid noman husainy
 
Introduction to embedded computing and arm processors
Introduction to embedded computing and arm processorsIntroduction to embedded computing and arm processors
Introduction to embedded computing and arm processorsSiva Kumar
 
Performance and Flexibility for Mmultiple-Processor SoC Design
Performance and Flexibility for Mmultiple-Processor SoC DesignPerformance and Flexibility for Mmultiple-Processor SoC Design
Performance and Flexibility for Mmultiple-Processor SoC DesignYalagoud Patil
 
Lec0.ppt
Lec0.pptLec0.ppt
Lec0.pptputmy
 
1. An Introduction to Embed Systems_DRKG.pptx
1. An Introduction to Embed Systems_DRKG.pptx1. An Introduction to Embed Systems_DRKG.pptx
1. An Introduction to Embed Systems_DRKG.pptxKesavanGopal1
 

Similar to VLSI and Embedded System DESIGN - An Overview of Key Challenges (20)

SISTec Microelectronics VLSI design
SISTec Microelectronics VLSI designSISTec Microelectronics VLSI design
SISTec Microelectronics VLSI design
 
Fundamentals.pptx
Fundamentals.pptxFundamentals.pptx
Fundamentals.pptx
 
Computer Architecture
Computer ArchitectureComputer Architecture
Computer Architecture
 
Introduction to embedded system design
Introduction to embedded system designIntroduction to embedded system design
Introduction to embedded system design
 
Unit-V.pptx
Unit-V.pptxUnit-V.pptx
Unit-V.pptx
 
software engineering CSE675_01_Introduction.ppt
software engineering CSE675_01_Introduction.pptsoftware engineering CSE675_01_Introduction.ppt
software engineering CSE675_01_Introduction.ppt
 
lec01.pdf
lec01.pdflec01.pdf
lec01.pdf
 
Lec4 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- ISA
Lec4 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- ISALec4 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- ISA
Lec4 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- ISA
 
1 Computer Architecture
1 Computer Architecture1 Computer Architecture
1 Computer Architecture
 
Esd module1
Esd module1Esd module1
Esd module1
 
Computer Architechture and Organization
Computer Architechture and OrganizationComputer Architechture and Organization
Computer Architechture and Organization
 
CSE675_01_Introduction.ppt
CSE675_01_Introduction.pptCSE675_01_Introduction.ppt
CSE675_01_Introduction.ppt
 
CSE675_01_Introduction.ppt
CSE675_01_Introduction.pptCSE675_01_Introduction.ppt
CSE675_01_Introduction.ppt
 
High speed-pcb-board-design-and-analysiscadence-130218085524-phpapp01
High speed-pcb-board-design-and-analysiscadence-130218085524-phpapp01High speed-pcb-board-design-and-analysiscadence-130218085524-phpapp01
High speed-pcb-board-design-and-analysiscadence-130218085524-phpapp01
 
Introduction to embedded computing and arm processors
Introduction to embedded computing and arm processorsIntroduction to embedded computing and arm processors
Introduction to embedded computing and arm processors
 
Embedded System-design technology
Embedded System-design technologyEmbedded System-design technology
Embedded System-design technology
 
Performance and Flexibility for Mmultiple-Processor SoC Design
Performance and Flexibility for Mmultiple-Processor SoC DesignPerformance and Flexibility for Mmultiple-Processor SoC Design
Performance and Flexibility for Mmultiple-Processor SoC Design
 
Lec0.ppt
Lec0.pptLec0.ppt
Lec0.ppt
 
1. An Introduction to Embed Systems_DRKG.pptx
1. An Introduction to Embed Systems_DRKG.pptx1. An Introduction to Embed Systems_DRKG.pptx
1. An Introduction to Embed Systems_DRKG.pptx
 
Nae
NaeNae
Nae
 

Recently uploaded

Call Us ✡️97111⇛47426⇛Call In girls Vasant Vihar༒(Delhi)
Call Us ✡️97111⇛47426⇛Call In girls Vasant Vihar༒(Delhi)Call Us ✡️97111⇛47426⇛Call In girls Vasant Vihar༒(Delhi)
Call Us ✡️97111⇛47426⇛Call In girls Vasant Vihar༒(Delhi)jennyeacort
 
Dubai Call Girls Pro Domain O525547819 Call Girls Dubai Doux
Dubai Call Girls Pro Domain O525547819 Call Girls Dubai DouxDubai Call Girls Pro Domain O525547819 Call Girls Dubai Doux
Dubai Call Girls Pro Domain O525547819 Call Girls Dubai Douxkojalkojal131
 
306MTAMount UCLA University Bachelor's Diploma in Social Media
306MTAMount UCLA University Bachelor's Diploma in Social Media306MTAMount UCLA University Bachelor's Diploma in Social Media
306MTAMount UCLA University Bachelor's Diploma in Social MediaD SSS
 
Kindergarten Assessment Questions Via LessonUp
Kindergarten Assessment Questions Via LessonUpKindergarten Assessment Questions Via LessonUp
Kindergarten Assessment Questions Via LessonUpmainac1
 
办理(宾州州立毕业证书)美国宾夕法尼亚州立大学毕业证成绩单原版一比一
办理(宾州州立毕业证书)美国宾夕法尼亚州立大学毕业证成绩单原版一比一办理(宾州州立毕业证书)美国宾夕法尼亚州立大学毕业证成绩单原版一比一
办理(宾州州立毕业证书)美国宾夕法尼亚州立大学毕业证成绩单原版一比一F La
 
VIP Call Girl Amravati Aashi 8250192130 Independent Escort Service Amravati
VIP Call Girl Amravati Aashi 8250192130 Independent Escort Service AmravatiVIP Call Girl Amravati Aashi 8250192130 Independent Escort Service Amravati
VIP Call Girl Amravati Aashi 8250192130 Independent Escort Service AmravatiSuhani Kapoor
 
Abu Dhabi Call Girls O58993O4O2 Call Girls in Abu Dhabi`
Abu Dhabi Call Girls O58993O4O2 Call Girls in Abu Dhabi`Abu Dhabi Call Girls O58993O4O2 Call Girls in Abu Dhabi`
Abu Dhabi Call Girls O58993O4O2 Call Girls in Abu Dhabi`dajasot375
 
3D Printing And Designing Final Report.pdf
3D Printing And Designing Final Report.pdf3D Printing And Designing Final Report.pdf
3D Printing And Designing Final Report.pdfSwaraliBorhade
 
VIP Kolkata Call Girl Gariahat 👉 8250192130 Available With Room
VIP Kolkata Call Girl Gariahat 👉 8250192130  Available With RoomVIP Kolkata Call Girl Gariahat 👉 8250192130  Available With Room
VIP Kolkata Call Girl Gariahat 👉 8250192130 Available With Roomdivyansh0kumar0
 
Housewife Call Girls NRI Layout - Call 7001305949 Rs-3500 with A/C Room Cash ...
Housewife Call Girls NRI Layout - Call 7001305949 Rs-3500 with A/C Room Cash ...Housewife Call Girls NRI Layout - Call 7001305949 Rs-3500 with A/C Room Cash ...
Housewife Call Girls NRI Layout - Call 7001305949 Rs-3500 with A/C Room Cash ...narwatsonia7
 
PORTAFOLIO 2024_ ANASTASIYA KUDINOVA
PORTAFOLIO   2024_  ANASTASIYA  KUDINOVAPORTAFOLIO   2024_  ANASTASIYA  KUDINOVA
PORTAFOLIO 2024_ ANASTASIYA KUDINOVAAnastasiya Kudinova
 
NO1 Famous Amil Baba In Karachi Kala Jadu In Karachi Amil baba In Karachi Add...
NO1 Famous Amil Baba In Karachi Kala Jadu In Karachi Amil baba In Karachi Add...NO1 Famous Amil Baba In Karachi Kala Jadu In Karachi Amil baba In Karachi Add...
NO1 Famous Amil Baba In Karachi Kala Jadu In Karachi Amil baba In Karachi Add...Amil baba
 
VIP Call Girls Service Mehdipatnam Hyderabad Call +91-8250192130
VIP Call Girls Service Mehdipatnam Hyderabad Call +91-8250192130VIP Call Girls Service Mehdipatnam Hyderabad Call +91-8250192130
VIP Call Girls Service Mehdipatnam Hyderabad Call +91-8250192130Suhani Kapoor
 
Kala jadu for love marriage | Real amil baba | Famous amil baba | kala jadu n...
Kala jadu for love marriage | Real amil baba | Famous amil baba | kala jadu n...Kala jadu for love marriage | Real amil baba | Famous amil baba | kala jadu n...
Kala jadu for love marriage | Real amil baba | Famous amil baba | kala jadu n...babafaisel
 
VIP Call Girls Service Kukatpally Hyderabad Call +91-8250192130
VIP Call Girls Service Kukatpally Hyderabad Call +91-8250192130VIP Call Girls Service Kukatpally Hyderabad Call +91-8250192130
VIP Call Girls Service Kukatpally Hyderabad Call +91-8250192130Suhani Kapoor
 
Call Girls in Okhla Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Okhla Delhi 💯Call Us 🔝8264348440🔝Call Girls in Okhla Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Okhla Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
Bus tracking.pptx ,,,,,,,,,,,,,,,,,,,,,,,,,,
Bus tracking.pptx ,,,,,,,,,,,,,,,,,,,,,,,,,,Bus tracking.pptx ,,,,,,,,,,,,,,,,,,,,,,,,,,
Bus tracking.pptx ,,,,,,,,,,,,,,,,,,,,,,,,,,bhuyansuprit
 

Recently uploaded (20)

Call Us ✡️97111⇛47426⇛Call In girls Vasant Vihar༒(Delhi)
Call Us ✡️97111⇛47426⇛Call In girls Vasant Vihar༒(Delhi)Call Us ✡️97111⇛47426⇛Call In girls Vasant Vihar༒(Delhi)
Call Us ✡️97111⇛47426⇛Call In girls Vasant Vihar༒(Delhi)
 
Dubai Call Girls Pro Domain O525547819 Call Girls Dubai Doux
Dubai Call Girls Pro Domain O525547819 Call Girls Dubai DouxDubai Call Girls Pro Domain O525547819 Call Girls Dubai Doux
Dubai Call Girls Pro Domain O525547819 Call Girls Dubai Doux
 
306MTAMount UCLA University Bachelor's Diploma in Social Media
306MTAMount UCLA University Bachelor's Diploma in Social Media306MTAMount UCLA University Bachelor's Diploma in Social Media
306MTAMount UCLA University Bachelor's Diploma in Social Media
 
Kindergarten Assessment Questions Via LessonUp
Kindergarten Assessment Questions Via LessonUpKindergarten Assessment Questions Via LessonUp
Kindergarten Assessment Questions Via LessonUp
 
办理(宾州州立毕业证书)美国宾夕法尼亚州立大学毕业证成绩单原版一比一
办理(宾州州立毕业证书)美国宾夕法尼亚州立大学毕业证成绩单原版一比一办理(宾州州立毕业证书)美国宾夕法尼亚州立大学毕业证成绩单原版一比一
办理(宾州州立毕业证书)美国宾夕法尼亚州立大学毕业证成绩单原版一比一
 
VIP Call Girl Amravati Aashi 8250192130 Independent Escort Service Amravati
VIP Call Girl Amravati Aashi 8250192130 Independent Escort Service AmravatiVIP Call Girl Amravati Aashi 8250192130 Independent Escort Service Amravati
VIP Call Girl Amravati Aashi 8250192130 Independent Escort Service Amravati
 
Abu Dhabi Call Girls O58993O4O2 Call Girls in Abu Dhabi`
Abu Dhabi Call Girls O58993O4O2 Call Girls in Abu Dhabi`Abu Dhabi Call Girls O58993O4O2 Call Girls in Abu Dhabi`
Abu Dhabi Call Girls O58993O4O2 Call Girls in Abu Dhabi`
 
3D Printing And Designing Final Report.pdf
3D Printing And Designing Final Report.pdf3D Printing And Designing Final Report.pdf
3D Printing And Designing Final Report.pdf
 
young call girls in Pandav nagar 🔝 9953056974 🔝 Delhi escort Service
young call girls in Pandav nagar 🔝 9953056974 🔝 Delhi escort Serviceyoung call girls in Pandav nagar 🔝 9953056974 🔝 Delhi escort Service
young call girls in Pandav nagar 🔝 9953056974 🔝 Delhi escort Service
 
VIP Kolkata Call Girl Gariahat 👉 8250192130 Available With Room
VIP Kolkata Call Girl Gariahat 👉 8250192130  Available With RoomVIP Kolkata Call Girl Gariahat 👉 8250192130  Available With Room
VIP Kolkata Call Girl Gariahat 👉 8250192130 Available With Room
 
Housewife Call Girls NRI Layout - Call 7001305949 Rs-3500 with A/C Room Cash ...
Housewife Call Girls NRI Layout - Call 7001305949 Rs-3500 with A/C Room Cash ...Housewife Call Girls NRI Layout - Call 7001305949 Rs-3500 with A/C Room Cash ...
Housewife Call Girls NRI Layout - Call 7001305949 Rs-3500 with A/C Room Cash ...
 
young call girls in Vivek Vihar🔝 9953056974 🔝 Delhi escort Service
young call girls in Vivek Vihar🔝 9953056974 🔝 Delhi escort Serviceyoung call girls in Vivek Vihar🔝 9953056974 🔝 Delhi escort Service
young call girls in Vivek Vihar🔝 9953056974 🔝 Delhi escort Service
 
PORTAFOLIO 2024_ ANASTASIYA KUDINOVA
PORTAFOLIO   2024_  ANASTASIYA  KUDINOVAPORTAFOLIO   2024_  ANASTASIYA  KUDINOVA
PORTAFOLIO 2024_ ANASTASIYA KUDINOVA
 
NO1 Famous Amil Baba In Karachi Kala Jadu In Karachi Amil baba In Karachi Add...
NO1 Famous Amil Baba In Karachi Kala Jadu In Karachi Amil baba In Karachi Add...NO1 Famous Amil Baba In Karachi Kala Jadu In Karachi Amil baba In Karachi Add...
NO1 Famous Amil Baba In Karachi Kala Jadu In Karachi Amil baba In Karachi Add...
 
VIP Call Girls Service Mehdipatnam Hyderabad Call +91-8250192130
VIP Call Girls Service Mehdipatnam Hyderabad Call +91-8250192130VIP Call Girls Service Mehdipatnam Hyderabad Call +91-8250192130
VIP Call Girls Service Mehdipatnam Hyderabad Call +91-8250192130
 
Kala jadu for love marriage | Real amil baba | Famous amil baba | kala jadu n...
Kala jadu for love marriage | Real amil baba | Famous amil baba | kala jadu n...Kala jadu for love marriage | Real amil baba | Famous amil baba | kala jadu n...
Kala jadu for love marriage | Real amil baba | Famous amil baba | kala jadu n...
 
VIP Call Girls Service Kukatpally Hyderabad Call +91-8250192130
VIP Call Girls Service Kukatpally Hyderabad Call +91-8250192130VIP Call Girls Service Kukatpally Hyderabad Call +91-8250192130
VIP Call Girls Service Kukatpally Hyderabad Call +91-8250192130
 
Call Girls in Okhla Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Okhla Delhi 💯Call Us 🔝8264348440🔝Call Girls in Okhla Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Okhla Delhi 💯Call Us 🔝8264348440🔝
 
Cheap Rate ➥8448380779 ▻Call Girls In Iffco Chowk Gurgaon
Cheap Rate ➥8448380779 ▻Call Girls In Iffco Chowk GurgaonCheap Rate ➥8448380779 ▻Call Girls In Iffco Chowk Gurgaon
Cheap Rate ➥8448380779 ▻Call Girls In Iffco Chowk Gurgaon
 
Bus tracking.pptx ,,,,,,,,,,,,,,,,,,,,,,,,,,
Bus tracking.pptx ,,,,,,,,,,,,,,,,,,,,,,,,,,Bus tracking.pptx ,,,,,,,,,,,,,,,,,,,,,,,,,,
Bus tracking.pptx ,,,,,,,,,,,,,,,,,,,,,,,,,,
 

VLSI and Embedded System DESIGN - An Overview of Key Challenges

  • 1. VLSI and Embedded System DESIGN – (An Overview) Prof. N.S.Murthy (Former Dean and HOD/ ECE/ NIT- Warangal) nsmurthy58@gmail.com 1
  • 2. Lecture Outline • What are the challenges in VLSI Design? • What is an Embedded System? • What is SOC? • What is FPGA? • ASIC vs FPGAs? • Applications 2
  • 3. 3
  • 4. 4 The First Integrated Circuits Bipolar logic 1960’s ECL 3-input Gate Motorola 1966
  • 5. 5 Intel 4004 Micro-Processor 1971 1000 transistors 1 MHz operation
  • 6. 6
  • 7. 7 Intel Pentium (IV) microprocessor
  • 8. 8
  • 10. 10 Transistor Counts 1,000,000 100,000 10,000 1,000 10 100 1 1975 1980 1985 1990 1995 2000 2005 2010 8086 80286 i386 i486 Pentium® Pentium® Pro K 1 Billion Transistors Source: Intel Projected Pentium® II Pentium® III Courtesy, Intel
  • 11. 11 Frequency P6 Pentium ® proc 486 386 286 8086 8085 8080 8008 4004 0.1 1 10 100 1000 10000 1970 1980 1990 2000 2010 Year Frequency (Mhz) Lead Microprocessors frequency doubles every 2 years Doubles every 2 years Courtesy, Intel
  • 12. 12 Power will be a major problem 5KW 18KW 1.5KW 500W 4004 8008 8080 8085 8086 286 386 486 Pentium® proc 0.1 1 10 100 1000 10000 100000 1971 1974 1978 1985 1992 2000 2004 2008 Year Power (Watts) Power delivery and dissipation will be prohibitive Courtesy, Intel
  • 13. 13 Power density 4004 8008 8080 8085 8086 286 386 486 Pentium® proc P6 1 10 100 1000 10000 1970 1980 1990 2000 2010 Year Power Density (W/cm2) Hot Plate Nuclear Reactor Rocket Nozzle Power density too high to keep junctions at low temp Courtesy, Intel
  • 14. Power Consumption • Dynamic – Transition – Short circuit • Leakage – Sub-threshold leakage – Diode/Drain leakage – Gate leakage At 250nm leakage power was only 5% but it is increasing rapidly as geometries decrease 14
  • 15. 15
  • 16. 16
  • 17. 17
  • 18. 18
  • 19. 19
  • 20. 20
  • 21. 21 Not Only Microprocessors Digital Cellular Market (Phones Shipped) 1996 1997 1998 1999 2000 Units 48M 86M 162M 260M 435M Analog Baseband Digital Baseband (DSP + MCU) Power Manageme nt Small Signal RF Powe r RF (data from Texas Instruments) Cell Phone
  • 22. 22 Challenges in Digital Design “Microscopic Problems” • Ultra-high speed design • Interconnect • Noise, Crosstalk • Reliability, Manufacturability • Power Dissipation • Clock distribution. Everything Looks a Little Different “Macroscopic Issues” • Time-to-Market • Millions of Gates • High-Level Abstractions • Reuse & IP: Portability • Predictability • etc. …and There’s a Lot of Them  DSM  1/DSM ?
  • 23. 23 Productivity Trends 1 10 100 1,000 10,000 100,000 1,000,000 10,000,000 2003 1981 1983 1985 1987 1989 1991 1993 1995 1997 1999 2001 2005 2007 2009 10 100 1,000 10,000 100,000 1,000,000 10,000,000 100,000,000 Logic Tr./Chip Tr./Staff Month. x x x x x x x 21%/Yr. compound Productivity growth rate x 58%/Yr. compounded Complexity growth rate 10,000 1,000 100 10 1 0.1 0.01 0.001 Logic Transistor per Chip (M) 0.01 0.1 1 10 100 1,000 10,000 100,000 Productivity (K) Trans./Staff - Mo. Source: Sematech Complexity outpaces design productivity Complexity Courtesy, ITRS Roadmap
  • 24. 24 Why Scaling? • Technology shrinks by 0.7/generation • With every generation can integrate 2x more functions per chip; chip cost does not increase significantly • Cost of a function decreases by 2x • But … – How to design chips with more and more functions? – Design engineering population does not double every two years… • Hence, a need for more efficient design methods – Exploit different levels of abstraction
  • 26. Major Design Challenges 26 Microscopic issues  ultra-high speeds  power dissipation and supply rail drop  growing importance of interconnect  noise, crosstalk  reliability, manufacturability  clock distribution Macroscopic issues  time-to-market  design complexity (millions of gates)  high levels of abstractions  reuse and IP, portability  systems on a chip (SoC)  tool interoperability Design Approach Top – Down approach Define top-block , identify the sub blocks needed to build the top level block and divide further up to the leaf cells Bottom – Up approach Identify the available building blocks, use them to build a bigger cells and use them to build top level block Combination of both
  • 27. 27 Design Metrics • How to evaluate performance of a digital circuit (gate, block, …)? – Cost – Reliability – Scalability – Speed (delay, operating frequency) – Power dissipation – Energy to perform a function
  • 28. 28 Cost of Integrated Circuits • NRE (non-recurrent engineering) costs – design time and effort, mask generation – one-time cost factor • Recurrent costs – silicon processing, packaging, test – proportional to volume – proportional to chip area
  • 29. 29 NRE Cost is Increasing
  • 30. 30 Die Cost Single die Wafer From http://www.amd.com Going up to 12” (30cm)
  • 31. 31 Cost per Transistor 0.0000001 0.000001 0.00001 0.0001 0.001 0.01 0.1 1 1982 1985 1988 1991 1994 1997 2000 2003 2006 2009 2012 cost: -per-transistor Fabrication capital cost per transistor (Moore’s law)
  • 32. 32 Some Examples (1994) Chip Metal layers Line width Wafer cost Def./ cm2 Area mm2 Dies/ wafer Yield Die cost 386DX 2 0.90 $900 1.0 43 360 71% $4 486 DX2 3 0.80 $1200 1.0 81 181 54% $12 Power PC 601 4 0.80 $1700 1.3 121 115 28% $53 HP PA 7100 3 0.80 $1300 1.0 196 66 27% $73 DEC Alpha 3 0.70 $1500 1.2 234 53 19% $149 Super Sparc 3 0.70 $1700 1.6 256 48 13% $272 Pentium 3 0.80 $1500 1.5 296 40 9% $417
  • 33. Embedded Systems Design: A Unified Hardware/Software Introduction 33 Introduction to Embedded Systems Design
  • 34. 34 Embedded systems overview • Computing systems are everywhere • Most of us think of “desktop” computers – PC’s – Laptops – Mainframes – Servers • But there’s another type of computing system – Far more common...
  • 35. 35 Embedded systems overview • Embedded computing systems – Computing systems embedded within electronic devices – Hard to define. Nearly any computing system other than a desktop computer – Billions of units produced yearly, versus millions of desktop units – Perhaps 50 per household and per automobile Computers are in here... and here... and even here... Lots more of these, though they cost a lot less each.
  • 36. 36 A “short list” of embedded systems And the list goes on and on Anti-lock brakes Auto-focus cameras Automatic teller machines Automatic toll systems Automatic transmission Avionic systems Battery chargers Camcorders Cell phones Cell-phone base stations Cordless phones Cruise control Curbside check-in systems Digital cameras Disk drives Electronic card readers Electronic instruments Electronic toys/games Factory control Fax machines Fingerprint identifiers Home security systems Life-support systems Medical testing systems Modems MPEG decoders Network cards Network switches/routers On-board navigation Pagers Photocopiers Point-of-sale systems Portable video games Printers Satellite phones Scanners Smart ovens/dishwashers Speech recognizers Stereo systems Teleconferencing systems Televisions Temperature controllers Theft tracking systems TV set-top boxes VCR’s, DVD players Video game consoles Video phones Washers and dryers
  • 37. 37 Some common characteristics of embedded systems • Single-functioned – Executes a single program, repeatedly • Tightly-constrained – Low cost, low power, small, fast, etc. • Reactive and real-time – Continually reacts to changes in the system’s environment – Must compute certain results in real-time without delay
  • 38. 38 An embedded system example -- a digital camera Microcontroller CCD preprocessor Pixel coprocessor A2D D2A JPEG codec DMA controller Memory controller ISA bus interface UART LCD ctrl Display ctrl Multiplier/Accum Digital camera chip lens CCD • Single-functioned -- always a digital camera • Tightly-constrained -- Low cost, low power, small, fast • Reactive and real-time -- only to a small extent
  • 39. 39 Design challenge – optimizing design metrics • Obvious design goal: – Construct an implementation with desired functionality • Key design challenge: – Simultaneously optimize numerous design metrics • Design metric – A measurable feature of a system’s implementation – Optimizing design metrics is a key challenge
  • 40. 40 Design challenge – optimizing design metrics • Common metrics – Unit cost: the monetary cost of manufacturing each copy of the system, excluding NRE cost – NRE cost (Non-Recurring Engineering cost): The one- time monetary cost of designing the system – Size: the physical space required by the system – Performance: the execution time or throughput of the system – Power: the amount of power consumed by the system – Flexibility: the ability to change the functionality of the system without incurring heavy NRE cost
  • 41. 41 Design challenge – optimizing design metrics • Common metrics (continued) – Time-to-prototype: the time needed to build a working version of the system – Time-to-market: the time required to develop a system to the point that it can be released and sold to customers – Maintainability: the ability to modify the system after its initial release – Correctness, safety, many more
  • 42. 42 Design metric competition -- improving one may worsen others • Expertise with both software and hardware is needed to optimize design metrics – Not just a hardware or software expert, as is common – A designer must be comfortable with various technologies in order to choose the best for a given application and constraints Size Performanc e Power NRE cost Microcontro ller CCD preprocessor Pixel coprocessor A2D D2A JPEG codec DMA controller Memory controller ISA bus interface UART LCD ctrl Display ctrl Multiplier/Accu m Digital camera chip lens CCD Hardware Software
  • 43. 43 Three key embedded system technologies • Technology – A manner of accomplishing a task, especially using technical processes, methods, or knowledge • Three key technologies for embedded systems – Processor technology – IC technology – Design technology
  • 44. 44 Processor technology • The architecture of the computation engine used to implement a system’s desired functionality • Processor does not have to be programmable – “Processor” not equal to general-purpose processor Application-specific Registers Custom ALU Datapath Controller Program memory Assembly code for: total = 0 for i =1 to … Control logic and State register Data memory IR PC Single-purpose (“hardware”) Datapath Controller Control logic State register Data memory index total + IR PC Register file General ALU Datapath Controller Program memory Assembly code for: total = 0 for i =1 to … Control logic and State register Data memory General-purpose (“software”)
  • 45. 45 Processor technology • Processors vary in their customization for the problem at hand total = 0 for i = 1 to N loop total += M[i] end loop General- purpose processor Single- purpose processor Application- specific processor Desired functionality
  • 46. 46 The co-design ladder • In the past: – Hardware and software design technologies were very different – Recent maturation of synthesis enables a unified view of hardware and software • Hardware/software “codesign” Implementation Assembly instructions Machine instructions Register transfers Compilers (1960's,1970 's) Assemblers, linkers (1950's, 1960's) Behavioral synthesis (1990's) RT synthesis (1980's, 1990's) Logic synthesis (1970's, 1980's) Microprocessor plus program bits: “software” VLSI, ASIC, or PLD implementation: “hardware” Logic gates Logic equations / FSM's Sequential program code (e.g., C, VHDL) The choice of hardware versus software for a particular function is simply a tradeoff among various design metrics, like performance, power, size, NRE cost, and especially flexibility; there is no fundamental difference between what hardware or software can implement.
  • 47. 47 Summary • Embedded systems are everywhere • Key challenge: optimization of design metrics – Design metrics compete with one another • A unified view of hardware and software is necessary to improve productivity • Three key technologies – Processor: general-purpose, application-specific, single-purpose – IC: Full-custom, semi-custom, PLD – Design: Compilation/synthesis, libraries/IP, test/verification
  • 48. Why Worry about Power?  Total Energy of Milky Way Galaxy: 1059 J  Minimum switching energy for digital gate (1 electron@100 mV): 1.6 *10-20 J (limit -- thermal noise)  Upper bound on number of digital operations: 6 x1078  Operations/year performed by 1 billion 100 MOPS computers: 3 1024  Energy consumed in 180 years, assuming a doubling of computational requirements every year (Moore’s Law). The Tongue-in-Cheek Answer 48
  • 49. Power the Dominant Design Constraint (1) Cost of large data centers solely determined by power bill … Columbia River Google Data Centre Oregaon. 8,00 0 100,000 450,000 NY Times, June 06 49
  • 50. 50 400 Millions of Personal Computers worldwide (Year 2000) - Assumed to consume 0.16 Tera kWh per year Equivalent to 26 nuclear power plants Over 1 Giga kWh per year just for cooling Including manufacturing electricity [Ref: Bar-Cohen et al., 2000]
  • 51. Power the Dominant Design Constraint 51
  • 52. Chip Architecture and Power Density Integration of diverse functionality SoC causes major variations in activ (and hence power density) The past: temperature uniformity Today: steep gradients Temperature variations cause performance degradation – higher temperature means slower clock speed 52
  • 53. Temperature Gradients (and Performance) IBM Power PC 4 temperature map Hot spot: 138 W/cm2 (3.6 x chip avg flux) Glass ceramic substrate SiC spreader (chip underneath spreade Copper hat (heat sink on top not shown 53
  • 54. 54
  • 55. Power The Dominant Design Constraint (3) Exciting emerging applications require “zero-power” Example: Computation/Communication Nodes for Wireless Sensor Networks Meso-scale low-cost wireless transceivers for ubiquitous wireless data acquisition that • are fully integrated – Size smaller than 1 cm3 •are dirt cheap –At or below 1$ • minimize power/energy dissipation – Limiting power dissipation to 100 mW enables energy scavenging • and form self-configuring, robust, ad-hoc networks containing 100’s to 1000’s of nodes 55
  • 56. How to Make Electronics Truly Disappear? From 10’s of cm3 and 10’s to 100’s of mW To 10’s of mm3 and 10’s of mW
  • 57. Power the Dominant Design Constraint Exciting emerging applications require “zero-power” Real-time Health Monitoring Smart Surfaces Artificial Skin Philips Sand module UCBmm3 radio UCB PicoCube Still at least one order of magnitude away 57
  • 58. Evolution of Supply Voltages in the Past Minimum Feature Size (micron) 10 -1 1 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 Supply Voltage (V) Supply voltage scaling only from the 1990’s 58
  • 59. Subthreshold Leakage As an Extra Complication Year 2002 ’04 ’06 ’08 ’10 ’12 ’14 ’16 0 0.2 0.4 0.6 0.8 1 1.2 0 20 40 60 80 100 120 Technology node[nm] Voltage [V] VTH VDD Technology node 2002 ’04 ’06 ’08 ’10 ’12 ’14’16 0 1 2 Year PDYNAMIC PLEAK Power [µW / gate] Subthreshold leak (Active leakage) © IEEE 2003 59
  • 60. Complicating the Issue: The Diversity of SoCs Power budgets of leading general purpose (MPU) and special purpose (ASSP) processors 60
  • 61. Supply and Threshold Voltage Trends VDD VT 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 2004 2006 2008 2010 2012 2014 2016 2018 2020 2022 VDD/VTH = 2! Voltage reduction projected to saturate Optimistic scenario – some claims exist that VDD may get stuck aro 61
  • 62. The Design Productivity Challenge Source: sematech9 A growing gap between design complexity and design productivity Productiv 58%/Yr. compound Complexity growth rate 21%/Yr. compound Productivity growth rate 1981 10 Logic Transistors per Chip (K) Productivity (Trans./Staff-Month) 100 1,000 10,000 100,000 1,000,000 10,000,000 1 X X X X X X x 100 1,000 10,000 100,000 1,000,000 10,000,000 100,000,000 10 2.5m .35m .10m 1983 1985 1987 1989 1991 1993 1995 1997 1999 2001 2003 2005 2007 2009 Transistor/Staff Month Logic Transistors/Chip 62
  • 63. Impact of Implementation Choices Energy Efficiency (in MOPS/mW) Flexibility (or application scope) 0.1-1 1-10 10-100 100-1000 None Fully flexible Somewhat flexible Hardwired custom Configurable/Parameterizable Domain-specific processor (e.g. DSP) Embedded microprocessor 63
  • 64. MANY CORE SYSTEM ON CHIP TRENDS 64
  • 65. Motivation for Low Power Design Low power design is important from three different reasons • Device temperature – Failure rate, Cooling and packaging costs • Life of the battery – Meantime between charging, System cost • Environment – Overall energy consumption 65
  • 66. Power Consumption • Dynamic – Transition – Short circuit • Leakage – Sub-threshold leakage – Diode/Drain leakage – Gate leakage At 250nm leakage power was only 5% but it is increasing rapidly as geometries decrease 66
  • 67. Dynamic Energy Consumption Energy/transition = CL * VDD 2 * P01 Power = CL * VDD 2 * f Vin Vout CL Vdd Transition Power 67
  • 68. Leakage Energy Vout Independent of switching Sub- threshold current Drain junction leakage OFF Gate leakage 68
  • 69. Modification for Circuits with Reduced Swing CL Vdd Vdd Vdd -Vt E0 1  CL Vdd Vdd Vt –     = Can exploit reduced swing to lower power (e.g., reduced bit-line swing in memory) 69
  • 70. Dynamic vs Static Power 0.01 0.1 1 10 Gate Length (microns) 1E-8 1E-6 1E-4 1E-2 1E+0 1E+2 1E+4 Power Density (W/cm^2) Shrinking Margin SubThreshold Power Active Power Source: Leon Stok, DAC 42© 70
  • 71. Short Circuit Currents Vin Vout CL Vdd I VDD (mA) 0.15 0.10 0.05 Vin (V) 5.0 4.0 3.0 2.0 1.0 0.0 71
  • 72. Reverse-Biased Diode Leakage N p+ p+ Reverse Leakage Current + - Vdd GATE IDL = JS  A JS = 1-5pA/mm2 for a 1.2mm CMOS technology Js double with every 9o C increase in temperature JS = 10-100 pA/mm2 at 25 deg C for 0.25mm CMOS JS doubles for every 9 deg C! 72
  • 74. Basic VLSI Design Flow  SYSTEM SPECIFICATION  ARCHITECTURAL DESIGN  LOGIC DESIGN  CIRCUIT DESIGN  DEVICE DESIGN  LAYOUT •Fabrication, On-Wafer Testing, Packaging, Testing (Algorithmic Level) (Register Transfer Level) (Gate Level) (Transistor Level) - - - - - - - - - - - - - - - - - - Physical Design 74
  • 75. Design Levels • System • Algorithmic/Module • RTL • Gate • Circuit • Device technology 75
  • 76. Optimization is possible at every level  ALGORITHMIC LEVEL eg: DFT: O(N2); FFT: O(N)  ARCHITECTURAL LEVEL  LOGIC LEVEL  CIRCUIT LEVEL  DEVICE LEVEL ….. 76
  • 77. System Level Design Same MP3 Application running on different systems consume significantly different amounts of power • System partitioning • Busses/Memory/IO devices /interfaces • Choice of components • Coding • System states (sleep/snooze etc) • DVS/DFS/.. 77
  • 78. Algorithmic/sub-system Level • Choice of algorithm (operation count etc.) • Word length choices • Module interfaces • Implementation technology – SW: Processor selection – HW: ASIC/FPGA/.. • Behavioral synthesis constraints and trade-off 78
  • 79. RTL • Pipelining/retiming • Module selection • Multiple frequency and voltage islands • Reduction in switching activity through transformations 79
  • 80. Gate Level • Clock gating • Power gating • Clock tree optimization • Logic level transformations to reduce switching activity 80
  • 81. Circuit Level • Transistor sizing • Power efficient circuits • Cell design • Multi-threshold circuits 81
  • 82. Device Technology • Multi-oxide devices • Multiple “cell types” on a single substrate – Logic, SRAM, Flash etc. • Support for many other low power design techniques (multiple thresholds, multiple voltages, multiple frequencies etc.) 82
  • 83. 83
  • 84. 84
  • 85. 85
  • 86. Reduction of Sub-threshold Leakage Current • Reduce supply voltage • Reduce size of the circuit – Resize transistors as per performance requirements – Dynamically cut power supply to unused circuits • Cooling • Reduce threshold voltage – Stack the off-transistors in series – Isolating supply through sleep transistors – Dual threshold; higher threshold on non-critical paths – Adaptive body biasing 86
  • 87. OPTIMIZATION AT LOGIC LEVEL 20 Transistors 2:1 MULTIPLEXER 6 6 6 2 S A B Y 4 4 4 2 S A B Y 14 Transistors 87
  • 88. OPTIMIZATION AT CIRCUIT LEVEL 2:1 MULTIPLEXER USING TRANSMISSION GATE LOGIC S A B S Y 6 Transistors (including 2 for inverting S) S S 88
  • 89. A A B B C C D D VDD Y Y = (AB + C)D 0 A B C D Y 18 Transistors 8 Transistors Optimized transistor level realization of Boolean function 89
  • 90. Low Power RTL Synthesis Techniques • Module selection • Retiming • Pipelining • Parallelism • Bus data encoding • FSM encoding • Transformations for Switching activity reduction 90
  • 91. Module Selection • Modules are used for implementing functional units, small memory modules etc. • Significant difference in power consumption of different implementations • Word-length as well as number coding techniques employed can play a significant role 91
  • 92. Ripple Carry Adder Carry signal switching propagates through all the stages and consumes Power ACTEL: MAPLD2004 92
  • 93. Carry Look Ahead Adder Carry signal switching propagates through much less number of stages and thus not only reduces delay but can also consume less power ACTEL: MAPLD2004 93
  • 94. Other Operations and Operators • ALUs – Traditional method: Perform all operations and use select for the output; very inefficient in terms of switching activity – Permit switching activity only in the operator required in this cycle • Complex operators like MAC • Cordic functions – Look up table vs computation 94
  • 95. Alternative ALU Structures F1 F2 Fn Function Select Mu x Inputs F1 F2 Fn Function Select Mu x Inputs Demu x 95
  • 96. Retiming - Positioning a Flip-flop and Power Consumption Logi c Logi c FF CL CL CR Eg Eg ER P1 = k * Eg * CL P2 = k * (Eg * CR + ER * CL) P2 can be less than P1 96
  • 97. Pipelining • Pipelining effects power in two different ways • One factor is similar to retiming where flip- flops can cut down on glitches • As pipelining can reduce the critical path to give higher frequency and performance (throughput), this can be used to reduce the voltage for the given throughput to reduce power 97
  • 98. Effect of Pipelining Logi c FF Logi c Case1: No Pipelining Logi c FF Logi c Case2: Pipelining for performan ce Logi c FF Logi c FF Logi c Case 3: Pipelining for low power freq: f0 voltage: v0 f1 > f0 v1 = v0 f2 = f0 v2 < v0 98
  • 99. Increasing Parallelism/ Concurrency • Chandrakasan[4] first showed that concurrency can be used to reduce power instead of increasing performance • Primary idea is to reduce the frequency of operation and/or voltage to meet a certain throughput • Power consumed by additional logic required to distribute computation and multiplex results needs to be accounted for 99
  • 100. Effect of Parallelism Case1: Single FU Case2: Two FUs for enhanced performan ce Case 3: Two FUs for reducing power freq: f0 voltage: v0 throughput: T0 f1 = f0 v1 = v0 T1 > T0 FU reg FU reg FU reg FU reg FU f2 < f0 v2 < v0 T2 = T0 M U x M U x 100
  • 101. Examples and Case study • Usage of redundant arithmetic • Usage of alternative number representation (normalized / Gray coded) • Usage of running transforms ____________________________________________ • Design of an alternative arithmetic unit (e.g. CORDIC) • Design of an FFT address generator 101
  • 102. ALTERNATIVE ARITHMETIC UNIT CO-ORDINATE ROTATION DIGITAL COMPUTER (CORDIC) x y x’ = x cos + y sin y’=  x sin  + y cos  102
  • 103. CORDIC Algorithm – a simple, smart way of computing trigonometric quantities (e.g., cos ) in digital hardware and to realize multiplierless architectures. CORDIC : “Coordinate Rotation Digital Computer” Define a set of basic CORDIC angles , 0 90 . o o k k     • 1 tan (2 ), 0,1,2, .... k k k      • 0 0 1 2 1 45 , ..... ..... 0 o o k k               Given an angle , 0 90 , o o     we can write, 0 , i i i        where 0 1 ( 1) i       103
  • 104. In practice, the summation is truncated up to a finite number of terms, say, M (called wordlength) : as generated is sequence The i  End i sign i i M to i For i i i )]; 1 ( [ ; ) ( ) 1 ( 1 0 1 , (0) 1 0                     104
  • 105. CORDIC Rotation ( , ) x y ( ', ') x y  1 1 1 0 0 0 Rot( ) = Rot( ). . Rot( ) M k k M M k                ' cos sin ' sin cos 1 tan cos tan 1 Rot ( ) x x y y x y x y                                                  1 0 1 0 1 0 1 tan 1 tan ' ( cos ) ' tan 1 tan 1 M M k k M x x y y                                       105
  • 106. ( 1) 0 1 ( 1) 0 0 ' 1 2 1 2 ( cos ) ' 2 1 2 1 M M k M k x x y y                                           • 1 0 cos : Universal constant M k k     • Elementary rotations – need only shifting operations --- can be pipelined • Shifting done by direct bus connections 2 i v u    u v 106
  • 107. i x i y 2 i  2 i  i  i  sign 1 i x 1 i y 1 i   i  Rotation by i  (i-th stage of a pipelined CORDIC unit) 0 x x  0 y y  0    0 Rot( )  Pipeline latch 1 y 1  1 M x  1 M y  1 M   1 Rot( ) M   M x M y Pipelined CORDIC Unit (PCU) 107
  • 108. CONCLUSION •There ALWAYS EXISTS A BETTER SOLUTION than the present one and we can think of that. •But There ALWAYS EXISTS A BETTER SOLUTION than what we can think of! 108