4. nMOS: 1 = ON
pMOS: 0 = ON
Series: both must be ON
Parallel: either can be ON
(a)
a
b
a
b
g1
g2
0
0
a
b
0
1
a
b
1
0
a
b
1
1
OFF OFF OFF ON
(b)
a
b
a
b
g1
g2
0
0
a
b
0
1
a
b
1
0
a
b
1
1
ON OFF OFF OFF
(c)
a
b
a
b
g1 g2 0 0
OFF ON ON ON
(d) ON ON ON OFF
a
b
0
a
b
1
a
b
11 0 1
a
b
0 0
a
b
0
a
b
1
a
b
11 0 1
a
b
g1 g2
5. Complementary CMOS gates always produce 0 or 1
Ex: NAND gate
◦ Series nMOS: Y=0 when both inputs are 1
◦ Thus Y=1 when either input is 0
◦ Requires parallel pMOS
Rule of Conduction Complements
◦ Pull-up network is complement of pull-down
◦ Parallel -> series, series -> parallel
A
B
Y
6. Compound gates can do any inverting function
Ex: AND-AND-OR-INV (AOI22) )()( DCBAY •+•=
A
B
C
D
A
B
C
D
A B C D
A B
C D
B
D
Y
A
C
A
C
A
B
C
D
B
D
Y
(a)
(c)
(e)
(b)
(d)
(f)
9. Transistors can be used as switches
g
s d
g = 0
s d
g = 1
s d
0 strong 0
Input Output
1 degraded 1
g
s d
g = 0
s d
g = 1
s d
0 degraded 0
Input Output
strong 1
g = 1
g = 1
g = 0
g = 0
10. Figure 3 How voltages
correspond to logic levels.
V
DD
logic 1
V
H
unknown (X)
VL
V
SS
logic 0
11. Strength of signal
◦ How close it approximates ideal voltage source
VDD and GND rails are strongest 1 and 0
nMOS pass strong 0
◦ But degraded or weak 1
pMOS pass strong 1
◦ But degraded or weak 0
Thus NMOS are best for pull-down network
Thus PMOS are best for pull-up network
12.
13. Pass transistors produce degraded outputs
Transmission gates pass both 0 and 1 well
g = 0, gb = 1
a b
g = 1, gb = 0
a b
0 strong 0
Input Output
1 strong 1
g
gb
a b
a b
g
gb
a b
g
gb
a b
g
gb
g = 1, gb = 0
g = 1, gb = 0
14. Other Forms of CMOS Logic (or) Alternate gate circuits:
Pseudo-nMOS logic:
Pseudo-nMOS Nand gate.
17. Figure : shows the circuit for a particular DCVSL gate. This gate
computes a+bc on one output and (a+bc)’ = a’b’+a’c’ on its other
output.
a
b
c
a’
b' c'
a'b'+a'c' (a+bc)'
28. Traditionally, chip is surrounded by pad frame
◦ Metal pads on 100 – 200 µm pitch
◦ Gold bond wires attach pads to package
◦ Lead frame distributes signals in package
◦ Metal heat spreader helps with cooling
29. 29
Decompose a large complex system into smaller
subsystems
Decompose hierarchically until each subsystem is
of manageable size
Design each subsystem separately to speed up
the process
Minimize connection between two subsystems to
reduce interdependency
31. Several blocks after partitioning:
Need to:
◦ Put the blocks together.
◦ Design each block.
32. How to put the blocks together without knowing
their shapes and the positions of the I/O pins?
If we design the blocks first, those blocks may not
be able to form a tight packing.
33. 33
Top-down partitioning
◦ Iterative improvement
◦ Spectral based
◦ Clustering methods
◦ Network flow based
◦ Analytical based
◦ Multi-level
Bottom-up clustering
◦ Unit delay model
◦ General delay model
◦ Sequential circuits with retiming
34. The floorplanning problem is to plan the
positions and shapes of the modules at the
beginning of the design cycle to optimize the
circuit performance
◦ chip area
◦ total wirelength
◦ delay of critical path
◦ routability
◦ others, e.g., noise, heat dissipation, etc.
35. Both determines block positions to optimize the
circuit performance.
Floorplanning:
◦ Details like shapes of blocks, I/O pin positions, etc. are not
yet fixed (blocks with flexible shape are called soft blocks).
Placement:
◦ Details like module shapes and I/O pin positions are fixed
(blocks with no flexibility in shape are called hard blocks).
36. 36
Output from partitioning used for floorplanning
Inputs:
◦ Blocks with well-defined shapes and area
◦ Blocks with approximated area and no particular shape
◦ Netlist specifying block connections
Outputs:
◦ Locations for all blocks
37. 37
Objectives
◦ Minimize area
◦ Reduce wirelength
◦ Maximize routability
◦ Determine shapes of
flexible blocks
Constraints
◦ Shape of each block
◦ Area of each block
◦ Pin locations for each
block
◦ Aspect ratio
38. Dead space is the space that is wasted:
Minimizing area is the same as minimizing
deads pace.
Dead space
39. Slicing Floorplan:
One that can be obtained by
repetitively subdividing (slicing)
rectangles horizontally or
vertically.
Non-Slicing Floorplan:
One that may not be obtained
by repetitively subdividing alone.
Otten (LSSS-82) pointed out
that slicing floorplans are much
easier to handle.
40. General case: all modules are soft macros
Phase 1: bottom-up
◦ Input – floorplan tree, modules shapes
◦ Start with a sorted shapes list of modules
◦ Perform vertical_node_sizing and horizontal_node_sizing
◦ On reaching the root node, we have a list of shapes,
select the one that is best in terms of area
Phase 2: top-down
◦ Traverse the floorplan tree and set module locations
43. The process of arranging circuit components on a
layout surface
Inputs : Set of fixed modules, netlist
Output : Best position for each module based on
various cost functions
Cost functions include wirelength, wire routability,
hotspots, performance, I/O pads
44. Good placement
◦ No congestion
◦ Shorter wires
◦ Less metal levels
◦ Smaller delay
◦ Lower power dissipation
Bad placement
Congestion
Longer wire lengths
More metal levels
Longer delay
Higher power dissipation
47. 47
Connect the various standard cells using wires
Input:
◦ Cell locations, netlist
Output:
◦ Geometric layout of each net connecting various
standard cells
Two-step process
◦ Global routing
◦ Detailed routing
51. 51
Objective
◦ 100% connectivity of a system
◦ Minimize area
◦ Minimize wirelength
Constraints
◦ Number of routing layers
◦ Design rules
◦ Timing (delay)
◦ Crosstalk
◦ Process variations
53. 53
Looked at the physical design flow
Involved several steps
◦ Partitioning
◦ Floorplanning
◦ Placement
◦ Routing
Each step can be formulated as an optimization
problem
Need to go through 2 or more iterations in each step
to generate an optimized solution
Editor's Notes
The next step in the physical design flow is the floorplanning algorithm. The output of partitioning is fed to the floorplanning algorithm.
Let us start with partitioning. So the idea is to decompose a large complex system into smaller subsystem. One can either you a hierarchical partitioning or flat partitioning. This partitioning is performed until each subsystem is of manageable size – in short a divide and conquer strategy. Each subsystem can then we designed separately. This speeds up the process. One important constraint while partitioning a system is to minimize the number of connections between two subsystems. This reduces any interdependency between the two subsystems.
Various algorithms have been proposed since 1969 to partition a system. You can either adopt a top-down partitioning approach or a bottom-up clustering approach. Each new partitioning algorithm adopts different strategies like the iterative improvement which goes through several iterations, or say the analytical method which adopts a mathematical approach to solve the partitioning problem. Depending on the system the objective is changed accordingly.
The input to the floorplanning stage is a set blocks that have well-defined shapes and area (like memory block which are highly regular), blocks which have approximate area and no particular shape (like any new architecture) and the netlist that connects the different block connections. The output of the floorplanning stage is the location of each block.
The floorplanning problem can be formulated as follows, the objective of the floorplanning stage is to minimize the aggregate area of the system, reduce the wirelength, maximize the routability and determine shapes of flexible blocks. We are constrained by the shape of each block, area of each block, pin locations for each block and the aspect ratio. Here we have a sample of an optimal floorplan in terms of area and the corresponding non-optimal floorplan using the same blocks.
Let us look at a simple algorithm called slicing floorplan. The modules used for this algorithm are all soft macros that means the aspect ratio is flexible and the modules can be rotated. The algorithm is a two phase process. The first phase is the bottom-up approach. It uses the floorplan tree and the different module shapes as input. The floorplan tree is generated based on the partitioning results. Using a sorted list of modules vertical node sizing and horizontal node sizing. We will look at horizontal and vertical sizing in the next slide. On reaching the root node, we have a list of shapes and we select shapes for each node based on area. The second phase involves a top-down flow. Here we traverse the floorplan tree and set the locations for each module.
Here we have an example for vertical sizing and horizontal sizing. We have two block A and B. We need to figure out what is the best aspect ratio for each block and the least aggregate area of the two blocks. For the floorplanning step there is an upper and lower limit on the aspect ratio. For the two blocks. So as you can see from the slide different permissible aspect ratios and orientations are tried for both blocks. As you can see from the figure sizes a3 and b2 are selected.
The next stage after floorplanning is placement.
Now the objective for placement is obvious – arrange the your circuit components on a layout surface under certain constraints. The placement algorithms use a set of fixed modules and the netlist describing the connections between the various modules as their input. The output of the algorithms is the best possible position for each module based on various cost functions. You can have one or more cost functions depending on your designs. The cost functions include maximum total wirelength, wire routability, congestions, performance and I/O pads locations.
Here we have two sample placement results. It is obvious that the one of the left is better than the one on the right. So what is good about the placement on the left and what are the implications? If you notice the IO pads for the placement on the left are distributed along the periphery, while for the placement on the right the IO pads are clustered in one place. As you can see there is no congestion for wire routing so that avoids any hotspots, we are using shorter wires which reduce area, power and delay. It also reduces the number of metal levels. On the other hand, for the bad placement we have congestion, long wire length, more delays, more metal level and higher power dissipation leading to an inefficient design.
The next stage in the physical design flow is routing.
The routing stage connects the various standard cells of our system using wires. The routing algorithms use cell locations and netlist as the input and generate a geometric layout of each net that connects the various standard cells. The routing process is a two step process – global routing and detailed routing.
So what is the difference between global routing and detailed routing. Suppose the chip is is North America and some travelers in California need advice on how to drive from Stanford (near San Francisco) to Caltech (near Los Angeles). The floorplanner has decided that California is on the left (west) side of the ASIC and the placement tool has put Stanford in Northern California and Caltech in Southern California. Floorplanning and placement have defined the roads and freeways. There are two ways to go: the coastal route (using Highway 101) or the inland route (using Interstate I5, which is usually faster). The global router specifies the coastal route because the travelers are not in a hurry and I5 is congested (the global router knows this because it has already routed onto I5 many other travelers that are in a hurry today). Next, the detailed router looks at a map and gives indications from Stanford onto Highway 101 south through San Jose, Monterey, and Santa Barbara to Los Angeles and then off the freeway to Caltech in Pasadena. So in case of a chip, global routing generates a loose route for each net and list of routing regions are assigned, while detailed routing does the geometric layout for each net within the assigned routing regions. As you can see from this example, global routing determined the approximate routes between the various modules of the system, while detailed routing generated the geometric layout of each net.
The system routing problem can be formulated as follows – the objective is to provide 100% connectivity of the system, while minimizing area and wirelength. And we are constrained by the number of metal layers, design rules like spacing between different wires, wire dimensions, timing, interconnect crosstalk and variations in the technology manufacturing process.
Once we have routed the signal interconnects, next we route the clocks and power lines. We talked about clock routing last week. The idea is to minimize the clock skew and the delay along the clock routing path. In case of power routing, we need to have low resistance metal lines to ensure adequate current.