1. 1|Page
A Report on
Designing and implementing 32 bit ALU
With four 8 bit ALU’s using
Cadence tools
BY
Name:K.Gautham Reddy
ID NO:2011A8ps364G
IN THE PARTIAL FULFILLMENT OF THE COURSE ‘ANALOG AND DIGITAL VLSI DESIGN .
2. 2|Page
ACKNOWLEDGEMENT
I sincerely thank Mr.Pravin Mane for giving us this opportunity because of which I could
get a practical experience of field work in Cadence. This project gave me a great
insight.
3. 3|Page
Abstract :
This project describes the design and implementation of the layout of a basic 32 bit
ALU using four 8 bit ALUs. ALU stands for Arithmetic Logic Unit, which is a
digital circuit that performs arithmetic and logical operations. In this project, 32 bit
ALU is designed, implemented and simulated using the Cadence software. The
system performs arthematic and logic operations. The report covers the verilog
code of the data which has been used to generate a layout of the chip using the
Cadence tool, testbench which is used in NCverilog to verify the code,Encounter
which is then imported into Virtuoso for simulation.
4. 4|Page
Table of contents:
1. Introduction and working of design……………………………………………….. 6
2) FLOW CHART………………………………………………………………………………….
7
3) Code for implementing 32 bit ALU using four 8 bit ALUs………………
8
4)Testbench……………………………………………………………………………………….
9
5)NCverilog………………………………………………………………………………………..
10
6)RTL compiler…………………………………………………………………………………
13
i) Schematic of 32bit ALU
ii)Schematic of 8 bit ALU
iii)initial slack report
iv)report mapped gates
v)report statistics
vi)Worst case critical path
vii)Instance power usage
Viii)Probability Histograms
ix)Net power usage
x)End point Slack Histograms
xi)Datapath area
xii)Toggle rate histograms
7)Encounter Tool……………………………………………………………………………
i) verify floor plan
ii) Add rings
iii) Add stripes
22
5. 5|Page
iv) special route
v) Place standard cell
vi) Pre CTS
vii) Post CTS
viii) slack after Post CTS
ix) Nano route
x) Post route
xi) Add filler
xii) Verify geometry
xiii) Verify connectivity
8) Virtuoso tool…………………………………………………………………………………………… 45
i) Layout
ii) Extracted view
iii) DRC check error report
9) Check netlist and Check placement report…………………………………………….47
10) Part of summary report………………………………………………………………………. 48
6. 6|Page
1. Introduction and working of design
Arithmetic and Logic Unit is one of the most basic components of any computing machine. It
has the capability of performing various arithmetic and logical operations like addition,
subtraction, AND, OR, XOR etc., on given data and generating results. The number of operations
it can perform and the size of the data it can work with are dependent on the ALU.
In this project, I made a 32 bit ALU by using four 8 bit ALUs.Here the inputs to the ALU are two
32 bit inputs on which operations are to be performed,one bit cin(carry in ) in case the
operation selected is addition and 3 bit select for selecting which operation is to be performed.
So the 32 bit inputs are divided into four 8 bit inputs and sent to four ALUs for performing
operations. In case the operation selected is addition the carry out of the first ALU is given is
given as the carryin to the second ALU and so on and the final carry out of the fourth ALU is out
final cout.This design can also be extended to implement 64 bit or 128 bit ALUs by using same
algorithm.
I have first described the ALU using a Verilog code and then I have extracted the layout of the
design using the Cadence tools Encounter and Virtuoso. Cadence is an electronic design
automation software that supports circuits schematic and layout design and circuit simulation.
The Verilog code was used to generate the stick diagrams of all the possible circuits in
Encounter. Then the layout was imported into Virtuoso and verified. The results of the design
were satisfactory in terms of accuracy .
This report documents the steps in the construction of the ALU. First the code for the two modules
32 bit ALU and 8 bit ALU is written along with its test bench.Then the code is verified using
NCverilog process and its resultant waveform is displayed with 0 errors and warnings.Then the
schematics of both 32bit ALU and 8bit ALU is displayed and various analysis like total power usage
,instant power usage,worst case critical path etc are done here. Then the encounter tool is explored
and the circuit is implemented and its final slack time is adjusted to positive and near zero by
adjusting the clock period of the circuit.Then the design is saved as a DEF file and this file is used in
virtuoso for the extraction of layout.
7. 7|Page
2) FLOW CHART:
Start
Inputs:[31:0]A,B
[2:0]S and cin
Inputs:[31:24]A,B
Inputs:[23:16]A,B
Inputs:[15:8]A,B
Inputs:[7:0]A,B
[2:0]S and cin
[2:0]S and cin
[2:0]S and cin
[2:0]S and cin
Perform
arthematic and
logic operations
Perform
arthematic and
logic operations
Perform
arthematic and
logic operations
outputs:[31:0]out
,cout(in case of
addition)
End
Perform
arthematic and
logic operations
8. 8|Page
3) Code for implementing 32 bit ALU using four 8 bit ALUs:
module alu_8bit(out,cout,g,e,A,B,cin,S);
output reg [7:0] out;
output reg cout,g,e;
input [7:0] A,B;
input cin;
input [2:0] S;
//used functions
parameter BUF_A = 3'b000;
parameter NOT_A = 3'b001;
parameter ADD = 3'b010;
parameter OR = 3'b011;
parameter AND = 3'b100;
parameter NOT_B = 3'b101;
parameter BUF_B = 3'b110;
parameter LOW = 3'b111;
always @(A or B or S or cin) begin
//Comparator
g = A>B;
e = A==B;
//Other selective functions
case(S)
BUF_A: out = A;
NOT_A: out = ~A;
ADD: {cout,out} = A+B+cin;
OR: out = A | B;
AND: out = A & B;
NOT_B: out = ~B;
BUF_B: out = B;
LOW: out = {8{1'b0}};
endcase
end
endmodule
module alu_32bit(out,cout,g,e,A,B,cin,S);
output [31:0] out;
output cout,g,e;
input [31:0] A,B;
input cin;
input [2:0] S;
wire e1,e2,e3,e4;
wire g1,g2,g3,g4;
alu_8bit
alu_8bit
alu_8bit
alu_8bit
ALU1(out[7:0],cin2,g1,e1,A[7:0],B[7:0],cin,S);
ALU2(out[15:8],cin3,g2,e2,A[15:8],B[15:8],cin2,S);
ALU3(out[23:16],cin4,g3,e3,A[23:16],B[23:16],cin3,S);
ALU4(out[31:24],cout,g4,e4,A[31:24],B[31:24],cin4,S);
9. 9|Page
assign g = g4 | (e4 & g3) |(e4 & e3 & g2) | (e4& e3 & e2 & g1);
assign e = e4 & e3 & e2 & e1;
endmodule
4)Testbench for 32 bit ALU using four 8 bit ALUs
module ALUtester;
wire [31:0] out;
wire cout,g,e;
reg [31:0] A,B;
reg cin;
reg [2:0] S;
alu_32bit alu(out,cout,g,e,A,B,cin,S);
initial begin
#10
#10
#10
#10
#10
A
A
A
A
A
A
=
=
=
=
=
=
'h00000000;
'h00000000;
'h00000000;
'h00000000;
'h00000000;
'h00000000;
B
B
B
B
B
B
=
=
=
=
=
=
'h80000001;
'h80000001;
'h80000001;
'h80000001;
'h80000001;
'h80000001;
cin
cin
cin
cin
cin
cin
=
=
=
=
=
=
$strobe( "A=%h B=%h out=%h ",A, B, out );
#1
$finish;
end
endmodule
'b1;
'b1;
'b1;
'b1;
'b1;
'b1;
S
S
S
S
S
S
=
=
=
=
=
=
'b010;
'b011;
'b100;
'b101;
'b110;
'b111;
10. 10 | P a g e
5)NCverilog process:
The code is checked using ncverilog process by using the command
ncverilog +access+r +gui -f test.txt
and the following results are obtained with 0 errors and 0 warnings.
Design browser-simVision
13. 13 | P a g e
6)RTL Compiler:
In digital circuit design Register transfer level (RTL) is a design abstraction which models a
synchronous digital circuit in terms of flow of digital signals between hardware registers
and the logic
operations performed on the signals.
We need a ‘syn-rtl.tcl’ file which is required for timing analysis. It has the clock period and
delays.
Now invoke the encounter RTL compiler using the command
‘ rc -gui –f syn-rtl.tcl ’
Then the gate level schematic is generated-
Schematic of 32bit ALU
14. 14 | P a g e
Gate level schematic of each 8bit ALU
After this rtl.v and rtl.sdc files are generated which is later used in the encounter. Along
with this .log and .cmd files are also generated.
15. 15 | P a g e
Initial slack = 34 ps for clock period of 3900ps
19. 19 | P a g e
Instance Power Usage
Probability Histogram
20. 20 | P a g e
Net Power Usage
Endpoint Slack Histogram
21. 21 | P a g e
Datapath Area
Togglerate Histogram
22. 22 | P a g e
7)Encounter tool:
Cadence Encounter RTL –to- GDSII system supports large-scale complex flat and
hierarchial designs. It combines advanced RTL and physical synthesis , silicon virtual
prototyping, automated floorplan synthesis, clock tree and clock mesh synthesis, advanced
nanometer routing, advanced low power implementation etc.
Now we invoke this system using the command
‘encounter -32’.
After this system gets invoked, the .global file was generated.
To create the .global file we load the rtl.v file, technology libraries (.lef file), timing
libraries(.lib) ,timing constraints file (.sdc file)
Once made, the file can be directly loaded for the other runs.
Specify floor plan
To determine the shape of each subcircuit and pin locations at their boundary and to
approximately find location of each module in rectangular floor planning is being done.
Aim of good floor planning is to reduce chip area, improve performance and to make
routing phase simpler.
23. 23 | P a g e
ADD RINGS
Add rings helps adding the vdd and gnd rings around the rectangular area to make
available the power nets easily for all the modules.
47. 47 | P a g e
Check netlist report:
###############################################################
# Generated by:
Cadence Encounter 11.10-p003_1
# OS:
Linux x86_64(Host ID eee-08)
# Generated on:
Thu Nov 7 21:11:27 2013
# Design:
alu_32bit
# Command:
checkDesign -io -netlist -physicalLibrary powerGround -tieHilo -timingLibrary -spef -floorplan -place -outdir
checkDesign
###############################################################
Design: alu_32bit
------ Design Summary:
Total Standard Cell Number
Total Block Cell Number
Total I/O Pad Cell Number
Total Standard Cell Area
Total Block Cell Area
Total I/O Pad Cell Area
(cells)
(cells)
(cells)
( um^2)
( um^2)
( um^2)
:
:
:
:
:
:
1085
0
0
235872.00
0.00
0.00
------ Design Statistics:
Number of Instances
Number of Nets
Average number of Pins per Net
Maximum number of Pins in Net
:
:
:
:
1085
752
3.08
25
:
:
:
:
:
103
68
35
0
0
------ I/O Port summary
Number
Number
Number
Number
Number
of
of
of
of
of
Primary I/O Ports
Input Ports
Output Ports
Bidirectional Ports
Power/Ground Ports
-----------------------------------------------------------Detail report:
------ I/O Pad/Port Checking
------ Primitive Pins DRC Checking
------ Primitive Net DRC Check
48. 48 | P a g e
Check Placement report:
###############################################################
# Generated by:
Cadence Encounter 11.10-p003_1
# OS:
Linux x86_64(Host ID eee-08)
# Generated on:
Thu Nov 7 21:11:27 2013
# Design:
alu_32bit
# Command:
checkDesign -io -netlist -physicalLibrary powerGround...
###############################################################
## No violations found ##
## Summary:
#########################################################
## Number of Placed Instances = 1085
## Number of Unplaced Instances = 0
## Placement Density:115.04%(235872/205027)
Part of summary report:
==============================
General Library Information
==============================
# Routing Layers: 3
# Masterslice Layers: 1
# Pin Layers:
General Caution:
1) Library have metal1, metal2 and metal3 pins, you should
setPreRouteAsObs {1 2 3}
to ensure these
pins are accessible after placement
-----------------------------Pin Layers
-----------------------------metal3
metal2
metal1 3
# Layers:
-----------------------------Layer OVERLAP Information
-----------------------------Type Overlap
-----------------------------Layer metal3 Information
-----------------------------Type Routing
Wire Pitch X 3.000 um
Wire Pitch Y 3.000 um
49. 49 | P a g e
Offset X 1.500 um
Offset Y 1.500 um
Wire Width 1.500 um
Spacing 0.900 um
-----------------------------Layer via2 Information
-----------------------------Type Cut
Vias
-----------------------------Via list in layer via2
-----------------------------Vias in via2 Default
M3_M2_via
Yes For complete list click here
Multiple Orientation Vias CAUTION: There is only one default via in
this layer
-----------------------------Layer metal2 Information
-----------------------------Type Routing
Wire Pitch X 2.400 um
Wire Pitch Y 2.400 um
Offset X 1.200 um
Offset Y 1.200 um
Wire Width 0.900 um
Spacing 0.900 um
-----------------------------Layer via Information
-----------------------------Type Cut
Vias
-----------------------------Via list in layer via
-----------------------------Vias in via Default
M2_M1_via
Yes For complete list click here
Multiple Orientation Vias CAUTION: There is only one default via in
this layer
-----------------------------Layer metal1 Information
-----------------------------Type Routing
Wire Pitch X 3.000 um
Wire Pitch Y 3.000 um
Offset X 1.500 um
Offset Y 1.500 um
Wire Width 0.900 um
Spacing 0.900 um
-----------------------------Layer cc Information
-----------------------------Type Cut
Vias
------------------------------
50. 50 | P a g e
Via list in layer cc
----------------------------------------------------------Layer poly Information
-----------------------------Type Masterslice 8
# Pins without Physical Port: 0
# Pins in Library without Timing Lib: 0
# Pins Missing Direction: 0
Antenna Summary Report:
General Caution:
1) All Antenna Constructs are absent for the layer section of LEF.
2) All Antenna Constructs are absent for the macro section of
LEF.For more information click here
# Cells Missing LEF Info: 0
# Cells with Dimension Errors: 0
==============================
Netlist Information
==============================
# HFO (>200) Nets: 0
# No-driven Nets: 0
# Multi-driven Nets: 0
# Assign Statements: 0
Is Design Uniquified: YES
# Pins in Netlist without timing lib: 0
==============================
==============================
: Internal External
No of Nets:
750
0
No of Connections:
1563
0
Total Net Length (X): 3.2352e+04 0.0000e+00
Total Net Length (Y): 3.0275e+04 0.0000e+00
Total Net Length: 6.2628e+04 0.0000e+00
==============================
Timing Information
==============================
# Clocks in design: 0
# Generated clocks: 0
# "dont_use" cells from .libs: 0
# "dont_touch" cells from .libs: 0
# Cells in .lib with max_tran: 0
# Cells in .lib with max_cap:
-----------------------------Cell List with max_cap
-----------------------------Cell Name Max Capacitance (pf)
XOR2X1 0.325168
XNOR2X1 0.322263
OAI22X1 0.225398
OAI21X1 0.404479
52. 52 | P a g e
ENINVX1
DCX1
DCNX1
DCBX1
DCBNX1
BUFX8
BUFX4
BUFX2
AOI22X1
AOI21X1
AND3X1
20
20
20
20
20
20
20
20
20
20
20
33
SDC max_cap: N/A
SDC max_tran: N/A
SDC max_fanout: N/A
Default Ext. Scale Factor: 1.000
Detail Ext. Scale Factor: 1.000
==============================
Floorplan/Placement Information
==============================
Total area of Standard cells: 314496.000 um^2
Total area of Standard cells(Subtracting Physical Cells): 238485.600 um^2
Total area of Macros: 0.000 um^2
Total area of Blockages: 0.000 um^2
Total area of Pad cells: 0.000 um^2
Total area of Core: 303409.800 um^2
Total area of Chip: 374283.000 um^2
Effective Utilization: 1.1504e+00
Number of Cell Rows: 14
% Pure Gate Density #1 (Subtracting BLOCKAGES): 103.654%
% Pure Gate Density #2 (Subtracting BLOCKAGES and Physical Cells): 78.602%
% Pure Gate Density #3 (Subtracting MACROS): 103.654%
% Pure Gate Density #4 (Subtracting MACROS and Physical Cells): 78.602%
% Pure Gate Density #5 (Subtracting MACROS and BLOCKAGES): 103.654%
% Pure Gate Density #6 (Subtracting MACROS and BLOCKAGES and Physical
Cells): 78.602%
% Core Density (Counting Std Cells and MACROs): 103.654%
% Core Density #2(Subtracting Physical Cells): 78.602%
% Chip Density (Counting Std Cells and MACROs and IOs): 84.026%
% Chip Density #2(Subtracting Physical Cells): 63.718%
# Macros within 5 sites of IO pad: No
Macro halo defined?: No
==============================
Wire Length Distribution
==============================
Total metal1 wire length: 3255.6000 um
Total metal2 wire length: 34792.9500 um
Total metal3 wire length: 33720.6000 um
Total wire length: 71769.1500 um
Average wire length/net: 95.4377 um
Area of Power Net Distribution:
-----------------------------Area of Power Net Distribution
53. 53 | P a g e
-----------------------------Layer Name Area of Power Net Routable Area
metal1 36117.0900 303409.8000 11.9037%
metal2 26127.3600 303409.8000 8.6112%
metal3 0.0000 303409.8000 0.0000%
Percentage