Introduction:
• 16 bit arithmetic units are mainly found in microcontroller applications where speed is
important from real-time constraints and low power methodologies dominate system
design.

• Static CMOS which has been traditionally used for digital logic design has become
unattractive at advanced technology nodes due to high static and dynamic power
dissipation.1
• Design techniques using static CMOS such as Logical Effort have been developed that
give the best delay for a logic chain.
• An effort is made to reduce dynamic and static power dissipation of static CMOS in this
project.

Prepared by: Nirav Desai. Work done as a student at University of Minnesota Twin Cities
Brent Kung Adder Transistor Level Design
Inverter Design Optimization

V DD

110

100
V in

V out

CL

TD*Iavg

90

80

•
•
•
•

70

60

50

40

120

140

160

180

200

220

240

260

280

300

NMOS Width = 90nm
PMOS / NMOS Length = 50nM
Vdd = 1.1V
Current Averaged Over
One Period of 2 ns
• Optimal PMOS Width = 165nM
• βinverter = 165/90 = 1.834
• Sizing for NAND, NOR and XOR
Changed appropriately

PMOS Width (nM)

Work done as a student at the University of Minnesota, Twin Cities by Nirav Desai desai.nirav.12.09@gmail.com
Design Equations for Brent Kung Adder
Gi = Ai AND Bi …
(1)
Pi = Ai XOR Bi …
(2)
Gi1=Gi + Pi AND Gi-1 … (3)
Pi1=Pi AND Pi-1 …
(4)

Brent Kung Adder Gate Level Diagram
1. Input Block with Pre Computation

1.097X

3.883X

Input Adder Chain 1
1X
Gi + Pi*Gi-1
1.224X

Input Adder Chain 2
1.562X
1X

10.1683X

Input Adder Chain 3

1.553X

36X

Output Buffers to drive
Capacitive Loads

3.043X
1X
1.108X

Input Adder Chain 4

Pi*Pi-1

1.23X

2.943X

1X
1.274X

10.8506X

40X

1.034X

Output Buffers to drive
Capacitive Loads
Work done as a student at the University of Minnesota, Twin Cities by Nirav Desai desai.nirav.12.09@gmail.com
Design Equations for Brent Kung Adder
Gi = Ai AND Bi …
(1)
Pi = Ai XOR Bi …
(2)
Gi1=Gi + Pi AND Gi-1 … (3)
Pi1=Pi AND Pi-1 …
(4)

Brent Kung Adder Gate Level Diagram
2. Intermediate Dot Product Blocks

Intermediate Adder Chain 1

Gi + Pi*Gi-1
1X

Intermediate Adder Chain 2
6X

16X

1.72X
1X
1X
4X

1X

16X

Pi*Pi-1

Output Buffers to drive
Capacitive Loads

Work done as a student at the University of Minnesota, Twin Cities by Nirav Desai desai.nirav.12.09@gmail.com
Brent Kung Adder Gate Level Diagram
3. Output Block for Post Computation

Pi

Design Equations for Brent Kung Adder
Gi = Ai AND Bi …
(1)
Pi = Ai XOR Bi …
(2)
Gi1=Gi + Pi AND Gi-1 … (3)
Pi1=Pi AND Pi-1 …
(4)

Si

1.182X

1.117X

Ci-1

Output Buffers to drive
Capacitive Loads

Work done as a student at the University of Minnesota, Twin Cities by Nirav Desai desai.nirav.12.09@gmail.com
Work done as a student at the University of Minnesota, Twin Cities by Nirav Desai desai.nirav.12.09@gmail.com
Brent Kung Adder Transistor Level Design

1. Input Block with Pre Computation
Input Adder Block Chain 1

Logical Effort Design
for Signal Chains
labeled in previous
slide #2
Gi = logical effort
Fi = fan out
Si = sizing
Bi = Branching in

Gate Number
Gate Name
g value
f value
b value
S Value

Stage
G
1.000
2.000 3.000
4.000
5.000
Stage F Stage B Stage H
Gate H
BUFFER INVERTER NOR INVERTER NAND
LOAD
h
1.000
1.000 1.646
1.000
1.352 36.000 2.225 36.000
6.943
556.248
3.540
3.540
3.540 2.151
3.540
2.618648
2.893
2.400 1.000
1.000
1.000 1.000
1.000
1.224 1.097
3.883
10.16831 36.000

Input Adder Block Chain 2
Gate Number
Gate Name
g value
f value
b value
S Value

1.000
2.000 3.000
BUFFER INVERTER
XOR NAND
1.000
1.000 1.893
4.518
4.518 2.386
2.893
2.400 1.780
1.000
1.562 1.553

Stage
G
Stage F Stage B Stage H

4.000
LOAD
1.295
3.488
1.000
3.043

13.748

2.451 13.748

12.359

Gate H
h

416.510

4.518

1.000
13.748

Input Adder Block Chain 3
Gate Number
Gate Name
g value
f value
b value
S Value

1.000
2.000 3.000
BUFFER INVERTER
NOR
1.000
1.000 1.646
3.558
3.558 2.162
2.893
2.400 1.000
1.000
1.230 1.108

Stage
G
Stage F Stage B Stage H
LOAD
3.941

1.646

3.941

6.943

45.038

Gate H
h
3.558

3.941

Input Adder Block Chain 4
Gate Number
Gate Name
g value
f value
b value
S Value

Stage
G
1.000
2.000 3.000
4.000
5.000
Stage F Stage B Stage H
Gate H
BUFFER INVERTER
XOR NAND
INVERTER LOAD
h
1.000
1.000 1.893
1.295
1.000 40.000 2.451 40.000
6.943
680.832
3.686
3.686
3.686 1.947
2.847
3.686447
2.893
2.400 1.000
1.000
1.000 1.000
1.000
1.274 1.034
2.943
10.85056 40.000
3.94084

Work done as a student at the University of Minnesota, Twin Cities by Nirav Desai desai.nirav.12.09@gmail.com
Brent Kung Adder Transistor Level Design
2. Intermediate Dot Product Blocks
Intermediate Adder Block Chain 1
Gate Number
Gate Name
g value
f value
b value
S Value

1.000
2.000
INVERTER NAND LOAD
1.000
1.352
1.000
2.848
2.107
2.848
1.000
1.000
1.000
1.000
2.107
6.000

Stage G
1.352

Stage F
6.000

Stage B
1.000

Stage H

Gate H
h
8.112
2.848

Intermediate Adder Block Chain 2
Gate Number
Gate Name
g value
f value
b value
S Value

1.000
2.000
BUFFER NAND
1.000
1.352
2.775
2.053
2.000
1.000
1.000
1.026

Gi = logical effort
Fi = fan out
Si = sizing
Bi = Branching in

Stage G
LOAD
2.848

1.352

Stage F
2.848

Stage B
2.000

Stage H

Gate H
h
7.701
2.775

Logical Effort Design
for Signal Chains
labeled in previous
slide #2

Work done as a student at the University of Minnesota, Twin Cities by Nirav Desai desai.nirav.12.09@gmail.com
Brent Kung Adder Transistor Level Design
XOR GATE

Work done as a student at the University of Minnesota, Twin Cities by Nirav Desai desai.nirav.12.09@gmail.com
Brent Kung Adder Layout
Input Block with Pre Computation
Input Inverters for Bit 0 and Bit 1

XOR

Output Buffers
PEX waveforms show
larger size may be needed

NAND
10X

Work done as a student at the University of Minnesota, Twin Cities by Nirav Desai desai.nirav.12.09@gmail.com
Brent Kung Adder Layout
XOR 1.553X

Work done as a student at the University of Minnesota, Twin Cities by Nirav Desai desai.nirav.12.09@gmail.com
Brent Kung Adder Layout
Intermediate Dot Product Generator

Output Buffers
PEX Waveforms
show larger
Size may be necessary
here

Work done as a student at the University of Minnesota, Twin Cities by Nirav Desai desai.nirav.12.09@gmail.com
Brent Kung Adder Layout

Output Stage with Buffers

Work done as a student at the University of Minnesota, Twin Cities by Nirav Desai desai.nirav.12.09@gmail.com
Brent Kung Adder Layout
Full Layout: 49.5um X 48.6um

Work done as a student at the University of Minnesota, Twin Cities by Nirav Desai desai.nirav.12.09@gmail.com
Brent Kung Adder Worst Case Delay
Input Pattern: A: FFFF B: 0000 -> 0001
Dotted Lines show Carry Bits 15 and 14

Carry Bit 15

Carry Bit 14

Work done as a student at the University of Minnesota, Twin Cities by Nirav Desai desai.nirav.12.09@gmail.com
Output waveforms after parasitic extraction from layout: Sum Bit 0

Work done as a student at the University of Minnesota, Twin Cities by Nirav Desai desai.nirav.12.09@gmail.com
Output waveforms after parasitic extraction from layout: Sum Bit 14

Work done as a student at the University of Minnesota, Twin Cities by Nirav Desai desai.nirav.12.09@gmail.com
Brent Kung Adder Simulated Performance
Simulations with maximally sized 1 stage buffers as determined by Logical Effort Design
of individual chains
Voltage (V)

Delay Max-C14
(nS)

Power Max
(mW)

Power-Delay

1.1

0.359

6.73

2.41

0.9

0.503

2.95

1.483

0.7

0.937

0.924

0.865

Product (xE-12)

Simulations with minimally sized 1 stage buffers
Voltage (V)

Delay Max-C14
(nS)

Power Max
(mW)

Power-Delay

1.1

0.403

5.186

2.089

0.9

0.569

2.277

1.295

0.7

1.069

0.692

0.739

Product (xE-12)

Without Parasitic Extraction and Interconnect Parasitics buffering doesn’t improve performance
significantly.
Work done as a student at the University of Minnesota, Twin Cities by Nirav Desai desai.nirav.12.09@gmail.com
Comparison with other similar works:
Sr.
No.

Group Name

Adder Type

Technology

Adder Delay

Power
Consumption

Power Delay Product

1

[1] University of Waterloo
Department of ECE

16 bit Kogge
Stone

FPGA 45nM

419ps

13.29mW

5.57E-12

2

This work

16 bit Brent
Kung Adder

NCSU 45nM Free PDK
ASIC

359ps

6.73mW

2.41E-12

3

[4] University of Texas,
Tyler Department of EE

16 bit Kogge
Stone

Spartan 3e 90nM

6.286ns

--

--

4

[2] VIT University, Vellore

16 bit Kogge
Stone

SPARTAN 3e 90nM

599ns

46.16uW

2.76E-11

5

[2] VIT University, Vellore

16 bit Brent
Kung Adder

Spartan 3e 90nM

762ns

32.465uW

2.47E-11

8

[5] University of
Wisconsin, Madison

16 bit Ripple
Carry Adder

LSI Logic 110nM ASIC

2.59ns

--

--

9

[5] University of
Wisconsin, Madison

16 bit Carry
Lookahead
Adder

LSI Logic 110nM ASIC

1.09ns

--

--

6

[3] Concordia University
Department of ECS

16 bit Brent
Kung Adder

Virtex 2 130nM

26.94ns

1.15W

3.10E-08

7

[3] Concordia University
Department of ECS

16 bit Kogge
Stone

Virtex 2 130nM

25.59ns

1.5546W

3.97E-08

Column1

Work done as a student at the University of Minnesota, Twin Cities by Nirav Desai desai.nirav.12.09@gmail.com
References: Comparison with other similar works:

Work done as a student at the University of Minnesota, Twin Cities by Nirav Desai desai.nirav.12.09@gmail.com
References:
1. K. Nose and T. Sakurai, “Optimization of VDD and VTH for low power high speed
applications,” in ACM/IEEE Design Automation Conference (DAC) Digest of Technical
Papers, 2000, pp. 469-474
From: Sub-threshold Design for Ultra Low-Power Systems
Alice Wang, Benton Highsmith Calhoun, Anantha P. Chandrakasan
2. Logical Effort by Ivan Sutherland, Bob Sproull and David Harris (Book)
3. Digital Integrated Circuits by Jan Rabaey, Anantha Chandrakasan, Borivoje Nikolic (Book)
4. A high-density sub-threshold SRAM with data-independent bit line leakage and virtualground replica scheme
Tae-Hyoung Kim, Jason Liu, John Keane, Chris Kim, University of Minnesota
ISSCC 2007
5. The Design of CMOS Radio-Frequency Integrated Circuits by Thomas Lee (Book)

Work done as a student at the University of Minnesota, Twin Cities by Nirav Desai desai.nirav.12.09@gmail.com

Design of a high speed low power Brent Kung Adder in 45nM CMOS

  • 1.
    Introduction: • 16 bitarithmetic units are mainly found in microcontroller applications where speed is important from real-time constraints and low power methodologies dominate system design. • Static CMOS which has been traditionally used for digital logic design has become unattractive at advanced technology nodes due to high static and dynamic power dissipation.1 • Design techniques using static CMOS such as Logical Effort have been developed that give the best delay for a logic chain. • An effort is made to reduce dynamic and static power dissipation of static CMOS in this project. Prepared by: Nirav Desai. Work done as a student at University of Minnesota Twin Cities
  • 2.
    Brent Kung AdderTransistor Level Design Inverter Design Optimization V DD 110 100 V in V out CL TD*Iavg 90 80 • • • • 70 60 50 40 120 140 160 180 200 220 240 260 280 300 NMOS Width = 90nm PMOS / NMOS Length = 50nM Vdd = 1.1V Current Averaged Over One Period of 2 ns • Optimal PMOS Width = 165nM • βinverter = 165/90 = 1.834 • Sizing for NAND, NOR and XOR Changed appropriately PMOS Width (nM) Work done as a student at the University of Minnesota, Twin Cities by Nirav Desai desai.nirav.12.09@gmail.com
  • 3.
    Design Equations forBrent Kung Adder Gi = Ai AND Bi … (1) Pi = Ai XOR Bi … (2) Gi1=Gi + Pi AND Gi-1 … (3) Pi1=Pi AND Pi-1 … (4) Brent Kung Adder Gate Level Diagram 1. Input Block with Pre Computation 1.097X 3.883X Input Adder Chain 1 1X Gi + Pi*Gi-1 1.224X Input Adder Chain 2 1.562X 1X 10.1683X Input Adder Chain 3 1.553X 36X Output Buffers to drive Capacitive Loads 3.043X 1X 1.108X Input Adder Chain 4 Pi*Pi-1 1.23X 2.943X 1X 1.274X 10.8506X 40X 1.034X Output Buffers to drive Capacitive Loads Work done as a student at the University of Minnesota, Twin Cities by Nirav Desai desai.nirav.12.09@gmail.com
  • 4.
    Design Equations forBrent Kung Adder Gi = Ai AND Bi … (1) Pi = Ai XOR Bi … (2) Gi1=Gi + Pi AND Gi-1 … (3) Pi1=Pi AND Pi-1 … (4) Brent Kung Adder Gate Level Diagram 2. Intermediate Dot Product Blocks Intermediate Adder Chain 1 Gi + Pi*Gi-1 1X Intermediate Adder Chain 2 6X 16X 1.72X 1X 1X 4X 1X 16X Pi*Pi-1 Output Buffers to drive Capacitive Loads Work done as a student at the University of Minnesota, Twin Cities by Nirav Desai desai.nirav.12.09@gmail.com
  • 5.
    Brent Kung AdderGate Level Diagram 3. Output Block for Post Computation Pi Design Equations for Brent Kung Adder Gi = Ai AND Bi … (1) Pi = Ai XOR Bi … (2) Gi1=Gi + Pi AND Gi-1 … (3) Pi1=Pi AND Pi-1 … (4) Si 1.182X 1.117X Ci-1 Output Buffers to drive Capacitive Loads Work done as a student at the University of Minnesota, Twin Cities by Nirav Desai desai.nirav.12.09@gmail.com
  • 6.
    Work done asa student at the University of Minnesota, Twin Cities by Nirav Desai desai.nirav.12.09@gmail.com
  • 7.
    Brent Kung AdderTransistor Level Design 1. Input Block with Pre Computation Input Adder Block Chain 1 Logical Effort Design for Signal Chains labeled in previous slide #2 Gi = logical effort Fi = fan out Si = sizing Bi = Branching in Gate Number Gate Name g value f value b value S Value Stage G 1.000 2.000 3.000 4.000 5.000 Stage F Stage B Stage H Gate H BUFFER INVERTER NOR INVERTER NAND LOAD h 1.000 1.000 1.646 1.000 1.352 36.000 2.225 36.000 6.943 556.248 3.540 3.540 3.540 2.151 3.540 2.618648 2.893 2.400 1.000 1.000 1.000 1.000 1.000 1.224 1.097 3.883 10.16831 36.000 Input Adder Block Chain 2 Gate Number Gate Name g value f value b value S Value 1.000 2.000 3.000 BUFFER INVERTER XOR NAND 1.000 1.000 1.893 4.518 4.518 2.386 2.893 2.400 1.780 1.000 1.562 1.553 Stage G Stage F Stage B Stage H 4.000 LOAD 1.295 3.488 1.000 3.043 13.748 2.451 13.748 12.359 Gate H h 416.510 4.518 1.000 13.748 Input Adder Block Chain 3 Gate Number Gate Name g value f value b value S Value 1.000 2.000 3.000 BUFFER INVERTER NOR 1.000 1.000 1.646 3.558 3.558 2.162 2.893 2.400 1.000 1.000 1.230 1.108 Stage G Stage F Stage B Stage H LOAD 3.941 1.646 3.941 6.943 45.038 Gate H h 3.558 3.941 Input Adder Block Chain 4 Gate Number Gate Name g value f value b value S Value Stage G 1.000 2.000 3.000 4.000 5.000 Stage F Stage B Stage H Gate H BUFFER INVERTER XOR NAND INVERTER LOAD h 1.000 1.000 1.893 1.295 1.000 40.000 2.451 40.000 6.943 680.832 3.686 3.686 3.686 1.947 2.847 3.686447 2.893 2.400 1.000 1.000 1.000 1.000 1.000 1.274 1.034 2.943 10.85056 40.000 3.94084 Work done as a student at the University of Minnesota, Twin Cities by Nirav Desai desai.nirav.12.09@gmail.com
  • 8.
    Brent Kung AdderTransistor Level Design 2. Intermediate Dot Product Blocks Intermediate Adder Block Chain 1 Gate Number Gate Name g value f value b value S Value 1.000 2.000 INVERTER NAND LOAD 1.000 1.352 1.000 2.848 2.107 2.848 1.000 1.000 1.000 1.000 2.107 6.000 Stage G 1.352 Stage F 6.000 Stage B 1.000 Stage H Gate H h 8.112 2.848 Intermediate Adder Block Chain 2 Gate Number Gate Name g value f value b value S Value 1.000 2.000 BUFFER NAND 1.000 1.352 2.775 2.053 2.000 1.000 1.000 1.026 Gi = logical effort Fi = fan out Si = sizing Bi = Branching in Stage G LOAD 2.848 1.352 Stage F 2.848 Stage B 2.000 Stage H Gate H h 7.701 2.775 Logical Effort Design for Signal Chains labeled in previous slide #2 Work done as a student at the University of Minnesota, Twin Cities by Nirav Desai desai.nirav.12.09@gmail.com
  • 9.
    Brent Kung AdderTransistor Level Design XOR GATE Work done as a student at the University of Minnesota, Twin Cities by Nirav Desai desai.nirav.12.09@gmail.com
  • 10.
    Brent Kung AdderLayout Input Block with Pre Computation Input Inverters for Bit 0 and Bit 1 XOR Output Buffers PEX waveforms show larger size may be needed NAND 10X Work done as a student at the University of Minnesota, Twin Cities by Nirav Desai desai.nirav.12.09@gmail.com
  • 11.
    Brent Kung AdderLayout XOR 1.553X Work done as a student at the University of Minnesota, Twin Cities by Nirav Desai desai.nirav.12.09@gmail.com
  • 12.
    Brent Kung AdderLayout Intermediate Dot Product Generator Output Buffers PEX Waveforms show larger Size may be necessary here Work done as a student at the University of Minnesota, Twin Cities by Nirav Desai desai.nirav.12.09@gmail.com
  • 13.
    Brent Kung AdderLayout Output Stage with Buffers Work done as a student at the University of Minnesota, Twin Cities by Nirav Desai desai.nirav.12.09@gmail.com
  • 14.
    Brent Kung AdderLayout Full Layout: 49.5um X 48.6um Work done as a student at the University of Minnesota, Twin Cities by Nirav Desai desai.nirav.12.09@gmail.com
  • 15.
    Brent Kung AdderWorst Case Delay Input Pattern: A: FFFF B: 0000 -> 0001 Dotted Lines show Carry Bits 15 and 14 Carry Bit 15 Carry Bit 14 Work done as a student at the University of Minnesota, Twin Cities by Nirav Desai desai.nirav.12.09@gmail.com
  • 16.
    Output waveforms afterparasitic extraction from layout: Sum Bit 0 Work done as a student at the University of Minnesota, Twin Cities by Nirav Desai desai.nirav.12.09@gmail.com
  • 17.
    Output waveforms afterparasitic extraction from layout: Sum Bit 14 Work done as a student at the University of Minnesota, Twin Cities by Nirav Desai desai.nirav.12.09@gmail.com
  • 18.
    Brent Kung AdderSimulated Performance Simulations with maximally sized 1 stage buffers as determined by Logical Effort Design of individual chains Voltage (V) Delay Max-C14 (nS) Power Max (mW) Power-Delay 1.1 0.359 6.73 2.41 0.9 0.503 2.95 1.483 0.7 0.937 0.924 0.865 Product (xE-12) Simulations with minimally sized 1 stage buffers Voltage (V) Delay Max-C14 (nS) Power Max (mW) Power-Delay 1.1 0.403 5.186 2.089 0.9 0.569 2.277 1.295 0.7 1.069 0.692 0.739 Product (xE-12) Without Parasitic Extraction and Interconnect Parasitics buffering doesn’t improve performance significantly. Work done as a student at the University of Minnesota, Twin Cities by Nirav Desai desai.nirav.12.09@gmail.com
  • 19.
    Comparison with othersimilar works: Sr. No. Group Name Adder Type Technology Adder Delay Power Consumption Power Delay Product 1 [1] University of Waterloo Department of ECE 16 bit Kogge Stone FPGA 45nM 419ps 13.29mW 5.57E-12 2 This work 16 bit Brent Kung Adder NCSU 45nM Free PDK ASIC 359ps 6.73mW 2.41E-12 3 [4] University of Texas, Tyler Department of EE 16 bit Kogge Stone Spartan 3e 90nM 6.286ns -- -- 4 [2] VIT University, Vellore 16 bit Kogge Stone SPARTAN 3e 90nM 599ns 46.16uW 2.76E-11 5 [2] VIT University, Vellore 16 bit Brent Kung Adder Spartan 3e 90nM 762ns 32.465uW 2.47E-11 8 [5] University of Wisconsin, Madison 16 bit Ripple Carry Adder LSI Logic 110nM ASIC 2.59ns -- -- 9 [5] University of Wisconsin, Madison 16 bit Carry Lookahead Adder LSI Logic 110nM ASIC 1.09ns -- -- 6 [3] Concordia University Department of ECS 16 bit Brent Kung Adder Virtex 2 130nM 26.94ns 1.15W 3.10E-08 7 [3] Concordia University Department of ECS 16 bit Kogge Stone Virtex 2 130nM 25.59ns 1.5546W 3.97E-08 Column1 Work done as a student at the University of Minnesota, Twin Cities by Nirav Desai desai.nirav.12.09@gmail.com
  • 20.
    References: Comparison withother similar works: Work done as a student at the University of Minnesota, Twin Cities by Nirav Desai desai.nirav.12.09@gmail.com
  • 21.
    References: 1. K. Noseand T. Sakurai, “Optimization of VDD and VTH for low power high speed applications,” in ACM/IEEE Design Automation Conference (DAC) Digest of Technical Papers, 2000, pp. 469-474 From: Sub-threshold Design for Ultra Low-Power Systems Alice Wang, Benton Highsmith Calhoun, Anantha P. Chandrakasan 2. Logical Effort by Ivan Sutherland, Bob Sproull and David Harris (Book) 3. Digital Integrated Circuits by Jan Rabaey, Anantha Chandrakasan, Borivoje Nikolic (Book) 4. A high-density sub-threshold SRAM with data-independent bit line leakage and virtualground replica scheme Tae-Hyoung Kim, Jason Liu, John Keane, Chris Kim, University of Minnesota ISSCC 2007 5. The Design of CMOS Radio-Frequency Integrated Circuits by Thomas Lee (Book) Work done as a student at the University of Minnesota, Twin Cities by Nirav Desai desai.nirav.12.09@gmail.com