SlideShare a Scribd company logo
Carry-Save Addition
Prof. Loh
CS3220 - Processor Design - Spring 2005
February 2, 2005
1 Adding Multiple Numbers
There are many cases where it is desired to add more than two numbers together. The straightforward way of adding
together m numbers (all n bits wide) is to add the first two, then add that sum to the next, and so on. This requires
a total of m − 1 additions, for a total gate delay of O(m lg n) (assuming lookahead carry adders). Instead, a tree of
adders can be formed, taking only O(lg m · lg n) gate delays.
Using carry save addition, the delay can be reduced further still. The idea is to take 3 numbers that we want to add
together, x + y + z, and convert it into 2 numbers c + s such that x + y + z = c + s, and do this in O(1) time. The
reason why addition can not be performed in O(1) time is because the carry information must be propagated. In carry
save addition, we refrain from directly passing on the carry information until the very last step. We will first illustrate
the general concept with a base 10 example.
To add three numbers by hand, we typically align the three operands, and then proceed column by column in the
same fashion that we perform addition with two numbers. The three digits in a row are added, and any overflow goes
into the next column. Observe that when there is some non-zero carry, we are really adding four digits (the digits of
x,y and z, plus the carry).
carry: 1 1 2 1
x: 1 2 3 4 5
y: 3 8 1 7 2
z: + 2 0 5 8 7
sum: 7 1 1 0 4
The carry save approach breaks this process down into two steps. The first is to compute the sum ignoring any carries:
x: 1 2 3 4 5
y: 3 8 1 7 2
z: + 2 0 5 8 7
s: 6 0 9 9 4
Each si is equal to the sum of xi + yi + zi modulo 10. Now, separately, we can compute the carry on a column by
column basis:
x: 1 2 3 4 5
y: 3 8 1 7 2
z: + 2 0 5 8 7
c: 1 0 1 1
In this case, each ci is the sum of the bits from the previous column divided by 10 (ignoring any remainder). Another
way to look at it is that any carry over from one column gets put into the next column. Now, we can add together c
and s, and we’ll verify that it indeed is equal to x + y + z:
1
x y
cincout
s
⇒
c s
x y z
FA CSA
Figure 1: The carry save adder block is the same circuit as the full adder.
x0 y0 z0
s0c1
CSA
x1 y1 z1
s1c2
CSA
x2 y2 z2
s2c3
CSA
x3 y3 z3
s3c4
CSA
x4 y4 z4
s4c5
CSA
x5 y5 z5
s5c6
CSA
x6 y6 z6
s6c7
CSA
x7 y7 z7
s7c8
CSA
Figure 2: One CSA block is used for each bit. This circuit adds three n = 8 bit numbers together into two new
numbers.
s: 6 0 9 9 4
c: + 1 0 1 1
sum: 7 1 1 0 4
The important point is that c and s can be computed independently, and furthermore, each ci (and si) can be computed
independently from all of the other c’s (and s’s). This achieves our original goal of converting three numbers that we
wish to add into two numbers that add up to the same sum, and in O(1) time.
The same concept can be applied to binary numbers. As a quick example:
x: 1 0 0 1 1
y: 1 1 0 0 1
z: + 0 1 0 1 1
s: 0 0 0 0 1
c: + 1 1 0 1 1
sum: 1 1 0 1 1 1
What does the circuit to compute s and c look like? It is actually identical to the full adder, but with some of the
signals renamed. Figure 1 shows a full adder and a carry save adder. A carry save adder simply is a full adder with
the cin input renamed to z, the z output (the original “answer” output) renamed to s, and the cout output renamed to
c. Figure 2 shows how n carry save adders are arranged to add three n bit numbers x,y and z into two numbers c
and s. Note that the CSA block in bit position zero generates c1, not c0. Similar to the least significant column when
adding numbers by hand (the “blank”), c0 is equal to zero. Note that all of the CSA blocks are independent, thus the
entire circuit takes only O(1) time. To get the final sum, we still need a LCA, which will cost us O(lg n) delay. The
asymptotic gate delay to add three n-bit numbers is thus the same as adding only two n-bit numbers.
So how long does it take us to add m different n-bit numbers together? The simple approach is just to repeat this
trick approximately m times over. This is illustrated in Figure 3. There are m−2 CSA blocks (each block in the figure
actually represents many one-bit CSA blocks in parallel) that we have to go through, and then the final LCA. Note that
every time we pass through a CSA block, our number increases in size by one bit. Therefore, the numbers that go to
the LCA will be at most n + m − 2 bits long. So the final LCA will have a gate delay of O(lg (n + m)). Therefore
the total gate delay is O(m + lg (n + m))
Instead of arranging the CSA blocks in a chain, a tree formation can actually be used. This is slightly awkward
because of the odd ratio of 3 to 2. Figure 4 shows how to build a tree of CSAs. This circuit is called a Wallace tree.
2
X0X1X2X3X4X5X6X7X8
LCA
P8
i=0 Xi
Figure 3: Adding m n-bit numbers with a chain of CSA’s.
The depth of the tree is log3
2
m. Like before, the width of the numbers will increase as we get deeper in the tree. At
the end of the tree, the numbers will be O(n + log m) bits wide, and therefore the LCA will have a O(lg (n + log m))
gate delay. The total gate delay of the Wallace tree is thus O(log m + lg (n + log m)).
3
LCA
P8
i=0 Xi
X0 X1 X2 X3 X4 X5 X6 X7 X8
Figure 4: A Wallace tree for adding m n-bit numbers.
4

More Related Content

What's hot

MCA-I-COA- overview of register transfer, micro operations and basic computer...
MCA-I-COA- overview of register transfer, micro operations and basic computer...MCA-I-COA- overview of register transfer, micro operations and basic computer...
MCA-I-COA- overview of register transfer, micro operations and basic computer...
Rai University
 
Floating point representation
Floating point representationFloating point representation
Floating point representation
missstevenson01
 
Computer graphics chapter 4
Computer graphics chapter 4Computer graphics chapter 4
Computer graphics chapter 4
PrathimaBaliga
 
Data encryption standard
Data encryption standardData encryption standard
Data encryption standard
Vasuki Ramasamy
 
Deductive databases
Deductive databasesDeductive databases
Deductive databases
Dabbal Singh Mahara
 
Circle drawing algo.
Circle drawing algo.Circle drawing algo.
Circle drawing algo.Mohd Arif
 
Computer arithmetics (computer organisation & arithmetics) ppt
Computer arithmetics (computer organisation & arithmetics) pptComputer arithmetics (computer organisation & arithmetics) ppt
Computer arithmetics (computer organisation & arithmetics) ppt
SuryaKumarSahani
 
Computer architecture pipelining
Computer architecture pipeliningComputer architecture pipelining
Computer architecture pipelining
Mazin Alwaaly
 
Computer architecture data representation
Computer architecture  data representationComputer architecture  data representation
Computer architecture data representation
Anil Pokhrel
 
backtracking algorithms of ada
backtracking algorithms of adabacktracking algorithms of ada
backtracking algorithms of ada
Sahil Kumar
 
Part 1 : reliable transmission
Part 1 : reliable transmissionPart 1 : reliable transmission
Part 1 : reliable transmission
Olivier Bonaventure
 
Huffman coding
Huffman coding Huffman coding
Huffman coding
Nazmul Hyder
 
Classical encryption techniques
Classical encryption techniquesClassical encryption techniques
Classical encryption techniques
Dr.Florence Dayana
 
Hypermedia messageing (UNIT 5)
Hypermedia messageing (UNIT 5)Hypermedia messageing (UNIT 5)
Hypermedia messageing (UNIT 5)
nirmalbj
 
Data representation
Data representationData representation
Data representation
shashikant pabari
 
Error detection & correction codes
Error detection & correction codesError detection & correction codes
Error detection & correction codes
Revathi Subramaniam
 
Chapter 3: Block Ciphers and the Data Encryption Standard
Chapter 3: Block Ciphers and the Data Encryption StandardChapter 3: Block Ciphers and the Data Encryption Standard
Chapter 3: Block Ciphers and the Data Encryption Standard
Shafaan Khaliq Bhatti
 

What's hot (20)

MCA-I-COA- overview of register transfer, micro operations and basic computer...
MCA-I-COA- overview of register transfer, micro operations and basic computer...MCA-I-COA- overview of register transfer, micro operations and basic computer...
MCA-I-COA- overview of register transfer, micro operations and basic computer...
 
Floating point representation
Floating point representationFloating point representation
Floating point representation
 
Computer graphics chapter 4
Computer graphics chapter 4Computer graphics chapter 4
Computer graphics chapter 4
 
Data encryption standard
Data encryption standardData encryption standard
Data encryption standard
 
Deductive databases
Deductive databasesDeductive databases
Deductive databases
 
Circle drawing algo.
Circle drawing algo.Circle drawing algo.
Circle drawing algo.
 
Computer arithmetics (computer organisation & arithmetics) ppt
Computer arithmetics (computer organisation & arithmetics) pptComputer arithmetics (computer organisation & arithmetics) ppt
Computer arithmetics (computer organisation & arithmetics) ppt
 
Computer architecture pipelining
Computer architecture pipeliningComputer architecture pipelining
Computer architecture pipelining
 
Computer architecture data representation
Computer architecture  data representationComputer architecture  data representation
Computer architecture data representation
 
backtracking algorithms of ada
backtracking algorithms of adabacktracking algorithms of ada
backtracking algorithms of ada
 
Part 1 : reliable transmission
Part 1 : reliable transmissionPart 1 : reliable transmission
Part 1 : reliable transmission
 
Huffman coding
Huffman coding Huffman coding
Huffman coding
 
Classical encryption techniques
Classical encryption techniquesClassical encryption techniques
Classical encryption techniques
 
Hypermedia messageing (UNIT 5)
Hypermedia messageing (UNIT 5)Hypermedia messageing (UNIT 5)
Hypermedia messageing (UNIT 5)
 
Data representation
Data representationData representation
Data representation
 
Error detection & correction codes
Error detection & correction codesError detection & correction codes
Error detection & correction codes
 
RSA ALGORITHM
RSA ALGORITHMRSA ALGORITHM
RSA ALGORITHM
 
Computer graphics
Computer graphicsComputer graphics
Computer graphics
 
Slide03 Number System and Operations Part 1
Slide03 Number System and Operations Part 1Slide03 Number System and Operations Part 1
Slide03 Number System and Operations Part 1
 
Chapter 3: Block Ciphers and the Data Encryption Standard
Chapter 3: Block Ciphers and the Data Encryption StandardChapter 3: Block Ciphers and the Data Encryption Standard
Chapter 3: Block Ciphers and the Data Encryption Standard
 

Viewers also liked

Carry save adder Type 2
Carry save adder Type 2Carry save adder Type 2
Carry save adder Type 2
Atchyuth Sonti
 
Carry save multiplier
Carry save multiplierCarry save multiplier
Carry save multiplier
youssef ramzy
 
Carry look ahead adder
Carry look ahead adderCarry look ahead adder
Carry look ahead adder
dragonpradeep
 
Low Power Techniques
Low Power TechniquesLow Power Techniques
Low Power Techniques
keshava murali
 

Viewers also liked (8)

Carry save adder Type 2
Carry save adder Type 2Carry save adder Type 2
Carry save adder Type 2
 
Carry save adder vhdl
Carry save adder vhdlCarry save adder vhdl
Carry save adder vhdl
 
Adder ppt
Adder pptAdder ppt
Adder ppt
 
carry select adder
carry select addercarry select adder
carry select adder
 
Carry save multiplier
Carry save multiplierCarry save multiplier
Carry save multiplier
 
Carry look ahead adder
Carry look ahead adderCarry look ahead adder
Carry look ahead adder
 
Adder Presentation
Adder PresentationAdder Presentation
Adder Presentation
 
Low Power Techniques
Low Power TechniquesLow Power Techniques
Low Power Techniques
 

Similar to Carry save addition

2008 worldfinalsproblemset
2008 worldfinalsproblemset2008 worldfinalsproblemset
2008 worldfinalsproblemset
Luca Zannini
 
ALA Solution.pdf
ALA Solution.pdfALA Solution.pdf
ALA Solution.pdf
RkAA4
 
Sample quizz test
Sample quizz testSample quizz test
Sample quizz test
kasguest
 
Comparison among Different Adders
Comparison among Different Adders Comparison among Different Adders
Comparison among Different Adders
iosrjce
 
Ceng232 Decoder Multiplexer Adder
Ceng232 Decoder Multiplexer AdderCeng232 Decoder Multiplexer Adder
Ceng232 Decoder Multiplexer Adder
gueste731a4
 
Application of matrices in Daily life
Application of matrices in Daily lifeApplication of matrices in Daily life
Application of matrices in Daily life
shubham mishra
 
Aes128 bit project_report
Aes128 bit project_reportAes128 bit project_report
Aes128 bit project_report
Nikhil Gupta
 
Unit-3 greedy method, Prim's algorithm, Kruskal's algorithm.pdf
Unit-3 greedy method, Prim's algorithm, Kruskal's algorithm.pdfUnit-3 greedy method, Prim's algorithm, Kruskal's algorithm.pdf
Unit-3 greedy method, Prim's algorithm, Kruskal's algorithm.pdf
yashodamb
 
Answers withexplanations
Answers withexplanationsAnswers withexplanations
Answers withexplanations
Gopi Saiteja
 
Syed Ubaid Ali Jafri - Cryptography Techniques
Syed Ubaid Ali Jafri - Cryptography TechniquesSyed Ubaid Ali Jafri - Cryptography Techniques
Syed Ubaid Ali Jafri - Cryptography Techniques
Syed Ubaid Ali Jafri
 
Lab 4 Three-Bit Binary Adder
Lab 4 Three-Bit Binary AdderLab 4 Three-Bit Binary Adder
Lab 4 Three-Bit Binary Adder
Katrina Little
 
1525 equations of lines in space
1525 equations of lines in space1525 equations of lines in space
1525 equations of lines in space
Dr Fereidoun Dejahang
 
SAMPLE QUESTIONExercise 1 Consider the functionf (x,C).docx
SAMPLE QUESTIONExercise 1 Consider the functionf (x,C).docxSAMPLE QUESTIONExercise 1 Consider the functionf (x,C).docx
SAMPLE QUESTIONExercise 1 Consider the functionf (x,C).docx
anhlodge
 
Graphics6 bresenham circlesandpolygons
Graphics6 bresenham circlesandpolygonsGraphics6 bresenham circlesandpolygons
Graphics6 bresenham circlesandpolygonsThirunavukarasu Mani
 
Graphics6 bresenham circlesandpolygons
Graphics6 bresenham circlesandpolygonsGraphics6 bresenham circlesandpolygons
Graphics6 bresenham circlesandpolygonsKetan Jani
 
GTU LAVC Line Integral,Green Theorem in the Plane, Surface And Volume Integra...
GTU LAVC Line Integral,Green Theorem in the Plane, Surface And Volume Integra...GTU LAVC Line Integral,Green Theorem in the Plane, Surface And Volume Integra...
GTU LAVC Line Integral,Green Theorem in the Plane, Surface And Volume Integra...
Panchal Anand
 
Introduction to Logarithm
Introduction to LogarithmIntroduction to Logarithm
Introduction to Logarithm
FellowBuddy.com
 

Similar to Carry save addition (20)

2008 worldfinalsproblemset
2008 worldfinalsproblemset2008 worldfinalsproblemset
2008 worldfinalsproblemset
 
ALA Solution.pdf
ALA Solution.pdfALA Solution.pdf
ALA Solution.pdf
 
Sample quizz test
Sample quizz testSample quizz test
Sample quizz test
 
Comparison among Different Adders
Comparison among Different Adders Comparison among Different Adders
Comparison among Different Adders
 
internal assement 3
internal assement 3internal assement 3
internal assement 3
 
Ceng232 Decoder Multiplexer Adder
Ceng232 Decoder Multiplexer AdderCeng232 Decoder Multiplexer Adder
Ceng232 Decoder Multiplexer Adder
 
Application of matrices in Daily life
Application of matrices in Daily lifeApplication of matrices in Daily life
Application of matrices in Daily life
 
Aes128 bit project_report
Aes128 bit project_reportAes128 bit project_report
Aes128 bit project_report
 
Unit-3 greedy method, Prim's algorithm, Kruskal's algorithm.pdf
Unit-3 greedy method, Prim's algorithm, Kruskal's algorithm.pdfUnit-3 greedy method, Prim's algorithm, Kruskal's algorithm.pdf
Unit-3 greedy method, Prim's algorithm, Kruskal's algorithm.pdf
 
Lec20
Lec20Lec20
Lec20
 
Answers withexplanations
Answers withexplanationsAnswers withexplanations
Answers withexplanations
 
Syed Ubaid Ali Jafri - Cryptography Techniques
Syed Ubaid Ali Jafri - Cryptography TechniquesSyed Ubaid Ali Jafri - Cryptography Techniques
Syed Ubaid Ali Jafri - Cryptography Techniques
 
Lab 4 Three-Bit Binary Adder
Lab 4 Three-Bit Binary AdderLab 4 Three-Bit Binary Adder
Lab 4 Three-Bit Binary Adder
 
Matlabtut1
Matlabtut1Matlabtut1
Matlabtut1
 
1525 equations of lines in space
1525 equations of lines in space1525 equations of lines in space
1525 equations of lines in space
 
SAMPLE QUESTIONExercise 1 Consider the functionf (x,C).docx
SAMPLE QUESTIONExercise 1 Consider the functionf (x,C).docxSAMPLE QUESTIONExercise 1 Consider the functionf (x,C).docx
SAMPLE QUESTIONExercise 1 Consider the functionf (x,C).docx
 
Graphics6 bresenham circlesandpolygons
Graphics6 bresenham circlesandpolygonsGraphics6 bresenham circlesandpolygons
Graphics6 bresenham circlesandpolygons
 
Graphics6 bresenham circlesandpolygons
Graphics6 bresenham circlesandpolygonsGraphics6 bresenham circlesandpolygons
Graphics6 bresenham circlesandpolygons
 
GTU LAVC Line Integral,Green Theorem in the Plane, Surface And Volume Integra...
GTU LAVC Line Integral,Green Theorem in the Plane, Surface And Volume Integra...GTU LAVC Line Integral,Green Theorem in the Plane, Surface And Volume Integra...
GTU LAVC Line Integral,Green Theorem in the Plane, Surface And Volume Integra...
 
Introduction to Logarithm
Introduction to LogarithmIntroduction to Logarithm
Introduction to Logarithm
 

Carry save addition

  • 1. Carry-Save Addition Prof. Loh CS3220 - Processor Design - Spring 2005 February 2, 2005 1 Adding Multiple Numbers There are many cases where it is desired to add more than two numbers together. The straightforward way of adding together m numbers (all n bits wide) is to add the first two, then add that sum to the next, and so on. This requires a total of m − 1 additions, for a total gate delay of O(m lg n) (assuming lookahead carry adders). Instead, a tree of adders can be formed, taking only O(lg m · lg n) gate delays. Using carry save addition, the delay can be reduced further still. The idea is to take 3 numbers that we want to add together, x + y + z, and convert it into 2 numbers c + s such that x + y + z = c + s, and do this in O(1) time. The reason why addition can not be performed in O(1) time is because the carry information must be propagated. In carry save addition, we refrain from directly passing on the carry information until the very last step. We will first illustrate the general concept with a base 10 example. To add three numbers by hand, we typically align the three operands, and then proceed column by column in the same fashion that we perform addition with two numbers. The three digits in a row are added, and any overflow goes into the next column. Observe that when there is some non-zero carry, we are really adding four digits (the digits of x,y and z, plus the carry). carry: 1 1 2 1 x: 1 2 3 4 5 y: 3 8 1 7 2 z: + 2 0 5 8 7 sum: 7 1 1 0 4 The carry save approach breaks this process down into two steps. The first is to compute the sum ignoring any carries: x: 1 2 3 4 5 y: 3 8 1 7 2 z: + 2 0 5 8 7 s: 6 0 9 9 4 Each si is equal to the sum of xi + yi + zi modulo 10. Now, separately, we can compute the carry on a column by column basis: x: 1 2 3 4 5 y: 3 8 1 7 2 z: + 2 0 5 8 7 c: 1 0 1 1 In this case, each ci is the sum of the bits from the previous column divided by 10 (ignoring any remainder). Another way to look at it is that any carry over from one column gets put into the next column. Now, we can add together c and s, and we’ll verify that it indeed is equal to x + y + z: 1
  • 2. x y cincout s ⇒ c s x y z FA CSA Figure 1: The carry save adder block is the same circuit as the full adder. x0 y0 z0 s0c1 CSA x1 y1 z1 s1c2 CSA x2 y2 z2 s2c3 CSA x3 y3 z3 s3c4 CSA x4 y4 z4 s4c5 CSA x5 y5 z5 s5c6 CSA x6 y6 z6 s6c7 CSA x7 y7 z7 s7c8 CSA Figure 2: One CSA block is used for each bit. This circuit adds three n = 8 bit numbers together into two new numbers. s: 6 0 9 9 4 c: + 1 0 1 1 sum: 7 1 1 0 4 The important point is that c and s can be computed independently, and furthermore, each ci (and si) can be computed independently from all of the other c’s (and s’s). This achieves our original goal of converting three numbers that we wish to add into two numbers that add up to the same sum, and in O(1) time. The same concept can be applied to binary numbers. As a quick example: x: 1 0 0 1 1 y: 1 1 0 0 1 z: + 0 1 0 1 1 s: 0 0 0 0 1 c: + 1 1 0 1 1 sum: 1 1 0 1 1 1 What does the circuit to compute s and c look like? It is actually identical to the full adder, but with some of the signals renamed. Figure 1 shows a full adder and a carry save adder. A carry save adder simply is a full adder with the cin input renamed to z, the z output (the original “answer” output) renamed to s, and the cout output renamed to c. Figure 2 shows how n carry save adders are arranged to add three n bit numbers x,y and z into two numbers c and s. Note that the CSA block in bit position zero generates c1, not c0. Similar to the least significant column when adding numbers by hand (the “blank”), c0 is equal to zero. Note that all of the CSA blocks are independent, thus the entire circuit takes only O(1) time. To get the final sum, we still need a LCA, which will cost us O(lg n) delay. The asymptotic gate delay to add three n-bit numbers is thus the same as adding only two n-bit numbers. So how long does it take us to add m different n-bit numbers together? The simple approach is just to repeat this trick approximately m times over. This is illustrated in Figure 3. There are m−2 CSA blocks (each block in the figure actually represents many one-bit CSA blocks in parallel) that we have to go through, and then the final LCA. Note that every time we pass through a CSA block, our number increases in size by one bit. Therefore, the numbers that go to the LCA will be at most n + m − 2 bits long. So the final LCA will have a gate delay of O(lg (n + m)). Therefore the total gate delay is O(m + lg (n + m)) Instead of arranging the CSA blocks in a chain, a tree formation can actually be used. This is slightly awkward because of the odd ratio of 3 to 2. Figure 4 shows how to build a tree of CSAs. This circuit is called a Wallace tree. 2
  • 3. X0X1X2X3X4X5X6X7X8 LCA P8 i=0 Xi Figure 3: Adding m n-bit numbers with a chain of CSA’s. The depth of the tree is log3 2 m. Like before, the width of the numbers will increase as we get deeper in the tree. At the end of the tree, the numbers will be O(n + log m) bits wide, and therefore the LCA will have a O(lg (n + log m)) gate delay. The total gate delay of the Wallace tree is thus O(log m + lg (n + log m)). 3
  • 4. LCA P8 i=0 Xi X0 X1 X2 X3 X4 X5 X6 X7 X8 Figure 4: A Wallace tree for adding m n-bit numbers. 4