1
BINARY ARITHMETIC
Dr. Avijit Kumar Chaudhuri
Digital Systems
2
DIGITAL
CIRCUITS
Why Binary Arithmetic?
3
3 + 5
0011 + 0101
= 8
= 1000
Why Binary Arithmetic?
 Hardware can only deal with binary digits, 0
and 1.
 Must represent all numbers, integers or
floating point, positive or negative, by binary
digits, called bits.
 Can devise electronic circuits to perform
arithmetic operations: add, subtract, multiply
and divide, on binary numbers.
4
Positive Integers
 Decimal system: made of 10 digits, {0,1,2, . . . , 9}
41 = 4×101 + 1×100
255 = 2×102 + 5×101 + 5×100
 Binary system: made of two digits, {0,1}
00101001= 0×27 + 0×26 + 1×25 + 0×24
+1×23 +0×22 + 0×21 + 1×20
= 32 + 8 +1 = 41
11111111 = 255, largest number with 8
binary digits, 28-1
5
Base or Radix
 For decimal system, 10 is called the base or
radix.
 Decimal 41 is also written as 4110 or 41ten
 Base (radix) for binary system is 2.
 Thus, 41ten = 1010012 or 101001two
 Also, 111ten = 1101111two
and 111two = 7ten
 What about negative numbers?
6
Signed Magnitude – What Not to
Do
 Use fixed length binary representation
 Use left-most bit (called most significant bit or
MSB) for sign:
0 for positive
1 for negative
 Example: +18ten = 00010010two
–18ten = 10010010two
7
Difficulties with Signed
Magnitude
 Sign and magnitude bits should be differently
treated in arithmetic operations.
 Addition and subtraction require different logic
circuits.
 Overflow is difficult to detect.
 “Zero” has two representations:
+ 0ten = 00000000two
– 0ten = 10000000two
 Signed-integers are not used in modern computers.
8
Problems with Finite Math
 Finite size of representation:
 Digital circuit cannot be arbitrarily large.
 Overflow detection – easy to determine when the
number becomes too large.
 Represent negative numbers:
 Unique representation of 0.
9
-4 0 4 8 12 16 20
0000 0100 1000 1100 10000 10100
Infinite
universe
of integers
∞
-∞
4-bit numbers
4-bit Universe
10
Modulo-16
(4-bit)
universe
16/0
8
4
12 0100
1000
1100
0000
15
1111 0
8
4
12 0100
1000
1100
0000
-0
1111
15
-7
7 7
0111
-3
0001 0001
Only 16 integers: 0 through 15, or – 7 through 7
One Way to Divide Universe
1’s Complement Numbers
11
0
8
4
12 0100
1000
1100
0000
-0
1111
15
-7 7
0111
-3
0001
Decimal
magnitude
Binary number
Positive Negative
0 0000 1111
1 0001 1110
2 0010 1101
3 0011 1100
4 0100 1011
5 0101 1010
6 0110 1001
7 0111 1000
Negation rule: invert bits.
Problem: 0 ≠ – 0
Another Way to Divide Universe
2’s Complement Numbers
12
0
8
4
12 0100
1000
1100
0000
-1
1111
15
-8 7
0111
-4
0001
Decimal
magnitude
Binary number
Positive Negative
0 0000
1 0001 1111
2 0010 1110
3 0011 1101
4 0100 1100
5 0101 1011
6 0110 1010
7 0111 1001
8 1000
Negation rule: invert bits
and add 1
Subtract 1
on this side
Integers With Sign – Two Ways
 Use fixed-length representation, but no explicit sign
bit:
 1’s complement:To form a negative number, complement
each bit in the given number.
 2’s complement:To form a negative number, start with
the given number, subtract one, and then complement
each bit, or
first complement each bit, and then add 1.
 2’s complement is the preferred representation.
13
2’s-Complement Integers
 Why not 1’s-complement? Don’t like two zeros.
 Negation rule:
 Subtract 1 and then invert bits, or
 Invert bits and add 1
 Some properties:
 Only one representation for 0
 Exactly as many positive numbers as negative numbers
 Slight asymmetry – there is one negative number with no
positive counterpart
14
General Method for Binary
Integers with Sign
 Select number (n) of bits in representation.
 Partition 2n integers into two sets:
 00…0 through 01…1 are 2n/2 positive integers.
 10…0 through 11…1 are 2n/2 negative integers.
 Negation rule transforms negative to positive, and vice-versa:
 Signed magnitude: invert MSB (most significant bit)
 1’s complement: Subtract from 2n – 1 or 1…1 (same as “inverting all
bits”)
 2’s complement: Subtract from 2n or 10…0 (same as 1’s complement +
1)
15
Three Systems (n = 4)
16
0000
1000
0111
1111
1010 = – 2
Signed magnitude
0000
1000
1111
1010 = – 5
1’s complement integers
0010
1010 1010 0111
2
– 2
6
– 5
0000
1000
1111
10000
1010 = – 6
2’s complement integers
1010
0111
6
– 6
0
– 0
0
– 7 – 8
7 7
0
– 0
7
– 7
– 1
Three Representations
17
Sign-magnitude
000 = +0
001 = +1
010 = +2
011 = +3
100 = - 0
101 = - 1
110 = - 2
111 = - 3
2’s complement
000 = +0
001 = +1
010 = +2
011 = +3
100 = - 4
101 = - 3
110 = - 2
111 = - 1
(Preferred)
1’s complement
000 = +0
001 = +1
010 = +2
011 = +3
100 = - 3
101 = - 2
110 = - 1
111 = - 0
2’s Complement Numbers (n = 3)
18
0
+1
+2
+3
-1
-2
-3
- 4
000
001
010
011
100
101
110
111
addition
subtraction
Negation
2’s Complement n-bit Numbers
 Range: – 2n –1 through 2n –1 – 1
 Unique zero: 00000000 . . . . . 0
 Negation rule: see slide 11 or 13.
 Expansion of bit length: stretch the left-most bit all the
way, e.g., 11111101 is still 101 or – 3. Also, 00000011 is
same as 011 or 3.
 Most significant bit (MSB) indicates sign.
 Overflow rule: If two numbers with the same sign bit
(both positive or both negative) are added, the overflow
occurs if and only if the result has the opposite sign.
 Subtraction rule: for A – B, add – B and A.
19
Summary
 For a given number (n) of digits we have a finite
set of integers. For example, there are 103 = 1,000
decimal integers and 23 = 8 binary integers in 3-
digit representations.
 We divide the finite set of integers [0, rn – 1],
where radix r = 10 or 2, into two equal parts
representing positive and negative numbers.
 Positive and negative numbers of equal
magnitudes are complements of each other: x +
complement (x) = 0.
20
Summary: Defining Complement
 Decimal integers:
 10’s complement: – x = Complement (x) = 10n – x
 9’s complement: – x = Complement (x) = 10n – 1 – x
 For 9’s complement, subtract each digit from 9
 For 10’s complement, add 1 to 9’s complement
 Binary integers:
 2’s complement: – x = Complement (x) = 2n – x
 1’s complement: – x = Complement (x) = 2n – 1 – x
 For 1’s complement, subtract each digit from 1
 For 2’s complement, add 1 to 1’s complement
21
Understanding Complement
 Complement means “something that
completes”:
e.g., X + complement (X) = “Whole”.
 Complement also means “opposite”, e.g.,
complementary colors are placed opposite
in the primary color chart.
 Complementary numbers are like electric
charges. Positive and negative charges of
equal magnitudes annihilate each other.
22
2’s-Complement Numbers
23
. . . -1 0 1 2 3 4 5 . . .
000 001 010 011 100 101
Infinite
universe
of integers
∞
-∞
000
499
500
1000
001
999
501
Finite
Universe of
3-digit
Decimal
numbers
000
011
100
1000
001
111
101
Finite
Universe
of 3-bit
binary
numbers
Examples of Complements
 Decimal integers (r = 10, n = 3):
 10’s complement: – 50 = Compl (50) = 103 – 50 = 950; 50 +
950 = 1,000 = 0 (in 3 digit representation)
 9’s complement: – 50 = Compl (50) = 10n – 1 – 50 = 949; 50 +
949 = 999 = – 0 (in 9’s complement rep.)
 Binary integers (r = 2, n = 4):
 2’s complement: – 5 = Complement (5) = 24 – 5 = 1110 or
1011; 5 + 11 = 16 = 0 (in 4-bit representation)
 1’s complement: – 5 = Complement (5) = 24 – 1 – 5 = 1010 or
1010; 5 + 10 = 15 = – 0 (in 1’s complement representation)
24
2’s-Complement to Decimal
Conversion
25
bn-1 bn-2 . . . b1 b0 = – 2n-1bn-1 + Σ 2i bi
i=0
n-2
-128 64 32 16 8 4 2 1
8-bit conversion box
-128 64 32 16 8 4 2 1
1 1 1 1 1 1 0 1
Example
– 128 + 64 + 32 + 16 + 8 + 4 + 1 = – 128 + 125 = – 3
For More on 2’s-Complement
 Chapter 4 in D. E. Knuth, The Art of Computer Programming:
Seminumerical Algorithms, Volume II, Second Edition, Addison-Wesley,
1981.
 A. al’Khwarizmi, Hisab al-jabr w’al-muqabala, 830.
 Read: A two part interview with D. E. Knuth, Communications of the
ACM (CACM), vol. 51, no. 7, pp. 35-39 (July), and no. 8, pp. 31-35
(August), 2008.
26
Donald E. Knuth (1938 - ) Abu Abd-Allah ibn Musa
al’Khwarizmi (~780 – 850)
Addition
 Adding bits:
 0 + 0 = 0
 0 + 1 = 1
 1 + 0 = 1
 1 + 1 = (1) 0
 Adding integers:
27
carry
0 0 0 . . . . . . 0 1 1 1 two = 7ten
+ 0 0 0 . . . . . . 0 1 1 0 two = 6ten
= 0 0 0 . . . . . . 1 (1)1 (1)0 (0)1 two = 13ten
1 1 0
Subtraction
 Direct subtraction
 Two’s complement subtraction by adding
28
0 0 0 . . . . . . 0 1 1 1 two = 7ten
– 0 0 0 . . . . . . 0 1 1 0 two = 6ten
= 0 0 0 . . . . . . 0 0 0 1two = 1ten
0 0 0 . . . . . . 0 1 1 1 two = 7ten
+ 1 1 1 . . . . . . 1 0 1 0 two = – 6ten
= 0 0 0 . . . . . . 0 (1) 0 (1) 0 (0)1 two = 1ten
1 1 1 . . . . . . 1 1 0
Overflow: An Error
 Examples: Addition of 3-bit integers (range - 4 to +3)
 -2-3 = -5 110 = -2
+ 101 = -3
= 1011 = 3 (error)
 3+2 = 5 011 = 3
010 = 2
= 101 = -3 (error)
 Overflow rule: If two numbers with the same sign bit (both
positive or both negative) are added, the overflow occurs if
and only if the result has the opposite sign.
29
0
1
2
3
-1
-2
-3
- 4
000
001
010
011
100
101
110
111
– +
Overflow
crossing
Overflow and Finite Universe
30
. . .1111 0000 0001 0010 0011 0100 0101 . . .
Decrease Increase
Infinite
universe
of integers
No overflow
∞
-∞
0000
Forbidden fence
1000
0001
1111
1001
Finite
Universe
of 4-bit
binary
integers
0010
0011
0100
0101
0110
0111
1010
1011
1100
1101
1110
Increase
Decrease
Adding Two Bits
a b
s = a + b
Decimal Binary
0 0 0 00
0 1 1 01
1 0 1 01
1 1 2 10
31
SUM
CARRY
Half-Adder Adds two Bits
“half” because it has no carry
input
 Adding two bits:
a b a + b
0 0 00
0 1 01
1 0 01
1 1 10
carry sum
32
HA
a
b
sum
carry
XOR
AND
Full Adder: Include Carry
Input
a b c
s = a + b + c
Decimal value Binary value
0 0 0 0 00
0 0 1 1 01
0 1 0 1 01
0 1 1 2 10
1 0 0 1 01
1 0 1 2 10
1 1 0 2 10
1 1 1 3 11
33
SUM
CARRY
Full-Adder Adds Three Bits
34
a
b
XOR
AND
XOR
AND
OR
c
sum
carry
FA
HA
HA
32-bit Ripple-Carry Adder
35
FA0
FA1
FA2
FA31
a0
b0
c0 = 0
a1
b1
a2
b2
a31
b31
s0
s1
s2
c32
(discard)
s31
c31
c2
c1
c32 c31 . . . c2 c1 0
a31 . . . a2 a1 a0
+ b31 . . . b2 b1 b0
s31 . . . s2 s1 s0
How Fast is Ripple-Carry
Adder?
 Longest delay path (critical path) runs from (a0,
b0) to sum31.
 Suppose delay of full-adder is 100ps.
 Critical path delay = 3,200ps
 Clock rate cannot be higher than 1/(3,200×10 –12)
Hz = 312MHz.
 Must use more efficient ways to handle carry.
36
Speeding Up the Adder
37
16-bit
ripple
carry
adder
a0-a15
b0-b15
c0 = 0
s0-s15
16-bit
ripple
carry
adder
a16-a31
b16-b31
0
16-bit
ripple
carry
adder
a16-a31
b16-b31
1
Multiplexer
s16-s31
0
1
This is a carry-select adder
Fast Adders
 In general, any output of a 32-bit adder can be
evaluated as a logic expression in terms of all 65
inputs.
 Number of levels of logic can be reduced to
log2N for N-bit adder. Ripple-carry has N levels.
 More gates are needed, about log2N times that
of ripple-carry design.
 Fastest design is known as carry lookahead
adder.
38
N-bit Adder Design Options
Type of adder Time complexity
(delay)
Space complexity
(size)
Ripple-carry O(N) O(N)
Carry-lookahead O(log2N) O(N log2N)
Carry-skip O(√N) O(N)
Carry-select O(√N) O(N)
39
Reference: J. L. Hennessy and D. A. Patterson, Computer Architecture:
A Quantitative Approach, Second Edition, San Francisco, California,
1990, page A-46.
Binary Multiplication
(Unsigned)
40
1 0 0 0 two = 8ten multiplicand
1 0 0 1 two = 9ten multiplier
____________
1 0 0 0
0 0 0 0 partial products
0 0 0 0
1 0 0 0
____________
1 0 0 1 0 0 0two = 72ten
Basic algorithm: For n = 1, 32,
only If nth bit of multiplier is 1,
then add multiplicand × 2 n –1
to product
Digital Circuits for
Multiplication
 Need:
 Three registers for multiplicand, multiplier and product.
 Adder or arithmetic logic unit (ALU).
 What is a register
 A memory device – unit cell stores one bit.
 A 32-bit register has 32 storage cells. It can store a 32-bit
integer.
41
bit 0
bit 32
1 bit right shift divides integer by 2
1 bit left shift multiplies integer by 2
Multiplication Flowchart
42
LSB
of multiplier
?
Initialize product register to 0 Partial product number, n = 1
Left shift multiplicand register 1 bit
Right shift multiplier register 1 bit
n = ? n = n + 1
Done
Start
Add multiplicand to
product and place result
in product register
1 0
n < 32
n = 32
N = 32
Serial Multiplication
43
64-bit product register, initially 0
64
64
64
64-bit ALU
Test LSB
N = 32 times
shift right
32-bit multiplier
shift left
write
3 operations per bit:
shift right
shift left
add
Need 64-bit ALU
Multiplicand (expanded 64-bits)
LSB = 0
LSB
= 1
add
Shift l/r
LSB
after
add
N = 32
after
add
Serial Multiplication
(Improved)
44
Multiplicand
64-bit product register
32
32
32
32-bit ALU
Test LSB
32 times
LSB
(3) shift right
00000 . . . 00000 32-bit multiplier Initialized product register
2 operations per bit:
shift right
add
32-bit ALU
1
1
(1) add
1 or 0
N = 32
Example: 0010two× 0011two
Iteration Step Multiplicand Product
0 Initial values 0010 0000 0011
1 LSB =1 → Prod = Prod + Mcand 0010 0010 0011
Right shift product 0010 0001 0001
2 LSB =1 → Prod = Prod + Mcand 0010 0011 0001
Right shift product 0010 0001 1000
3 LSB = 0 → no operation 0010 0001 1000
Right shift product 0010 0000 1100
4 LSB = 0 → no operation 0010 0000 1100
Right shift product 0010 0000 0110
45
0010two × 0011two = 0110two, i.e., 2ten × 3ten = 6ten
Multiplying with Signs
 Convert numbers to magnitudes.
 Multiply the two magnitudes through 32 iterations.
 Negate the result if the signs of the multiplicand
and multiplier differed.
 Alternatively, the previous algorithm will work with
some modifications. See B. Parhami, Computer
Architecture, NewYork: Oxford University Press,
2005, pp. 199-200.
46
Alternative Method with
Signs
 In the improved method:
 Use 2N + 1 bit product register
 Use N + 1 bit multiplicand register
 Use N + 1 bit adder
 Proceed as in the improved method, except –
 In the last (Nth) iteration, if LSB = 1, subtract
multiplicand instead of adding.
47
Example 1: 1010two× 0011two
Iteration Step Multiplicand Product
0 Initial values 11010 00000 0011
1 LSB = 1 → Prod = Prod + Mcand 11010 11010 0011
Right shift product 11010 11101 0001
2 LSB = 1 → Prod = Prod + Mcand 11010 10111 0001
Right shift product 11010 11011 1000
3 LSB = 0 → no operation 11010 11011 1000
Right shift product 11010 11101 1100
4 LSB = 0 → no operation 11010 11101 1100
Right shift product 11010 11110 1110
48
1010two × 0011two = 101110two, i.e., – 6ten × 3ten = – 18ten
Example 2: 1010two× 1011two
Iteration Step Multiplicand Product
0 Initial values 11010 00000 1011
1 LSB =1 → Prod = Prod + Mcand 11010 11010 1011
Right shift product 11010 11101 0101
2 LSB =1 → Prod = Prod + Mcand 11010 10111 0101
Right shift product 11010 11011 1010
3 LSB = 0 → no operation 11010 11011 1010
Right shift product 11010 11101 1101
4 LSB =1 → Prod = Prod – Mcand* 00110 00011 1101
Right shift product 11010 00001 1110
49
1010two × 1011two = 011110two, i.e., – 6ten × ( – 5ten) = 30ten
*Last iteration with a negative multiplier in 2’s complement.
Adding Partial Products
50
y3 y2 y1 y0 multiplicand
x3 x2 x1 x0 multiplier
________________________
x0y3 x0y2 x0y1 x0y0 four
carry←x1y3 x1y2 x1y1 x1y0 partial
carry←x2y3 x2y2 x2y1 x2y0 products
carry← x3y3 x3y2 x3y1 x3y0 to be
__________________________________________________ summed
p7 p6 p5 p4 p3 p2 p1 p0
Requires three 4-bit additions. Slow.
Array Multiplier: Carry
Forward
51
y3 y2 y1 y0 multiplicand
x3 x2 x1 x0 multiplier
________________________
x0y3 x0y2 x0y1 x0y0 four
x1y3 x1y2 x1y1 x1y0 partial
x2y3 x2y2 x2y1 x2y0 products
x3y3 x3y2 x3y1 x3y0 to be
__________________________________________________ summed
p7 p6 p5 p4 p3 p2 p1 p0
Note: Carry is added to the next partial product (carry-save addition).
Adding the carry from the final stage needs an extra (ripple-carry
stage. These additions are faster but we need four stages.
Basic Building Blocks
 Two-input AND
 Full-adder
52
Full
adder
yi x0
p0i = x0yi
0th partial product
sum bit
to (k+1)th
sum
sum bit
from (k-1)th
sum
yi xk
carry bits
from (k-1)th
sum
carry bits
to (k+1)th
sum
Slide 24
ith bit of
kth partial
product
Array Multiplier
53
y3 y2 y1 y0
x0
x1
x2
x3
FA
xi
yj
ppk
ppk+1
co
0
0
0
ci
0
0 0 0 0
p7 p6 p5 p4 p3 p2 p1 p0
FA FA FA FA
Critical path
0
Types of Array Multipliers
 Baugh-Wooley Algorithm: Signed product by
two’s complement addition or subtraction
according to the MSB’s.
 Booth multiplier algorithm
 Tree multipliers
 Reference: N. H. E.Weste and D. Harris,
CMOSVLSI Design, A Circuits and Systems
Perspective,Third Edition, Boston: Addison-
Wesley, 2005.
54
Binary Division (Unsigned)
55
1 3 Quotient
1 1 / 1 4 7 Divisor / Dividend
1 1
3 7 Partial remainder
3 3
4 Remainder
0 0 0 0 1 1 0 1
1 0 1 1 / 1 0 0 1 0 0 1 1
1 0 1 1
0 0 1 1 1 0
1 0 1 1
0 0 1 1 1 1
1 0 1 1
1 0 0
4-bit Binary Division
(Unsigned)
56
Dividend: 6 = 0110
Divisor: 4 = 0100
– 4 = 1100
6
─ = 1, remainder 2
4
0 0 0 1
0 0 0 0 1 1 0
1 1 0 0
1 1 0 0 negative → quotient bit 0
0 1 0 0 → restore remainder
0 0 0 0 1 1 0
1 1 0 0
1 1 0 1 negative → quotient bit 0
0 1 0 0 → restore remainder
0 0 0 1 1 0
1 1 0 0
1 1 1 1 negative → quotient bit 0
0 1 0 0 → restore remainder
0 0 1 1 0
1 1 0 0
0 0 1 0 positive → quotient bit 1
Iteration
4
Iteration
3
Iteration
2
Iteration
1
32-bit Binary Division
Flowchart
57
$R = 0, $M = Divisor, $Q = Dividend, count = n
Shift 1-bit left $R, $Q
$R ← $R – $M
$R < 0?
$Q0 = 1
$Q0=0
$R ← $R + $M
count = count – 1
count = 0?
Done
$Q = Quotient
$R = Remainder
Start
Yes
Yes
No
No
$R and $M have
one extra sign bit
beyond 32 bits.
Restore $R
(remainder)
$R (33 b) | $Q (32 b)
4-bit Example: 6/4 = 1, Remainder
2 Actions n $R, $Q $M = Divisor
Initialize 4 0 0 0 0 0 | 0 1 1 0 0 0 1 0 0
Shift left $R, $Q 4 0 0 0 0 0 | 1 1 0 0 0 0 1 0 0
Add – $M (11100) to $R 4 1 1 1 0 0 | 1 1 0 0 0 0 1 0 0
Restore, add $M (00100) to $R 3 0 0 0 0 0 | 1 1 0 0 0 0 1 0 0
Shift left $R, $Q 3 0 0 0 0 1 | 1 0 0 0 0 0 1 0 0
Add – $M (11100) to $R 3 1 1 1 0 1 | 1 0 0 0 0 0 1 0 0
Restore, add $M (00100) to $R 2 0 0 0 0 1 | 1 0 0 0 0 0 1 0 0
Shift left $R, $Q 2 0 0 0 1 1 | 0 0 0 0 0 0 1 0 0
Add – $M (11100) to $R 2 1 1 1 1 1 | 0 0 0 0 0 0 1 0 0
Restore, add $M (00100) to $R 1 0 0 0 1 1 | 0 0 0 0 0 0 1 0 0
Shift left $R, $Q 1 0 0 1 1 0 | 0 0 0 0 0 0 1 0 0
Add – $M (11100) to $R 1 0 0 0 1 0 | 0 0 0 0 0 0 1 0 0
Set LSB of $Q = 1 0 0 0 0 1 0 | 0 0 0 1 0 0 1 0 0
58
Remainder | Quotient
count
4
3
2
1
0
Division
59
Initialize
$R←0
33-bit $M (Divisor)
33-bit $R (Remainder)
33
33
33
33-bit ALU
32 times
Step 1: 1- bit left shift $R and $Q
32-bit $Q (Dividend)
Step 2: Subtract $R ← $R – $M
Step 3: If sign-bit ($R) = 0, set Q0 = 1
If sign-bit ($R) = 1, set Q0 = 0 and restore $R
V. C. Hamacher, Z. G. Vranesic and S. G. Zaky, Computer Organization, Fourth Edition,
New York: McGraw-Hill, 1996.
Example: 8/3 = 2, Remainder = 2
60
Initialize $R = 0 0 0 0 0 $Q = 1 0 0 0 $M = 0 0 0 1 1
Step 1, L-shift $R = 0 0 0 0 1 $Q = 0 0 0 0
Step 2, Add – $M = 1 1 1 0 1
$R = 1 1 1 1 0
Step 3, Set Q0 $Q = 0 0 0 0
Restore + $M = 0 0 0 1 1
$R = 0 0 0 0 1
Step 1, L-shift $R = 0 0 0 1 0 $Q = 0 0 0 0 $M = 0 0 0 1 1
Step 2, Add – $M = 1 1 1 0 1
$R = 1 1 1 1 1
Step 3, Set Q0 $Q = 0 0 0 0
Restore + $M = 0 0 0 1 1
$R = 0 0 0 1 0
Iteration
2
Iteration
1
Example: 8/3 = 2 (Remainder = 2)
(Continued)
61
$R = 0 0 0 1 0 $Q = 0 0 0 0 $M = 0 0 0 1 1
Step 1, L-shift $R = 0 0 1 0 0 $Q = 0 0 0 0 $M = 0 0 0 1 1
Step 2, Add – $M = 1 1 1 0 1
$R = 0 0 0 0 1
Step 3, Set Q0 $Q = 0 0 0 1
Step 1, L-shift $R,Q = 0 0 0 1 0 $Q = 0 0 1 0 $M = 0 0 0 1 1
Step 2, Add – $M = 1 1 1 0 1
$R = 1 1 1 1 1
Step 3, Set Q0 $Q = 0 0 1 0 Final quotient
Restore + $M = 0 0 0 1 1
$R = 0 0 0 1 0
Iteration
4
Iteration
3
Note “Restore $R” in Steps 1, 2 and 4. This method is known as
the RESTORING DIVISION. An improved method, NON-RESTORING
DIVISION, is possible (see Hamacher, et al.)
Remainder
Non-Restoring Division
 Avoid unnecessary addition (restoration).
 How it works?
 Initially $R contains dividend ✕ 2 – n for n-bit numbers. Example (n = 8):
 In some iteration after left shift, suppose $R = x and divisor is y
 Subtract divisor, $R = x – y
 Restore: If $R is negative, add y, $R = x
 Next step: Left shift, $R = 2x+b, and subtract y, $R = 2x – y + b
62
00101101
00000000 00101101
Dividend
$R, $Q
How It Works: Last two Steps
 Suppose we do not restore and go to next step:
 Left shift, $R = 2(x – y) + b = 2x – 2y + b, and add y, then $R = 2x – 2y + y
+ b = 2x – y + b (same result as with restoration)
 Non-restoring division
 Initialize and start iterations same as in restoring division
by subtracting divisor
 In any iteration after left shift and subtraction/addition
 If $R is positive, subtract divisor (y) in next iteration
 If $R is negative, add divisor (y) in next iteration
 After final iteration, if $R is negative then restore it by
adding divisor (y)
63
Example: 8/3 = 2, Remainder = 2
Non-Restoring Division
64
Initialize $R = 0 0 0 0 0 $Q = 1 0 0 0 $M = 0 0 0 1 1
Step 1, L-shift $R = 0 0 0 0 1 $Q = 0 0 0 0
Step 2, Add – $M = 1 1 1 0 1
$R = 1 1 1 1 0 $Q = 0 0 0 0
Step 3, Set Q0
Step 1, L-shift $R = 1 1 1 0 0 $Q = 0 0 0 0 $M = 0 0 0 1 1
Step 2, Add + $M = 0 0 0 1 1
$R = 1 1 1 1 1 $Q = 0 0 0 0
Step 3, Set Q0
Iteration
2
Iteration
1
Add + $M in next iteration
Example: 8/3 = 2 (Remainder = 2)
Non-Restoring Division
(Continued)
65
$R = 1 1 1 1 1 $Q = 0 0 0 0 $M = 0 0 0 1 1
Step 1, L-shift $R = 1 1 1 1 0 $Q = 0 0 0 0 $M = 0 0 0 1 1
Step 2, Add + $M = 0 0 0 1 1
$R = 0 0 0 0 1 $Q = 0 0 0 1
Step 3, Set Q0
Step 1, L-shift $R,Q = 0 0 0 1 0 $Q = 0 0 1 0 $M = 0 0 0 1 1
Step 2, Add – $M = 1 1 1 0 1
$R = 1 1 1 1 1 $Q = 0 0 1 0 Final quotient = 2
Step 3, Set Q0
Restore + $M = 0 0 0 1 1
$R = 0 0 0 1 0
Iteration
4
Iteration
3
See, V. C. Hamacher, Z. G. Vranesic and Z. G. Zaky, Computer
Organization, Fourth Edition, McGraw-Hill, 1996, Section 6.9, pp.
281-285.
Remainder = 2
Signed Division
 Remember the signs and divide magnitudes.
 Negate the quotient if the signs of divisor and
dividend disagree.
 There is no other direct division method for
signed division.
66
Symbol Representation
 Early versions (60s and 70s)
 Six-bit binary code (Control Data Corp., CDC)
 EBCDIC – extended binary coded decimal interchange
code (IBM)
 Presently used –
 ASCII – American standard code for information
interchange – 7 bit code specified by American National
Standards Institute (ANSI), seeTable 1.11 on page 63; an
eighth MSB is often used as parity bit to construct a
byte-code.
 Unicode – 16 bit code and an extended 32 bit version
67
ASCII
 Each byte pattern represents a character (symbol)
 Convenient to write in hexadecimal, e.g., with even parity,
 00000000 0ten 00hex null
 01000001 65ten 41hex A
 11100001 225ten E1hex a
 Table 1.11 on page 63 gives the 7-bit ASCII code.
 C program – string – terminating with a null byte (odd parity):
01000101 01000011 01000101 10000000
69ten or 45hex 67ten or43hex 69ten or 45hex 128ten or 80hex
E C E (null)
68
Error Detection Code
 Errors: Bits can flip due to noise in circuits and in
communication.
 Extra bits used for error detection.
 Example: a parity bit in ASCII code
69
Even parity code for A 01000001
(even number of 1s)
Odd parity code for A 11000001
(odd number of 1s)
7-bit ASCII code
Parity bits
Single-bit error in 7-bit code of “A”, e.g., 1000101, will change
symbol to “E” or 1000000 to “@”. But error will be detected in
the 8-bit code because the error changes the specified parity.
Richard W. Hamming
 Error-correcting codes
(ECC).
 Also known for
 Hamming distance (HD) =
Number of bits two binary
vectors differ in
 Example:
HD(1101, 1010) = 3
 Hamming Medal, 1988
70
1915 -1998
The Idea of Hamming Code
71
Code space contains 2N possible N-bit code words:
1010
”A”
1110”
E”
1011”
B”
1000
”8”
0010
”2”
1-bit error in “A”
HD = 1
HD = 1
HD = 1
HD = 1
Error not correctable. Reason: No redundancy.
Hamming’s idea: Increase HD between valid code words.
N = 4
Code Symbol
0000 0
0001 1
0010 2
0011 3
0100 4
0101 5
0110 6
0111 7
1000 8
1001 9
1010 A
1011 B
1100 C
1101 D
1110 E
1111 F
Hamming’s Distance ≥ 3 Code
72
1010010
”A”
1-bit error in “A”
shortest distance
decoding eliminates
error
HD = 2
HD = 1
0010101
”2”
1000111
”8”
1011001
”B”
1110100
”E”
HD = 3
HD = 3
HD = 3
HD = 4
0010010
”?”
HD = 3
HD = 4
HD = 4
0011110
”3”
HD = 3
Minimum Distance-3 Hamming Code
Symbol
Original
code
Odd-parity
code
ECC, HD ≥ 3
0 0000 10000 0000000
1 0001 00001 0001011
2 0010 00010 0010101
3 0011 10011 0011110
4 0100 00100 0100110
5 0101 10101 0101101
6 0110 10110 0110011
7 0111 00111 0111000
8 1000 01000 1000111
9 1001 11001 1001100
A 1010 11010 1010010
B 1011 01011 1011001
C 1100 11100 1100001
D 1101 01101 1101010
E 1110 01110 1110100
F 1111 11111 1111111
73
Original code: Symbol “0” with a
single-bit error will be Interpreted as
“1”, “2”, “4” or “8”.
Reason: Hamming distance between
codes is 1. A code with any bit error will
map onto another valid code.
Remedy 1: Design codes with HD ≥ 2.
Example: Parity code. Single bit error
detected but not correctable.
Remedy 2: Design codes with HD ≥ 3.
For single bit error correction, decode
as the valid code at HD = 1.
For more error bit detection or
correction, design code with HD ≥ 4.
Integers and Real Numbers
 Integers: the universe is infinite but discrete
 No fractions
 No numbers between consecutive integers, e.g., 5 and 6
 A countable (finite) number of items in a finite range
 Referred to as fixed-point numbers
 Real numbers – the universe is infinite and continuous
 Fractions represented by decimal notation
 Rational numbers, e.g., 5/2 = 2.5
 Irrational numbers, e.g., 22/7 = 3.14159265 . . .
 Infinite numbers exist even in the smallest range
 Referred to as floating-point numbers
74
Wide Range of Numbers
 A large number:
976,000,000,000,000 = 9.76 × 1014
 A small number:
0.0000000000000976 = 9.76 × 10 –14
75
Scientific Notation
 Decimal numbers
 0.513×105, 5.13×104 and 51.3×103 are written in scientific
notation.
 5.13×104 is the normalized scientific notation.
 Binary numbers
 Base 2
 Binary point – multiplication by 2 moves the point to the
right.
 Normalized scientific notation, e.g., 1.0two×2 –1
76
Floating Point Numbers
 General format
±1.bbbbbtwo×2eeee
or (-1)S × (1+F) × 2E
 Where
 S = sign, 0 for positive, 1 for negative
 F = fraction (or mantissa) as a binary integer,
1+F is called significand
 E = exponent as a binary integer, positive or
negative (two’s complement)
77
Binary to Decimal Conversion
78
Binary (-1)S (1.b1b2b3b4) × 2E
Decimal (-1)S × (1 + b1×2-1 + b2×2-2 + b3×2-3 + b4×2-4) × 2E
Example: -1.1100 × 2-2 (binary) = - (1 + 2-1 + 2-2) ×2-2
= - (1 + 0.5 + 0.25)/4
= - 1.75/4
= - 0.4375 (decimal)
William Morton (Velvel)
Kahan
79
1989 Turing Award Citation:
For his fundamental contributions to
numerical analysis. One of the foremost
experts on floating-point computations.
Kahan has dedicated himself to "making
the world safe for numerical
computations."
Architect of the IEEE floating point standard
b. 1933, Canada
Professor of Computer Science, UC-Berkeley
Numbers in 32-bit Formats
 Two’s complement integers
 Floating point numbers
 Ref:W. Stallings, Computer Organization and Architecture, Sixth Edition,
Upper Saddle River, NJ: Prentice-Hall.
80
Negative
Overflow
Positive
Overflow
Expressible numbers
-231 231-1
0
Expressible
negative
numbers
Expressible
positive
numbers
0
-2-127 2-127
Positive underflow
Negative underflow
(2 – 2-23)×2128
- (2 – 2-23)×2128
Positive zero
Negative zero + ∞
– ∞
IEEE 754 Floating Point
Standard
 Biased exponent: true exponent range
[-126,127] is changed to [1, 254]:
 Biased exponent is an 8-bit positive binary integer.
 True exponent obtained by subtracting 127ten or
01111111two
 First bit of significand is always 1:
± 1.bbbb . . . b × 2E
 1 before the binary point is implicitly assumed.
 Significand field represents 23 bit fraction after the binary
point.
 Significand range is [1, 2), to be exact [1, 2 – 2-23]
81
Examples
82
1.1010001 × 210100 = 0 10010011 10100010000000000000000 = 1.6328125 × 220
-1.1010001 × 210100 = 1 10010011 10100010000000000000000 = -1.6328125 × 220
1.1010001 × 2-10100 = 0 01101011 10100010000000000000000 = 1.6328125 × 2-20
-1.1010001 × 2-10100 = 1 01101011 10100010000000000000000 = -1.6328125 × 2-20
Biased exponent (1-254), bias 127 (01111111) to be subtracted
1.0
0.5
0.125
0.0078125
1.6328125
Sign bit
8-bit biased exponent
107 – 127
= – 20
23-bit Fraction (F)
of significand
Example: Conversion to Decimal
 Sign bit is 1, number is negative
 Biased exponent is 27+20 = 129
 The number is
83
1 10000001 01000000000000000000000
Sign bit S bits 23-30 bits 0-22
normalized E F
(-1)S × (1 + F) × 2(exponent – bias) = (-1)1 × (1 + F) × 2(129 – 127)
= - 1 × 1.25 × 22
= - 1.25 × 4
= - 5.0
IEEE 754 Floating Point
Format
 Floating point numbers
84
Negative
Overflow
Positive
Overflow
Expressible
negative
numbers
Expressible
positive
numbers
0
-2-126 2-126
Positive underflow
Negative underflow
(2 – 2-23)×2127
- (2 – 2-23)×2127
+ ∞
– ∞
1 1011001 01001100000000010001101
Sign bit S
bits 23-30 bits 0-22
normalized E F
Positive integer – 127 = E
+0
– 0
Positive Zero in IEEE 754
 + 1.0 × 2 –127
 Smaller than the smallest positive number in single-
precision IEEE 754 standard.
 Interpreted as positive zero.
 True exponent less than –126 is positive underflow;
can be regarded as zero.
85
0 00000000 00000000000000000000000
Biased
exponent
Fraction
Negative Zero in IEEE 754
 – 1.0 × 2 –127
 Greater than the largest negative number in single-
precision IEEE 754 standard.
 Interpreted as negative zero.
 True exponent less than –126 is negative underflow;
may be regarded as 0.
86
1 00000000 00000000000000000000000
Biased
exponent
Fraction
Positive Infinity in IEEE
754
 + 1.0 × 2128
 Greater than the largest positive number in single-
precision IEEE 754 standard.
 Interpreted as + ∞
 If true exponent > 127, then the number is greater
than ∞. It is called “not a number” or NaN and may
be interpreted as ∞.
87
0 11111111 00000000000000000000000
Biased
exponent
Fraction
Negative Infinity in IEEE
754
 –1.0 × 2128
 Smaller than the smallest negative number in single-
precision IEEE 754 standard.
 Interpreted as - ∞
 If true exponent > 127, then the number is less than -
∞. It is called “not a number” or NaN and may be
interpreted as - ∞.
88
1 11111111 00000000000000000000000
Biased
exponent
Fraction
Addition and Subtraction
0. Zero check
- Change the sign of subtrahend, i.e., convert to summation
- If either operand is 0, the other is the result
1. Significand alignment: right shift significand of
smaller exponent until two exponents match.
2. Addition: add significands and report error if
overflow occurs. If significand = 0, return result as
0.
3. Normalization
- Shift significand bits to normalize.
- report overflow or underflow if exponent goes out of range.
4. Rounding
89
Example (4 Significant Fraction
Bits)
 Subtraction: 0.5ten – 0.4375ten
 Step 0: Floating point numbers to be added
1.000two× 2 –1 and –1.110two× 2 –2
 Step 1: Significand of lesser exponent is shifted
right until exponents match
–1.110two× 2 –2 → – 0.111two× 2 –1
 Step 2: Add significands, 1.000two + ( – 0.111two)
Result is 0.001two × 2 –1
90
01000
+11001
00001
2’s complement addition, one bit added for sign
Example (Continued)
 Step 3: Normalize, 1.000two× 2 – 4
No overflow/underflow since
127 ≥ exponent ≥ –126
 Step 4: Rounding, no change since the sum
fits in 4 bits.
1.000two × 2 – 4 = (1+0)/16 = 0.0625ten
91
FP Multiplication: Basic
Idea
1. Separate sign
2. Add exponents (integer addition)
3. Multiply significands (integer multiplication)
4. Normalize, round, check overflow/underflow
5. Replace sign
92
FP Multiplication: Step 0
93
Multiply, X × Y = Z
X = 0? Y = 0?
Z = 0
Return
Steps 1 - 5
yes
no
yes
no
FP Multiplication
Illustration
 Multiply 0.5ten and – 0.4375ten
(answer = – 0.21875ten) or
 Multiply 1.000two×2 –1 and –1.110two×2 –2
 Step 1: Add exponents
–1 + (–2) = – 3
 Step 2: Multiply significands
1.000
×1.110
0000
1000
1000
1000
1110000 Product is 1.110000
94
FP Mult. Illustration
(Cont.)
 Step 3:
 Normalization: If necessary, shift significand right and
increment exponent.
Normalized product is 1.110000 × 2 –3
 Check overflow/underflow: 127 ≥ exponent ≥ –126
 Step 4: Rounding: 1.110 × 2 –3
 Step 5: Sign: Operands have opposite signs,
Product is –1.110 × 2 –3
(Decimal value = – (1+0.5+0.25)/8 = – 0.21875ten)
95
FP Division: Basic Idea
 Separate sign.
 Check for zeros and infinity.
 Subtract exponents.
 Divide significands.
 Normalize and detect overflow/underflow.
 Perform rounding.
 Replace sign.
96

lec2_BinaryArithmetic.ppt

  • 1.
  • 2.
  • 3.
    Why Binary Arithmetic? 3 3+ 5 0011 + 0101 = 8 = 1000
  • 4.
    Why Binary Arithmetic? Hardware can only deal with binary digits, 0 and 1.  Must represent all numbers, integers or floating point, positive or negative, by binary digits, called bits.  Can devise electronic circuits to perform arithmetic operations: add, subtract, multiply and divide, on binary numbers. 4
  • 5.
    Positive Integers  Decimalsystem: made of 10 digits, {0,1,2, . . . , 9} 41 = 4×101 + 1×100 255 = 2×102 + 5×101 + 5×100  Binary system: made of two digits, {0,1} 00101001= 0×27 + 0×26 + 1×25 + 0×24 +1×23 +0×22 + 0×21 + 1×20 = 32 + 8 +1 = 41 11111111 = 255, largest number with 8 binary digits, 28-1 5
  • 6.
    Base or Radix For decimal system, 10 is called the base or radix.  Decimal 41 is also written as 4110 or 41ten  Base (radix) for binary system is 2.  Thus, 41ten = 1010012 or 101001two  Also, 111ten = 1101111two and 111two = 7ten  What about negative numbers? 6
  • 7.
    Signed Magnitude –What Not to Do  Use fixed length binary representation  Use left-most bit (called most significant bit or MSB) for sign: 0 for positive 1 for negative  Example: +18ten = 00010010two –18ten = 10010010two 7
  • 8.
    Difficulties with Signed Magnitude Sign and magnitude bits should be differently treated in arithmetic operations.  Addition and subtraction require different logic circuits.  Overflow is difficult to detect.  “Zero” has two representations: + 0ten = 00000000two – 0ten = 10000000two  Signed-integers are not used in modern computers. 8
  • 9.
    Problems with FiniteMath  Finite size of representation:  Digital circuit cannot be arbitrarily large.  Overflow detection – easy to determine when the number becomes too large.  Represent negative numbers:  Unique representation of 0. 9 -4 0 4 8 12 16 20 0000 0100 1000 1100 10000 10100 Infinite universe of integers ∞ -∞ 4-bit numbers
  • 10.
    4-bit Universe 10 Modulo-16 (4-bit) universe 16/0 8 4 12 0100 1000 1100 0000 15 11110 8 4 12 0100 1000 1100 0000 -0 1111 15 -7 7 7 0111 -3 0001 0001 Only 16 integers: 0 through 15, or – 7 through 7
  • 11.
    One Way toDivide Universe 1’s Complement Numbers 11 0 8 4 12 0100 1000 1100 0000 -0 1111 15 -7 7 0111 -3 0001 Decimal magnitude Binary number Positive Negative 0 0000 1111 1 0001 1110 2 0010 1101 3 0011 1100 4 0100 1011 5 0101 1010 6 0110 1001 7 0111 1000 Negation rule: invert bits. Problem: 0 ≠ – 0
  • 12.
    Another Way toDivide Universe 2’s Complement Numbers 12 0 8 4 12 0100 1000 1100 0000 -1 1111 15 -8 7 0111 -4 0001 Decimal magnitude Binary number Positive Negative 0 0000 1 0001 1111 2 0010 1110 3 0011 1101 4 0100 1100 5 0101 1011 6 0110 1010 7 0111 1001 8 1000 Negation rule: invert bits and add 1 Subtract 1 on this side
  • 13.
    Integers With Sign– Two Ways  Use fixed-length representation, but no explicit sign bit:  1’s complement:To form a negative number, complement each bit in the given number.  2’s complement:To form a negative number, start with the given number, subtract one, and then complement each bit, or first complement each bit, and then add 1.  2’s complement is the preferred representation. 13
  • 14.
    2’s-Complement Integers  Whynot 1’s-complement? Don’t like two zeros.  Negation rule:  Subtract 1 and then invert bits, or  Invert bits and add 1  Some properties:  Only one representation for 0  Exactly as many positive numbers as negative numbers  Slight asymmetry – there is one negative number with no positive counterpart 14
  • 15.
    General Method forBinary Integers with Sign  Select number (n) of bits in representation.  Partition 2n integers into two sets:  00…0 through 01…1 are 2n/2 positive integers.  10…0 through 11…1 are 2n/2 negative integers.  Negation rule transforms negative to positive, and vice-versa:  Signed magnitude: invert MSB (most significant bit)  1’s complement: Subtract from 2n – 1 or 1…1 (same as “inverting all bits”)  2’s complement: Subtract from 2n or 10…0 (same as 1’s complement + 1) 15
  • 16.
    Three Systems (n= 4) 16 0000 1000 0111 1111 1010 = – 2 Signed magnitude 0000 1000 1111 1010 = – 5 1’s complement integers 0010 1010 1010 0111 2 – 2 6 – 5 0000 1000 1111 10000 1010 = – 6 2’s complement integers 1010 0111 6 – 6 0 – 0 0 – 7 – 8 7 7 0 – 0 7 – 7 – 1
  • 17.
    Three Representations 17 Sign-magnitude 000 =+0 001 = +1 010 = +2 011 = +3 100 = - 0 101 = - 1 110 = - 2 111 = - 3 2’s complement 000 = +0 001 = +1 010 = +2 011 = +3 100 = - 4 101 = - 3 110 = - 2 111 = - 1 (Preferred) 1’s complement 000 = +0 001 = +1 010 = +2 011 = +3 100 = - 3 101 = - 2 110 = - 1 111 = - 0
  • 18.
    2’s Complement Numbers(n = 3) 18 0 +1 +2 +3 -1 -2 -3 - 4 000 001 010 011 100 101 110 111 addition subtraction Negation
  • 19.
    2’s Complement n-bitNumbers  Range: – 2n –1 through 2n –1 – 1  Unique zero: 00000000 . . . . . 0  Negation rule: see slide 11 or 13.  Expansion of bit length: stretch the left-most bit all the way, e.g., 11111101 is still 101 or – 3. Also, 00000011 is same as 011 or 3.  Most significant bit (MSB) indicates sign.  Overflow rule: If two numbers with the same sign bit (both positive or both negative) are added, the overflow occurs if and only if the result has the opposite sign.  Subtraction rule: for A – B, add – B and A. 19
  • 20.
    Summary  For agiven number (n) of digits we have a finite set of integers. For example, there are 103 = 1,000 decimal integers and 23 = 8 binary integers in 3- digit representations.  We divide the finite set of integers [0, rn – 1], where radix r = 10 or 2, into two equal parts representing positive and negative numbers.  Positive and negative numbers of equal magnitudes are complements of each other: x + complement (x) = 0. 20
  • 21.
    Summary: Defining Complement Decimal integers:  10’s complement: – x = Complement (x) = 10n – x  9’s complement: – x = Complement (x) = 10n – 1 – x  For 9’s complement, subtract each digit from 9  For 10’s complement, add 1 to 9’s complement  Binary integers:  2’s complement: – x = Complement (x) = 2n – x  1’s complement: – x = Complement (x) = 2n – 1 – x  For 1’s complement, subtract each digit from 1  For 2’s complement, add 1 to 1’s complement 21
  • 22.
    Understanding Complement  Complementmeans “something that completes”: e.g., X + complement (X) = “Whole”.  Complement also means “opposite”, e.g., complementary colors are placed opposite in the primary color chart.  Complementary numbers are like electric charges. Positive and negative charges of equal magnitudes annihilate each other. 22
  • 23.
    2’s-Complement Numbers 23 . .. -1 0 1 2 3 4 5 . . . 000 001 010 011 100 101 Infinite universe of integers ∞ -∞ 000 499 500 1000 001 999 501 Finite Universe of 3-digit Decimal numbers 000 011 100 1000 001 111 101 Finite Universe of 3-bit binary numbers
  • 24.
    Examples of Complements Decimal integers (r = 10, n = 3):  10’s complement: – 50 = Compl (50) = 103 – 50 = 950; 50 + 950 = 1,000 = 0 (in 3 digit representation)  9’s complement: – 50 = Compl (50) = 10n – 1 – 50 = 949; 50 + 949 = 999 = – 0 (in 9’s complement rep.)  Binary integers (r = 2, n = 4):  2’s complement: – 5 = Complement (5) = 24 – 5 = 1110 or 1011; 5 + 11 = 16 = 0 (in 4-bit representation)  1’s complement: – 5 = Complement (5) = 24 – 1 – 5 = 1010 or 1010; 5 + 10 = 15 = – 0 (in 1’s complement representation) 24
  • 25.
    2’s-Complement to Decimal Conversion 25 bn-1bn-2 . . . b1 b0 = – 2n-1bn-1 + Σ 2i bi i=0 n-2 -128 64 32 16 8 4 2 1 8-bit conversion box -128 64 32 16 8 4 2 1 1 1 1 1 1 1 0 1 Example – 128 + 64 + 32 + 16 + 8 + 4 + 1 = – 128 + 125 = – 3
  • 26.
    For More on2’s-Complement  Chapter 4 in D. E. Knuth, The Art of Computer Programming: Seminumerical Algorithms, Volume II, Second Edition, Addison-Wesley, 1981.  A. al’Khwarizmi, Hisab al-jabr w’al-muqabala, 830.  Read: A two part interview with D. E. Knuth, Communications of the ACM (CACM), vol. 51, no. 7, pp. 35-39 (July), and no. 8, pp. 31-35 (August), 2008. 26 Donald E. Knuth (1938 - ) Abu Abd-Allah ibn Musa al’Khwarizmi (~780 – 850)
  • 27.
    Addition  Adding bits: 0 + 0 = 0  0 + 1 = 1  1 + 0 = 1  1 + 1 = (1) 0  Adding integers: 27 carry 0 0 0 . . . . . . 0 1 1 1 two = 7ten + 0 0 0 . . . . . . 0 1 1 0 two = 6ten = 0 0 0 . . . . . . 1 (1)1 (1)0 (0)1 two = 13ten 1 1 0
  • 28.
    Subtraction  Direct subtraction Two’s complement subtraction by adding 28 0 0 0 . . . . . . 0 1 1 1 two = 7ten – 0 0 0 . . . . . . 0 1 1 0 two = 6ten = 0 0 0 . . . . . . 0 0 0 1two = 1ten 0 0 0 . . . . . . 0 1 1 1 two = 7ten + 1 1 1 . . . . . . 1 0 1 0 two = – 6ten = 0 0 0 . . . . . . 0 (1) 0 (1) 0 (0)1 two = 1ten 1 1 1 . . . . . . 1 1 0
  • 29.
    Overflow: An Error Examples: Addition of 3-bit integers (range - 4 to +3)  -2-3 = -5 110 = -2 + 101 = -3 = 1011 = 3 (error)  3+2 = 5 011 = 3 010 = 2 = 101 = -3 (error)  Overflow rule: If two numbers with the same sign bit (both positive or both negative) are added, the overflow occurs if and only if the result has the opposite sign. 29 0 1 2 3 -1 -2 -3 - 4 000 001 010 011 100 101 110 111 – + Overflow crossing
  • 30.
    Overflow and FiniteUniverse 30 . . .1111 0000 0001 0010 0011 0100 0101 . . . Decrease Increase Infinite universe of integers No overflow ∞ -∞ 0000 Forbidden fence 1000 0001 1111 1001 Finite Universe of 4-bit binary integers 0010 0011 0100 0101 0110 0111 1010 1011 1100 1101 1110 Increase Decrease
  • 31.
    Adding Two Bits ab s = a + b Decimal Binary 0 0 0 00 0 1 1 01 1 0 1 01 1 1 2 10 31 SUM CARRY
  • 32.
    Half-Adder Adds twoBits “half” because it has no carry input  Adding two bits: a b a + b 0 0 00 0 1 01 1 0 01 1 1 10 carry sum 32 HA a b sum carry XOR AND
  • 33.
    Full Adder: IncludeCarry Input a b c s = a + b + c Decimal value Binary value 0 0 0 0 00 0 0 1 1 01 0 1 0 1 01 0 1 1 2 10 1 0 0 1 01 1 0 1 2 10 1 1 0 2 10 1 1 1 3 11 33 SUM CARRY
  • 34.
    Full-Adder Adds ThreeBits 34 a b XOR AND XOR AND OR c sum carry FA HA HA
  • 35.
    32-bit Ripple-Carry Adder 35 FA0 FA1 FA2 FA31 a0 b0 c0= 0 a1 b1 a2 b2 a31 b31 s0 s1 s2 c32 (discard) s31 c31 c2 c1 c32 c31 . . . c2 c1 0 a31 . . . a2 a1 a0 + b31 . . . b2 b1 b0 s31 . . . s2 s1 s0
  • 36.
    How Fast isRipple-Carry Adder?  Longest delay path (critical path) runs from (a0, b0) to sum31.  Suppose delay of full-adder is 100ps.  Critical path delay = 3,200ps  Clock rate cannot be higher than 1/(3,200×10 –12) Hz = 312MHz.  Must use more efficient ways to handle carry. 36
  • 37.
    Speeding Up theAdder 37 16-bit ripple carry adder a0-a15 b0-b15 c0 = 0 s0-s15 16-bit ripple carry adder a16-a31 b16-b31 0 16-bit ripple carry adder a16-a31 b16-b31 1 Multiplexer s16-s31 0 1 This is a carry-select adder
  • 38.
    Fast Adders  Ingeneral, any output of a 32-bit adder can be evaluated as a logic expression in terms of all 65 inputs.  Number of levels of logic can be reduced to log2N for N-bit adder. Ripple-carry has N levels.  More gates are needed, about log2N times that of ripple-carry design.  Fastest design is known as carry lookahead adder. 38
  • 39.
    N-bit Adder DesignOptions Type of adder Time complexity (delay) Space complexity (size) Ripple-carry O(N) O(N) Carry-lookahead O(log2N) O(N log2N) Carry-skip O(√N) O(N) Carry-select O(√N) O(N) 39 Reference: J. L. Hennessy and D. A. Patterson, Computer Architecture: A Quantitative Approach, Second Edition, San Francisco, California, 1990, page A-46.
  • 40.
    Binary Multiplication (Unsigned) 40 1 00 0 two = 8ten multiplicand 1 0 0 1 two = 9ten multiplier ____________ 1 0 0 0 0 0 0 0 partial products 0 0 0 0 1 0 0 0 ____________ 1 0 0 1 0 0 0two = 72ten Basic algorithm: For n = 1, 32, only If nth bit of multiplier is 1, then add multiplicand × 2 n –1 to product
  • 41.
    Digital Circuits for Multiplication Need:  Three registers for multiplicand, multiplier and product.  Adder or arithmetic logic unit (ALU).  What is a register  A memory device – unit cell stores one bit.  A 32-bit register has 32 storage cells. It can store a 32-bit integer. 41 bit 0 bit 32 1 bit right shift divides integer by 2 1 bit left shift multiplies integer by 2
  • 42.
    Multiplication Flowchart 42 LSB of multiplier ? Initializeproduct register to 0 Partial product number, n = 1 Left shift multiplicand register 1 bit Right shift multiplier register 1 bit n = ? n = n + 1 Done Start Add multiplicand to product and place result in product register 1 0 n < 32 n = 32 N = 32
  • 43.
    Serial Multiplication 43 64-bit productregister, initially 0 64 64 64 64-bit ALU Test LSB N = 32 times shift right 32-bit multiplier shift left write 3 operations per bit: shift right shift left add Need 64-bit ALU Multiplicand (expanded 64-bits) LSB = 0 LSB = 1 add Shift l/r LSB after add N = 32 after add
  • 44.
    Serial Multiplication (Improved) 44 Multiplicand 64-bit productregister 32 32 32 32-bit ALU Test LSB 32 times LSB (3) shift right 00000 . . . 00000 32-bit multiplier Initialized product register 2 operations per bit: shift right add 32-bit ALU 1 1 (1) add 1 or 0 N = 32
  • 45.
    Example: 0010two× 0011two IterationStep Multiplicand Product 0 Initial values 0010 0000 0011 1 LSB =1 → Prod = Prod + Mcand 0010 0010 0011 Right shift product 0010 0001 0001 2 LSB =1 → Prod = Prod + Mcand 0010 0011 0001 Right shift product 0010 0001 1000 3 LSB = 0 → no operation 0010 0001 1000 Right shift product 0010 0000 1100 4 LSB = 0 → no operation 0010 0000 1100 Right shift product 0010 0000 0110 45 0010two × 0011two = 0110two, i.e., 2ten × 3ten = 6ten
  • 46.
    Multiplying with Signs Convert numbers to magnitudes.  Multiply the two magnitudes through 32 iterations.  Negate the result if the signs of the multiplicand and multiplier differed.  Alternatively, the previous algorithm will work with some modifications. See B. Parhami, Computer Architecture, NewYork: Oxford University Press, 2005, pp. 199-200. 46
  • 47.
    Alternative Method with Signs In the improved method:  Use 2N + 1 bit product register  Use N + 1 bit multiplicand register  Use N + 1 bit adder  Proceed as in the improved method, except –  In the last (Nth) iteration, if LSB = 1, subtract multiplicand instead of adding. 47
  • 48.
    Example 1: 1010two×0011two Iteration Step Multiplicand Product 0 Initial values 11010 00000 0011 1 LSB = 1 → Prod = Prod + Mcand 11010 11010 0011 Right shift product 11010 11101 0001 2 LSB = 1 → Prod = Prod + Mcand 11010 10111 0001 Right shift product 11010 11011 1000 3 LSB = 0 → no operation 11010 11011 1000 Right shift product 11010 11101 1100 4 LSB = 0 → no operation 11010 11101 1100 Right shift product 11010 11110 1110 48 1010two × 0011two = 101110two, i.e., – 6ten × 3ten = – 18ten
  • 49.
    Example 2: 1010two×1011two Iteration Step Multiplicand Product 0 Initial values 11010 00000 1011 1 LSB =1 → Prod = Prod + Mcand 11010 11010 1011 Right shift product 11010 11101 0101 2 LSB =1 → Prod = Prod + Mcand 11010 10111 0101 Right shift product 11010 11011 1010 3 LSB = 0 → no operation 11010 11011 1010 Right shift product 11010 11101 1101 4 LSB =1 → Prod = Prod – Mcand* 00110 00011 1101 Right shift product 11010 00001 1110 49 1010two × 1011two = 011110two, i.e., – 6ten × ( – 5ten) = 30ten *Last iteration with a negative multiplier in 2’s complement.
  • 50.
    Adding Partial Products 50 y3y2 y1 y0 multiplicand x3 x2 x1 x0 multiplier ________________________ x0y3 x0y2 x0y1 x0y0 four carry←x1y3 x1y2 x1y1 x1y0 partial carry←x2y3 x2y2 x2y1 x2y0 products carry← x3y3 x3y2 x3y1 x3y0 to be __________________________________________________ summed p7 p6 p5 p4 p3 p2 p1 p0 Requires three 4-bit additions. Slow.
  • 51.
    Array Multiplier: Carry Forward 51 y3y2 y1 y0 multiplicand x3 x2 x1 x0 multiplier ________________________ x0y3 x0y2 x0y1 x0y0 four x1y3 x1y2 x1y1 x1y0 partial x2y3 x2y2 x2y1 x2y0 products x3y3 x3y2 x3y1 x3y0 to be __________________________________________________ summed p7 p6 p5 p4 p3 p2 p1 p0 Note: Carry is added to the next partial product (carry-save addition). Adding the carry from the final stage needs an extra (ripple-carry stage. These additions are faster but we need four stages.
  • 52.
    Basic Building Blocks Two-input AND  Full-adder 52 Full adder yi x0 p0i = x0yi 0th partial product sum bit to (k+1)th sum sum bit from (k-1)th sum yi xk carry bits from (k-1)th sum carry bits to (k+1)th sum Slide 24 ith bit of kth partial product
  • 53.
    Array Multiplier 53 y3 y2y1 y0 x0 x1 x2 x3 FA xi yj ppk ppk+1 co 0 0 0 ci 0 0 0 0 0 p7 p6 p5 p4 p3 p2 p1 p0 FA FA FA FA Critical path 0
  • 54.
    Types of ArrayMultipliers  Baugh-Wooley Algorithm: Signed product by two’s complement addition or subtraction according to the MSB’s.  Booth multiplier algorithm  Tree multipliers  Reference: N. H. E.Weste and D. Harris, CMOSVLSI Design, A Circuits and Systems Perspective,Third Edition, Boston: Addison- Wesley, 2005. 54
  • 55.
    Binary Division (Unsigned) 55 13 Quotient 1 1 / 1 4 7 Divisor / Dividend 1 1 3 7 Partial remainder 3 3 4 Remainder 0 0 0 0 1 1 0 1 1 0 1 1 / 1 0 0 1 0 0 1 1 1 0 1 1 0 0 1 1 1 0 1 0 1 1 0 0 1 1 1 1 1 0 1 1 1 0 0
  • 56.
    4-bit Binary Division (Unsigned) 56 Dividend:6 = 0110 Divisor: 4 = 0100 – 4 = 1100 6 ─ = 1, remainder 2 4 0 0 0 1 0 0 0 0 1 1 0 1 1 0 0 1 1 0 0 negative → quotient bit 0 0 1 0 0 → restore remainder 0 0 0 0 1 1 0 1 1 0 0 1 1 0 1 negative → quotient bit 0 0 1 0 0 → restore remainder 0 0 0 1 1 0 1 1 0 0 1 1 1 1 negative → quotient bit 0 0 1 0 0 → restore remainder 0 0 1 1 0 1 1 0 0 0 0 1 0 positive → quotient bit 1 Iteration 4 Iteration 3 Iteration 2 Iteration 1
  • 57.
    32-bit Binary Division Flowchart 57 $R= 0, $M = Divisor, $Q = Dividend, count = n Shift 1-bit left $R, $Q $R ← $R – $M $R < 0? $Q0 = 1 $Q0=0 $R ← $R + $M count = count – 1 count = 0? Done $Q = Quotient $R = Remainder Start Yes Yes No No $R and $M have one extra sign bit beyond 32 bits. Restore $R (remainder) $R (33 b) | $Q (32 b)
  • 58.
    4-bit Example: 6/4= 1, Remainder 2 Actions n $R, $Q $M = Divisor Initialize 4 0 0 0 0 0 | 0 1 1 0 0 0 1 0 0 Shift left $R, $Q 4 0 0 0 0 0 | 1 1 0 0 0 0 1 0 0 Add – $M (11100) to $R 4 1 1 1 0 0 | 1 1 0 0 0 0 1 0 0 Restore, add $M (00100) to $R 3 0 0 0 0 0 | 1 1 0 0 0 0 1 0 0 Shift left $R, $Q 3 0 0 0 0 1 | 1 0 0 0 0 0 1 0 0 Add – $M (11100) to $R 3 1 1 1 0 1 | 1 0 0 0 0 0 1 0 0 Restore, add $M (00100) to $R 2 0 0 0 0 1 | 1 0 0 0 0 0 1 0 0 Shift left $R, $Q 2 0 0 0 1 1 | 0 0 0 0 0 0 1 0 0 Add – $M (11100) to $R 2 1 1 1 1 1 | 0 0 0 0 0 0 1 0 0 Restore, add $M (00100) to $R 1 0 0 0 1 1 | 0 0 0 0 0 0 1 0 0 Shift left $R, $Q 1 0 0 1 1 0 | 0 0 0 0 0 0 1 0 0 Add – $M (11100) to $R 1 0 0 0 1 0 | 0 0 0 0 0 0 1 0 0 Set LSB of $Q = 1 0 0 0 0 1 0 | 0 0 0 1 0 0 1 0 0 58 Remainder | Quotient count 4 3 2 1 0
  • 59.
    Division 59 Initialize $R←0 33-bit $M (Divisor) 33-bit$R (Remainder) 33 33 33 33-bit ALU 32 times Step 1: 1- bit left shift $R and $Q 32-bit $Q (Dividend) Step 2: Subtract $R ← $R – $M Step 3: If sign-bit ($R) = 0, set Q0 = 1 If sign-bit ($R) = 1, set Q0 = 0 and restore $R V. C. Hamacher, Z. G. Vranesic and S. G. Zaky, Computer Organization, Fourth Edition, New York: McGraw-Hill, 1996.
  • 60.
    Example: 8/3 =2, Remainder = 2 60 Initialize $R = 0 0 0 0 0 $Q = 1 0 0 0 $M = 0 0 0 1 1 Step 1, L-shift $R = 0 0 0 0 1 $Q = 0 0 0 0 Step 2, Add – $M = 1 1 1 0 1 $R = 1 1 1 1 0 Step 3, Set Q0 $Q = 0 0 0 0 Restore + $M = 0 0 0 1 1 $R = 0 0 0 0 1 Step 1, L-shift $R = 0 0 0 1 0 $Q = 0 0 0 0 $M = 0 0 0 1 1 Step 2, Add – $M = 1 1 1 0 1 $R = 1 1 1 1 1 Step 3, Set Q0 $Q = 0 0 0 0 Restore + $M = 0 0 0 1 1 $R = 0 0 0 1 0 Iteration 2 Iteration 1
  • 61.
    Example: 8/3 =2 (Remainder = 2) (Continued) 61 $R = 0 0 0 1 0 $Q = 0 0 0 0 $M = 0 0 0 1 1 Step 1, L-shift $R = 0 0 1 0 0 $Q = 0 0 0 0 $M = 0 0 0 1 1 Step 2, Add – $M = 1 1 1 0 1 $R = 0 0 0 0 1 Step 3, Set Q0 $Q = 0 0 0 1 Step 1, L-shift $R,Q = 0 0 0 1 0 $Q = 0 0 1 0 $M = 0 0 0 1 1 Step 2, Add – $M = 1 1 1 0 1 $R = 1 1 1 1 1 Step 3, Set Q0 $Q = 0 0 1 0 Final quotient Restore + $M = 0 0 0 1 1 $R = 0 0 0 1 0 Iteration 4 Iteration 3 Note “Restore $R” in Steps 1, 2 and 4. This method is known as the RESTORING DIVISION. An improved method, NON-RESTORING DIVISION, is possible (see Hamacher, et al.) Remainder
  • 62.
    Non-Restoring Division  Avoidunnecessary addition (restoration).  How it works?  Initially $R contains dividend ✕ 2 – n for n-bit numbers. Example (n = 8):  In some iteration after left shift, suppose $R = x and divisor is y  Subtract divisor, $R = x – y  Restore: If $R is negative, add y, $R = x  Next step: Left shift, $R = 2x+b, and subtract y, $R = 2x – y + b 62 00101101 00000000 00101101 Dividend $R, $Q
  • 63.
    How It Works:Last two Steps  Suppose we do not restore and go to next step:  Left shift, $R = 2(x – y) + b = 2x – 2y + b, and add y, then $R = 2x – 2y + y + b = 2x – y + b (same result as with restoration)  Non-restoring division  Initialize and start iterations same as in restoring division by subtracting divisor  In any iteration after left shift and subtraction/addition  If $R is positive, subtract divisor (y) in next iteration  If $R is negative, add divisor (y) in next iteration  After final iteration, if $R is negative then restore it by adding divisor (y) 63
  • 64.
    Example: 8/3 =2, Remainder = 2 Non-Restoring Division 64 Initialize $R = 0 0 0 0 0 $Q = 1 0 0 0 $M = 0 0 0 1 1 Step 1, L-shift $R = 0 0 0 0 1 $Q = 0 0 0 0 Step 2, Add – $M = 1 1 1 0 1 $R = 1 1 1 1 0 $Q = 0 0 0 0 Step 3, Set Q0 Step 1, L-shift $R = 1 1 1 0 0 $Q = 0 0 0 0 $M = 0 0 0 1 1 Step 2, Add + $M = 0 0 0 1 1 $R = 1 1 1 1 1 $Q = 0 0 0 0 Step 3, Set Q0 Iteration 2 Iteration 1 Add + $M in next iteration
  • 65.
    Example: 8/3 =2 (Remainder = 2) Non-Restoring Division (Continued) 65 $R = 1 1 1 1 1 $Q = 0 0 0 0 $M = 0 0 0 1 1 Step 1, L-shift $R = 1 1 1 1 0 $Q = 0 0 0 0 $M = 0 0 0 1 1 Step 2, Add + $M = 0 0 0 1 1 $R = 0 0 0 0 1 $Q = 0 0 0 1 Step 3, Set Q0 Step 1, L-shift $R,Q = 0 0 0 1 0 $Q = 0 0 1 0 $M = 0 0 0 1 1 Step 2, Add – $M = 1 1 1 0 1 $R = 1 1 1 1 1 $Q = 0 0 1 0 Final quotient = 2 Step 3, Set Q0 Restore + $M = 0 0 0 1 1 $R = 0 0 0 1 0 Iteration 4 Iteration 3 See, V. C. Hamacher, Z. G. Vranesic and Z. G. Zaky, Computer Organization, Fourth Edition, McGraw-Hill, 1996, Section 6.9, pp. 281-285. Remainder = 2
  • 66.
    Signed Division  Rememberthe signs and divide magnitudes.  Negate the quotient if the signs of divisor and dividend disagree.  There is no other direct division method for signed division. 66
  • 67.
    Symbol Representation  Earlyversions (60s and 70s)  Six-bit binary code (Control Data Corp., CDC)  EBCDIC – extended binary coded decimal interchange code (IBM)  Presently used –  ASCII – American standard code for information interchange – 7 bit code specified by American National Standards Institute (ANSI), seeTable 1.11 on page 63; an eighth MSB is often used as parity bit to construct a byte-code.  Unicode – 16 bit code and an extended 32 bit version 67
  • 68.
    ASCII  Each bytepattern represents a character (symbol)  Convenient to write in hexadecimal, e.g., with even parity,  00000000 0ten 00hex null  01000001 65ten 41hex A  11100001 225ten E1hex a  Table 1.11 on page 63 gives the 7-bit ASCII code.  C program – string – terminating with a null byte (odd parity): 01000101 01000011 01000101 10000000 69ten or 45hex 67ten or43hex 69ten or 45hex 128ten or 80hex E C E (null) 68
  • 69.
    Error Detection Code Errors: Bits can flip due to noise in circuits and in communication.  Extra bits used for error detection.  Example: a parity bit in ASCII code 69 Even parity code for A 01000001 (even number of 1s) Odd parity code for A 11000001 (odd number of 1s) 7-bit ASCII code Parity bits Single-bit error in 7-bit code of “A”, e.g., 1000101, will change symbol to “E” or 1000000 to “@”. But error will be detected in the 8-bit code because the error changes the specified parity.
  • 70.
    Richard W. Hamming Error-correcting codes (ECC).  Also known for  Hamming distance (HD) = Number of bits two binary vectors differ in  Example: HD(1101, 1010) = 3  Hamming Medal, 1988 70 1915 -1998
  • 71.
    The Idea ofHamming Code 71 Code space contains 2N possible N-bit code words: 1010 ”A” 1110” E” 1011” B” 1000 ”8” 0010 ”2” 1-bit error in “A” HD = 1 HD = 1 HD = 1 HD = 1 Error not correctable. Reason: No redundancy. Hamming’s idea: Increase HD between valid code words. N = 4 Code Symbol 0000 0 0001 1 0010 2 0011 3 0100 4 0101 5 0110 6 0111 7 1000 8 1001 9 1010 A 1011 B 1100 C 1101 D 1110 E 1111 F
  • 72.
    Hamming’s Distance ≥3 Code 72 1010010 ”A” 1-bit error in “A” shortest distance decoding eliminates error HD = 2 HD = 1 0010101 ”2” 1000111 ”8” 1011001 ”B” 1110100 ”E” HD = 3 HD = 3 HD = 3 HD = 4 0010010 ”?” HD = 3 HD = 4 HD = 4 0011110 ”3” HD = 3
  • 73.
    Minimum Distance-3 HammingCode Symbol Original code Odd-parity code ECC, HD ≥ 3 0 0000 10000 0000000 1 0001 00001 0001011 2 0010 00010 0010101 3 0011 10011 0011110 4 0100 00100 0100110 5 0101 10101 0101101 6 0110 10110 0110011 7 0111 00111 0111000 8 1000 01000 1000111 9 1001 11001 1001100 A 1010 11010 1010010 B 1011 01011 1011001 C 1100 11100 1100001 D 1101 01101 1101010 E 1110 01110 1110100 F 1111 11111 1111111 73 Original code: Symbol “0” with a single-bit error will be Interpreted as “1”, “2”, “4” or “8”. Reason: Hamming distance between codes is 1. A code with any bit error will map onto another valid code. Remedy 1: Design codes with HD ≥ 2. Example: Parity code. Single bit error detected but not correctable. Remedy 2: Design codes with HD ≥ 3. For single bit error correction, decode as the valid code at HD = 1. For more error bit detection or correction, design code with HD ≥ 4.
  • 74.
    Integers and RealNumbers  Integers: the universe is infinite but discrete  No fractions  No numbers between consecutive integers, e.g., 5 and 6  A countable (finite) number of items in a finite range  Referred to as fixed-point numbers  Real numbers – the universe is infinite and continuous  Fractions represented by decimal notation  Rational numbers, e.g., 5/2 = 2.5  Irrational numbers, e.g., 22/7 = 3.14159265 . . .  Infinite numbers exist even in the smallest range  Referred to as floating-point numbers 74
  • 75.
    Wide Range ofNumbers  A large number: 976,000,000,000,000 = 9.76 × 1014  A small number: 0.0000000000000976 = 9.76 × 10 –14 75
  • 76.
    Scientific Notation  Decimalnumbers  0.513×105, 5.13×104 and 51.3×103 are written in scientific notation.  5.13×104 is the normalized scientific notation.  Binary numbers  Base 2  Binary point – multiplication by 2 moves the point to the right.  Normalized scientific notation, e.g., 1.0two×2 –1 76
  • 77.
    Floating Point Numbers General format ±1.bbbbbtwo×2eeee or (-1)S × (1+F) × 2E  Where  S = sign, 0 for positive, 1 for negative  F = fraction (or mantissa) as a binary integer, 1+F is called significand  E = exponent as a binary integer, positive or negative (two’s complement) 77
  • 78.
    Binary to DecimalConversion 78 Binary (-1)S (1.b1b2b3b4) × 2E Decimal (-1)S × (1 + b1×2-1 + b2×2-2 + b3×2-3 + b4×2-4) × 2E Example: -1.1100 × 2-2 (binary) = - (1 + 2-1 + 2-2) ×2-2 = - (1 + 0.5 + 0.25)/4 = - 1.75/4 = - 0.4375 (decimal)
  • 79.
    William Morton (Velvel) Kahan 79 1989Turing Award Citation: For his fundamental contributions to numerical analysis. One of the foremost experts on floating-point computations. Kahan has dedicated himself to "making the world safe for numerical computations." Architect of the IEEE floating point standard b. 1933, Canada Professor of Computer Science, UC-Berkeley
  • 80.
    Numbers in 32-bitFormats  Two’s complement integers  Floating point numbers  Ref:W. Stallings, Computer Organization and Architecture, Sixth Edition, Upper Saddle River, NJ: Prentice-Hall. 80 Negative Overflow Positive Overflow Expressible numbers -231 231-1 0 Expressible negative numbers Expressible positive numbers 0 -2-127 2-127 Positive underflow Negative underflow (2 – 2-23)×2128 - (2 – 2-23)×2128 Positive zero Negative zero + ∞ – ∞
  • 81.
    IEEE 754 FloatingPoint Standard  Biased exponent: true exponent range [-126,127] is changed to [1, 254]:  Biased exponent is an 8-bit positive binary integer.  True exponent obtained by subtracting 127ten or 01111111two  First bit of significand is always 1: ± 1.bbbb . . . b × 2E  1 before the binary point is implicitly assumed.  Significand field represents 23 bit fraction after the binary point.  Significand range is [1, 2), to be exact [1, 2 – 2-23] 81
  • 82.
    Examples 82 1.1010001 × 210100= 0 10010011 10100010000000000000000 = 1.6328125 × 220 -1.1010001 × 210100 = 1 10010011 10100010000000000000000 = -1.6328125 × 220 1.1010001 × 2-10100 = 0 01101011 10100010000000000000000 = 1.6328125 × 2-20 -1.1010001 × 2-10100 = 1 01101011 10100010000000000000000 = -1.6328125 × 2-20 Biased exponent (1-254), bias 127 (01111111) to be subtracted 1.0 0.5 0.125 0.0078125 1.6328125 Sign bit 8-bit biased exponent 107 – 127 = – 20 23-bit Fraction (F) of significand
  • 83.
    Example: Conversion toDecimal  Sign bit is 1, number is negative  Biased exponent is 27+20 = 129  The number is 83 1 10000001 01000000000000000000000 Sign bit S bits 23-30 bits 0-22 normalized E F (-1)S × (1 + F) × 2(exponent – bias) = (-1)1 × (1 + F) × 2(129 – 127) = - 1 × 1.25 × 22 = - 1.25 × 4 = - 5.0
  • 84.
    IEEE 754 FloatingPoint Format  Floating point numbers 84 Negative Overflow Positive Overflow Expressible negative numbers Expressible positive numbers 0 -2-126 2-126 Positive underflow Negative underflow (2 – 2-23)×2127 - (2 – 2-23)×2127 + ∞ – ∞ 1 1011001 01001100000000010001101 Sign bit S bits 23-30 bits 0-22 normalized E F Positive integer – 127 = E +0 – 0
  • 85.
    Positive Zero inIEEE 754  + 1.0 × 2 –127  Smaller than the smallest positive number in single- precision IEEE 754 standard.  Interpreted as positive zero.  True exponent less than –126 is positive underflow; can be regarded as zero. 85 0 00000000 00000000000000000000000 Biased exponent Fraction
  • 86.
    Negative Zero inIEEE 754  – 1.0 × 2 –127  Greater than the largest negative number in single- precision IEEE 754 standard.  Interpreted as negative zero.  True exponent less than –126 is negative underflow; may be regarded as 0. 86 1 00000000 00000000000000000000000 Biased exponent Fraction
  • 87.
    Positive Infinity inIEEE 754  + 1.0 × 2128  Greater than the largest positive number in single- precision IEEE 754 standard.  Interpreted as + ∞  If true exponent > 127, then the number is greater than ∞. It is called “not a number” or NaN and may be interpreted as ∞. 87 0 11111111 00000000000000000000000 Biased exponent Fraction
  • 88.
    Negative Infinity inIEEE 754  –1.0 × 2128  Smaller than the smallest negative number in single- precision IEEE 754 standard.  Interpreted as - ∞  If true exponent > 127, then the number is less than - ∞. It is called “not a number” or NaN and may be interpreted as - ∞. 88 1 11111111 00000000000000000000000 Biased exponent Fraction
  • 89.
    Addition and Subtraction 0.Zero check - Change the sign of subtrahend, i.e., convert to summation - If either operand is 0, the other is the result 1. Significand alignment: right shift significand of smaller exponent until two exponents match. 2. Addition: add significands and report error if overflow occurs. If significand = 0, return result as 0. 3. Normalization - Shift significand bits to normalize. - report overflow or underflow if exponent goes out of range. 4. Rounding 89
  • 90.
    Example (4 SignificantFraction Bits)  Subtraction: 0.5ten – 0.4375ten  Step 0: Floating point numbers to be added 1.000two× 2 –1 and –1.110two× 2 –2  Step 1: Significand of lesser exponent is shifted right until exponents match –1.110two× 2 –2 → – 0.111two× 2 –1  Step 2: Add significands, 1.000two + ( – 0.111two) Result is 0.001two × 2 –1 90 01000 +11001 00001 2’s complement addition, one bit added for sign
  • 91.
    Example (Continued)  Step3: Normalize, 1.000two× 2 – 4 No overflow/underflow since 127 ≥ exponent ≥ –126  Step 4: Rounding, no change since the sum fits in 4 bits. 1.000two × 2 – 4 = (1+0)/16 = 0.0625ten 91
  • 92.
    FP Multiplication: Basic Idea 1.Separate sign 2. Add exponents (integer addition) 3. Multiply significands (integer multiplication) 4. Normalize, round, check overflow/underflow 5. Replace sign 92
  • 93.
    FP Multiplication: Step0 93 Multiply, X × Y = Z X = 0? Y = 0? Z = 0 Return Steps 1 - 5 yes no yes no
  • 94.
    FP Multiplication Illustration  Multiply0.5ten and – 0.4375ten (answer = – 0.21875ten) or  Multiply 1.000two×2 –1 and –1.110two×2 –2  Step 1: Add exponents –1 + (–2) = – 3  Step 2: Multiply significands 1.000 ×1.110 0000 1000 1000 1000 1110000 Product is 1.110000 94
  • 95.
    FP Mult. Illustration (Cont.) Step 3:  Normalization: If necessary, shift significand right and increment exponent. Normalized product is 1.110000 × 2 –3  Check overflow/underflow: 127 ≥ exponent ≥ –126  Step 4: Rounding: 1.110 × 2 –3  Step 5: Sign: Operands have opposite signs, Product is –1.110 × 2 –3 (Decimal value = – (1+0.5+0.25)/8 = – 0.21875ten) 95
  • 96.
    FP Division: BasicIdea  Separate sign.  Check for zeros and infinity.  Subtract exponents.  Divide significands.  Normalize and detect overflow/underflow.  Perform rounding.  Replace sign. 96