Transcript of "Quick tutorial on IEEE 754 FLOATING POINT representation"
1.
QUICK TUTORIAL ON IEEE 754
FLOATING POINT REPRESENTATION
-by
RITU RANJAN SHRIVASTWA
2.
Decimal to IEEE 754 Floating point
representation
There are 32 bits in Standard IEEE 754 representation of floating point numbers in binary and
is divided into three parts namely:
• Sign bit
• Exponent
• Mantissa
The representation in bit format is as follows
Sign bit
1 or 0
EXPONENT
8 bits
MANTISSA
23 bits
To be represented in this format, a number should be in the following normalized form.
(+ or -) 1.(mantissa) x 2^(exponent)
Sometimes in question it asks not to convert in normalized form, otherwise it should be
converted to its normalized form
3.
Decimal to IEEE 754 Floating point
representation
To convert a number into its normalized form, we need to do the following:
For example, we will take the decimal number +4.6
We see that the number before decimal is not equal to 1 which means we need to convert it
into normalized form and bring 1 there. To do this, we need to keep dividing it by 2 till we get
the normalized form with just 1 left before decimal.
This means
4.6 / 2
2.3 / 2
= 2.3
= 1.15
Hence we get the normalized form and we can write
+4.6 1.15 x 22
Now we will represent this using IEEE 754 standard
4.
Decimal to IEEE 754 Floating point
representation
We have +1.15 x 22 to represent
1. The sign bit will be ‘0’ as the number is positive
2. The exponent will be 127+2=129 (here we are using 127 as bias value because, the 8 bit
exponent part can accommodate 256 values i.e., 0-255. In this range we need to display
both positive and negative powers, thus we use the first 128 numbers(0-127) to denote
negative power and next 128(128-255) for positive power. Thus unless mentioned as
Excess-128 or Excess-64, we will use 2n-1 as the Bias value where n is the number of bits
in the exponent part.) Hence, if the power had been negative, then the exponent value
would have been
127+(-2) = 127-2 = 125
3. Since we have got our sign bit, and exponent, lets fill them up in the bit pattern.
0
10000001
12910 100000012
MANTISSA
23 bits
5.
Decimal to IEEE 754 Floating point
representation
Now we need to find out the mantissa part.
First of all, not that the ‘1’ is NOT represented in the bit pattern since it is in the normalized
form, it is known that the ‘1’ will exist. Thus in the mantissa part only the decimal part i.e.,
(0.15) need to be represented.
Let us convert the 0.15 to binary
0.15 x 2 = 0.3 0
0.3 x 2 = 0.6
0
0.6 x 2 = 1.2
1
(i)
0.2 x 2 = 0.4
0
0.4 x 2 = 0.8
0
0.8 x 2 = 1.6
1
(ii)
Now the value from (i) till (ii) will continue to recur and we will keep recurring it till 23 bits
are filled.
Thus the bits obtained are 00100110011001100110011
Hence the bit pattern in the 32 bit format are
0
10000001
00100110011001100110011
(40933333)16
6.
EXAMPLE PROBLEM
NOTE : In this question,
The total no of bits is only
16. They have given the
bias as 64, where it
should be 63, so you
need to use 64. And also,
the given number need
not be converted into its
normal form
7.
IEEE 754 Floating point to Decimal
conversion
You need to do just the reverse of the above which is very simple.
For example:
Given Binary representation: 11000001101111110……0
Thus we will break it into three parts as:
1
10000011
01111110000000000000000
We clearly see that the number is negative and the power is 131-127 = 4
Mantissa is: 2-1x0 + 2-2x1 + 2-3x1 + 2-4x1 + 2-5x1 + 2-6x1 + 2-7x1 = 0.4921875
The number is -1.4921875 x 24 [note the ‘1’ is added before the 0 in the normal fom]
Which is equal to -23.875
ANS: -23.875
Be the first to comment