COMP201 Computer Systems Floating Point Numbers - PowerPoint PPT Presentation

1 / 22

About This Presentation

Title:

COMP201 Computer Systems Floating Point Numbers

Description:

Representations considered so far have a limited range dependent on the number of bits: ... have very few significant digits but are well beyond the range above. ... – PowerPoint PPT presentation

Number of Views:184

Avg rating:3.0/5.0

Slides: 23

Provided by: csWaik

Category:

more less

Transcript and Presenter's Notes

Title: COMP201 Computer Systems Floating Point Numbers

1
COMP201 Computer SystemsFloating Point Numbers
2
Floating Point Numbers

Representations considered so far have a limited
range dependent on the number of bits
16 bits 0 to 65535 for unsigned -32768 to 32767
for 2's complement.
32 bits 0 to 4294967295 for unsigned -2147483648
to 2147483647 2's Complement
How do we represent the following numbers
mass of an electron 0.00000000000000000000000000
000910956Kg
Mass of the earth 5975000000000000000000000Kg
Both these numbers have very few significant
digits but are well beyond the range above.

3
Floating Point Numbers

We normally use scientific notation
Mass of electron 9.10956 x 10-31 Kg
Mass of earth 5.975 x 1024 Kg
These numbers can be split into two components
Mantissa
Exponent
e.g. 9.10956 x 10-31 Kg mantissa
9.10956 exponent -31

4
Floating Point Numbers

Also need to be able to represent negative
numbers
Same approach can be taken with binary numbers
i.e. a x re
where a is the mantissa, r is the base( Radix)
and e is the exponent
Notice, we are trading off precision and possibly
making computation times longer, in exchange for
being able to represent a larger range of
numbers.

5
Defining a floating point number

Several things must be defined, in order to
define a floating point number
Size of mantissa
Sign of mantissa
Size of exponent
Sign of exponent
Number base in use

9.10956
Positive
31
Negative
10

6
Floating Point Numbers

Consider the following using eight digits and
radix 10

MSD
LSD
Mantissa (5 digits)
Exponent (2 digits)
The high order bit is the sign bit (0 for and
1 for - )
7
Floating Point Numbers

But this has serious limitations!
Range of numbers which can be represented is 100
1099 since only two digits are devoted to
exponent.
Precision is only 5 digits
No provision for negative exponents

8
Floating Point Numbers (Continued)

One solution for the limitations of range and
negative exponents, is to decrease the positive
range to 49, by applying an offset of 50
(excess-50 notation). Then, the range of numbers
which can be expressed becomes
0.00001 x 10-50 to 0.99999 x 1049
Often implementations require that the most
significant digit not be zero so restrict the
most negative value to 0.10000 x 10-50 and the
all-0s represents 0.

9
From the text
Excess-50 notation
Range of represented numbers
10
Examples (from textbook)

Problem Convert 246.8035 to the standard format
Write as 0.2468035 x 103
Truncate at 5 digits 0.24680 x 103
Final number 05324680

Problem Convert -0.00000075 to FP
format.
Write as -0.75 x 10-6
Zero fill to 5 digits
Final number
54475000

NOTE Plus is represented by 0 minus by 5 in
this format.
11
Floating point in the computer

Within the computer, we work in base 2, and use a
format such as
NOTE Bit order depends upon the particular
computer

12
Numbers represented

Using 32 bits to represent a number, with 1 sign
bit, 8 bits for exponent, and the remaining (23
bits) for mantissa.
In order to represent negative exponents, use
excess-128 notation.
Range of numbers represented is approximately
10-38 to 1038 (in decimal terms)

13
But this is not the end!

The precision of the mantissa can be improved
from 23 bits to 24 bits by noticing that the MSB
of the mantissa in a normalized binary number is
always 1
And so, it can be implied, rather than expressed
directly. The added complications (see below)
are felt to be a good tradeoff for the added
precision.
This complication is the fact that certain
numbers are too small to be normalized, and the
number 0.0 cannot be represented at all!
The solution is to utilize certain codes for
these special cases.

14
Floating point numbers the IEEE standard

IEEE Standard 754
Most (but not all) computer manufactures use
IEEE-754 format
Number represented (-1)S (1.M)2(E -
Bias)2 main formats single and double

15
And there are special cases to be considered

Exponent Mantissa
0 /- 0
0 not 0
1-254 any
255 /- 0
255 not 0

Zero, represented by
/- 2-126 x 0.M
2E-127 x 1.M
/- infinity
Special condition

16
Special cases (continued)

The thing to remember about the special cases, is
that they are SPECIAL, that is, not expected to
occur.
They cover numbers outside the range expected, by
using some unlikely-to-be used codes
Special code for 0.0
Special code for numbers too small to be
normalized
Special code for infinity
What you are expected to remember is that these
special codes exist, and they limit the range of
numbers that can be represented by the standard.

17
Final ranges represented

2-126 to 2127
Approx. 10-38 to 1038
Similarly, the double-precision used sixty-four
bits, and represents a range of approximately
10-300 to 10300

18
Floating Point Numbers

IEEE Standard 754
Example What Decimal number does the following
IEEE floating point number represent? 0
10000001 10110000000000000000000Sign 0
(positive)Exponent 2 (0x81 or 129 127
2)Mantissa 1 .10110000000000000000000

Final answer 110.112 or 6.7510
19
More

What is the IEEE f.p. number
01000001111111100000000000000000
Converted to decimal?
Begin by dividing up the number into the fields
of sign, exponent and mantissa
0 10000011 11111100000000000000000
Then, convert the exponent ( 10000011 131) and
subtract the bias (127) to get the shift (4)
Add in the implied one to the mantissa
1.11111100000000000000000
And shift 4 places, to get the final number
11111.11, or 31.75 in decimal.

Mantissa (23 bits)
20
And more

What is the decimal number 32.125 converted to
IEEE fp format?
First, convert the number to binary
100000.001
Then, normalize for 1.xxxx format
1.00000001 shift 5
Add in the bias (127) and convert the total (132)
to binary (10000100 or 0x84)
Assemble final number
0 1000010 00000001000000000000000

21
Floating Point Application

Floating point arithmetic is very important in
certain types of application
Primarily scientific computation
Many applications use no floating point
arithmetic at all
Communications
Operating systems
Most graphics

22
Floating point implementation

Floating point arithmetic may be implemented in
software or hardware.
It is much more complex than integer arithmetic.
Hardware instruction implementations require more
gates than equivalent integer instructions and
are often slower
Software implementations are generally very slow
CPU Floating performance can be very different
from Integer performance

Write a Comment

User Comments (0)