Online Book Reader

Home Category

Choose a category
All
Classic-Fiction

High Performance Computing - Charles Severance [25]

By Root 1308 0

defined. You can generate a NaN by dividing zero by zero, dividing infinity by infinity, or taking the square root of -1. The difference between infinity and NaN is that the NaN value has a nonzero significand. The NaN value is very sticky. Any operation that has a NaN as one of its inputs always produces a NaN result.

Exceptions and Traps*

In addition to defining the results of computations that aren’t mathematically defined, the IEEE standard provides programmers with the ability to detect when these special values are being produced. This way, programmers can write their code without adding extensive IF tests throughout the code checking for the magnitude of values. Instead they can register a trap handler for an event such as underflow and handle the event when it occurs. The exceptions defined by the IEEE standard include:

Overflow to infinity

Underflow to zero

Division by zero

Invalid operation

Inexact operation

According to the standard, these traps are under the control of the user. In most cases, the compiler runtime library manages these traps under the direction from the user through compiler flags or runtime library calls. Traps generally have significant overhead compared to a single floating-point instruction, and if a program is continually executing trap code, it can significantly impact performance.

In some cases it’s appropriate to ignore traps on certain operations. A commonly ignored trap is the underflow trap. In many iterative programs, it’s quite natural for a value to keep reducing to the point where it “disappears.” Depending on the application, this may or may not be an error situation so this exception can be safely ignored.

If you run a program and then it terminates, you see a message such as:

Overflow handler called 10,000,000 times

It probably means that you need to figure out why your code is exceeding the range of the floating-point format. It probably also means that your code is executing more slowly because it is spending too much time in its error handlers.

Compiler Issues*

The IEEE 754 floating-point standard does a good job describing how floating- point operations are to be performed. However, we generally don’t write assembly language programs. When we write in a higher-level language such as FORTRAN, it’s sometimes difficult to get the compiler to generate the assembly language you need for your application. The problems fall into two categories:

The compiler is too conservative in trying to generate IEEE-compliant code and produces code that doesn’t operate at the peak speed of the processor. On some processors, to fully support gradual underflow, extra instructions must be generated for certain instructions. If your code will never underflow, these instructions are unnecessary overhead.

The optimizer takes liberties rewriting your code to improve its performance, eliminating some necessary steps. For example, if you have the following code:

Z = X + 500

Y = Z - 200

The optimizer may replace it with Y = X + 300. However, in the case of a value for X that is close to overflow, the two sequences may not produce the same result.

Sometimes a user prefers “fast” code that loosely conforms to the IEEE standard, and at other times the user will be writing a numerical library routine and need total control over each floating-point operation. Compilers have a challenge supporting the needs of both of these types of users. Because of the nature of the high performance computing market and benchmarks, often the “fast and loose” approach prevails in many compilers.

Closing Notes*

While this is a relatively long chapter with a lot of technical detail, it does not even begin to scratch the surface of the IEEE floating-point format or the entire field of numerical analysis. We as programmers must be careful about the accuracy of our programs, lest the results become meaningless. Here are a few basic rules to get you started:

Look for compiler options that relax or enforce strict IEEE compliance and choose the appropriate option for your program.

Online Book Reader

High Performance Computing - Charles Severance [25]

®Online Book Reader