Online Book Reader

Home Category

Choose a category
All
Classic-Fiction

High Performance Computing - Charles Severance [34]

By Root 1356 0

2, for instance, to eliminate bottlenecks. Register renaming keeps instructions that are recycling the same registers for different purposes from having to wait until previous instructions have finished with them.

The same situation can occur in programs — the same variable (i.e., memory location) can be recycled for two unrelated purposes. For example, see the variable x in the following fragment:

x = y * z;

q = r + x + x;

x = a + b;

When the compiler recognizes that a variable is being recycled, and that its current and former uses are independent, it can substitute a new variable to keep the calculations separate:

x0 = y * z;

q = r + x0 + x0;

x = a + b;

Variable renaming is an important technique because it clarifies that calculations are independent of each other, which increases the number of things that can be done in parallel.

Common Subexpression Elimination

Subexpressions are pieces of expressions. For instance, A+B is a subexpression of C*(A+B). If A+B appears in several places, like it does below, we call it a common subexpression:

D = C * (A + B)

E = (A + B)/2.

Rather than calculate A + B twice, the compiler can generate a temporary variable and use it wherever A + B is required:

temp = A + B

D = C * temp

E = temp/2.

Different compilers go to different lengths to find common subexpressions. Most pairs, such as A+B, are recognized. Some can recognize reuse of intrinsics, such as SIN(X). Don’t expect the compiler to go too far though. Subexpressions like A+B+C are not computationally equivalent to reassociated forms like B+C+A, even though they are algebraically the same. In order to provide predictable results on computations, FORTRAN must either perform operations in the order specified by the user or reorder them in a way to guarantee exactly the same result. Sometimes the user doesn’t care which way A+B+C associates, but the compiler cannot assume the user does not care.

Address calculations provide a particularly rich opportunity for common subexpression elimination. You don’t see the calculations in the source code; they’re generated by the compiler. For instance, a reference to an array element A(I,J) may translate into an intermediate language expression such as:

address(A) + (I-1)*sizeof_datatype(A)

+ (J-1)*sizeof_datatype(A) * column_dimension(A)

If A(I,J) is used more than once, we have multiple copies of the same address computation. Common subexpression elimination will (hopefully) discover the redundant computations and group them together.

Loop-Invariant Code Motion

Loops are where many high performance computing programs spend a majority of their time. The compiler looks for every opportunity to move calculations out of a loop body and into the surrounding code. Expressions that don’t change after the loop is entered (loop-invariant expressions) are prime targets. The following loop has two loop-invariant expressions:

DO I=1,N

A(I) = B(I) + C * D

E = G(K)

ENDDO

Below, we have modified the expressions to show how they can be moved to the outside:

temp = C * D

DO I=1,N

A(I) = B(I) + temp

ENDDO

E = G(K)

It is possible to move code before or after the loop body. As with common subexpression elimination, address arithmetic is a particularly important target for loop- invariant code motion. Slowly changing portions of index calculations can be pushed into the suburbs, to be executed only when needed.

Induction Variable Simplification

Loops can contain what are called induction variables. Their value changes as a linear function of the loop iteration count. For example, K is an induction variable in the following loop. Its value is tied to the loop index:

DO I=1,N

K = I*4 + M

...

ENDDO

Induction variable simplification replaces calculations for variables like K with simpler ones. Given a starting point and the expression’s first derivative, you can arrive at K’s value for the nth iteration by stepping through the n-1 intervening iterations:

K = M

DO I=1,N

K = K + 4

...

ENDDO

The two forms of the loop

Online Book Reader

High Performance Computing - Charles Severance [34]

®Online Book Reader