Online Book Reader

Home Category

High Performance Computing - Charles Severance [111]

By Root 1380 0
there are a large number of these intrinsic functions, most applications use only a few of the operations.


HPF extrinsics

In order to allow the vendors with diverse architectures to provide their particular advantage, HPF included the capability to link "extrinsic" functions. These functions didn't need to be written in FORTRAN 90/HPF and performed a number of vendor-supported capabilities. This capability allowed users to perform such tasks as the creation of hybrid applications with some HPF and some message passing.

High performance computing programmers always like the ability to do things their own way in order to eke out that last drop of performance.


Heat Flow in HPF

To port our heat flow application to HPF, there is really only a single line of code that needs to be added. In the example below, we've changed to a larger two-dimensional array:

INTEGER PLATESIZ,MAXTIME

PARAMETER(PLATESIZ=2000,MAXTIME=200)

!HPF$ DISTRIBUTE PLATE(*,BLOCK)

REAL*4 PLATE(PLATESIZ,PLATESIZ)

INTEGER TICK

PLATE = 0.0

* Add Boundaries

PLATE(1,:) = 100.0

PLATE(PLATESIZ,:) = -40.0

PLATE(:,PLATESIZ) = 35.23

PLATE(:,1) = 4.5

DO TICK = 1,MAXTIME

PLATE(2:PLATESIZ-1,2:PLATESIZ-1) = (

+ PLATE(1:PLATESIZ-2,2:PLATESIZ-1) +

+ PLATE(3:PLATESIZ-0,2:PLATESIZ-1) +

+ PLATE(2:PLATESIZ-1,1:PLATESIZ-2) +

+ PLATE(2:PLATESIZ-1,3:PLATESIZ-0) ) / 4.0

PRINT 1000,TICK, PLATE(2,2)

1000 FORMAT('TICK = ',I5, F13.8)

ENDDO

*

END

You will notice that the HPF directive distributes the array columns using the BLOCK approach, keeping all the elements within a column on a single processor. At first glance, it might appear that (BLOCK,BLOCK) is the better distribution. However, there are two advantages to a (*,BLOCK) distribution. First, striding down a column is a unit-stride operation and so you might just as well process an entire column. The more significant aspect of the distribution is that a (BLOCK,BLOCK) distribution forces each processor to communicate with up to eight other processors to get its neighboring values. Using the (*,BLOCK) distribution, each processor will have to exchange data with at most two processors each time step.

When we look at PVM, we will look at this same program implemented in a SPMD-style message-passing fashion. In that example, you will see some of the details that HPF must handle to properly execute this code. After reviewing that code, you will probably choose to implement all of your future heat flow applications in HPF!


HPF Summary

In some ways, HPF has been good for FORTRAN 90. Companies such as IBM with its SP-1 needed to provide some high-level language for those users who didn't want to write message-passing codes. Because of this, IBM has invested a great deal of effort in implementing and optimizing HPF. Interestingly, much of this effort will directly benefit the ability to develop more sophisticated FORTRAN 90 compilers. The extensive data flow analysis required to minimize communications and manage the dynamic data structures will carry over into FORTRAN 90 compilers even without using the HPF directives.

Time will tell if the HPF data distribution directives will no longer be needed and compilers will be capable of performing sufficient analysis of straight FORTRAN 90 code to optimize data placement and movement.

In its current form, HPF is an excellent vehicle for expressing the highly data-parallel, grid-based applications. Its weaknesses are irregular communications and dynamic load balancing. A new effort to develop the next version of HPF is under- way to address some of these issues. Unfortunately, it is more difficult to solve these runtime problems while maintaining good performance across a wide range of architectures.


Closing Notes*

In this chapter, we have covered some of the efforts in the area of languages that have been developed to allow programs to be written for scalable computing. There is a tension between pure FORTRAN-77, FORTRAN 90, HPF, and message passing as to which will be the ultimate tools for scalable, high performance computing.

Certainly, there have

Return Main Page Previous Page Next Page

®Online Book Reader