Online Book Reader

Home Category

Choose a category
All
Classic-Fiction

High Performance Computing - Charles Severance [119]

By Root 1371 0

call, systems that have support for collective operations in hardware can make best use of this hardware. Also, when MPI is operating on a shared-memory environment, the broadcast can be simplified as all the slaves simply make a local copy of a shared variable.

Clearly, the developers of the MPI specification had significant experience with developing message-passing applications and added many widely used features to the message-passing library. Without these features, each programmer needed to use more primitive operations to construct their own versions of the higher-level operations.

Heat Flow in MPI

In this example, we implement our heat flow problem in MPI using a similar decomposition to the PVM example. There are several ways to approach the prob- lem. We could almost translate PVM calls to corresponding MPI calls using the MPI_COMM_WORLD communicator. However, to showcase some of the MPI features, we create a Cartesian communicator:

PROGRAM MHEATC

INCLUDE ’mpif.h’

INCLUDE ’mpef.h’

INTEGER ROWS,COLS,TOTCOLS

PARAMETER(MAXTIME=200)

* This simulation can be run on MINPROC or greater processes.

* It is OK to set MINPROC to 1 for testing purposes

* For a large number of rows and columns, it is best to set MINPROC

* to the actual number of runtime processes

PARAMETER(MINPROC=2) PARAMETER(ROWS=200,TOTCOLS=200,COLS=TOTCOLS/MINPROC)

DOUBLE PRECISION RED(0:ROWS+1,0:COLS+1),BLACK(0:ROWS+1,0:COLS+1)

INTEGER S,E,MYLEN,R,C

INTEGER TICK,MAXTIME

CHARACTER*30 FNAME

The basic data structures are much the same as in the PVM example. We allocate a subset of the heat arrays in each process. In this example, the amount of space allocated in each process is set by the compile-time variable MINPROC. The simulation can execute on more than MINPROC processes (wasting some space in each process), but it can’t execute on less than MINPROC processes, or there won’t be sufficient total space across all of the processes to hold the array:

INTEGER COMM1D,INUM,NPROC,IERR

INTEGER DIMS(1),COORDS(1)

LOGICAL PERIODS(1)

LOGICAL REORDER

INTEGER NDIM

INTEGER STATUS(MPI_STATUS_SIZE)

INTEGER RIGHTPROC, LEFTPROC

These data structures are used for our interaction with MPI. As we will be doing a one-dimensional Cartesian decomposition, our arrays are dimensioned to one. If you were to do a two-dimensional decomposition, these arrays would need two elements:

PRINT *,’Calling MPI_INIT’

CALL MPI_INIT( IERR )

PRINT *,’Back from MPI_INIT’

CALL MPI_COMM_SIZE( MPI_COMM_WORLD, NPROC, IERR )

The call to MPI_INIT creates the appropriate number of processes. Note that in the output, the PRINT statement before the call only appears once, but the second PRINT appears once for each process. We call MPI_COMM_SIZE to determine the size of the global communicator MPI_COMM_WORLD. We use this value to set up our Cartesian topology:

* Create new communicator that has a Cartesian topology associated

* with it - MPI_CART_CREATE returns COMM1D - A communicator descriptor

DIMS(1) = NPROC

PERIODS(1) = .FALSE.

REORDER = .TRUE.

NDIM = 1

CALL MPI_CART_CREATE(MPI_COMM_WORLD, NDIM, DIMS, PERIODS,

+ REORDER, COMM1D, IERR)

Now we create a one-dimensional (NDIM=1) arrangement of all of our processes (MPI_COMM_WORLD). All of the parameters on this call are input values except for COMM1D and IERR. COMM1D is an integer “communicator handle.” If you print it out, it will be a value such as 134. It is not actually data, it is merely a handle that is used in other calls. It is quite similar to a file descriptor or unit number used when performing input-output to and from files.

The topology we use is a one-dimensional decomposition that isn’t periodic. If we specified that we wanted a periodic decomposition, the far-left and far-right processes would be neighbors in a wrapped-around fashion making a ring. Given that it isn’t periodic, the far-left and far-right processes have no neighbors.

In our PVM example above, we declared that Process 0 was the far-right process, Process NPROC-1 was the far-left process, and the other processes

Online Book Reader

High Performance Computing - Charles Severance [119]

®Online Book Reader