High Performance Computing - Charles Severance [120]
Once the communicator is set up, we use it in all of our communication operations:
* Get my rank in the new communicator
CALL MPI_COMM_RANK( COMM1D, INUM, IERR)
Within each communicator, each process has a rank from zero to the size of the communicator minus 1. The MPI_COMM_RANK tells each process its rank within the communicator. A process may have a different rank in the COMM1D communicator than in the MPI_COMM_WORLD communicator because of some reordering.
Given a Cartesian topology communicator,[76] we can extract information from the communicator using the MPI_CART_GET routine:
* Given a communicator handle COMM1D, get the topology, and my position
* in the topology
CALL MPI_CART_GET(COMM1D, NDIM, DIMS, PERIODS, COORDS, IERR)
In this call, all of the parameters are output values rather than input values as in the MPI_CART_CREATE call. The COORDS variable tells us our coordinates within the communicator. This is not so useful in our one-dimensional example, but in a two-dimensional process decomposition, it would tell our current position in that two-dimensional grid:
* Returns the left and right neighbors 1 unit away in the zeroth dimension
* of our Cartesian map - since we are not periodic, our neighbors may
* not always exist - MPI_CART_SHIFT handles this for us
CALL MPI_CART_SHIFT(COMM1D, 0, 1, LEFTPROC, RIGHTPROC, IERR)
CALL MPE_DECOMP1D(TOTCOLS, NPROC, INUM, S, E)
MYLEN = ( E - S ) + 1
IF ( MYLEN.GT.COLS ) THEN
PRINT *,’Not enough space, need’,MYLEN,’ have ’,COLS
PRINT *,TOTCOLS,NPROC,INUM,S,E
STOP
ENDIF
PRINT *,INUM,NPROC,COORDS(1),LEFTPROC,RIGHTPROC, S, E
We can use MPI_CART_SHIFT to determine the rank number of our left and right neighbors, so we can exchange our common points with these neighbors. This is necessary because we can’t simply send to INUM-1 and INUM+1 if MPI has chosen to reorder our Cartesian decomposition. If we are the far-left or far-right process, the neighbor that doesn’t exist is set to MPI_PROC_NULL, which indicates that we have no neighbor. Later when we are performing message sending, it checks this value and sends messages only to real processes. By not sending the message to the “null process,” MPI has saved us an IF test.
To determine which strip of the global array we store and compute in this process, we call a utility routine called MPE_DECOMP1D that simply does several calculations to evenly split our 200 columns among our processes in contiguous strips. In the PVM version, we need to perform this computation by hand.
The MPE_DECOMP1D routine is an example of an extended MPI library call (hence the MPE prefix). These extensions include graphics support and logging tools in addition to some general utilities. The MPE library consists of routines that were useful enough to standardize but not required to be supported by all MPI implementations. You will find the MPE routines supported on most MPI implementations.
Now that we have our communicator group set up, and we know which strip each process will handle, we begin the computation:
* Start Cold
DO C=0,COLS+1
DO R=0,ROWS+1
BLACK(R,C) = 0.0
ENDDO
ENDDO
As in the PVM example, we set the plate (including boundary values) to zero.
All processes begin the time step loop. Interestingly, like in PVM, there is no need for any synchronization. The messages implicitly synchronize our loops.
The first step is to store the permanent heat sources. We need to use a routine because we must make the store operations relative to our strip of the global array:
* Begin running the time steps
DO TICK=1,MAXTIME
* Set the persistent heat sources
CALL STORE(BLACK,ROWS,COLS,S,E,ROWS/3,TOTCOLS/3,10.0,INUM)
CALL STORE(BLACK,ROWS,COLS,S,E,2*ROWS/3,TOTCOLS/3,20.0,INUM)
CALL STORE(BLACK,ROWS,COLS,S,E,ROWS/3,2*TOTCOLS/3,-20.0,INUM)