High Performance Computing - Charles Severance [113]
HOST DTID ARCH SPEED
frodo 40000 SUN4SOL2 1000
gollum 40001 SUN4SOL2 1000
mordor 40002 SUN4SOL2 1000
pvm> ps
HOST TID FLAG 0x COMMAND
frodo 40042 6/c,f pvmgs
pvm> reset
pvm> ps
HOST TID FLAG 0x COMMAND
pvm>
Many different users can be running virtual machines using the same pool of resources. Each user has their own view of an empty machine. The only way you might detect other virtual machines using your resources is in the percentage of the time your applications get the CPU.
There is a wide range of commands you can issue at the PVM console. The ps command shows the running processes in your virtual machine. It’s quite possible to have more processes than computer systems. Each process is time-shared on a system along with all the other load on the system. The reset command performs a soft reboot on your virtual machine. You are the virtual system administrator of the virtual machine you have assembled.
To execute programs on your virtual computer, you must compile and link your programs with the PVM library routines:[73]
% aimk mast slav
making in SUN4SOL2/ for SUN4SOL2
cc -O -I/opt/pvm3/include -DSYSVBFUNC -DSYSVSTR -DNOGETDTBLSIZ
-DSYSVSIGNAL -DNOWAIT3 -DNOUNIXDOM -o mast
../mast.c -L/opt/pvm3/lib/SUN4SOL2 -lpvm3 -lnsl -lsocket
mv mast ˜crs/pvm3/bin/SUN4SOL2
cc -O -I/opt/pvm3/include -DSYSVBFUNC -DSYSVSTR -DNOGETDTBLSIZ
-DSYSVSIGNAL -DNOWAIT3 -DNOUNIXDOM -o slav
../slav.c -L/opt/pvm3/lib/SUN4SOL2 -lpvm3 -lnsl -lsocket
mv slav ˜crs/pvm3/bin/SUN4SOL2
%
When the first PVM call is encountered, the application contacts your virtual machine and enrolls itself in the virtual machine. At that point it should show up in the output of the ps command issued at the PVM console.
From that point on, your application issues PVM calls to create more processes and interact with those processes. PVM takes the responsibility for distributing the processes on the different systems in the virtual machine, based on the load and your assessment of each system’s relative performance. Messages are moved across the network using user datagram protocol (UDP) and delivered to the appropriate process.
Typically, the PVM application starts up some additional PVM processes. These can be additional copies of the same program or each PVM process can run a different PVM application. Then the work is distributed among the processes, and results are gathered as necessary.
There are several basic models of computing that are typically used when working with PVM:
Master/Slave: : When operating in this mode, one process (usually the initial process) is designated as the master that spawns some number of worker processes. Work units are sent to each worker process, and the results are returned to the master. Often the master maintains a queue of work to be done and as a slave finishes, the master delivers a new work item to the slave. This approach works well when there is little data interaction and each work unit is independent. This approach has the advantage that the overall problem is naturally load-balanced even when there is some variation in the execution time of individual processes.
Broadcast/Gather: : This type of application is typically characterized by the fact that the shared data structure is relatively small and can be easily copied into every processor’s node. At the beginning of the time step, all the global data structures are broadcast from the master process to all of the processes. Each process then operates on their portion of the data. Each process produces a partial result that is sent back and gathered by the master process. This pattern is repeated for each time step.
SPMD/Data decomposition: : When the overall data structure is too large to have a copy stored in every process, it must be decomposed across multiple processes. Generally, at the beginning of a time step, all processes must exchange some data with each of their neighboring processes. Then with their local data augmented by the necessary subset of the remote data, they perform their computations. At the end of the time step,