Online Book Reader

Home Category

High Performance Computing - Charles Severance [81]

By Root 1313 0
of the multiprocessor memory as one big pool, we have seen that it is actually a carefully crafted system of caches, coherency protocols, and main memory. The problems come when your application causes lots of data to be traded between the caches. Each reference that falls out of a given processor’s cache (especially those that require an update in another processor’s cache) has to go out on the bus.

Often, it’s slower to get memory from another processor’s cache than from the main memory because of the protocol and processing overhead involved. Not only do we need to have programs with high locality of reference and unit stride, we also need to minimize the data that must be moved from one CPU to another.


Multiprocessor Software Concepts *

Now that we have examined the way shared-memory multiprocessor hardware operates, we need to examine how software operates on these types of computers. We still have to wait until the next chapters to begin making our FORTRAN programs run in parallel. For now, we use C programs to examine the fundamentals of multiprocessing and multithreading. There are several techniques used to implement multithreading, so the topics we will cover include:

Operating system–supported multiprocessing

User space multithreading

Operating system-supported multithreading

The last of these is what we primarily will use to reduce the walltime of our applications.


Operating System–Supported Multiprocessing

Most modern general-purpose operating systems support some form of multiprocessing. Multiprocessing doesn’t require more than one physical CPU; it is simply the operating system’s ability to run more than one process on the system. The operating system context-switches between each process at fixed time intervals, or on interrupts or input-output activity. For example, in UNIX, if you use the ps command, you can see the processes on the system:

% ps -a

PID TTY TIME CMD

28410 pts/34 0:00 tcsh

28213 pts/38 0:00 xterm

10488 pts/51 0:01 telnet

28411 pts/34 0:00 xbiff

11123 pts/25 0:00 pine

3805 pts/21 0:00 elm

6773 pts/44 5:48 ansys

...

% ps –a | grep ansys

6773 pts/44 6:00 ansys

For each process we see the process identifier (PID), the terminal that is executing the command, the amount of CPU time the command has used, and the name of the command. The PID is unique across the entire system. Most UNIX commands are executed in a separate process. In the above example, most of the processes are waiting for some type of event, so they are taking very few resources except for memory. Process 6773[51] seems to be executing and using resources. Running ps again confirms that the CPU time is increasing for the ansys process:

% vmstat 5

procs memory page disk faults cpu

r b w swap free re mf pi po fr de sr f0 s0 -- -- in sy cs us sy id

3 0 0 353624 45432 0 0 1 0 0 0 0 0 0 0 0 461 5626 354 91 9 0

3 0 0 353248 43960 0 22 0 0 0 0 0 0 14 0 0 518 6227 385 89 11 0

Running the vmstat 5 command tells us many things about the activity on the system. First, there are three runnable processes. If we had one CPU, only one would actually be running at a given instant. To allow all three jobs to progress, the operating system time-shares between the processes. Assuming equal priority, each process executes about 1/3 of the time. However, this system is a two-processor system, so each process executes about 2/3 of the time. Looking across the vmstat output, we can see paging activity (pi, po), context switches (cs), overall user time (us), system time (sy), and idle time (id ).

Each process can execute a completely different program. While most processes are completely independent, they can cooperate and share information using interprocess communication (pipes, sockets) or various operating system-supported shared-memory areas. We generally don’t use multiprocessing on these shared-memory systems as a technique to increase single-application performance.


Multiprocessing software

In this section, we explore how programs access multiprocessing features.[52] In this example, the

Return Main Page Previous Page Next Page

®Online Book Reader