High Performance Computing - Charles Severance [125]
[75] Sounds a little like HPF, no?
[76] Remember, each communicator may have a topology associated with it. A topology can be grid, graph, or none. Interestingly, the MPI_COMM_WORLD communicator has no topology associated with it.
[77] Note that you could do two time steps (one black-red-black iteration) if you exchanged two ghost columns at the top of the loop.
Chapter 5. Appendixes
5.1. Appendix C: High Performance Microprocessors
Introduction*
High Performance Microprocessors
It has been said that history is rewritten by the victors. It is clear that high performance RISC-based microprocessors are defining the current history of high performance computing. We begin our study with the basic building blocks of modern high performance computing: the high performance RISC microprocessors.
A complex instruction set computer (CISC) instruction set is made up of powerful primitives, close in functionality to the primitives of high-level languages like C or FORTRAN. It captures the sense of “don’t do in software what you can do in hardware.” RISC, on the other hand, emphasizes low-level primitives, far below the complexity of a high-level language. You can compute anything you want using either approach, though it will probably take more machine instructions if you’re using RISC. The important difference is that with RISC you can trade instruction-set complexity for speed.
To be fair, RISC isn’t really all that new. There were some important early machines that pioneered RISC philosophies, such as the CDC 6600 (1964) and the IBM 801 project (1975). It was in the mid-1980s, however, that RISC machines first posed a direct challenge to the CISC installed base. Heated debate broke out — RISC versus CISC — and even lingers today, though it is clear that the RISC[78] approach is in greatest favor; late-generation CISC machines are looking more RISC-like, and some very old families of CISC, such as the DEC VAX, are being retired.
This chapter is about CISC and RISC instruction set architectures and the differences between them. We also describe newer processors that can execute more than one instruction at a time and can execute instructions out of order.
Why CISC?*
You might ask, “If RISC is faster, why did people bother with CISC designs in the first place?” The short answer is that in the beginning, CISC was the right way to go; RISC wasn’t always both feasible and affordable. Every kind of design incorporates trade-offs, and over time, the best systems will make them differently. In the past, the design variables favored CISC.
Space and Time
To start, we’ll ask you how well you know the assembly language for your work- station. The answer is probably that you haven’t even seen it. Why bother? Compilers and development tools are very good, and if you have a problem, you can debug it at the source level. However, 30 years ago, “respectable” programmers understood the machine’s instruction set. High-level language compilers were commonly available, but they didn’t generate the fastest code, and they weren’t terribly thrifty with memory. When programming, you needed to save both space and time, which meant you knew how to program in assembly language. Accordingly, you could develop an opinion about the machine’s instruction set. A good instruction set was both easy to use and powerful. In many ways these qualities were the same: “powerful” instructions accomplished a lot, and saved the programmer from specifying many little steps — which, in turn, made them easy to use. But they had other, less apparent (though perhaps more important) features as well: powerful instructions saved memory and time.
Back then, computers had very little storage by today’s standards. An instruction that could roll all the steps of a complex operation, such as a do-loop, into single opcode[79] was a plus, because memory was precious. To put some stakes in the ground, consider the last vacuum-tube computer that IBM built, the model 704 (1956). It had hardware floating-point, including a division operation, index registers, and instructions