Online Book Reader

Home Category

High Performance Computing - Charles Severance [142]

By Root 1374 0
chip space for them. If a wide range of programs can run faster with this type of instruction, it’s often added.

The assembly code on the RS-6000 is:

ai r3,r3,-4 # Address of A(0)

ai r5,r5,-4 # Address of B(0)

ai r4,r4,-4 # Address of C(0)

bcr BO_IF_NOT,CR0_GT

mtspr CTR,r6 # Store in the Counter Register

__L18:

lfsu fp0,4(r4) # Pre Increment Load

lfsu fp1,4(r5) # Pre Increment Load

fa fp0,fp0,fp1

frsp fp0,fp0

stfsu fp0,4(r3) # Pre-increment Store

bc BO_dCTR_NZERO,CR0_LT,__L18 # Branch on Counter

The RS-6000 also supports a memory addressing mode that can add a value to its address register before using the address register. Interestingly, these two features (branch on count and pre-increment load) eliminate several instructions when compared to the more “pure” SPARC processor. The SPARC processor has 10 instructions in the body of its loop, while the RS-6000 has 6 instructions.

The advantage of the RS-6000 in this particular loop may be less significant if both processors were two-way superscalar. The instructions were eliminated on the RS-6000 were integer instructions. On a two-way superscalar processor, those integer instructions may simply execute on the integer units while the floating-point units are busy performing the floating-point computations.


Conclusion

In this section, we have attempted to give you some understanding of the variety of assembly language that is produced by compilers at different optimization levels and on different computer architectures. At some point during the tuning of your code, it can be quite instructive to take a look at the generated assembly language to be sure that the compiler is not doing something really stupid that is slowing you down.

Please don’t be tempted to rewrite portions in assembly language. Usually any problems can be solved by cleaning up and streamlining your high-level source code and setting the proper compiler flags.

It is interesting that very few people actually learn assembly language any more. Most folks find that the compiler is the best teacher of assembly language. By adding the appropriate option (often -S), the compiler starts giving you lessons. I suggest that you don’t print out all of the code. There are many pages of useless variable declarations, etc. For these examples, I cut out all of that useless information. It is best to view the assembly in an editor and only print out the portion that pertains to the particular loop you are tuning.


[78] One of the most interesting remaining topics is the definition of “RISC.” Don’t be fooled into thinking there is one definition of RISC. The best I have heard so far is from John Mashey: “RISC is a label most commonly used for a set of instruction set architecture characteristics chosen to ease the use of aggressive implementation techniques found in high performance processors (regardless of RISC, CISC, or irrelevant).”

[79] Opcode = operation code = instruction.

[80] In 1955, IBM began constructing a machine known as Stretch. It was the first computer to process several instructions at a time in stages, so that they streamed in, rather than being fetched in a piece- meal fashion. The goal was to make it 25 times faster than the then brand-new IBM 704. It was six years before the first Stretch was delivered to Los Alamos National Laboratory. It was indeed faster, but it was expensive to build. Eight were sold for a loss of $20 million.

[81] And they did it without ever taking out a single instruction!

[82] The typical CISC microprocessor in the 1980s supported floating-point operations in a separate coprocessor.

[83] Here is a simple analogy: imagine a line at a fast-food drive up window. If there is only one window, one customer orders and pays, and the food is bagged and delivered to the customer before the second customer orders. For busier restaurants, there are three windows. First you order, then move ahead. Then at a second window, you pay and move ahead. At the third window you pull up, grab the food and roar off into the distance. While your wait at the three-window

Return Main Page Previous Page Next Page

®Online Book Reader