High Performance Computing - Charles Severance [49]
Exercise 2.15.3.
Write a program to determine the overhead of the getrusage and the etime calls. Other than consuming processor time, how can making a system call to check the time too often alter the application performance?
2.3. Eliminating Clutter
Introduction*
We have looked at code from the compiler’s point of view and at how to profile code to find the trouble spots. This is good information, but if you are dissatisfied with a code’s performance, you might still be wondering what to do about it. One possibility is that your code is too obtuse for the compiler to optimize properly. Excess code, too much modularization, or even previous optimization-related “improvements” can clutter up your code and confuse the compilers. Clutter is anything that contributes to the runtime without contributing to the answer. It comes in two forms:
Things that contribute to overhead
Subroutine calls, indirect memory references, tests within loops, wordy tests, type conversions, variables preserved unnecessarily
Things that restrict compiler flexibility
Subroutine calls, indirect memory references, tests within loops, ambiguous pointers
It’s not a mistake that some of the same items appear in both lists. Subroutine calls or if-statements within loops can both bite and scratch you by taking too much time and by creating fences — places in the program where instructions that appear before can’t be safely intermixed with instructions that appear after, at least not without a great deal of care. The goal of this chapter is to show you how to eliminate clutter, so you can restructure what’s left over for the fastest execution. We save a few specific topics that might fit here, especially those regarding memory references, for later chapters where they are treated as subjects by themselves.
Before we start, we’ll remind you: as you look for ways to improve what you have, keep your eyes and mind open to the possibility that there might be a fundamentally better way to do something—a more efficient sorting technique, random number generator, or solver. A different algorithm may buy you far more speed than tuning. Algorithms are beyond the scope of this book, but what we are discussing here should help you recognize “good” code, or help you to code a new algorithm to get the best performance.
Subroutine Calls*
A typical corporation is full of frightening examples of overhead. Say your department has prepared a stack of paperwork to be completed by another department. What do you have to do to transfer that work? First, you have to be sure that your portion is completed; you can’t ask them to take over if the materials they need aren’t ready. Next, you need to package the materials — data, forms, charge numbers, and the like. And finally comes the official transfer. Upon receiving what you sent, the other department has to unpack it, do their job, repackage it, and send it back.
A lot of time gets wasted moving work between departments. Of course, if the overhead is minimal compared to the amount of useful work being done, it won’t be that big a deal. But it might be more efficient for small jobs to stay within one department. The same is true of subroutine and function calls. If you only enter and exit modules once in a relative while, the overhead of saving registers and preparing argument lists won’t be significant. However, if you are repeatedly calling a few small subroutines, the overhead can buoy them to the top of the profile. It might be better if the work stayed where it was, in the calling routine.
Additionally, subroutine calls inhibit compiler flexibility. Given the right opportunity, you’d like your compiler to have the freedom to intermix instructions that aren’t dependent upon each other. These are found on either side of a subroutine call, in the caller and callee. But the opportunity is lost when the compiler can’t peer into subroutines and functions. Instructions that might overlap very nicely have to stay on their respective sides of the artificial fence.
It helps if we illustrate the challenge