Professional C__ - Marc Gregoire [423]
Consequently, it is often helpful to profile your program to determine which parts of the code require optimization. There are many profiling tools available that analyze programs as they run in order to generate data about their performance. Most profiling tools provide analysis at the function level by specifying the amount of time (or percent of total execution time) spent in each function in the program. After running a profiler on your program, you can usually tell immediately which parts of the program need optimization. Profiling before and after optimizing is also useful to prove that your optimizations had an effect.
If you are using Microsoft Visual C++ 2010 Premium or Ultimate, you already have a great built-in profiler, which is discussed later in this chapter. Another good profiling tool is Rational PurifyPlus from IBM. Both of these products require license fees, but you should check if your company or academic institution has a site license for their use. If the license restriction is prohibitive, there are several free profiling tools. One of the most well-known is gprof (GNU profiler), which can be found on most Unix systems, including Solaris and Linux.
Profiling Example with gprof
The power of profiling can best be seen with a real coding example. As a disclaimer, the performance bugs in the first attempt shown are not subtle. Real efficiency issues would probably be more complex, but a program long enough to demonstrate them would be too lengthy for this book.
Suppose that you work for the United States Social Security Administration. Every year the administration puts up a website that allows users to look up the popularity of new baby names from the previous year. Your job is to write the back-end program that looks up names for users. Your input is a file containing the name of every new baby. This file will obviously contain redundant names. For example, in the file for boys for 2003, the name Jacob was the most popular, showing up 29,195 times. Your program must read the file to construct an in-memory database. A user may then request the absolute number of babies with a given name, or the rank of that name among all the babies.
First Design Attempt
A logical design for this program consists of a NameDB class with the following public methods:
#include #include using std::string; class NameDB { public: // Reads the list of baby names in nameFile to populate the database. // Throws invalid_argument if nameFile cannot be opened or read. NameDB(const string& nameFile) throw (std::invalid_argument); // Returns the rank of the name (1st, 2nd, etc). // Returns -1 if the name is not found. int getNameRank(const string& name) const; // Returns the number of babies with this name. // Returns -1 if the name is not found. int getAbsoluteNumber(const string& name) const; // Protected and private members and methods not shown }; Code snippet from NameDB\FirstAttempt\NameDB.h The hard part is choosing a good data structure for the in-memory database. A first attempt might be an array, or a vector from the STL, of name/count pairs. Each entry in the vector would store one of the names, along with a count of the number of times that name shows up in the raw data file. Here is the complete class definition with such a design: #include #include #include #include using std::string; class NameDB { public: NameDB(const string& nameFile) throw (std::invalid_argument); int getNameRank(const string& name) const; int getAbsoluteNumber(const string& name) const; protected: std::vector // Helper methods bool nameExists(const