UNIX System Administration Handbook - Evi Nemeth [447]
• You can make sure that the system has enough memory. As we will see below, memory size has a major influence on performance. Memory is so inexpensive these days that you can usually afford to load every performance-sensitive machine to the gills.
• You can correct problems of usage, both those caused by users (too many jobs run at once, inefficient programming practices, jobs run at excessive priority, and large jobs run at inappropriate times of day) and those caused by the system (quotas, CPU accounting, unwanted daemons).
• For cases in which you are using UNIX as a web server or some other type of network application server, you may want to spread traffic among a number of systems with a commercial load balancing appliance, such as Cisco’s Local Director or Alteon Networks’ ACEswitch.1
These boxes make several physical servers appear to be one logical server to the outside world. They balance the load according one of several user-selectable algorithms such as “most responsive server” or “round robin.”
These load balancers also provide useful redundancy should a server go down. They’re really quite necessary if your site must handle unexpected traffic spikes.
• You can organize the system’s hard disks and filesystems so that load is evenly balanced, maximizing I/O throughput. For specific applications such as databases, you can use a fancy multidisk technology such as RAID to optimize data transfers. Consult with your database vendor for recommendations.
• It’s important to note that different types of applications and databases respond differently to being spread across multiple disks. RAID comes in many forms, and you will need to put effort into determining which form (if any) is appropriate for your particular application.
• You can monitor your network to be sure that it is not saturated with traffic and that the error rate is low. Networks can be supervised with the netstat command, described on page 631. See also Chapter 20, Network Management and Debugging.
• You can configure the kernel to eliminate unwanted drivers and options and to use tables of an appropriate size. These topics are covered in Chapter 12, Drivers and the Kernel.
• You can identify situations in which the system is fundamentally inadequate to satisfy the demands being made of it.
These steps are listed in rough order of effectiveness. Adding memory and balancing traffic across multiple servers can make a huge difference in performance. You might see some improvement from organizing the system’s disks correctly and from correcting network problems. The other factors may not make any difference at all.
25.2 FACTORS THAT AFFECT PERFORMANCE
Perceived performance is determined by the efficiency with which the system’s resources are allocated and shared. The exact definition of a “resource” is rather vague. It can include such items as cached contexts on the CPU chip and entries in the address table of the memory controller. However, to a first approximation, only the following four resources have much effect on performance:
• CPU time
• Memory
• Hard disk I/O bandwidth
• Network I/O bandwidth
All processes consume a portion of the system’s resources. If resources are still left after active processes have taken what they want, the system’s performance is about as good as it can be.
If there are not enough resources to go around, processes must take turns. A process that does not have immediate access to the resources it needs must wait around doing nothing. The amount of time spent waiting is one of the basic measures of performance degradation.
CPU time is one of the easiest resources to measure. A constant amount of processing power is always available. In theory, that amount is 100% of the CPU cycles, but overhead and various inefficiencies make the real-life number more like 95%. A process that’s using more than 90% of the CPU is entirely CPU-bound and is consuming most of the system’s available computing power.
Many people assume that the speed of the CPU is the