Apache Security - Ivan Ristic [136]
"Web Search for a Planet: The Google Cluster Architecture" by Luiz Andre Barroso et al. (http://www.computer.org/micro/mi2003/m2022.pdf)
"The Google Filesystem" by Sanjay Ghemawat et al. (http://www.cs.rochester.edu/sosp2003/papers/p125-ghemawat.pdf)
The following sections describe various advanced architectures.
No load balancing, no high availability
At the bottom of the scale we have a single-server system. It is great if such a system works for you. Introducing scalability and increasing availability of a system involves hard work, and it is usually done under pressure and with (financial) constraints.
So, if you are having problems with that server, you should first look into ways to enhance the system without changing it too much:
Determine where the processing bottleneck is. This will ensure you are addressing the real problem.
Tune the operating system. Tune hard-disk access and examine memory requirements. Add more memory to the system because you can never have too much.
Tune the web server to make the most out of available resources (see Chapter 5).
Look for other easy solutions. For example, if you are running PHP, having an optimization module (which caches compiled PHP scripts) can increase your performance several times and lower the server load. There are many free solutions to choose from. One of them, mmCache (http://turck-mmcache.sourceforge.net) is considered to be as good as commercially available solutions.
Perform other application-level tuning techniques (which are beyond the scope of this book).
* * *
Tip
John Lim of PHP Everywhere maintains a detailed list of 34 steps to tune a server running Apache and PHP at http://phplens.com/phpeverywhere/tuning-apache-php.
* * *
If you have done all of this and you are still on the edge of the server's capabilities, then look into replacing the server with a more powerful machine. This is an easy step because hardware continues to improve and drop in price.
The approach I have just described is not very scalable but is adequate for many installations that will never grow to require more than one machine. There remains a problem with availability—none of this will increase the availability of the system.
High availability
A simple solution to increase availability is to introduce resource redundancy by way of a server mirror (illustrated in Figure 9-6). Create an exact copy of the system and install software to monitor the operations of the original. If the original breaks down for any reason, the mirrored copy becomes active and takes over. The High-Availability Linux Project (http://linux-ha.org) describes how this can be done on Linux.
Figure 9-6. Two web servers in a high availability configuration
A simple solution such as this has its drawbacks:
It does not scale well. For each additional server you want to introduce to the system, you must purchase a mirror server. If you do this a couple of times, you will have way too much redundancy.
Resources are being wasted because mirrored servers are not operational until the fault occurs; there is no load balancing in place.
Manual load balancing
Suppose you have determined that a single server is not enough to cope with the load. Before you jump to creating a cluster of servers, you should consider several crude but often successful techniques that are referred to as manual load balancing. There are many sites happily working like this. Here are three techniques you can use:
Separate services onto different servers. For example, use one machine for the web server and the other for the database server.
Separate web servers into groups. One group could serve images, while the other serves application pages. Even with only one machine, some people prefer to have two web servers: a "slim" one for static files and a "heavy" one for dynamic pages. Another similar approach is to split the application into many parts, but this does not result in an easily maintainable system.
Add a performance reverse proxy in front of the server.
So,