Squid_ The Definitive Guide - Duane Wessels [100]
In addition to the load-sharing algorithm, CARP also has a protocol component. The Membership Table has a well-defined structure and syntax so that all clients of a single array can have the same configuration. If some clients are configured differently, CARP becomes less useful because not all clients send the same request to the same parent. Note that Squid doesn't currently implement the Membership Table feature.
Squid's CARP implementation is lacking in another way. The protocol says that if a request can't be forwarded to the highest-scoring parent cache, it should be sent to the second-highest-scoring member. If that also fails, the application should give up. Squid currently uses only the highest-scoring parent cache.
CARP was originally documented as an Internet Draft in 1998, which is now expired. It was developed by Vinod Valloppillil of Microsoft and Keith W. Ross of the University of Pennsylvania. With a little searching, you can still find the old document out there on the Internet. You may even be able to find some documentation on the Microsoft sites. You can also find more information on CARP in my O'Reilly book Web Caching.
Configuring Squid for CARP
To use CARP in Squid, you must first run the ./configure script with the —enable-carp option. Next, you must add carp-load-factor options to the cache_peer lines for parents that are members of the array. The following is an example.
cache_peer neighbor1.host.name parent 3128 0 carp-load-factor=0.3
cache_peer neighbor2.host.name parent 3128 0 carp-load-factor=0.3
cache_peer neighbor3.host.name parent 3128 0 carp-load-factor=0.4
Note that all carp-load-factor values must add up to 1.0. Squid checks for this condition and complains if it finds a discrepancy. Additionally, the cache_peer lines must be listed in order of increasing load factor values. Only recent versions of Squid check that this condition is true.
Remember that CARP is treated somewhat specially with regard to a neighbor's alive/dead state. Squid normally declares a neighbor dead (and ceases sending requests to it) after 10 failed connections. In the case of CARP, however, Squid skips a parent that has one or more failed connections. Once Squid is working with CARP, you can monitor it with the carp cache manager page. See Section 14.2.1.49 for more information.
Putting It All Together
As you probably realize by now, Squid has many different ways to decide how and where requests are forwarded. In many cases, you can employ more than one protocol or technique at a time. Just by looking at the configuration file, however, you'd probably have a hard time figuring out how Squid uses the different techniques in combination. In this section I'll explain how Squid actually makes the forwarding decision.
Obviously, it all starts with a cache miss. Any request that is satisfied as an unvalidated cache hit doesn't go through the following sequence of events.
The goal of the selection procedure is to create a list of appropriate next-hop locations. A next-hop location may be a neighbor cache or the origin server. Depending on your configuration, Squid may select up to three possible next-hops. If the request can't be satisfied by the first, Squid tries the second, and so on.
Step 1: Determine Direct Options
The first step is to determine if the request may, must, or must not be sent directly to the origin server. Squid evaluates the never_direct and always_direct access rule lists for the request. The goal is to set a flag to one of three values: DIRECT_YES, DIRECT_MAYBE, or DIRECT_NO. This flag later determines whether Squid should, or should not, try to select a neighbor cache for the request. Squid checks the following conditions in order. If any condition is true, it sets the direct flag and proceeds to the next step. If you're following along in the source code, this