Squid_ The Definitive Guide - Duane Wessels [124]
1066036250.467 4 170.210.173.0 TCP_IMS_HIT/304 265 GET http://...
1066036250.762 102 163.11.255.0 TCP_IMS_HIT/304 265 GET http://...
1066036250.832 20 163.11.255.0 TCP_IMS_HIT/304 266 GET http://...
1066036251.026 74 203.91.150.0 TCP_CLIENT_REFRESH_MISS/304 267 GET http://...
strip_query_terms
This directive is another privacy feature. It removes query terms from URIs before logging them. If your log files somehow fall into the wrong hands, they won't be able to find any usernames and passwords. When this directive is enabled, all characters after a question mark (?) are removed. For example, a URI like this:
http://auto.search.msn.com/response.asp?MT=www.kimo.com.yw&srch=3
&prov=&utf8
is logged like this:
http://auto.search.msn.com/response.asp?
uri_whitespace
Earlier, I mentioned the problem with whitespace appearing in some URIs. The RFCs state that URIs must not contain whitespace, but in reality it happens all too often. The uri_whitespace directive dictates how Squid should handle such cases. The allowed settings are: strip (default), deny, allow, encode, and chop. Of these, strip, encode, and chop ensure that the URI field doesn't contain any whitespace (thus adding more fields to access.log).
The allow setting allows the request to pass through Squid unmodified. It is likely to cause trouble for redirectors and log file parsers. The deny setting, on the other hand, causes Squid to deny the request. The user receives an error message, but the request is still written to access.log with the whitespace characters.
If you set it to encode, Squid changes the whitespace characters to their RFC 1738 equivalents. This is probably what the user-agent should have done in the first place. The chop setting causes Squid to cut off the URI at the first whitespace character.
The default setting is strip, which makes Squid remove the whitespace characters from the URI. It ensures that your log-file parsers and redirectors will be happy, but it might break certain things, such as improperly encoded search engine queries.
buffered_logs
By default, Squid disables buffering for the cache.log file, which allows you to run tail -f and watch log file entries appear in real time. If you think this will cause an unnecessary overhead, you can disable buffering:
buffered_logs off
However, it probably doesn't matter unless you are running Squid with full debugging. Note that this option affects only cache.log. The others always use unbuffered writes.
access.log Analysis Tools
The access.log file contains a wealth of information, much more than you can see by just browsing through it. In order to get the big picture view, you'll need to use a third-party log-file analysis package. You can find a long list of them linked from the Squid web page, or by going directly to http://www.squid-cache.org/Scripts/.
One of the most popular tools is Calamaris—a Perl script that parses the log file and generates either text or HTML-based reports. It provides a breakdown of traffic by request method, client IP address, origin server domain name, content types, filename extensions, reply size, and more. Calamaris also reports on ICP query traffic and even understands log files from other caching products. Check it out by visiting http://calamaris.cord.de/.
Squeezer, and its derivative, Squeezer2, are Squid-specific analysis tools. They provide many statistics that can help you understand Squid's performance, especially when you have neighbors. Both generate HTML pages as output. Visit the Logfile Analysis page on the squid-cache.org site for links to these programs.
Webalyzer is another good utility. It is designed to be fast and produces HTML pages with tables and bar charts. It was originally designed for origin server access logs. Although it can parse Squid's logs, it doesn't report on such things as hit ratios and response times. It also uses some terms differently than I do. For example, Webalyzer calls any request a "hit," which isn't the same as a cache hit. It also makes a distinction between "pages" and "files."