Squid_ The Definitive Guide - Duane Wessels [126]
5: cache key
Squid uses MD5 hash values for the primary index to locate cached objects. The key is based on the request method, URI, and possibly other information.
You might be able to use the cache key to match up store.log entries. Note, however, that an object's cache key can change. This happens, for example, whenever Squid logs a TCP_REFRESH_MISS request in access.log. It looks like this:
1065837334.045 SWAPOUT ... 554BACBD2CB2A0C38FF9BF4B2239A9E5 ... http://blah
1066031047.925 RELEASE ... 92AE17121926106EB12FA8054064CABA ... http://blah
1066031048.074 SWAPOUT ... 554BACBD2CB2A0C38FF9BF4B2239A9E5 ... http://blah
So what's going on? The object is originally cached under one key (554B...). Some time later, Squid receives another request for the object and forwards a validation request to the origin server. When the response comes back with new content, Squid changes the cache key of the old object (to 92AE...) so that it can give the new object the correct key (554B...). The old object is then removed, and the new object is saved to disk.
6: status code
This field shows the HTTP status code of the response, just like access.log. See Table 13-1 for a list of status codes.
7: date
The value of the Date header in the HTTP response, expressed as seconds since the Unix epoch. The value -1 indicates an unparseable Date header, and -2 means the header was entirely absent.
8: last-modified
The value of the Last-Modified header in the HTTP response, expressed as seconds since the Unix epoch. The value -1 indicates an unparseable Last-Modified header, and -2 means the header was entirely absent.
9: expires
The value of the Expires header in the HTTP response, expressed as seconds since the Unix epoch. The value -1 indicates an unparseable Expires header, and -2 means the header was entirely absent.
10: content-type
The value of the Content-Type header in the HTTP response, excluding any media-type parameters. Squid inserts the value unknown if the Content-Type is missing.
11: content-length/size
This field contains two numbers, separated by a slash. The first is the value of the Content-Length header. A -1 indicates the Content-Length header is absent. The second is the actual size of the HTTP message body. You can use these two numbers to identify partially received responses and origin servers that incorrectly calculate the content length. In most cases, the two numbers are the same.
12: method
The HTTP request method for the object, as in access.log.
13: URI
The final field is the requested URI, as in access.log. This field also has the whitespace problem mentioned in the previous section. However, it is less worrisome here because you can safely ignore any extra fields.
For many of the RELEASE entries, you'll see question marks (?) for the last eight fields. This is because most of those field values come from what Squid calls the MemObject structure. This structure is present only for objects that have just been received, or are being stored entirely in memory. Most of the objects in Squid's cache don't have a MemObject because they exist only on disk. For these, Squid puts question marks in the fields with missing information.
Mapping File Numbers to Pathnames
If you find you need to examine a particular cache file, you can, with some effort, turn a file number into a pathname. You'll also need the directory number, and L1 and L2 values. In the Squid source code, the storeUfsDirFullPath( ) function does this. You can find it in the src/fs/ufs/store_dir_ufs.c file. This short Perl program mimics the current algorithm:
#!/usr/bin/perl
$L1 = 16;
$L2 = 256;
while (<>) {
$filn = hex($_);
printf("%02X/%02X/%08X\n",
(($filn / $L2) / $L2) % $L1,
($filn / $L2) % $L2,
$filn);
}
And here's how you can use it:
% echo 000DCD06 | ./fileno-to-pathname.pl
0D/CD/000DCD06
To find this file in the Nth cache_dir, simply go to the corresponding directory and list or view the file:
% cd /cache2