Squid_ The Definitive Guide - Duane Wessels [129]
Squid has a built-in feature for rotating log files. You can invoke it with the squid -k rotate command. You then tell Squid how many old copies of each file to keep with the logfile_rotate directive. For example, if you set it to 7, you'll have eight versions of each log file: the current file and seven old ones.
Old log files are renamed with numeric extensions. For example, when you execute a rotation, Squid renames log.6 to log.7, then log.5 to log.6, and so on. The current log becomes log.0, and Squid creates a new, empty file named log.
Each time you execute squid -k rotate, Squid rotates the following files: cache.log, access.log, store.log, useragent.log (if enabled), and referer.log (if enabled). Squid also creates up-to-date versions of the swap.state files. Note, however, that swap.state isn't archived with numeric extensions.
Squid doesn't rotate the log files automatically. The best way to make it happen is with a daily cron job. For example:
0 0 * * * /usr/local/squid/sbin/squid -k rotate
If you'd rather write your own scripts to manage the log files, Squid has a special mode that you'll find useful. Simply set the logfile_rotate directive to 0. Then, when you run squid -k rotate, Squid simply closes the current log files and opens new ones. This is very useful when the operating system allows you to rename files opened by another process. The following shell script illustrates the idea:
#!/bin/sh
set -e
yesterday_secs=`perl -e 'print time -43200'`
yesterday_date=`date -r $yesterday_secs +%Y%m%d`
cd /usr/local/squid/var/logs
# rename the current log file without interrupting the logging process
mv access.log access.log.$yesterday_date
# tell Squid to close the current logs and open new ones
/usr/local/squid/sbin/squid -k rotate
# give Squid some time to finish writing swap.state files
sleep 60
mv access.log.$yesterday_date /archive/location/
gzip -9 /archive/location/access.log.$yesterday_date
Privacy and Security
Squid's log files, especially access.log, contain a record of users' activities and, hence, are subject to privacy concerns. As the Squid administrator, you should take every precaution to keep the log files safe and secure. One of the best ways to do that is limit the number of people who have access to the system on which Squid runs. If that isn't possible, carefully examine the file and directory permissions to make sure they can't be viewed by untrusted or unauthorized users.
You can also help protect your users' privacy by taking advantage of the client_netmask and strip_query_terms directives. The former makes it harder to identify individual users in the access.log; the latter removes URI query terms that may contain personal information. See Section 13.2.4 for more information.
You may also want to develop a policy for keeping old log files. Obviously access.log helps keep users accountable for their activities, but how far back would you ever need to go searching for something? A week? A year? What would you do if presented with a court order to hand over your log files for the last three months?
If you like to keep historical data for a long time, perhaps you can make the log files anonymous or somehow reduce the dataset. If you are interested only in which URIs were accessed, but not by whom, you can extract only that field from access.log. This not only makes the file smaller, it also reduces the risk of a privacy violation. Another technique is to randomize the client IP addresses. In other words, create a filter that maps real IP addresses to fake ones, such that the same real address is always changed to the same fake address. If you are using RFC 1413 identification or HTTP authentication, consider making those fields anonymous as well.
Exercises
Configure Squid so that it doesn't create any log files, except for the swap.state file(s).
Write a simple Perl or awk