Squid_ The Definitive Guide - Duane Wessels [45]
acl WebSite dstdom_regex -i ^www\.
Here is another useful regular expression that matches IP addresses given in URL hostnames:
acl IPaddr dstdom_regex [0-9]$
This works because Squid requires URL hostnames to be fully qualified. Since none of the global top-level domains end with a digit, this ACL matches only IP addresses, which do end with a number.
url_regex
You can use the url_regex ACL to match any part of a requested URL, including the transfer protocol and origin server hostname. For example, this ACL matches MP3 files requested from FTP servers:
acl FTPMP3 url_regex -i ^ftp://.*\.mp3$
urlpath_regex
The urlpath_regex ACL is very similar to url_regex, except that the transfer protocol and hostname aren't included in the comparison. This makes certain types of checks much easier. For example, let's say you need to deny requests with sex in the URL, but still possibly allow requests that have sex in their hostname:
acl Sex urlpath_regex sex
As another example, let's say you want to provide special treatment for cgi-bin requests. You can catch some of them with this ACL:
acl CGI1 urlpath_regex ^/cgi-bin
Of course, CGI programs aren't necessarily kept under /cgi-bin/, so you'd probably want to write additional ACLs to catch the others.
browser
Most HTTP requests include a User-Agent header. The value of this header is typically something strange like:
Mozilla/4.51 [en] (X11; I; Linux 2.2.5-15 i686)
The browser ACL performs regular expression matching on the value of the User-Agent header. For example, to deny requests that don't come from a Mozilla browser, you can use:
acl Mozilla browser Mozilla
http_access deny !Mozilla
Before using the browser ACL, be sure that you fully understand the User-Agent strings your cache receives. Some user-agents lie about their identity. Even Squid has a feature to rewrite User-agent headers in requests that it forwards. With browsers such as Opera and KDE's Konqueror, users can send different user-agent strings to different origin servers or omit them altogether.
req_mime_type
The req_mime_type ACL refers to the Content-Type header of the client's HTTP request. Content-Type headers usually appear only in requests with message bodies. POST and PUT requests might include the header, but GET requests don't. You might be able to use the req_mime_type ACL to detect certain file uploads and some types of HTTP tunneling requests.
The req_mime_type ACL values are regular expressions. To catch audio file types, you can use an ACL like this:
acl AuidoFileUploads req_mime_type -i ^audio/
rep_mime_type
The rep_mime_type ACL refers to the Content-Type header of the origin server's HTTP response. It is really only meaningful when used in an http_reply_access rule. All other access control forms are based on aspects of the client's request. This one is based on the response.
If you want to try blocking Java code with Squid, you might use some access rules like this:
acl JavaDownload rep_mime_type application/x-java
http_reply_access deny JavaDownload
ident_regex
You saw the ident ACL earlier in this section. The ident_regex simply allows you to use regular expressions, instead of exact string matching on usernames returned by the ident protocol. For example, this ACL matches usernames that contain a digit:
acl NumberInName ident_regex [0-9]
proxy_auth_regex
As with ident, the proxy_auth_regex ACL allows you to use regular expressions on proxy authentication usernames. For example, this ACL matches admin, administrator, and administrators:
acl Admins proxy_auth_regex -i ^admin
External ACLs
Squid Version 2.5 introduces a new feature: external ACLs. You instruct Squid to send certain pieces of information to an external process. This helper process then tells Squid whether the given data is a match or not.
Squid comes with a number of external ACL helper programs; most determine whether or not the named user is a member of a particular group. See