Webbots, Spiders, and Screen Scrapers - Michael Schrenk [71]
}
Listing 16-4: A simple webbot that sends an email when a web page changes
When the webbot finds that the web page's signature has changed, it sends an email like the one in Listing 16-5.
www.trackrates.com has changed
Old signature = baf73f476aef13ae48bd7df5122d685b6d2be2dd
New signature = baf73f476aed685b6d2be2ddf13ae48bd7df5124
Webbot ran at: Mon, 20 Mar 2007 17:08:00 −0600
Listing 16-5: Email generated by the webbot in Listing 16-4
Keeping Legitimate Mail out of Spam Filters
Many spam filters automatically reject any email in which the domain of the sender doesn't match the domain of the mail server used to send the message. For this reason, it is wise to verify that the domains for the From and Reply-to addresses match the outgoing mail server's domain.
The idea here is not to fool spam filters into letting you send unwanted email, but rather to ensure that legitimate email makes it to the intended Inbox and not the Junk folder, where no one will read it.
Sending HTML-Formatted Email
It's easy to send HTML-formatted email with images, hyperlinks, or any other media found in web pages. To send HTML-formatted emails with the formatted_mail() function, do the following:
Set the $content_type variable to text/html. This will tell the routine to use the proper MIME in the email header.
Use fully formed URLs to refer to any images or hyperlinks. Relative address references will resolve to the mail client, not the online media you want to use.
Since you never know the capabilities of the client reading the email, use standard formatting techniques. Tables work well.
Avoid CSS. Traditional font tags are more predictable in HTML email.
For debugging purposes, it's a good idea to build your message in a string, as shown in Listing 16-6.
# Get library
include("LIB_mail.php"); # Include mail library
# Define addresses
$address['from'] = "mikeSchrenk@yahoo.com";
$address['replyto'] = $address['from'];
$address['to'] = "mikeSchrenk@yahoo.com";
# Define subject line
$subject = "Example of an HTML-formatted email";
# Define message
$message = "";
$message = $message . "
![]() | "; $message = $message . " |
| "; $message = $message . ""; $message = $message . "Here is an example of a clean HTML-formatted email"; $message = $message . " $message = $message . " | "; $message = $message . " |
| "; $message = $message . ""; $message = $message . "with an image and a >hyperlink."; $message = $message . " $message = $message . " | "; $message = $message . " |
echo $message;
// Send email
formatted_mail($subject, $message, $address, $content_type="text/html");
?>
Listing 16-6: Sending HTML-formatted email
The email sent by Listing 16-6 looks like Figure 16-1.
Figure 16-1. HTML-formatted email sent by the script in Listing 16-6
Be aware that not all mail clients can render HTML-formatted email. In those instances, you should send either text-only emails or a multi-formatted email that contains both HTML and unformatted messages.
* * *
[55] For information on periodic and autonomous launching of webbots, read Chapter 23.
[56] This script makes use of LIB_mysql. If you haven't already done so, make sure you read Chapter 6 to learn how to use this library.
Further Exploration
If you think about all the ways you use email, you'll probably be able to come up with some very creative uses for your webbots. The following concepts should serve as starting points for your own webbot development.
Using Returned Emails to Prune Access Lists
You can design an email-wielding webbot to help you identify illegitimate members of a members-only website.
