Webbots, Spiders, and Screen Scrapers - Michael Schrenk [70]
Table 16-1. Email Addresses Used by LIB_mail
Address
Function
Required or Optional
To:
Defines the address of the main recipient of the email
Required
Reply-to:
Defines the address where replies to the email are sent
Optional
Return-path:
Indicates where notifications are sent if the email could not be delivered
Optional
From:
Defines the email address of the party sending the email
Required
Cc:
Refers to an address of another party, who receives a carbon copy of the email, but is not the primary recipient of the message
Optional
Bcc:
Is similar to Cc: and stands for blind carbon copy; this address is hidden from the other parties receiving the same email
Optional
Configuring the Reply-to address is also important because this address is used as the address where undeliverable email messages are sent. If this is not defined, undeliverable email messages will bounce back to your system admin, and you won't know that an email wasn't delivered. For this reason, the function automatically uses the From address if a Return-path address isn't specified.
* * *
[54] Spammers write webbots to discover mail servers that allow mail relaying.
Writing a Webbot That Sends Email Notifications
Here's a simple webbot that, when run, sends an email notification if a web page has changed since the last time it was checked.[55] Such a webbot could have many practical uses. For example, it could monitor online auctions or pages on your fantasy football league's website. A modified version of this webbot could even notify you when the balance of your checking account changes. The webbot simply downloads a web page and stores a page signature, a number that uniquely describes the content of the page, in a database. This is also known as a hash, or a series of characters, that represents a test message or a file. In this case, a small hash is used to create a signature that references a file without the need to reference the entire contents of the file. If the signature of the page differs from the one in the database, the webbot saves the new value and sends you an email indicating that the page has changed. Listing 16-4 shows the script for this webbot.[56]
# Get libraries
include("LIB_http.php"); # include cURL library
include("LIB_mysql.php"); # include MySQL library
include("LIB_mail.php"); # include mail library
# Define parameters
$webbot_email_address = "webbot@YourDomain.com";
$notification_email_address = "yourEmail@YourDomain.com ";
$target_web_site = "www.trackrates.com";
# Download the website
$download_array = http_get($target_web_site, $ref="");
$web_page = $download_array['FILE'];
# Calculate a 40-character sha1 hash for use as a simple signature
$new_signature = sha1($web_page);
# Compare this signature to the previously stored value in a database
$sql = "select SIGNATURE from signatures where WEB_PAGE='".$target_web_site."'";
list($old_signature) = exe_sql(DATABASE, $sql);
# If the new signature is different than the old one, update the database and
# send an email notifying someone that the web page changed.
if($new_signature != $old_signature)
{
// Update database
if(isset($data_array)) unset($data_array);
$data_array['SIGNATURE'] = $new_signature;
update(DATABASE, $table="signatures",
$data_array, $key_column="WEB_PAGE", $id=$target_web_site);
// Send email
$subject = $target_web_site." has changed";
$message = $subject . "\n";
$message = $message . "Old signature = ".$old_signature."\n";
$message = $message . "New signature = ".$new_signature."\n";
$message = $message . "Webbot ran at: ".date("r")."\n";
$address['from'] = $webbot_email_address;
$address['replyto'] = $webbot_email_address;
$address['to'] = $notification_email_address;
formatted_mail($subject, $message, $address, $content_type="text/plain");