Webbots, Spiders, and Screen Scrapers - Michael Schrenk [68]
Spam has negatively influenced all of our email experiences.[52] It was probably only a few years ago that every email in one's inbox had some value and deserved to be read. Today, however, my spam filter (a proxy service that examines email headers and content to determine if the email is legitimate or a potential scam) rejects roughly 80 percent of the email I receive, flagging it as unwanted solicitation at best and, at worst, a phishing attack—email that masquerades itself as legitimate and requests credit card or other personal information.
Nobody likes unsolicited email, and your webbot's effectiveness will be reduced if its messages are interpreted as spam by end readers or automated filters. When using your webbots to send volumes of mail, follow these guidelines:
Allow recipients to unsubscribe. If people can't remove themselves from a mailing list, they're subscribed involuntarily. Email that is part of a periodic mailing should include a link that allows the recipient to opt out of future mailings.[53]
Avoid multiple emails. Avoid sending multiple emails with similar content or intent to the same address.
Use a relevant subject line. Don't deceive email recipients (or try to avoid a spam filter) with misleading subject lines. If you're actually selling "herbal Via8r4," don't use a subject line like RE: Thanks!
Identify yourself. Don't spoof your email headers or the originator's actual email address in order to trick spam filters into delivering your email.
Obey the law. Depending where you live, laws may prohibit sending specific types of email. For example, under the Children's Online Privacy Protection Act (COPPA), it is illegal in the United States to solicit personal information from children. (More information is available at the COPPA website, http://www.coppa.org.) Laws regarding email ethics change constantly. If you have questions, talk to a lawyer that specializes in online law.
Note
Do not use any of the following techniques to test the resolve of people's spam filters. I recommend reading Chapter 28 and having a personal consultation with an attorney before doing anything remotely questionable.
* * *
[52] I would like to extend my sincerest apologies to the Hormel Foods Corporation for perpetuating the use of the word spam to describe unwanted email. I'd rather refer to the phenomenon of junk email with a clever term like eJunk or NetClutter. But unfortunately, no other synonym has the worldwide acceptance of spam. Hormel Foods deserves better treatment of its brand—and for this reason I want to stress the difference between SPAM and spam. For additional information on Hormel's take on the use of the word spam, please refer to http://www.spam.com/ci/ci_in.htm.
[53] Unfortunately, many spammers rely on people opting out of mailing lists to verify that an email address is actively used. For many, opting out of a mail list ensures they will continue to receive unsolicited email.
Sending Mail with SMTP and PHP
Outgoing email is sent using the Simple Mail Transfer Protocol (SMTP). Fortunately, PHP's built-in mail() function handles all SMTP socket-level protocols and handshaking for you. The mail() function acts as your mail client, sending email messages just as Outlook or Thunderbird might.
Configuring PHP to Send Mail
Before you can use PHP as a mail client, you must edit PHP's configuration file, php.ini, to point PHP to the mail server's location. For example, the script in Listing 16-1 shows the section of php.ini that configures PHP to work with sendmail, the Unix mail server on many networks.
[mail function]
; For Win32 only.
SMTP = localhost
; For Win32 only.
;sendmail_from = me@example.com
; For Unix only. You may supply arguments as well (default: "sendmail -t -i").
sendmail_path = /usr/sbin/sendmail -t -i
Listing 16-1: Configuring PHP's mail() function
Note
Notice that the configuration differs slightly for Windows and Unix installations. For example, windows servers use PHP.INI to describe the network location of the mail server you want to use.