Webbots, Spiders, and Screen Scrapers - Michael Schrenk [60]
ftp_delete ($ftp, "file_name")
Deletes a file
ftp_get ($ftp, "local file", "remote file", MODE)
Copies the remote file to the local file where MODE indicates if the remote file is FTP_ASCII or FTP_BINARY
ftp_mkdir($ftp, "directory name")
Creates a new directory
ftp_rename($ftp, "file name")
Renames a file or a directory on the FTP server
ftp_put ($ftp, "remote file", "local file", MODE)
Copies the local file to the remote file where MODE indicates if the local file is FTP_ASCII or FTP_BINARY
ftp_rmdir($ftp, "directory/path")
Removes a directory
ftp_rawlist($ftp, "directory/path")
Returns an array with each array element containing directory information about a file
As shown in Table 13-1, the PHP FTP commands allow you to write webbots that create, delete, and rename directories and files. You may also use PHP/CURL to perform advanced FTP tasks requiring advanced authentication or encryption. Since FTP seldom uses these features, they are out of the scope of this book, but they're available for you to explore on the official PHP website available at http://www.php.net.
Further Exploration
Since FTP is often the only application-level protocol that computer systems share, it is a convenient communication bridge between new and old computer systems. Moreover, in addition to using FTP as a common path between disparate—or obsolete—systems, FTP is still the most common method for uploading files to websites. With the information in this chapter, you should be able to write webbots that update websites with information found at a variety of sources. Here are some ideas to get you started.
Write a webbot that updates your corporate server with information gathered from sales reports.
Develop a security webbot that uses a webcam to take pictures of your warehouse or parking lot, timestamps the images, and uploads the pictures to an archival server.
Design a webbot that creates archives of your company's internal forums on an FTP server.
Create a webbot that photographically logs the progress of a construction site and uploads these pictures to an FTP server. Once construction is complete, compile the individual photos into an animation showing the construction process.
If you don't have access to an FTP server on the Internet, you can still experiment with FTP bots. An FTP server is probably already on your computer if your operating system is Unix, Linux, or Mac OS X. If you have a Windows computer, you can find free FTP servers on many shareware sites. Once you locate FTP server software, you can set up your own local server by following the instructions accompanying your FTP installation.
Chapter 14. NNTP NEWS WEBBOTS
Another non-web protocol your webbots can use is the Network News Transfer Protocol (NTTP). Before modern applications like MySpace, Facebook, and topic-specific web forums, NNTP was used to build online communities where people with common interests exchanged information in newsgroups. Members of newsgroups contribute articles—announcements, questions, or answers relating to one of thousands of subject-specific topics. Collectively, these articles are referred to as news. While NNTP is an older Internet protocol, it is still in wide use today, and it provides a valuable source of information for certain webbot projects. I've recently found NNTP useful when working on projects for private investigators, the hospitality industry, and financial institutions.
NNTP Use and History
NNTP originated in 1986[44] and was designed for a network much different from the one we use today. When NNTP was conceived, broadband and always-on access to networks were virtually unheard of. To utilize the network as it existed, NNTP employed a non-centralized server configuration, similar to what email uses. Users logged in to one of the many news servers on the network where they read articles, posted new articles, and replied to old ones. Behind the scenes, NNTP servers periodically synchronized to distribute updated news to all servers hosting specific newsgroups. Today,