Online Book Reader

Home Category

Webbots, Spiders, and Screen Scrapers - Michael Schrenk [125]

By Root 403 0
CURLOPT_HTTPHEADER expects to receive data in an array.

$header_array[] = "Mime-Version: 1.0";

$header_array[] = "Content-type: text/html; charset=iso-8859-1";

$header_array[] = "Accept-Encoding: compress, gzip";

curl_setopt($curl_session, CURLOPT_HTTPHEADER, $header_array);

Listing A-11: Configuring an outgoing header

CURLOPT_SSL_VERIFYPEER

You only need to use this option if the target website uses SSL encryption and the protocol in CURLOPT_URL is https:. An example is shown in Listing A-12.

curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, FALSE); // No certificate

Listing A-12: Configuring PHP/CURL not to use a local client certificate

Depending on the version of PHP/CURL you use, this option may be required; if you don't use it, the target server will attempt to download a client certificate, which is unnecessary in all but rare cases.

CURLOPT_USERPWD and CURLOPT_UNRESTRICTED_AUTH

As shown in Listing A-13, you may use the CURLOPT_USERPWD option with a valid username and password to access websites that use basic authentication. In contrast to using a browser, you will have to submit the username and password to every page accessed within the basic authentication realm.

curl_setopt($s, CURLOPT_USERPWD, "username:password");

curl_setopt($s, CURLOPT_UNRESTICTED_AUTH, TRUE);

Listing A-13: Configuring PHP/CURL for basic authentication schemes

If you use this option in conjunction with CURLOPT_FOLLOWLOCATION, you should also use the CURLOPT_UNRESTRICTED_AUTH option, which will ensure that the username and password are sent to all pages you're redirected to, providing they are part of the same realm.

Exercise caution with using CURLOPT_USERPWD, as it is possible that you can inadvertently send username and password information to the wrong server, where it may appear in access log files.

CURLOPT_POST and CURLOPT_POSTFIELDS

The CURLOPT_POST and CURLOPT_POSTFIELDS options configure PHP/CURL to emulate forms with the POST method. Since the default method is GET, you must first tell PHP/CURL to use the POST method. Then you must specify the POST data that you want to be sent to the target webserver. An example is shown in Listing A-14.

curl_setopt($s, CURLOPT_POST, TRUE); // Use POST method

$post_data = "var1=1&var2=2&var3=3"; // Define POST data values

curl_setopt($s, CURLOPT_POSTFIELDS, $post_data);

Listing A-14: Configuring POST method transfers

Notice that the POST data looks like a standard query string sent in a GET method. Incidentally, to send form information with the GET method, simply attach the query string to the target URL.

CURLOPT_VERBOSE

The CURLOPT_VERBOSE option controls the quantity of status messages created during a file transfer. You may find this helpful during debugging, but it is best to turn off this option during the production phase, because it produces many entries in your server log file. A typical succession of log messages for a single file download looks like Listing A-15.

* About to connect() to www.schrenk.com port 80

* Connected to www.schrenk.com (66.179.150.101) port 80

* Connection #0 left intact

* Closing connection #0

Listing A-15: Typical messages from a verbose PHP/CURL session

If you're in verbose mode on a busy server, you'll create very large log files. Listing A-16 shows how to turn off verbose mode.

curl_setopt($s, CURLOPT_VERBOSE, FALSE); // Minimal logs

Listing A-16: Turning off verbose mode reduces the size of server log files.

CURLOPT_PORT

By default, PHP/CURL uses port 80 for all HTTP sessions, unless you are connecting to an SSL encrypted server, in which case port 443 is used.[95] These are the standard port numbers for HTTP and HTTPS protocols, respectively. If you're connecting to a custom protocol or wish to connect to a non-web protocol, use CURLOPT_PORT to set the desired port number, as shown in Listing A-17.

curl_setopt($s, CURLOPT_PORT, 234); // Use port number 234

Listing A-17: Using nonstandard communication ports

Note

Configuration settings must be capitalized, as shown in the previous examples. This is

Return Main Page Previous Page Next Page

®Online Book Reader