Online Book Reader

Home Category

Webbots, Spiders, and Screen Scrapers - Michael Schrenk [26]

By Root 393 0
passed in a URL (GET method)

Since GET form variables may be combined with the URL, the web page that accepts the form will not be able to tell the difference between the form submitted in Listing 5-3 and the form emulation techniques shown in Listings 5-4 and 5-5. In either case, the variables term and sort will be submitted to the web page http://www.schrenk.com/search with the GET protocol.[18]

Listing 5-3: A GET method performed by a form submission

Alternatively, you could use LIB_http to emulate the form, as in Listing 5-4.

include("LIB_http.php");

$action = "http://www.schrenk.com/search.php"; // Address of form handler

$method="GET"; // GET method

$ref = ""; // Referer variable

$data_array['term'] = "hello"; // Define term

$data_array['sort'] = "up"; // Define sort

$response = http($target=$action, $ref, $method, $data_array, EXCL_HEAD);

Listing 5-4: Using LIB_http to emulate the form in Listing 5-3 with data passed in an array

Conversely, since the GET method places form information in the URL's query string, you could also emulate the form with a script like Listing 5-5.

include("LIB_http.php");

$action = "http://www.schrenk.com/search.php?term=hello&sort=up";

$method=""GET";

$ref = "" ;

$response = http($target=$action, $ref, $method, $data_array="", EXCL_HEAD);

Listing 5-5: Emulating the form in Listing 5-3 by combining the URL with the form data

The reason we might choose Listing 5-4 over Listing 5-5 is that the code is cleaner when form data is treated as array elements, especially when many form values are passed to the form handler. Passing form variables to the form's handler with an array is also more symmetrical, meaning that the procedure is nearly identical to the one required to pass values to a form handler expecting the POST method.

The POST Method

While the GET method tacks on form data at the end of the URL, the POST method sends data in a separate file. The POST method has these advantages over the GET method:

POST methods can send more data to servers than GET methods can. The maximum length of a GET method is typically around 250 characters. POST methods, in contrast, can easily upload several megabytes of information during a single form upload.

Since URL fetch requests are sent in HTTP headers, and since headers are never encrypted, sensitive data should always be transferred with POST methods. POST methods don't transfer form data in headers, and thus, they may be encrypted. Obviously, this is only important for web pages using encryption.

GET method requests are always visible on the location bar of the browser. POST requests only show the actual URL in the location bar.

Regardless of the advantages of POST over GET, you must match your method to the method of form you are emulating. Keep in mind that methods may also be combined in the same form. For example, forms with POST methods may also use form handlers that contains query strings.

To submit a form using the POST method with LIB_http, simply specify the POST protocol, as shown in Listing 5-6.

include("LIB_http.php");

$action = "http://www.schrenk.com/search.php"; // Address of form handler

$method="POST "; // POST method

$ref = ""; // Referer variable

$data_array['term'] = "hello"; // Define term

$data_array['sort'] = "up"; // Define sort

$response = http($target=$action, $ref, $method, $data_array, EXCL_HEAD);

Listing 5-6: Using LIB_http to emulate a form with the POST method

Regardless of the number of data elements, the process is the same. Some form handlers, however, access the form elements as an array, so it's always a good idea to match the order of the data elements that is defined in the HTML form.

Event Triggers

A submit button typically acts as the event trigger, which causes the form data to be sent to the form handler using the defined form method. While the submit button is

Return Main Page Previous Page Next Page

®Online Book Reader