Internet Marketing - Matt Bailey [163]
A Site Redesign Catch-22
Companies are developing websites with increasing technology and more complex programming every day, using search-friendly architecture, Ajax, CSS, and other technologies in an attempt to make the experience better for their users. However, because the means of information retrieval are so outdated, these same companies are sometimes penalized for changing a site that would have been better left alone.
Essentially, Google has its own rules of website development, redevelopment, and innovation. If a company is not aware of those rules or does not invest the time and money to reverse-engineer their new website to accommodate outdated technology, then they are effectively penalized.
In short, the rule of “Would I do this if search engines didn’t exist?” (Google Webmaster Guidelines: Quality Guidelines; basic principles) is nonsensical. Developers are left to struggle with increasingly outdated search engine technology in an attempt to have a new website (that is ideally better for their users) maintain rankings.
Domain Management
Typically, your website will have a www prior to the domain name. As mentioned in the prior section, the information retrieval algorithm used by search engines is still based on a very old method. As such, to a search engine, your domain with the www is a different page than your website without the www prefix. In other words, www.website.com is considered to be a different page than website.com. Because it is a different URL, the search engines consider it to be a different page, even though the two URLs deliver the same page.
This is what is considered “duplicate content,” when the same page is indexed by the search engines under two or more URLs. This causes confusion in the search engines as to how to apply the correct relevance to the proper page. The best method of gaining rankings is to have unique content that distinguishes your website from others. However, duplicate content provides an obstacle because there may be multiple pages, with the same content, found at different URLs.
One of the primary places this happens is at the domain level, where the home page can show up under multiple URLs. To humans, they all look the same, but to machines (search engines) they are all duplicate pages, saying the same thing, all at different locations. So, the search engine has to choose which URL is going to be the primary URL, and it may not be the one that the website owner prefers. This is where redirects help with managing the domain and directing the search engines to your preferred home page URL, rather than allowing the search engine to determine it for you.
Potentially, a home page can be considered a duplicated page in many ways; for example, the following home page URLs all point to the same page but appear as four different pages at four different locations:
http://www.homepage.com
http://homepage.com
http://www.homepage.com/index.php (or home.php, .asp, and so on)
http://homepage.com/index.php
This is called canonicalization; a different page could reside at each of these locations, but traditionally, it is the same page. Search engines then attempt to find the most relevant page to assign as the best source for assigning as the default domain URL.
Major search engines such as Google provide tools that allow users to determine a preferred domain, because the domain with or without the www is considered to be two unique pages, because they are unique URLs. Also, in the Google Webmaster Tools is an area to set redirects for Google. In the Parameter Handling section, users can set specific guides for Google to avoid the long URL parameters and instead direct Google to a shorter, logical URL (see Figure 14-9).
Figure 14-9: Google Webmaster Tools domain and parameter settings
Although many webmasters are very happy for the ability