Online Book Reader

Home Category

Internet Marketing - Matt Bailey [19]

By Root 736 0
gather the content of those sites is called a spider. It is also called a bot, short for “robot” but much cooler. It is also called a crawler, because it “crawls” the Web, searching for documents.

Our very first lesson provides many difficulties for people, because it involves various words that essentially mean the same thing. I’ve been in many conversations where people will refer to the spider but then call it a bot, even in the same sentence!

Understanding the behavior of this software is the first critical component of understanding how search engines work. The spider requests documents and downloads them for processing. By documents, I am referring to web pages, PDF files, Word documents, spreadsheets, image files, and so on. Anything that is linked on the Web is considered a document. The most common form of document that we are working with in this book is the web page.

Information Retrieval

People tend toward anthropomorphism when dealing with search engines. Anthropomorphism is a fancy way of saying that we label nonhuman or abstract concepts with human-type emotions, actions, or features in order to better understand them. This is great if you understand the concepts initially. If you don’t, well, then you will be confused. By giving search engines human characteristics, we blur the line between technical and emotional, which makes even more complicated explanations necessary for understanding this process.

The concept that is being communicated when someone says a search engine “sees you” is information retrieval. A search engine uses the spider (or bot) to request documents across the Internet. It then stores these documents in the data center—as explained earlier. The process of finding, cataloging, and then presenting those documents in a search result is called information retrieval.

Simply put, if someone tells you that the search engine can’t see your website, it means the search engine spiders have not found and cataloged your website and stored it in their data center. This can be the result of multiple issues, most of which will be presented in Part IV of this book, but the main point is that something may be preventing the search engine spider from finding, indexing, and storing your website’s pages.

Index

The index is where the requested documents are stored. It is the search engine’s database, kept in a data center full of computers dedicated to maintaining copies of all documents found online. Search engines have multiple data centers around the world, because they have cataloged millions of documents from the Internet and they need someplace to store those documents for the retrieval process.

The first part of search engine optimization is to ensure that the website and all of its pages are being indexed. That is, the website has been requested and is cataloged in the search engine’s database. This is verified by simply searching for that page in a search engine in order to verify that it has been downloaded and stored.

The second part of this process is to ensure that the indexed document is in a complete form, that is, that the information on the web page is the version that the search engine has cataloged.

All search engines need the three elements shown in Figure 2-9—architecture, links, and content—to properly find your website, follow links, and read the content. It is a mix of all three elements that will provide a search engine–friendly website. The architecture of a website has to be built properly, allowing the search engine to find and follow the internal links from page to page. As the search engine downloads all of the pages, the content has to be on the page and in the programming in a format that will be easily read and evaluated by the search engine. Without these three elements, your site will not be found by the search engines. We will deal with specific technical issues in Part IV of this book, along with some checklists for discovering your current situation in Part II.

Figure 2-9: What search engines need

Snippets

The snippet is a critical piece of information

Return Main Page Previous Page Next Page

®Online Book Reader