search engine 101

Search Engine 101

April 17th

Nuts & Bolts of Search Engines

There are only a handful of search engines you truly need to focus on and those are Google, Yahoo!, MSN, Ask, and AOL. All of these search engines have the same critical features that allow them to provide relevant web results:

How Search Engines Operate

  1. Crawl the web. Using automated programs called “bots” or “spiders” search engines crawl through 8 - 10 billion web pages.
  2. Index Pages. Once a page is crawled, it’s contents are stored in a gigantic database which make up the search engine’s “index”.
  3. Process Queries. When the search engine receives a request for information, it pulls all the documents that match that request from it’s index. It does this in tow parts:
    • Findall Mode - Google returns all the documents that match a term
    • Second Search - only those pages with the exact phrase are returned
  4. Ranking Results. It read the query and processed some results, now the search engine uses special algorithms to determine which are most relevant.

Things Search Engines Don’t Like

There are certain pages that search engine spiders and bots don’t find. As they crawl the web, they will skip sites that:

  1. Complex URLs. Spiders may be reluctant to crawl complex URLs because humans can’t read them the same way so can result in errors.
  2. Pages with more than 100 unique links. Spiders may only follow a few of the links.
  3. Pages more than 3 clicks from home page. Unless there are many other external links pointing to the site, spiders will often ignore deep pages.
  4. Pages with a Session ID or Cookies for navigation. Spiders may not be able to hold these elements in the same manor as a browser.
  5. Pages with frames. Confuses the spider as to which page to rank.

Things Search Engines Won’t Find

Spiders and bots won’t find pages that need the following to access the page:

  1. Select form or submit button
  2. Drop-down-menu
  3. Search box
  4. Blocked purposefully (using robots.txt file - more on this later)
  5. Login required
  6. Re-directs before showing content

In a nutshell, if a page cannot be access from the home page, it most likely won’t be indexed. The best way around this is sitemaps which we’ll talk about later. They’re the best way to help search engines find their way around your site.

Relevance & Popularity

Search engines care about two things. Is the content relevant and is it credible?

  • Relevance - the site’s content match the user’s query
  • Popularity - how many other credible sources are linking to the content?

Search engines first look to see if the user’s search terms are found in important areas of the sites such as the title, meta data, heading tags, or body text.

Search engines also measure who is looking at a site or page. As well as what they are say about the site or page. Clever as they are, they also keep track of who is affiliated with whom and how credible all the sites are.

All of these factors go into an algorithm that tells the engine how much importance to assign to each of these elements. This then determines a core for the page and lists the results in order of importance.

The Value of a Trustworthy Site

If hundreds of thousands of websites link to you, your site must be popular and therefor have high value. Now if those links come from very credible sites (such as .gov or BBC news), their power is multiplied.

On the other hand, search engines place a lower value on links from link farms (automated links or interlinked sites).

How to Increase the Link Value

  1. Quality anchor text. The words in the hyperlink account for a lot. Search engine use this text to help them determine the subject matter of the link text. “Click Here” should be replaced with your key words.
  2. Site Popularity. This accounts not only for the number of links to your sites, but the quality of the source. Highly credible sources linking to your site is extremely valuable.
  3. Text Directly Surrounding the Link. A link from inside a paragraph may carry greater weight than a link in the sidebar or footer.
  4. Links from Sites with Like Subject Mater. It’s more valuable to have links from pages that are related to the site’s subject matter.

These are only a few of the very many factors search engines us to measure and weigh evaluating links. Remember, search engines are there to provide quality and usable results to the user. That’s their first concern.

What next?

Next: Six Steps to Effective Keyword Research

Back to SEO for Beginners.

stickyseeds