Curated by: Luigi Canali De Rossi
 


Friday, June 24, 2005

Insight Into Google's Patent: Jill Whalen Reports

Sponsored Links

You may have already heard or read about Google's latest patent application regarding "information retrieval based on historical data", but if you're like me, you probably didn't bother to read it all.

jill_whalen_photo_o.jpg
Jill Whalen

Patents are not easy to read, that's for sure!

I had skimmed it and glanced at a few forum posts and articles that discussed it, but until today, I hadn't actually read it completely.

I wasn't surprised about the stuff in the patent that corresponded with Google's aging delay and its "sandbox" as I had already seen a lot of discussion on this.

For those who aren't familiar with the aging delay and the sandbox, you'll want to note that there is a lot of disagreement over what causes a site to be thrown in the sandbox. However, based on my own observations and the experiences of some trusted SEO friends, it's my belief that the sandbox is basically a purgatory database where Google places certain URLs based on a variety of predetermined criteria. (Much of this is spelled out in the first part of the patent application.)

The aging delay, on the other hand, is actually a subset of the sandbox. In other words, the aging delay is just *one* reason why a URL might get placed in the sandbox.

Basically, if you have a brand new domain/website, it will automatically land in the sandbox regardless of anything that you do with it.

Your new website will be stuck there for an unspecified period of time (averaging around 9 months these days) and it will not rank highly in Google for any keyword phrases that might bring it any decent traffic.

Yes, it can sometimes rank highly for the company name, or the names of the people who run the company. It may also show up in Google for a few additional phrases that other sites are not focusing on within their content. But new domains will not show up in Google's natural results for even slightly competitive keyword phrases until they are removed from the sandbox.

Other reasons why a site might be placed in the sandbox go beyond the aging delay. Google's major algorithm upheavals such as the recent one dubbed "Bourbon" by the folks at WebmasterWorld, show all too clearly that old domains can also be placed in the sandbox, under the right (or in this case wrong) circumstances. Nobody can really say for sure what the criteria is, but Google's patent does give us some insight into what some of them might be.

For instance, did you know that Google might use traffic data from sites when determining how to rank them?

The patent application specifically states in part "...information relating to traffic associated with a document over time may be used to generate (or alter) a score associated with the document." Since the application was filed in 2003, it would be a pretty safe bet to say that they are in fact using that information in today's ranking algorithm.

You might be wondering how they get information about your site's traffic since you're not providing them with your log files or traffic reports.

Well, Google has some nifty big brother spyware installed on tons and tons of people's browsers in the form of the "Google Toolbar." In order to use certain functions of the toolbar, users have to agree to allow data to be transferred back to Google, which includes which sites they've visited, and how long they were there.

Now, this isn't any cause for alarm if you're a Google toolbar user, as they're not actually identifying you personally (as far as I know).
They are simply taking the aggregate data that they receive and then using it for whatever purposes they see fit. It actually makes perfect sense that they'd use this data to perfect their ranking algorithm. Highly trafficked sites are popular sites, and Google would want to ensure that their searchers easily find popular sites.

Another factor used in Google's ranking algorithm is clickthroughs from the search results pages.

In Google's patent it is said that, "[Google] may monitor the number of times that a document is selected from a set of search results and/or the amount of time one or more users spend accessing the document. [Google] may then score the document based, at least in part, on this information."

Google has had tracking URLs on most of the links appearing in their search results for quite some time.

With these in place, they can study which pages are getting clicked for which queries. They can also figure out whether people are satisfied with the page they clicked on by making note of whether the user came back to the results page and clicked on additional results.

There's lots more in the patent regarding links and anchor text, including the length of time it takes for links to show up, and whether they fit the profile for being artificial or natural.

Suffice it to say that as long as you're not attempting to artificially inflate your link popularity, then you have nothing to worry about.

I cannot stress enough that the ideas in this patent have been mainly put forth as spam-fighting measures.

Unfortunately, as soon as the search engines start giving things like links any kind of prominence in their ranking algorithm, they get abused by those whose only goal is to "game" the engines. There will always be people who set out to obtain high rankings through exploiting weaknesses in the algorithms. They create numerous websites based on the algorithm of the day, and make as much money as they can until their sites are caught. Then they simply figure out the next loophole and start the process all over again. It's an interesting and exciting business model, but certainly not one that a company in business for the long haul should be interested in.

If you have a real company that is looking to establish a real brand and a long-term customer base, then you'll want to stick with the basic SEO techniques which have been proven to work time and again. In other words, the stuff I've been teaching and doing for years.

Yes, it can be time consuming and a huge amount of hard work and/or money to do things the right way, but the reward is long-term search engine success.

It is true that even for those who do practice what I preach, there have been occasions when some search engines mistakenly throw the baby out with the bathwater.

That is, you may do everything by the book, but something somewhere trips a spam filter and your site may mistakenly get sandboxed, penalized or banned.

This is certainly rare, but not as rare as it used to be.

Each new search engine update brings new cries of "Where's my site?!" from people who didn't do anything to deceive the engines. One can only hope that the engines work quickly to allow these sites to get back into the rankings as quickly as possible.

At any rate, you should never count on your natural search results as your sole method of bringing you business.



***********
About the author:

Jill Whalen of High Rankings is an internationally recognized search engine optimization consultant and host of the free weekly High Rankings Advisor search engine marketing newsletter.

She specializes in search engine optimization, SEO consultations and seminars. Jill's handbook, "The Nitty-gritty of Writing for the Search Engines" teaches business owners how and where to place relevant keyword phrases on their Web sites so that they make sense to users and gain high rankings in the major search engines.

 
 
 
Readers' Comments    
blog comments powered by Disqus
 
posted by Robin Good on Friday, June 24 2005, updated on Tuesday, May 5 2015


Search this site for more with 

  •  

     

     

     

     

    3544




     




    Curated by


    Publisher

    MasterNewMedia.org
    New media explorer
    Communication designer

     

    POP Newsletter

    Robin Good's Newsletter for Professional Online Publishers  

    Name:
    Email:

     

     
    Real Time Web Analytics