Print this article Print this article   |   Read this article in: | ES |

June 24, 2005



Insight Into Google's Patent: Jill Whalen Reports

 

You may have already heard or read about Google's latest patent application regarding "information retrieval based on historical data", but if you're like me, you probably didn't bother to read it all.

jill_whalen_photo_o.jpg
Jill Whalen

Patents are not easy to read, that's for sure!

I had skimmed it and glanced at a few forum posts and articles that discussed it, but until today, I hadn't actually read it completely.

I wasn't surprised about the stuff in the patent that corresponded with Google's aging delay and its "sandbox" as I had already seen a lot of discussion on this.

For those who aren't familiar with the aging delay and the sandbox, you'll want to note that there is a lot of disagreement over what causes a site to be thrown in the sandbox. However, based on my own observations and the experiences of some trusted SEO friends, it's my belief that the sandbox is basically a purgatory database where Google places certain URLs based on a variety of predetermined criteria. (Much of this is spelled out in the first part of the patent application.)


The aging delay, on the other hand, is actually a subset of the sandbox. In other words, the aging delay is just *one* reason why a URL might get placed in the sandbox.

Basically, if you have a brand new domain/website, it will automatically land in the sandbox regardless of anything that you do with it.

Your new website will be stuck there for an unspecified period of time (averaging around 9 months these days) and it will not rank highly in Google for any keyword phrases that might bring it any decent traffic.

Yes, it can sometimes rank highly for the company name, or the names of the people who run the company. It may also show up in Google for a few additional phrases that other sites are not focusing on within their content. But new domains will not show up in Google's natural results for even slightly competitive keyword phrases until they are removed from the sandbox.

Other reasons why a site might be placed in the sandbox go beyond the aging delay. Google's major algorithm upheavals such as the recent one dubbed "Bourbon" by the folks at WebmasterWorld, show all too clearly that old domains can also be placed in the sandbox, under the right (or in this case wrong) circumstances. Nobody can really say for sure what the criteria is, but Google's patent does give us some insight into what some of them might be.

For instance, did you know that Google might use traffic data from sites when determining how to rank them?

The patent application specifically states in part "...information relating to traffic associated with a document over time may be used to generate (or alter) a score associated with the document." Since the application was filed in 2003, it would be a pretty safe bet to say that they are in fact using that information in today's ranking algorithm.

You might be wondering how they get information about your site's traffic since you're not providing them with your log files or traffic reports.

Well, Google has some nifty big brother spyware installed on tons and tons of people's browsers in the form of the "Google Toolbar." In order to use certain functions of the toolbar, users have to agree to allow data to be transferred back to Google, which includes which sites they've visited, and how long they were there.

Now, this isn't any cause for alarm if you're a Google toolbar user, as they're not actually identifying you personally (as far as I know).
They are simply taking the aggregate data that they receive and then using it for whatever purposes they see fit. It actually makes perfect sense that they'd use this data to perfect their ranking algorithm. Highly trafficked sites are popular sites, and Google would want to ensure that their searchers easily find popular sites.

Another factor used in Google's ranking algorithm is clickthroughs from the search results pages.

In Google's patent it is said that, "[Google] may monitor the number of times that a document is selected from a set of search results and/or the amount of time one or more users spend accessing the document. [Google] may then score the document based, at least in part, on this information."

Google has had tracking URLs on most of the links appearing in their search results for quite some time.

With these in place, they can study which pages are getting clicked for which queries. They can also figure out whether people are satisfied with the page they clicked on by making note of whether the user came back to the results page and clicked on additional results.

There's lots more in the patent regarding links and anchor text, including the length of time it takes for links to show up, and whether they fit the profile for being artificial or natural.

Suffice it to say that as long as you're not attempting to artificially inflate your link popularity, then you have nothing to worry about.

I cannot stress enough that the ideas in this patent have been mainly put forth as spam-fighting measures.

Unfortunately, as soon as the search engines start giving things like links any kind of prominence in their ranking algorithm, they get abused by those whose only goal is to "game" the engines. There will always be people who set out to obtain high rankings through exploiting weaknesses in the algorithms. They create numerous websites based on the algorithm of the day, and make as much money as they can until their sites are caught. Then they simply figure out the next loophole and start the process all over again. It's an interesting and exciting business model, but certainly not one that a company in business for the long haul should be interested in.

If you have a real company that is looking to establish a real brand and a long-term customer base, then you'll want to stick with the basic SEO techniques which have been proven to work time and again. In other words, the stuff I've been teaching and doing for years.

Yes, it can be time consuming and a huge amount of hard work and/or money to do things the right way, but the reward is long-term search engine success.

It is true that even for those who do practice what I preach, there have been occasions when some search engines mistakenly throw the baby out with the bathwater.

That is, you may do everything by the book, but something somewhere trips a spam filter and your site may mistakenly get sandboxed, penalized or banned.

This is certainly rare, but not as rare as it used to be.

Each new search engine update brings new cries of "Where's my site?!" from people who didn't do anything to deceive the engines. One can only hope that the engines work quickly to allow these sites to get back into the rankings as quickly as possible.

At any rate, you should never count on your natural search results as your sole method of bringing you business.



***********
About the author:

Jill Whalen of High Rankings is an internationally recognized search engine optimization consultant and host of the free weekly High Rankings Advisor search engine marketing newsletter.

She specializes in search engine optimization, SEO consultations and seminars. Jill's handbook, "The Nitty-gritty of Writing for the Search Engines" teaches business owners how and where to place relevant keyword phrases on their Web sites so that they make sense to users and gain high rankings in the major search engines.

Conversation Tags: , , ,
 
Readers' Comments    
Recent Articles


October 3, 2008
Online Marketing Trends: Brands Meet Blogs - State Of The Blogosphere 2008


Brands and blogs unite... almost. It's not a match made in heaven yet, but the trend is heading in that direction. According to Technorati's State of the Blogosphere 2008 report, an increasing number of companies are reaching out to bloggers to harness the power of the... read more




September 30, 2008
Online Video Interview: How To Prepare Yourself For It


Getting ready for an online video interview? Find out how to set up your computer, webcam, lights and microphone for a perfect online video interview. Though for a geek this may be simple stuff, if you are not a technically skilled person or if you have... read more




September 29, 2008
Best Web-Based Image And Photo Editors - Sharewood Guide


Web-based image editors are online tools that anyone can use to easily modify, resize, edit and crop digital photographs or graphics files, without needing to download and install any software. Photo credit: Pixlr Interface Digital image editors, are typically free, and, in most cases, do not even... read more




September 26, 2008
How To Publish Your Content Online Without A Web Site - Video Interview With Ryan Hupfer Of Hubpages.com


Ready to publish some great content online but have no site or blog where to post it to? Hubpages, like a few other competing web publishing services online (namely Squidoo and Google Knol) provides you with everything you need to publish any content you are passionate... read more




September 18, 2008
Information Overload: What It Is And How You Can Avoid It


Information overload: are you affected by it? How can you better manage it? Are big companies giving us more and better information? How can you determine which information is worthwhile looking at? How to you decrease the noise created by the huge volume of info... read more




September 16, 2008
Live Events Strategy: Mashing Up Physical Conferences With Online Extended Events - Live Events Become X-Events


A new class of powerful, inclusive, popular and engaging events liberated from the straitjacket of space-time by the convergence of usable new media technologies is ushering at your door: X-events are next. Photo credit: XLucas But let me explain myself better: Online (and offline) events should not be... read more




posted by Robin Good on Friday, June 24 2005, updated on Tuesday, February 21 2006


 

 

 

 

Understanding comes from exploration

Home | Subscribe | RSS Feeds | Site map | Syndicate
Consulting | Publications
About | Privacy | Contact

 

Creative Commons License
This work is licensed under a Creative Commons License.





View blog authority

 

3544