Curated by: Luigi Canali De Rossi

Wednesday, May 25, 2011

The Google Panda Guide - Part 2: Machine Learning And The New Mindset

Sponsored Links

If you have been hit heavily by the Google Panda penalization, like MasterNewMedia has, one of the hardest thing to do is to understand what is the appropriate frame of mind to get into before starting to fix, modify or correct possible problems.

Photo credit: Eric Isselée - iStockphoto and Ashwin KA

Rushing to prune and modify content on your web site may not really get you the results you need, unless you a) understand what is really happening and b) adopt a new attitude about the way you are going to create value for your readers, before worrying about what the search engines want.

With Google Panda looming on the head of many international webmasters, the best advice I can share on how to prepare and recover from it is really not at all about the nitty-gritty of deleting this or avoiding that (though there is a good chunk of that to do too) but about helping you understand how deeply different is this new Panda thing from anything you have seen before and "how" to "think" when it comes to decide what to do with your site.

Let's start from the basics:

Google has realized that search results have been eroded by low quality sites, MFAs, spammers and scrapers and by those who have invested a great deal of time and resources (SEO work) in testing and exploiting Google existing search algorithms for the benefit of their sites.

To resolve this issue, Google has decided to do three things:

1) To "filter" automatically all of the sites that do not fit its new Panda algorithm.

2) To give increasingly greater credit to indicators and data that cannot be easily gamed. For example: if you look at the quartet of variables made up by Bounce rate, Time on site, Actions on page (e.g.: scroll, print, boookmark, click on ads, no-action, etc.) and Actions after leaving the page, you can get a much better idea of whether someone has derived a benefit from visiting that page or not.

3) To devise an algorithm that builds itself by continuously learning from new data and objectives it is set to go after. This way no-one at Google can really tell what specific variables determine the success or devaluation of a site, because these will be the fruit of software calculations and not of a specific rule and input provided by human beings.

These three points, represent profound changes to how Google has decided to manage low-quality content and to how Google is going to move gradually away from traditional gameable "signals" and into reading into the rich flux of user-data it has been collecting through cookies, toolbars, Google Analytics and Adsense accounts, Youtube accesses and more.

If you are curious and courageous enough to want to find out why these changes are so revolutionary and what would be the best attitude to face them, read on:


Google Panda and The Machine Learning Algorithm


Yesterday, Eric Enge published an interview with SEOMoz Rand Fishkin on Google Panda in which you can read the following:

Rand Fishkin - "They are using the aggregated opinions of their quality raters, in combination with machine learning algorithms, to filter and reorder the results for a better user experience.

That's a mouthful, but essentially what it means is that Google has this huge cadre of human workers who search all the time and rate what they find.

What they want to do is find ways to show things they like and suppress things they don't like. Google and has previously been reticent to do this across the board and use it as a primary signal, and have historically used this data only as a quality control check on the algorithms they write."

If you read again those words carefully, you can see how much the "search game" is changing and how much more difficult it will be for anyone to be successful by using the old classic SEO approach.

By looking under the hood at what this new machine algorithm is all about you can also understand why there has been so much collateral damage and why it is taking time for Google to further fine-tune Panda.

Even Google is aware of these dangers and is likely preoccupied by them, as Rand Fishkin further reports in this interview:

"Machine learning takes a bunch of predictive metrics and uses a neural network, or some other machine learning model, to try and come up with a best fit to the desired result.

I think one reason machine learning is slowly making its way into Google's algorithmic updates is they are uncomfortable with not knowing what is in the algorithm.

It's not as if you target specific sites like Ezinearticles and eHow, but sites that the quality raters identified as fitting into the eHow profile.

The challenge is to find metrics that will push those sites down, but keep deserving sites high. The machine learning algorithm will search across all data points it can, but it may use weird derivatives, for example, the number of times the page uses the letter x may have a super high correlation to whether people didn't like its quality so the machine learning algorithm pushes down pages that use the letter x. That's not an actual example but you get my point.

[This is why, from now on] You can no longer dig into the code and figure out which engineer coded into the algorithm that the letter x in pages means lower rankings. An engineer did not do it, the machine learning system did it. So, you know they have to be careful with how they implement it."

"I got the sense from the Wired interview and other writings that even Amit and Matt were a little nervous about how this works. I think they recognized that they hit some sites unintentionally.

The most frustrating part for them is that they don't know why the algorithm hit sites they didn't want to."

Source: Eric Enge

If the above is really the case, which I think it is, you now understand how major this change really is, and why I think it marks a critical departure from what Google has been doing to organize web sites since its very beginning.


Why You Need a New Frame of Mind To Face the Panda


If what I have learned in these three months is of any use, the first thing you need to do, is to change the way you think about how you should optimize your web site and the use of SEO practices.

Many of the things I have learned or reframed inside my head in this time, were triggered by my intense daily readings of independent Panda reports, and above all of the official thread on Google Panda inside the Google Webmaster Central Forum.

Though the reporting from relevant web sites has gradually vanished and the main thread is overloaded by quite useless discussions, in the previous weeks and months there has been a lot of very valuable information published in this thread.

In particular, I was struck by one unique poster running under the nickname of Lyrical Question, who caught my attention with her numerous answers to webmasters complaining about the effects of Google Panda on their sites.

While I am not in agreement with several suggestions she provided elsewhere in that thread, there was a beautiful set of strategic tips that Lyrical Question provided in a few of her answers which I thought provided the best mindset from which to engage a Panda defense or recovery plan.

They all pointed in one specific direction: stop optimizing your site for Google. Optimize it for your readers!

Here it is in her own words:
N.B.: The text that follows has been posted by Lyrical Question inside the Google Webmaster Central Forum - I am quoting it as is - with only a few grammar corrections - in the same thread I have asked Lyrical Question for permission to republish her content but have received no reply. I hope that the good work she has done can be made accessible to many others via this article without restrictions.

"Now - about building your site for your consumer and ranking well.

Why do you think algorithms happen? Or are created?

Because SEO people learned what moved sites to the top - things that really had nothing to do with the QUALITY of the site.

Things like... Oh say ---- Link exchanges... Or like Spun Blog posts... Or random comments on blogs that were spam...

etc. etc. ad infinitum.


The whole CONCEPT of being at the top of the line in Google is that your site is valuable - is relevant to the search and is worthy of being there.

Otherwise - in all honestly - Casinos and Porno, mesothelioma, JC Penny and Amazon would rank in the top ten thousand spots regardless of the input search request - because they paid for FORCED cheating.

Now - Google will continuously change its parameters. To continuously stop cheaters or blackhat seo practices.

Which means... one of two things....

a) You can spend all your time and your money racking your brain to figure out the steps to beat the algorithm every time it changes...


b) You can build your site to entice your visitors to remain, stay and enjoy your site once they land there.

If you choose b), you will ALWAYS come out ahead."


The Story of Queen SEO


by Lyrical Question

"Queen SEO wants to stay young and beautiful... She relishes her beauty in the mirror daily... She wants to be the ONLY young and beautiful queen ever....

And she wants to be the only one to show up in the mirror.

Queen SEO demands that MAGIC MIRROR MAKER give her the magic scroll that will make the mirror hers - and hers alone.

And she wants the magical potion to keep her young and beautiful.

The Magic Mirror Maker cannot give only one person the mirror - for that would be ludicrous. And if he should reveal the magic potion to QUEEN SEO - then the secret to the potion would be given away - and everyone would eventually be making the potion... and alas... if EVERYONE was young and beautiful then - youth and beauty would hold no meaning.

Sadly - the Magic Mirror Maker shakes his head gently and says - "My Queen - you must simply exercise, eat well and take care of yourself - allowing your beautiful personality to shine through... So that OTHERS may see your beauty... and all will flock to you..."

Unhappy that she must do work herself - QUEEN SEO screams ---- "OFF WITH HIS HEAD!"

Strangely - in your fairytale - you're demanding Google give you the recipe for the ways to rank.

If you have them - then everyone has them. Then ranking is done based on the "RANKING REQUIREMENTS" (or mirror or potion) --- instead of the quality of the site.


Make your website. Build it for your consumers/ viewers/ friends. Make it the best you can be.

Google is going to change the potion - on a continuous basis.

Unless you intend to sit staring in the mirror - to constantly get feedback - and recalculate the potion, then just make your site the best site there is.

If you do that - WITHOUT trying to decipher the potion content --- which IS going to change - because people are abusing it, then in truth, you should rank fairly well.

Google has listed most of the determining factors. You're not going to get the full recipe. So screaming about it isn't going to do any good.

They own the company - you are using it.

Simple math."

Lyrical Question - Google Webmaster Central Forum


The Lesson from Panda Is This One


"The lesson from Panda is simple.

Develop your brand, bother your viewers, collect emails, put people on contact lists, re-market heavily to folks. Stop depending on Google. Diversify into everything that isn't Google."


Don't DO SEO work.

Make your site for your visitors.

Stop trying to beat Google to get a ranking placement.

Build your site for those who come to it - and be the best that you can be.

Source: LyricalQuestion - Google Webmaster Central Forums


Does This Make SEO Useless? Should I Stop Optimizing Then?


Now... does this mean you shouldn't optimize?

No... It means - that if you're de-ranked - there is a reason... and finding out what that reason may be, is really important.

However, chances are that your site was de-ranked because of previous attempts to rank in Google. Not because of previous attempts at making your site better for your consumers.



No matter what Google does - the true test of a good site is whether the visitor comes to the site - stays on the site - buys from the site - or returns to the site.

Make your site for your visitors and not for the search engines - and in the end --- in the LONG RUN - your site will rank.

That's the bottom line."

Lyrical Question - Google Webmaster Central Forum



Remember that most of the advice, this included, about Panda, remains for now all very speculative, as there are yet very little reports of web sites re-emerging from its penalty.

Other than this, if I were asked what to do or how to best prepare for it, the one that you have read above would be my sincere advice ...even if Google Panda didn't exist.

Whether you like this or not, if your web site largely depends on Google for its traffic, advertising or business needs, I strongly urge you to look in depth at what Panda has done so far, and to prepare adequately for its coming landing in your language territory.

In the next part of this Guide to Google Panda, I will share the details of what I have specifically done over the course of these three months to improve the quality of MasterNewMedia and recover from the Panda penalization.

From "thin" pages, to whole blocks of thousands of news articles, Panda has caused a little-seen but deep revolution inside my content, and in the next part, you will learn what steps I took to find and then to change, modify or get rid of all the problematic things I have found.

Here the other parts of this guide:


Originally written by for MasterNewMedia and first published on May 24th 2011 as "The Google Panda Guide - Part 2: Machine Learning And The New Mindset".

Robin Good -
Readers' Comments    
blog comments powered by Disqus
posted by Robin Good on Wednesday, May 25 2011, updated on Tuesday, May 5 2015

Search this site for more with 








    Curated by

    New media explorer
    Communication designer


    POP Newsletter

    Robin Good's Newsletter for Professional Online Publishers  



    Real Time Web Analytics