The Google Panda Guide: Part 4 - The Future I Would Like To See
Once I started realizing that the consequences of Google Panda on MasterNewMedia were not going to go away anytime soon, I completely stopped worrying about Google and begun devoting my time and resources to my most important asset: readers and fans. I also started to wonder whether there could be a way, maybe in the future, to escape from this apparent Google stranglehold without simultaneously becoming also a victim of your own virtue.
Photo credit: Eric Isselée and Adisa - iStockphoto, Andres Rodriguez
Google Panda, the new Google algorhitm designed to eliminate "thin" content sites from Google search results, has been a disastrous adventure for many hundreds of legitimate web publishers, many of which really did not deserve to see Google make them nearly disappear from search engine results as they were inadvertently caught by this much to be refined new search ranking algorithm.
In the name of a crusade against low-quality sites, built and automated in many cases just to monetize Google AdSense ads, hundreds of quality web sites have been jolted out of Google search results, while being in some - not so rare - cases replaced by the very low-quality, "thin" sites that Panda had been designed to punish.
This is not a new story. For whom has been a web publisher for some time, the notion that Google has secret rules and that these are often changed and improved, is nothing new. But Google Panda has set a new era and it, at least from my viewpoint, is the largest and deepest change Google has made into how it organizes and ranks information inside its search engine result pages.
Unfortunately, the results, have not been in many cases truly impressive.
The key problem that Panda only helped to exacerbate again, is in fact a long-standing and well known one. The only difference is that now, after-Panda, there is likely a lot more people thinking about it.
Even if all the traffic magically reappeared and my lost revenues started soaring again, this key problem would not go away. And it is two-folded:
1) It gets more and more difficult to trust Google and to see it as a friendly ally. Especially if you are not a big company. Google should be more upfront and transparent about how it ranks results, especially when it decides to dramatically change such criteria and mechanisms.
2) Google has become a central hub that tells millions of people what counts and what to look at when they express a request via a search. As such, it provides a critical and very valuable service to the global community, which increasingly needs to find rapidly and efficiently the information it is looking for. We as users, should realize that the time has come for us to get in control of this valuable mechanism, instead of relying exclusively on secret algorithms.
This is why, I decided to throw on the table a small, probably utopian idea, but at least one which can serve the purpose of:
a) Contributing a possible new direction
b) Proposing something that is technically feasible
c) Questioning what we give too much for granted
d) Following the patterns we have seen at work elsewhere: from centralized to distributed; from top-down expert secrecy to crowdsourced, open-sourced and distributed co-operative participation.
This is the future of online search. Or at least, the one I would like to see. Please read on.
The Panda Situation
Matt Cutts of Google explains Google Panda and what it does in its most recent update.
As I have reported in previous parts of this multi-part Panda Guide series, while Google officially stated goal is certainly an honorable one, and that is to provide more high-quality search results to its users, not all of the results obtained so far really showed a marked improvement.
More than anything else, the number of legitimate web sites that have apparently been whacked out of Google results for no apparent valid reason at all, is the aspect that, while possibly temporary, is most worrisome.
In some cases and for extended periods of time, the new Panda algorithm has caused valuable content to disappear from the top search results while being replaced by dubious higher quality sites, and in many cases by blatant scrapers.
"One thing with this Farmer / Panda Google update that seems to be impacting a lot of sites, is that there are many examples of sites producing original content and other sites taking their content and outranking them. In short, Google is confusing the site that created the content and labeling them as a low quality site, and then ranking a site that stole the content above the original content creator.
This has always been an issue for some webmasters. But since the Farmer / Panda update, it has become more of an issue for these webmasters."
Matt Cutts of Google admits this issue and explains how to manage it
"A large portion of those scraper sites are monetized via Google AdSense and would not even exist if it were not for AdSense.
So Google whacks your site, tells you to clean up your act (and to increase your operating costs while decreasing your margins), lumps you in the bad actors group, offers no information about when the pain will (or even could) end, pays someone to steal your content, then ranks that stolen copy of your content above you in the search results."
Source: Aaron Wall - SEOBook http://www.seobook.com/how-google-creates-black-hats
At the gut level, what Aaron wrote is what I have been feeling deep inside of me, as Google extended for so many weeks, the gigantic collateral damage it had unleashed with the first two Panda iterations.
Notwithstanding my optimism and desire to see good and beyond the immediate disaster, I also have to recognize the significant amount of damage Google has done to valid content publishers, while - inadvertently or not - compensating and rewarding the worst scum sites of the web. Those that scrape content from your site and republish it without credit as if it was theirs.
Google PR Fail
Yes, I can understand that it takes time to refine and adjust this new toy Google has created, but I have the impression that this time Google should have tested it much more extensively than it has, while providing, ahead of time, clear indications to web publishers of the deep changes coming and of the possible consequences this may have had.
Given the gigantic small business ecosystem that has grown around Google, and which is made up of both good and bad companies, Google should have felt a greater responsibility in informing its many web publishing partners and stakeholders of its intentions, as such changes can destroy overnight years of work and love.
So while I agree that Google is providing a free service, and it is a private commercial company, I still think that what I have seen happen, given the number of respectable and value-providing sites that have been "pandalized" is akin to having, at least in part, thrown away the baby with the water.
Yes it is OK to refine and drastically change algorithms to eliminate bad and useless web sites, but this does not mean that we can wipe out irresponsibly hundreds of other valid content publishers, possibly because this new algo needs yet a lot more refinement.
I am not advocating that Google should guarantee business and revenues to anyone, but being aware of its power and of the many good web publishers that could have been mistakenly penalized by these changes, I think it should have taken a little extra care this time to inform them, at least in general terms, of the coming changes (Google Panda).
And it is on this very standpoint - call it PR or external communications - that Google really failed. I don't know why so few US-based web publishers have had the courage to stand up and say this, but since I have always been open about my feelings on Google, I felt this was an appropriate situation in which to raise some of these issues.
The Problem: Google Credibility
Panda, and the uneven results it has produced, have generated a ton of criticism toward Google, especially from those who have been worst hit by this new search algorithm.
But even outside this large circle of thousands of "pandalized" web sites, there seems to be a periodically recurring and growing concern. Now that more people have started to see first-hand the type of negative consequences that Google can cause on a large scale (when its algorithmic changes are not carefully tested beforehand), I see once again a growing number of people complain and criticize Google.
Their key concerns are in particular Google credibility, ethics and transparency.
Having such secrecy, for a private and commercial online search engine company shouldn't, in theory, be a problem. But the issue here is that this company is now the world intelligence information hub, where anyone taps to find out "who is who", "where I find this or that" and "who is I can rely upon" on any given matter.
On top of this, Google is also the most used Internet search engine and the top search advertising provider. And the money Google makes from advertising is by far the largest slice of its present revenue pie.
EricLegge - Level 1 - 5/5/11
...Google finances itself mostly from advertising on websites. In its early days it was its only source of finance.
It is a free service as is the information provided by most information websites, but it is made possible by advertising.
Google is the dominant search engine and the dominant web-advertising company. It can keep a stranglehold on this position by designing its search algorithm in such a way so that other advertisers cannot compete with it.
There is an extremely questionable lack of integrity in a dominant search engine that creates a search algorithm that is aimed at improving its search results so that only quality sites come up in the first search pages, but which does not make it clear exactly what it requires of a site to meet those search requirements - moreover a change that has made many quality sites unviable.
Where there is secrecy like this there is usually a lack of integrity.
Everything should be above board.
Especially where one company dominates both the search engine market and the web advertising market...
Source: Google Webmaster Central
I think that these concerns are not completely inappropriate and that Google needs to start addressing them as soon as it can.
The Value of Accessing and Searching Information
Given that we live in the information age and that a very large part of the world economy revolves around "information", accessing and having the ability to search through it should be considered akin to accessing a "public" commons. Not a private one, driven exclusively by commercial, market interests and measured solely on the basis of its financial performance.
Public information available on the Internet is there for anyone to access it and to create extra value, understanding and new solutions for others. For this reason, how information is organized, ranked and filtered inside a worldwide search engine serving hundreds of million of people is not just a Google private matter in my humble opinion.
As a matter of fact, I really think search engines such Google should be considered like public utility services, should be kept cost-free and made public and users should be the owners of such "commons". (Possibly, Google or who provided such search facility, could be publicly financed by everyone in a fashion similar to an invisible tax added to the cost of any Internet connection or to any electricity or phone bill).
When so much of our life depends on having proper and rapid access to the right information, don't you think it would be very risky to depend on a centralized and secret system, driven exclusively by financial gains, to continuously influence how information is organized, ranked and classified inside a major search engine like Google?
Everything should be transparent. No one company or brand should be able to game the system without being vulnerable to everyone seeing it.
To achieve this, it would be necessary to require transparency of both the filtering and ranking rules used and turning into the hands of the user, the control and final choice of what is ranked higher and what is filtered out.
There should not be a Google deciding this for you in a secret way.
Now, since Google is indeed in control of what information you see when you search for something and how this information bits are ranked and organized, how can Google remain credible if:
a) It is totally secretive about it, even in the face of its own mistakes?
b) It is also the world dominant search advertising platform, which pivots 100% around Google search engine, and which makes the majority of Google earnings?
That is: if you have a monopoly of the search market and control the search results, whether for the good or the bad, you can control the advertising market to your own benefit.
How credible can you then be in such situation? And for how long you can get away with it?
An Alternative Future for Search: That's How It Could Be
This is how I would like to imagine the future of online search, which may also be a possible solution-opportunity for Google … or for the next "Google".
This is also an open invitation to the Open Source and P2P world communities, as in my eyes, they are the ones who could more easily appreciate any possible value there may be and who would also have the skill and vision to transform it into a reality.
Mine is a simple idea.
A search engine for the users, and by the users.
Open, user-controlled, distributed.
a) To make search results more useful, while becoming more trusted and much less vulnerable to being reverse-engineered and gamed by unscrupulous marketers, I don't think there is a need to make your search engine and your ranking algorithms secret.
b) Secrecy promotes and breeds black markets, underground work and a well-defined objective for everyone: uncover the secret. Reverse-engineer it. Game it.
My proposal: Why then not turn the search ranking mechanism upside down by giving back control to who is searching and in need of taking decisions based on that information?
But not by personalizing his results or differentiating them from those of others based on history, preferences or the social graph, but by allowing the user to see at all times, what is under the hood and having the option to modify it.
What is really best for us? A centralized, secret and proprietary search engine driven by Wall Street or a distributed, fully transparent and open-sourced one that placed each and every USER / searcher in the driver seat?
What would happen if it was me and you, individually, selecting the criteria, ranking algorithms and penalization approaches to use to make up our search results?
a) Search was a full public service.
b) Users start seeing ranking and filtering factors and, if they want they can change them according to their needs and preferences.
d) "Trusted search curators" for specific vertical information niches start to become themselves the new relevant results. They provide the needed "trust" and transparency to search results by creating their own curated collections for the topics they cover.
e) An ecosystem of open-source algorithms, filters, curated collections of sites and resources on specific topics emerges.
f) An ungameable system. If everyone can individually select and rank results according either to their preferences, or by utilizing user-defined filtering pre-sets, ranking plugins done by experts and niche curators, it becomes much more difficult for anyone to game search engine results, as now there would be many thousand different ranking systems at work. Not just Google and Bing.
N.B.: For those who wouldn’t want to bother with setting up their individual search preferences, they could be offered to select among alternative ranking algorithms (e.g.: Pre-Panda, 1998-style, etc.), or through pre-set ones designed by users, groups or even other search engines themselves, or to fallback on the preferences set by their close network of friends (on Facebook, Twitter, etc.). If you had no "online" friends and did not want to set preferences, paradoxically you could be given an alphabetical, or chronologically indexed set of results, and then you could move on to refine and distill what you need out of it, by applying on the fly, your own criteria.
Morale: Turn it upside down. Legalize it.
Put the choice of how to rank Internet search results results in the hand of the searchers, not in the hands of those who control both the search and advertising marketplace.
Let users index, refine, develop and improve search engine ranking algorithms by applying the filters and metrics that serve THEM best, and not only the Google stock.
Move from listing titles-URLs-descriptions to curated search results, in which "trusted search curators" will provide bundles of high-quality results, selected and organized together in new emerging formats.
- Content indexing could become as well a distributed activity in the hands of the users. With this approach, individual users contribute to index and add information into a shared database aggregating each user personal index. In this fashion, users not ONLY have greater control of what is actually indexed, but they are actually creating a real search commons index - a collaborative effort by all users that is available to all. (An example of a distributed search engine - where peers collaborate to construct their search database - is the YaCy project).
- New media literacy becomes more and more important, because if individuals are not educated to appreciate the difference between different types of information, the role of search engines and the value of appropriate filtering this proposal of mine may just remain a utopian dream.
- Search results as such may be seeing the start of slow but unstoppable demise. Why? People want more and more quality information, selected, curated by someone they trust, or organized in ways that they can control and modify.
- Searching information is not a consumer need. It is a vital part of our new emerging ecosystem on this planet. Anything that has such immense value and potential, for the whole of humanity, should not in my humble opinion be regulated by the interests of those owning stock in that company.
- If Google and the other major search engines are not willing to be transparent about how they organize, filter and rank information, how can you trust that the answers you are given do really provide you with the best option possible?
- Access to information should not be based on some “social” secret recipe of what is good and what is bad, - that is taken care of by religions of this world - as there is no objective metric that can measure the different needs and information requirements of each human being.
Unless I can check it.
That's the provocative future of online search I would like to see.
Open-source gurus, P2P evangelists, independents and futurists, what do you say? Is this really a possible future direction we can take?
If you have missed them, here the other parts of this guide:
- The Google Panda Guide - Part 1: What It Is, How It Works, Collateral Damage
- The Google Panda Guide - Part 2: Machine Learning And The New Mindset
- The Google Panda Guide - Part 3: What To Look For, What To Clean
- The Google Panda Guide - Part 5: The AdSense Dilemma
Originally written by Robin Good for MasterNewMedia and first published on June 9th 2011 as "The Google Panda Guide: Part 4 - The Future I Would Like To See".
Google PR Fail - barsik
The Problem: Google Credibility - Clipart
The Value of Accessing and Searching Information - Sean Gladwell
An Alternative Future for Search: That's How It Could Be - gvictoria
Originally written by Robin Good and first published on MasterNewMedia.Robin Good -
blog comments powered by Disqus