Blogs vs. MSM
The multi-lingual web
Almost like a good Swiss watch, Dave Sifry, Technorati CEO, is back with full stats and trends on what he now calls the "State of the Live Web" and which includes "all forms of social media on the Web, including blogs, of course, but also video, photos, audio such as podcasts and much more."
As always, Dave's report is full of interesting data confirming the ongoing explosion of blogs and personal publishing tools as well as the increasing amount of spam (though with some recent slowdowns) and the changing relevance of different languages online.
Before sharing with you the full report, here a few items that caught my personal interest:
a) The total number of blogs keeps growing. We are at 70 millions now and while there is some sign of slowdown in the adoption rate, blogs remain the most interesting and powerful opportunity for communication, education, news reporting, personal independent publishing and grassroots democracy that has ever appeared.
b) The Top 100 index of the most linked sites shows a strong increase in the number of blogs now present: from eleven to twenty-two. Not only blogs present unique opportunities but they are consolidating their position as serious news and information sources among traditional established media content publishers.
c) Cloned sites, splogs and blog spam sites are an increasing problem with hundreds of unscrupulous online marketers firing up spam blogs and copy sites by the thousands daily. While the problem remain very serious both for publishers and search engine aggregators like Technorati, Dave Sifry expresses confidence in the growing ability manage and counter this huge wave of content junk.
d) Tags have become a staple of the social, live web. Over 35% of all sites indexed by Technorati use tags and more and more are doing so every day. If you have not started using tags yet it is about time you start thinking about adopting them soon. Tags represent an easy and effective way to give more exposure, visibility and relevance to your content by allowing content publishers to attach simple text labels to thier articles and posts.
e) The multi-language, global web shows signs of change and growth as well, with Italian taking over Spanish-based blog content and confirming the impact that the economic digital divide has on a large part of the world population, which nonetheless very large in numbers has little or no access to these new technologies.
Hey, it's that time again, time to slow down, take a deep breath, and dig into the data!
About this Report, and the Obligatory Plug for Technorati
Technorati is known widely for its quarterly State of the Blogosphere reports, analyzing the trends around blogs and blogging.
With this report, we expand on this tradition by introducing information and analysis relating to the broader range of social media on the Web -- what we and many others call the Live Web (another good definition).
Technorati continues to grow well beyond its roots at the leading blog search engine; increasingly, we are the main aggregation point for all forms of social media on the Web, including blogs, of course, but also video, photos, audio such as podcasts and much more. What makes this possible is the rise in the use of tags across all forms of social media and the increasing implementation of tags by the publishing platforms supporting each form of media.
Increasingly, tags have become a lingua franca of Live Web, helping to categorize social media while also indicating where people’s attention might be at any given moment. But because each form of media is published from unique platforms with their own established communities, the audience found itself hopping from platform to platform to get a sense of what might be hot at any given moment. Which is why our social media aggregation service -- made manifest on our tagged media pages -- is growing at a torrid pace.
While we still have substantial reporting on the the State of the Blogosphere, we now expanding the report to provide information about the State of Tags. Admittedly, the information we have on this new area of focus for our report isn’t as deep or as expansive as our State of the Blogosphere, and we expect that over time, this and other new sections will expand, but we believe this is a good first step in trying to provide a more comprehensive snapshot of the Live Web.
OK, on to the numbers!
The state of the Blogosphere is strong, and is maturing as an influential and important part of the web.
For nearly four years, we’ve been tracking and enabling the growth of this phenomenon and theirs is much in our data to indicate that the medium is “growing up.”
Technorati is now tracking over 70 million weblogs, and we're seeing about 120,000 new weblogs being created worldwide each day. That's about 1.4 blogs created every second of every day.
Spam and splogs (spam blogs) continue to be a problem in the blogosphere, and there was a marked increase in splogs that coincided with the holiday season last year. Technorati has been tracking between 3,000 - 7,000 new splogs created each day, but there was a significant spike in splog creation during early December, when we tracked over 11,000 splogs created each day during December - a total of 341,000 splogs that we removed from our indexes during that period.
Fortunately, spam rates have decreased somewhat since then, as blog hosting providers have responded to the issue during the months of January and February. My personal take on the issue of spam is that all healthy ecosystems have parasites - the only question is whether or not the system is structurally vulnerable to being overwhelmed. Thankfully, because of the accountability that is built into the web itself (the URL structure is fundamentally accountable), I believe that while the vulnerability of the live web to spam is real, it is managable.
Since our last State of the Blogosphere report in October 2006, we’ve seen a slowing in the doubling of the size of the blogosphere. This shouldn't be surprising, as we're dealing with the law of large numbers - it takes a lot more growth to double from 35 million blogs to 70 million (which took about 320 days) than when it doubled from 5 million to 10 million blogs (which took about 180 days).
We also see a slowing in growth in the rate of posts created per day; while there are spikes in blog posts during times of significant world crisis -- for instance, last summer’s conflict between Israel and Hezbollah -- the overall trend is that posting volume is growing more slowly, at about 1.5 million postings per day. That's about 17 posts per second. In October 2006, Technorati was tracking about 1.3 million postings per day, about 15 posts per second.
In previous reports, we looked at the popularity of mainstream media compared to blog sites. One interesting item to note in April 2007, the number of blogs in the top 100 most popular sites has risen substantially.
During Q3 2006 there were only 12 blogs in the Top 100 most popular sites.
In Q4, however, there were 22 blogs on the list -- further evidence of the continuing maturation of the Blogosphere.
Blogs continue to become more and more viable news and information outlets. For instance, information not shown in our data but revealed in our own user testing in Q1 2007 indicates that the audience is less and less likely to distinguish a blog from, say, nytimes.com -- for a growing base of users, these are all sites for news, information, entertainment, gossip, etc. and not a “blog” or a “MSM site”.
Further, there is a wider diversity of languages represented here, specifically Farsi with TodayLink.ir, Persian Blog Fans Club, and Giliran.com making the Top 100. More on that in a moment, as we discuss the international growth of the Blogosphere.
In terms of blog posts by language, Japanese retakes the top spot from our last report, with 37% (up from 33%) of the posts followed closely by English at 36% (down from 39%).
Additionally there was movement in the middle of the top 10 languages, highlighted by Italian overtaking Spanish for the number four spot.
The newcomer to the top 10 languages is Farsi, just joining the list at #10. It has been very interesting to watch the growth of the blogging world in the middle east, especially in countries like Iran, and it is reflected in the language distribution above.
English, Japanese and Chinese look almost identical to our last report in their posting distribution. With Italian overtaking Spanish, we get to see another language with a different distribution, which contrast both the extreme geographic correlations of the Asian languages and the relative lack of geographic correlations of English.
Again it would appear that both English and Spanish are more global languages based on consistency of posting through a 24 hour period, whereas other top languages, specifically Japanese, Chinese, and Italian, are more geographically correlated.
It would also appear that a significant number of people who are blogging are doing it during work hours.
The explosive growth that we see in the Technorati index is mirrored in social media sites throughout the Web, including Flickr, YouTube, and the like. This shared phenomenon allows us to marry the wealth of information in our index with the wealth of that stored on social media sites across the Live Web through the shared construct of tags.
For the uninitiated, a tag is a category or descriptor that someone (often the creator) assigns to it . This descriptor literally hangs off the media that’s published to the Web much in the same way a luggage tag hangs off your suitcase -- easily identifying the bag.
The bottom line: we’re seeing explosive growth in the tags index. People are clicking on tags, people are using tags, Google features tagged media in its results pages.
Tags adoption has become a phenomenon across the Live Web, and we are seeing a correlative explosive growth at Technorati.
On to the numbers:
Technorati is now tracking over 230 million posts using tags or categories, and the number of people who are using tags is growing:
As of February 2007, about 35% of all posts Technorati tracks use tags.
The number of bloggers that are using tags is also increasing month over month. About 2.5 million blogs posted at least one tagged post in February 2007.
Back in 2002 when Technorati started tracking the blogosphere, social mores and community practices were still forming, and its growth was primarily through the written word. It was a fledgling medium that was initially reviled, then feared, and, now, embraced as mainstream.
The blogosphere started well before Technorati was founded, and its growth was fostered by many people and organizations that brought openness and cooperation to the medium.
One of those people, Dave Winer, just celebrated the tenth anniversary of his weblog. Given this auspicious anniversary, I wanted to give my thanks and support to Dave and to all of the other early pioneers in the world of blogging, RSS, and the Live Web. Without Dave's efforts, the web wouldn't look the way it does today. His creation and support for systems like weblogs.com and open formats like RSS were critical in building the early infrastructure that Technorati relies upon and helps to support. Thanks, Dave!
As a result of this work and the cultural mores of openness, we also have photo sharing, podcasting, online music publishing, online video publishing, user-generated games, and, increasingly, we have structured data-sharing such as upcoming events. All of this seething, lively activity constitutes the Live Web and Technorati is its hub -- thanks in large part to the growing use and ubiquity of tags.
Through the social constructs of tags, we help people find unique voices and points of view. We also help social media publishers to find the people formerly known as their audience. And they all converge, as a result, on Technorati. We’re proud of this position, of course, but also humbled by the responsibility it imposes.
As we continue to bring more and more of the Live Web to the fore, and to organize it and present it in ways that are useful, entertaining, and informative to you all, I hope you’ll continue to tell us your opinions (as if I could stop you!) and provide us your guidance. Our credo has been and will always remain: “Be of Service.” Your voice helps us to do this, so please continue to tell us what we can do better.
Read the full State of the Live Web report in his original version with additional graphs and stats here:
The State of the Live Web, April 2007
Getting All the Reports
You can get all of the State of the Blogosphere and State of the Live Web reports, going back my first report in October 2004 at http://www.sifry.com/stateoftheliveweb/ All of this material is licensed under a creative commons for-attribution license, and all I ask in addition is that you please keep the Technorati logo and links to the original reports in any use of the charts or data.
About the author:
Dave Sifry is the CEO of Technorati, the major blog search engine online. He has been publishing the State of the Blogosphere for the last few quarters by collecting, aggregating and analyzing the huge amount of traffic and usage data coming from its own search engine which now tracks over 50 million blogs.
Previous Blogosphere reports by Technorati:
Blog Usage Statistics And Trends: State Of The Blogosphere - Q3 - 2006
by Dave Sifry - November 8th, 2006.
Blog Usage Statistics And Trends: State Of The Blogosphere - Q2 - 2006
by Dave Sifry - April 27th, 2006.