|
|
|||||||||||||||||||
|
|
First, a very sincere kudos to Pete Cashmore of Mashable for digging up hard data that shows how Google Reader's "feed bundles" can skew results. It's clear that subscribing to a bunch of feeds with a single click isn't the same as choosing individual feeds by hand. What Pete added: real numbers.
(I haven't dug into the specific examples to see if Pete's conclusions are the mostly likely explanation, though his results aren't that surprising.) Where Pete went wrong: concluding that these specific errors make the whole dataset worthless. ("**" added) Google Reader stats, in case you don't know, are bulls**t. In fact, all Feedburner stats for most top blogs are bulls**t due to the effect of default feeds. I've looked at lots of data over the years, including as part of corporate data quality projects. Even without gaming, any non-trivial dataset I've seen has flaws. The answer is to understand and attempt to fix or work around flaws, not to throw out the whole thing or wait for flawless data. The perfect is the enemy of the good. Even if the data is flawed for some or all of the 91 blogs included in the feed bundles, much of that can be corrected for based on analysis like Pete has done. And, there's a whole world of blogs that aren't affected. Sure, there are similar problems with other feed readers, home pages with RSS subscriptions, etc. Switching from feeds to sites, there's also no shortage of problems measuring page views (bots!), unique visitors (cookie deletion!), time spent on site (tabs!), etc. But data is useful and important, so people make do with what's available given time and budget constraints. (Mashable isn't shy about mentioning some stats on every page of the site: "in excess of 5 million monthly pageviews" and "ranks among the Top 100 blogs worldwide".) Has anyone automated the process of harvesting the subscriber stats? If so, please send details. I'll bet there's meaningful data to be found. One more bit of overreach: The easiest way to get a default feed on one of these startpages is to own it, promise to promote it on your blog or be friends with the person who runs it. Did the post include any evidence? (I'm sure these things happen, but making implied accusations without backing them up doesn't lend credibility to the post.) On Crunchnotes, Mike Arrington presents another side to the story, including a useful data point: the complainers rallied around the notion that the stats are somehow fixed. In particular, some of the feeds are included in bundles that users can add to the reader, jacking up their stats.
Add Comment
|
|
|||||||||||||||||