The first time I played with StumbleUpon, I was mesmerized. After a week or so of rating sites, StumbleUpon seemingly knew my tastes perfectly. I was so excited about the very concept of crowdsourcing, that I even wrote a blog post suggesting that crowdsourcing could end up replacing algorithmic search engines like Google as the preferred online searching (or “discovery”) technology.
But as impressed as I was at the time by StumbleUpon, I also recognized the potential problems that could arise if a crowdsourcing application ever truly became a major phenomenon. I wrote: “Granted, as with any technology, the more popular it becomes, the more likely it is susceptible to manipulation. Just as the rise of search engines created an entire industry of “search engine optimizers”, so too will the rise of “social media” create an entire industry of “social media optimizers.”
And, of course, that prediction has come true (to be fair, I wrote that about a year ago, after people far more into social media than myself had probably already declared themselves social media optimizers). As a result, the utility of social media “discovery engines” seems to diminish every day. To wit, a recent story on Sphinn – the crowdsourcing discovery engine for the search engine marketing industry – was about outraged StumbleUpon users complaining about an SEO company that had social media optimized itself into heavy rotation on StumbleUpon.
It turns out, however, that there was one other cause of social media manipulation that hadn’t occurred to me – the “heavy user” (a term that I borrow from the fast food industry, it refers to a very frequent visitor to a fast food joint). These aren’t necessarily people that are purposely trying to manipulate the popularity of a website or story on a social media site for profit; rather, they are just passionate users who happen to spend far, far more time than other users on the social media site. As a result, the influence of a small group of heavy users becomes disproportionately large.
This has already happened on Digg.com, the discovery engine for news stories. Rand Fishkin of SEOMoz describes this phenomenon aptly: “When folks think of Digg, they’re often misled into believing that the content seen on the homepage is representative of what a wide base of Internet users think is news-worthy and important. The numbers tell a different story – that of all stories that make it to the front page of Digg, more than 20% come from a select group of 20 users. Rand goes on to note that the top 100 Digg users drive 56% of front page content. Out of hundreds of thousands (or millions?) of Digg users, this truly represents a “monarchy” of Digg users who control what stories shall and shall not achieve popularity.
All of this is bad news for discovery engines. To combat social media optimizers, social media sites must invent rules to reduce the impact of social media optimization (SMO). And what does one call a series of rules designed to improve search results – why an “algorithm”, of course (see, for example, Digg’s recent announcement on changes to their ranking system). At some point, if a crowdsourcing discovery engine keeps on adding more and more algorithmic inputs into the determination of search results, it no longer becomes a discovery engine, it becomes an algorithmic search engine.
Over time – as we have seen happen with Google – the only way to combat optimizers is to hire a huge army of very smart engineers (and perhaps an even bigger army of temps to manually rate sites!) who continually try to keep one step ahead of the optimizers. This is an expensive and never-ending process. It’s hard to imagine websites like Digg, del.icio.us, or Sphinn having the resources to do this.
But even assuming you are able to to keep the SMOs at bay, a potentially more difficult issue is what to do with the heavy users. If you change your algorithm to reduce the impact of heavy users, you take the risk that you alienate the very people who love your site the most. It would be like an airline penalizing frequent fliers for using their service too much!
Ultimately, I suspect that the only solution to this conundrum is to encourage/force users to not only rate stories but also rate their fellow users. Like a “negative keyword” on a search engine, users will need to create “negative users” that discount any recommendations that come from users unlike then. Results will then become different for different groups of users, depending on the ‘micro-crowd’ that they have selected to use for their results.
Again, quoting my own post from last year: “In the current social media world, it is possible to “Digg bomb” and generate buzz around a news story simply by spamming the results and voting a site up the ranks. But if users have the ability to approve or reject members of their ‘crowd’, you could truly end up with a spam-free world where you really trust the results that you get back from the social media engine.”
In other words, crowdsourcing by itself is not enough. To succeed, you need to develop crowdsourcing 2.0 – a combination of crowdsourcing and personalization. Call is my-crowdsourcing if you will. This sort of technology is far less complex to develop and maintain when compared to the ongoing algorithmic improvements required to maintain quality search results in Google, and I still believe there’s a chance that such results could outperform algorithmic results over time. In the meantime, I’ll continue to use sites like StumbleUpon to find stories about spiders using drugs, and wait for the day when discovery engines evolve to the point of making Google obsolete.