Crowdsourcing made better

765140960_735722ddf8_zRead an article the other day in MIT News (Better wisdom from crowds) about a new approach to drawing out better information from crowdsourced surveys. It’s based on something the researchers have named the “surprising popularity” algorithm.

Normally, when someone performs a crowdsourced survey, the results of the survey are typically some statistically based (simple or confidence weighted) average of all the responses. But this may not be correct because, if the majority are ill-informed then any average of their responses will most likely be incorrect.

Surprisingly popular?

10955401155_89f0f3f05a_zWhat surprising popularity does, is it asks respondents what they believe will be the most popular answer to a question and then asks what the respondent believes the correct answer to the question. It’s these two answers that they then use to choose the most surprisingly popular answer.

For example, lets say the answer the surveyors are looking for is the capital of Pennsylvania (PA, a state in the eastern USA) Philadelphia or not. They ask everyone what answer would be the most popular answer. In this case yes, because Philadelphia is large and well known and historically important. But they then ask for a yes or no on whether Philadelphia is the capital of PA. Of course the answer they get back from the crowd here is also yes.

But, a sizable contingent would answer that the capital of PA is  Philadelphia wrong (it is actually Harisburg). And because there’s a (knowledgeable) group that all answers the same (no) this becomes the “surprisingly popular” answer and this is the answer the surprisingly popular algorithm would choose.

What it means

The MIT researchers indicated that their approach reduced errors by 21.3% over a simple majority and 24.2% over a confidence weighted average.

What the researchers have found, is that surprisingly popular algorithm can be used to identify a knowledgeable subset of individuals in the respondents that knows the correct answer.  By knowing the most popular answer, the algorithm can discount this and then identify the surprisingly popular (next most frequent) answer and use this as the result of the survey.

Where might this be useful?

In our (USA) last election there were quite a few false news stories that were sent out via social media (Facebook and Twitter). If there were a mechanism to survey the readers of these stories that asked both whether this story was false/made up or not and asked what the most popular answer would be, perhaps the new story truthfulness could be completely established by the crowd.

In the past, there were a number of crowdsourced markets that were being used to predict stock movements, commodity production and other securities market values. Crowd sourcing using surprisingly popular methods might be used to better identify the correct answer from the crowd.

Problems with surprisingly popular methods

The one issue is that this approach could be gamed. If a group wanted some answer (lets say that a news story was true), they could easily indicate that the most popular answer would be false and then the method would fail. But it would fail in any case if the group could command a majority of responses, so it’s no worse than any other crowdsourced approach.

Comments?

Photo Credit(s): Crowd shot by Andrew WestLost in the crowd by Eric Sonstroem