A talk on my experiences building crowdsourcing applications, both at the Guardian newspaper and for my own personal projects. Presented at Web Directions @media 2010 on June 9th.
1. Building
crowdsourcing
applications
Simon Willison - simonwillison.net - @simonw
@media - 9th June 2010
2.
3.
4. Crowdsourcing?
Let me just cop to the fact that
“crowdsourcing” is a stupid buzzword. But
like “blog” before it, sometimes it’s the stupid
term that sticks. For my purposes, it means
collaborating with the people who used to be the
silent audience to make something better than you
could make alone. - Derek Powazek
http://powazek.com/posts/2443
35. Background
June 2009
450,000 pages of expenses documents released
“Transparency” = dodgy scanned PDFs
One week notice - so one week to build it!
36.
37.
38.
39.
40.
41.
42. Stuff that worked
The progress bar
Photos of the MPs
Releasing a small group of documents at first
Score boards (once we finally added them)
Especially the “top in last 48 hours” one
43. Stuff that didn't
Releasing everything else at once
Asking the wrong questions
Line items!
Too much time fighting scalability fires
Reporting tools were 24 hours too late
48. Goals
Find stuff our journalists cared about
Less boring data entry
Data coming out again from the start
Visible rewards for contributors
More digestible tasks
Better sense of activity by other people
49.
50.
51.
52.
53.
54.
55.
56. Lessons learned
Use Redis for random selections, not MySQL
Assignments made a huge improvement
The most important logic in a crowdsourcing
system is the next thing to review button
“Oldest first” pagination is critical
69. Lessons learned
Be flexible: your users may not share your
precise goals
Optimise for the fat head of your user base
Expose recent activity to site staff
Users will do almost anything for a medal!
70. Final thoughts
Don’t be afraid: even flawed crowdsourcing
systems produce fascinating results
Think hard about the questions you ask
Have a minimal barrier to entry
Get the next task logic right. Seriously.