This document discusses human computation and its uses in natural resource management. It defines human computation and describes various forms it can take including crowdsourcing, games with a purpose, solution space exploration, and social mobilization. Examples are given of human computation being used to monitor waterways, predict population dynamics from Twitter data, and predict snow levels from Flickr photos. Open issues and research projects exploring uses of human computation for problems in areas like predicting water demand are also discussed.
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
Putting Humans in the Loop for Natural Resource Management
1. Putting Humans in the Loop:
Human Computation for Natural
Resources Management
New Developments in IT & Water
Amsterdam, Nov 5 2012
Piero Fraternali
Politecnico di Milano, Italy
piero.fraternali@polimi.it
2. Outline
• Human Computation
– Origins
– Forms
• Crowdsourcing
• Games with a purpose
• Solution space exploration
• Social Mobilization
• Examples in Natural Resource & Water Management
– Open Issues
– Research Projects
– Conclusions and outlook
3. Human Computation: a definition
• According to Von Ahn
• Combine humans and computers to solve large-scale problems
that neither can solve alone taking advantage of the human
cycles
• According to Wikipedia:
• Human-based computation is a computer science technique in
which a computational process performs its function by
outsourcing certain steps to humans. This approach uses
differences in abilities and alternative costs between humans and
computer agents to achieve symbiotic human-computer
interaction.
4. Early example: CAPTCHA
• Stands for “Completely Automated Public
Turing test to tell Computers and Humans
Apart”
• Luis von Ahn et al. coined the term in 2000
• A Program that can tell
whether a user is a human
or a computer
• Humans and machines
have complementary
skills
4
6. Forms of HC: crowdsourcing
• Crowdsourcing is a distributed model that
assigns tasks traditionally undertaken by
employees or contractors to an undefined
crowd
– Split the task into micro-tasks
– Assign them to performers in the crowd
– Collect partial results into the final one
9. Forms of HC: GWAPS
• Games with a Purpose (GWAPs)
– Exploiting the billions of hours that people spend
online playing with computer games to solve complex
problems that involve human intelligence
[vA06,LvA09].
– Useful tasks are embedded in a playful experience
where human judgment is exploited consciously or
unconsciously
10. Types of Games
[Luis von Ahn and Laura Dabbish, CACM 2008]
Three generic game structures
• Output agreement:
– Type same output
• Input agreement:
– Decide if having same input
• Inversion problem:
– P1 generates output from input
– P2 looks at P1-output and guesses P1-input
12. Input Agreement: TagATune
• Sometimes difficult to type identical output
(e.g., “describe this song”)
• Show same or different input, let users
describe, ask players if they have same
input
13. Inversion Problem: Peekaboom
• Non-symmetric players
• Input: Image with word
• Player 1 slowly reveals pic
• Player 2 tries to guess word
14. Sketchness
• Puzzle Game, Guess and
Draw (Pictionary,
iSketch…)
• Players take turns
drawing the shapes of
objects inside an image
to make the other players
guess the object
• Two roles: Sketcher &
Guesser
• Objectives: Object
detection, garment
segmentation and tagging
15. Forms of HC: space exploration
• Combinatorial problems with
intractable solutions spaces, in
which humans can help the
heuristic core in pruning
– Protein folding: Proteins fold
from long chains into small
balls, each in a very specific
shape
– Shape is the lower-energy
setting, which the most stable
– Fold shape is very important
to understand interactions with
out molecules
– Extremely expensive
computationally! (too many
degrees of freedom)
• A Mason-Pfizer monkey virus
retroviral protease was
modeled by FoldIT gamers in
just three weeks
16. Forms of HC: social mobilization
• Social Mobilization
– Problems with time constraints, where the
efficiency of task spreading and of solution
finding is essential
– An example of the problem and of the
techniques employed to face it is the Darpa
Network Challenge [PRP+10]
– The solution comes from the
nature of the reward
mechanism and social
ties of humans
17. HC & Natural Resource
Management
• Objectives
– Collect and validate data
– Extract information from data
– Involve people in resource usage planning and management
– Change people’s behavior
• Approaches
– Passive: mine information from existing user’s activity traces
– Active: engage people in ad hoc tasks
• Ultimate goals
– Obtain “better data” for predictive models, planning and
management tool: more accurate, at finer time/space resolution,
in real time …
– Take “better decisions”: more participative, less conflicting,
capable of promoting social change
18. Monitoring waterways: CreekWatch
• Problem: obtain simple yet useful parameters on water shed
conditions in a vast territory at low cost
• Solution: geo localized mobile+Web application
– Developed at IBM Research Almaden, 4000+ users, 25 countries
– The city of San Jose, CA, uses it to prioritize pollution cleanup efforts
• Collected data are found to have good quality
19. Predicting population dynamics
with twitter data
• Problem: obtaining impact of population on territory at high temporal
resolution
• Can be used to detect events, estimate water consumption bursts,
waste production, etc
• Solution: using low cost geo-localized data sources (e.g., tweets)
together with structured and high cost sources (e.g., mobile phone
traces)
http://www.streamreasoning.org/demos/london2012
20. Predicting snow level with Flickr
images
• Problem: predicting the incidence of natural
phenomena using user generated content
• Solution: using Flickr photos tagged with “snow”
to estimate snow fall (precision 100% with 7
snow photos)
– H Zhang, M Korayem, DJ Crandall, G LeBuhn: Mining
photo-sharing websites to study ecological
phenomena. WWW 2012
21. Using social deliberation tools for
partipatory planning
• Problem: letting a large
crowd of citizens propose
solutions or deliberate on
proposals about public
goods
• Solution: large scale
deliberation and idea
management tools
– IdeaScale.com,
MIT’s Deliberatorium
…
22. Open problems
• Humans, like machines, can make errors
– Cognitive bias, fatigue
• Unlike machines humans can cheat
– Classification of attacks
– Spammer detection
• Quality of output improvement techniques are in
use
• Voting schemes
• Workers quality modeling and vote weighing (requires ground
truth or machine learning models and iterative / selective
labeling of data)
• Micro-flows, worker’s pre-task testing
• Task to worker assignment, active learning
27. 27
CUbRIK Project
• FP7 Integrating Project
• Goals:
– Advance the architecture
of multimedia search
– Exploit the human
contribution in
multimedia search
– Use open-source
components provided by
the community
– Start up a search
business ecosystem
• http://www.cubrikproject.eu/
31. 31
Experimental evaluation
1
0.9
0.8 Precision decreases
Crowd
0.7
Experts
0.6 Reasons for the wrong inclusion
Experts
Recall
Experts • Geographical location of the users
0.5 Aleve
• Expertise of the involved users
0.4 Crowd Chunky
0.3
No Crowd Shout
0.2 Crowd No Crowd
0.1
0 No Crowd
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Precision
32. 32
Experimental evaluation
1
Precision decreases
• Similarity between two
0.9
logos in the data set
0.8
Crowd
0.7
Experts
0.6
Experts
Recall
Experts
0.5 Aleve
0.4 Crowd Chunky
0.3
No Crowd Shout
0.2 Crowd No Crowd
0.1
0 No Crowd
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Precision
33. 33
Future directions & outlook
• Find problems where crowd support can be
useful, e.g.,
– Urban water demand prediction: smarter meters are
costly and not deployed. Household data can be used
to build models
• Design crowd interaction
– Non only IT: engagement, incentives, ethical and
legal issues
• Collect and clean-up data
• Integrate crowd model and data with (e.g.,
water) system models
• Check validity
34. References
• Managing Crowdsourced Human Computation, Panos
Ipeirotis, New York University Praveen
Paritosh, Google
• [LvA09] Edith Law and Luis von Ahn. Input-agreement:
a new mechanism for collecting data using human
computation games. In Proc. CHI 2009, 2009.
• [vA06] Luis von Ahn. Games with a purpose.
Computer, 39:92{94, 2006.
• [vAMM+08] Luis von Ahn, Ben Maurer, Colin
McMillen, David Abraham, and Manuel Blum.
recaptcha: Human-based character recognition via
web security measures.
Science, 321(5895):1465~1468, 2008.[
35. References
• Galen Pickard, Iyad Rahwan, Wei Pan, Manuel Cebrian, Riley
Crane, Anmol Madan, and Alex Pentland. Time critical social
mobilization: The darpa network challenge winning strategy. CoRR,
abs/1008.3172, 2010.
• Trant J., Exploring the potential for social tagging and folksonomy in
art museums: proof of concept. New Rev. Hypermed. Multimed.
12(1), 83–105
• Firas Khatib et al, Crystal structure of a monomeric retroviral
protease solved by protein folding game players, NATURE, 2011
• S. Kim, C. Robson, T. Zimmerman, J. Pierce, and E. M. Haber.
Creek watch: pairing usefulness and usability for successful citizen
science. In Proceedings of the 29th Int Conf on Human Factors in
Computing Systems, pages 2125–2134, New York, NY, 2011.