Crowdsourcing satellite imagery (Talk at Giscience2012)
Upcoming SlideShare
Loading in...5
×
 

Crowdsourcing satellite imagery (Talk at Giscience2012)

on

  • 9,836 views

Talk given at the GIScience2012 conference (http://www.giscience.org/)

Talk given at the GIScience2012 conference (http://www.giscience.org/)

More info about this work on my blog http://goo.gl/giouF

Statistics

Views

Total Views
9,836
Views on SlideShare
665
Embed Views
9,171

Actions

Likes
0
Downloads
9
Comments
0

13 Embeds 9,171

http://irevolution.net 8518
http://nico.maisonneuve.free.fr 367
http://pacheramanda.wordpress.com 257
http://translate.googleusercontent.com 16
http://kimonify.kimonolabs.com 2
http://ranksit.com 2
http://irevolution.net. 2
https://si0.twimg.com 2
http://webcache.googleusercontent.com 1
http://thisninja 1
http://www.twylah.com 1
http://feeds.feedburner.com 1
https://twitter.com 1
More...

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

CC Attribution-NonCommercial LicenseCC Attribution-NonCommercial License

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    Crowdsourcing satellite imagery (Talk at Giscience2012) Crowdsourcing satellite imagery (Talk at Giscience2012) Presentation Transcript

    • Crowdsourcing satellite imagery: study of iterative vs. parallel models Nicolas Maisonneuve, Bastien Chopard Twitter: nmaisonneuveFriday, September 21, 12 1
    • Damage assessment after a humanitarian crisisFriday, September 21, 12 2
    • Port-au-prince: 300K buildings assessed in 3 months for 8 UNOSAT expertsFriday, September 21, 12 3
    • Organizational challenges: How to organize non-trained volunteers, especially to enforce quality?Friday, September 21, 12 4
    • Organizational challenges: How to organize non-trained volunteers, especially to enforce quality? Investigated scope: • Qualitative + Quantitative study of 2 collaborative models inspired by Computer science: iterative vs parallel information processing • Controlled experiment to isolate quality = F(organisation), removing other parameters e.g. training, task difficulty • this research != studying real world collaborative practices but more extreme/symbolic cases to guide collaborative system designersFriday, September 21, 12 5
    • Organizational challenges: How to organize non-trained volunteers, especially to enforce quality? Investigated scope: • Qualitative + Quantitative study of 2 collaborative models inspired by Computer science: iterative vs parallel information processing • Controlled experiment to isolate quality = F(organisation), removing other parameters e.g. training, task difficulty • this research != studying real world collaborative practices but more extreme/symbolic cases to guide collaborative system designersFriday, September 21, 12 6
    • Organizational challenges: How to organize non-trained volunteers, especially to enforce quality? Investigated scope: • Qualitative + Quantitative study of 2 collaborative models inspired by Computer science: iterative vs parallel information processing • Controlled experiment to isolate quality = F(organisation), removing other parameters e.g. training, task difficulty • this research != studying real world collaborative practices but more extreme/symbolic cases to guide collaborative system designersFriday, September 21, 12 7
    • Tested Collaborative Models (1/2) iterative model e.g. wikipedia, open street map, assembly linesFriday, September 21, 12 8
    • Tested Collaborative Models (2/2) parallel model aggregation e.g. voting systems in society, distributed computingFriday, September 21, 12 9
    • Tested Collaborative Models (2/2) parallel model old version (17th to mid 20th century): when computers were human/women (Mathematical Table project - (1938 -1948)Friday, September 21, 12 10
    • Qualitative comparison Iterative Parallel problem No need to divide complex Complex problem need to be divisibility problem divided in easier piecesFriday, September 21, 12 11
    • Qualitative comparison Iterative Parallel problem No need to divide complex Complex problem need to be divisibility problem divided in easier pieces optimization copy emphasizing isolation emphasizing tradeoff exploitation explorationFriday, September 21, 12 12
    • Qualitative comparison Iterative Parallel problem No need to divide complex Complex problem need to be divisibility problem divided in easier pieces optimization copy emphasizing isolation emphasizing tradeoff exploitation exploration quality redundancy + diversity of sequential improvement mechanism opinionsFriday, September 21, 12 13
    • Qualitative comparison Iterative Parallel problem No need to divide complex Complex problem need to be divisibility problem divided in easier pieces optimization copy emphasizing isolation emphasizing tradeoff exploitation exploration quality redundancy + diversity of sequential improvement mechanism opinions useless redundancy for path dependency effect + side effect obvious decisions + pb of sensitivity to vandalism aggregationFriday, September 21, 12 14
    • Controlled Experiment: web platform Interface/instruction for the Parallel modelFriday, September 21, 12 15
    • on 3 maps with different topologies (annotated by 1 UNITAR expert)Friday, September 21, 12 16
    • Participants used for the experiments: Mechanical Turk as simulatorFriday, September 21, 12 17
    • Data Quality Metrics Quality of the collective output • type I errors = p(wrong annotation) • type II errors = p(missing a building) • Consistency Analogy with the information retrieval field: • Precision = p(an annotation is a building) • Recall = p(a building is annotated) • F-measure = score mixing recall + precision • (metrics adjusted with tolerance distance)Friday, September 21, 12 18
    • Methodology for parallel model Step 1 - collecting independent contribution: N for (map1, map2, map3) = (121,120,113)Friday, September 21, 12 19
    • Methodology for parallel model Step 2 - for each map, generating the set of groups of m=[1 to N] participants m=1 m=2m=3Friday, September 21, 12 20
    • Methodology for parallel model Step 3 - for each group: aggregating + computing quality groupsof m = 2 Spatial Clustering of points + quorum Compute Data Quality with Gold Standard Precision Recall F-measureFriday, September 21, 12 21
    • The more = the better? (parallel model) avg. F-measure yes but until some points.. • (Adding more people wont change the consensus panel) • Limitation of Linus’ law (compared to iterative model e.g. openstreetmap) • Wisdom != skill: we can’t replace training by more peopleFriday, September 21, 12 22
    • Methodology for Iterative model sample of an iterative process for map3Friday, September 21, 12 23
    • Methodology for Iterative model n instances of about m iterations Collected data for map1, map2, map3 = 13, 21,25 instances of about 10 iterationsFriday, September 21, 12 24
    • Methodology for Iterative model Step 2- for each iteration, we compute the precision, recall, f-measure of all the instances Precision Recall F-measureFriday, September 21, 12 25
    • Intrepretation of results / Comparison on data quality Parallel Iterative Accuracy - wrong consensual results (*) error propagation annotations accumulation of Accuracy - useless redundancy on knowledge driving missing obvious buildings attention on buildings uncovered area Consistency redundancy naive last = best (*) but parallel < iterative in difficult cases (map 2) (lack of consensus)Friday, September 21, 12 26
    • Side-objective: Measuring how the crowd spatially agrees Method: taking randomly 2 participants and measure their spatial inter-agreement (e.g. ratio of points matching) and repeat the process N timeFriday, September 21, 12 27
    • Side-objective: Measuring how the crowd spatially agrees Method: taking randomly 2 participants and measure their spatial inter-agreement (e.g. ratio of points matching) and repeat the process N time way to measure the intrinsic difficulty of a task (map 1 = easy , map 2 = quite hard)Friday, September 21, 12 28
    • future tracks Impact of the organization beyond data quality • Energy / Footprint to collectively solve a problem, • Participation sustainability, • On Individual behavior (skill Learning & Enjoyment) Skill complementarity: Is the best group of 3 people the best 3 people at the individual level? data says no! Other symbolic organisations / mechanism: • human cellular automata (cell = 1 person, resubmit a task at time t, because influenced by peers results generated at time t-1) • Integration of Game design / GamificationFriday, September 21, 12 29