SlideShare a Scribd company logo
1 of 29
Download to read offline
Crowdsourcing satellite imagery:
          study of iterative vs. parallel models
                           Nicolas Maisonneuve, Bastien Chopard




                                                          Twitter: nmaisonneuve




Friday, September 21, 12                                                          1
Damage assessment after a humanitarian crisis




Friday, September 21, 12                                           2
Port-au-prince: 300K buildings assessed
                           in 3 months for 8 UNOSAT experts




Friday, September 21, 12                                             3
Organizational challenges:
                           How to organize non-trained volunteers,
                                especially to enforce quality?




Friday, September 21, 12                                             4
Organizational challenges:
                           How to organize non-trained volunteers,
                                especially to enforce quality?



        Investigated scope:
        • Qualitative + Quantitative study of 2 collaborative models inspired by
        Computer science: iterative vs parallel information processing

        • Controlled experiment to isolate quality = F(organisation), removing
        other parameters e.g. training, task difficulty

        • this research != studying real world collaborative practices but more
        extreme/symbolic cases to guide collaborative system designers



Friday, September 21, 12                                                           5
Organizational challenges:
                           How to organize non-trained volunteers,
                                especially to enforce quality?



        Investigated scope:
        • Qualitative + Quantitative study of 2 collaborative models inspired by
        Computer science: iterative vs parallel information processing

        • Controlled experiment to isolate quality = F(organisation), removing
        other parameters e.g. training, task difficulty

        • this research != studying real world collaborative practices but more
        extreme/symbolic cases to guide collaborative system designers



Friday, September 21, 12                                                           6
Organizational challenges:
                           How to organize non-trained volunteers,
                                especially to enforce quality?



        Investigated scope:
        • Qualitative + Quantitative study of 2 collaborative models inspired by
        Computer science: iterative vs parallel information processing

        • Controlled experiment to isolate quality = F(organisation), removing
        other parameters e.g. training, task difficulty

        • this research != studying real world collaborative practices but more
        extreme/symbolic cases to guide collaborative system designers



Friday, September 21, 12                                                           7
Tested Collaborative Models (1/2)
                                  iterative model




                       e.g. wikipedia, open street map, assembly lines
Friday, September 21, 12                                                 8
Tested Collaborative Models (2/2)
                                   parallel model




                                                 aggregation




             e.g. voting systems in society, distributed computing
Friday, September 21, 12                                             9
Tested Collaborative Models (2/2)
                                   parallel model




            old version (17th to mid 20th century): when computers were human/women
            (Mathematical Table project - (1938 -1948)
Friday, September 21, 12                                                              10
Qualitative comparison
                                    Iterative                    Parallel

                 problem      No need to divide complex   Complex problem need to be
               divisibility   problem                     divided in easier pieces




Friday, September 21, 12                                                               11
Qualitative comparison
                                    Iterative                     Parallel

                 problem      No need to divide complex   Complex problem need to be
               divisibility   problem                     divided in easier pieces


           optimization       copy emphasizing            isolation emphasizing
               tradeoff       exploitation                exploration




Friday, September 21, 12                                                               12
Qualitative comparison
                                    Iterative                     Parallel

                 problem      No need to divide complex   Complex problem need to be
               divisibility   problem                     divided in easier pieces


           optimization       copy emphasizing            isolation emphasizing
               tradeoff       exploitation                exploration


               quality                                    redundancy + diversity of
                              sequential improvement
            mechanism                                     opinions




Friday, September 21, 12                                                               13
Qualitative comparison
                                    Iterative                     Parallel

                 problem      No need to divide complex   Complex problem need to be
               divisibility   problem                     divided in easier pieces


           optimization       copy emphasizing            isolation emphasizing
               tradeoff       exploitation                exploration


               quality                                    redundancy + diversity of
                              sequential improvement
            mechanism                                     opinions

                                                          useless redundancy for
                              path dependency effect +
               side effect                                obvious decisions + pb of
                              sensitivity to vandalism
                                                          aggregation




Friday, September 21, 12                                                               14
Controlled Experiment: web platform




                           Interface/instruction for the Parallel model

Friday, September 21, 12                                                  15
on 3 maps with different topologies
                    (annotated by 1 UNITAR expert)




Friday, September 21, 12                                16
Participants used for the experiments:
              Mechanical Turk as simulator




Friday, September 21, 12                          17
Data Quality Metrics

                 Quality of the collective output
                 • type I errors = p(wrong annotation)
                 • type II errors = p(missing a building)
                 • Consistency

                 Analogy with the information retrieval field:
                 • Precision = p(an annotation is a building)
                 • Recall = p(a building is annotated)
                 • F-measure = score mixing recall + precision
                 • (metrics adjusted with tolerance distance)



Friday, September 21, 12                                         18
Methodology for parallel model
                     Step 1 - collecting independent contribution:
                     N for (map1, map2, map3) = (121,120,113)




Friday, September 21, 12                                             19
Methodology for parallel model
                       Step 2 - for each map,
       generating the set of groups of m=[1 to N] participants


  m=1


  m=2



m=3


Friday, September 21, 12                                         20
Methodology for parallel model
         Step 3 - for each group: aggregating + computing quality

 groups
of m = 2

                           Spatial Clustering of points + quorum




                 Compute Data Quality with Gold Standard

                             Precision          Recall             F-measure

Friday, September 21, 12                                                       21
The more = the better?
                              (parallel model)
      avg. F-measure




    yes but until some points..
    • (Adding more people wont change the consensus panel)
    • Limitation of Linus’ law (compared to iterative model e.g.
    openstreetmap)
    • Wisdom != skill: we can’t replace training by more people
Friday, September 21, 12                                           22
Methodology for Iterative model




                           sample of an iterative process for map3




Friday, September 21, 12                                             23
Methodology for Iterative model




 n instances
 of about m
  iterations

      Collected data for map1, map2, map3 = 13, 21,25
              instances of about 10 iterations
Friday, September 21, 12                                24
Methodology for Iterative model
            Step 2- for each iteration, we compute the precision,
                     recall, f-measure of all the instances




                           Precision   Recall       F-measure

Friday, September 21, 12                                            25
Intrepretation of results / Comparison
               on data quality

                           Parallel                               Iterative

   Accuracy -
   wrong                   consensual results (*)                 error propagation
   annotations
                                                                  accumulation of
   Accuracy -
                           useless redundancy on                  knowledge driving
   missing
                           obvious buildings                      attention on
   buildings
                                                                  uncovered area
   Consistency             redundancy                             naive last = best
  (*) but parallel < iterative in difficult cases (map 2) (lack of consensus)

Friday, September 21, 12                                                              26
Side-objective: Measuring how the crowd spatially agrees
          Method: taking randomly 2 participants and measure their
        spatial inter-agreement (e.g. ratio of points matching) and repeat
                              the process N time




Friday, September 21, 12                                                     27
Side-objective: Measuring how the crowd spatially agrees
          Method: taking randomly 2 participants and measure their
        spatial inter-agreement (e.g. ratio of points matching) and repeat
                              the process N time




                           way to measure the intrinsic difficulty of a task
                                  (map 1 = easy , map 2 = quite hard)
Friday, September 21, 12                                                      28
future tracks
                     Impact of the organization beyond data
                     quality
                     • Energy / Footprint to collectively solve a problem,
                     • Participation sustainability,
                     • On Individual behavior (skill Learning & Enjoyment)
                     Skill complementarity:
                     Is the best group of 3 people the best 3 people at the
                     individual level? data says no!
                     Other symbolic organisations / mechanism:
                     • human cellular automata (cell = 1 person, resubmit a task at
                     time t, because influenced by peers results generated at time
                     t-1)
                     • Integration of Game design / Gamification
Friday, September 21, 12                                                              29

More Related Content

Viewers also liked

The territory of expertise: machine vs. human expert vs. group of amateurs.
The territory of expertise: machine vs. human expert vs. group of amateurs.The territory of expertise: machine vs. human expert vs. group of amateurs.
The territory of expertise: machine vs. human expert vs. group of amateurs.Nicolas Maisonneuve
 
Mapping Visual Perceptions using Google Street View
Mapping Visual Perceptions using Google Street ViewMapping Visual Perceptions using Google Street View
Mapping Visual Perceptions using Google Street ViewNicolas Maisonneuve
 
Team activity analysis / visualization
Team activity analysis / visualizationTeam activity analysis / visualization
Team activity analysis / visualizationNicolas Maisonneuve
 
NoiseTube: Participatory sensing for noise pollution via mobile phones
NoiseTube: Participatory sensing for noise pollution via mobile phonesNoiseTube: Participatory sensing for noise pollution via mobile phones
NoiseTube: Participatory sensing for noise pollution via mobile phonesNicolas Maisonneuve
 
Orientation of the Community's attention and User alignment
Orientation of the Community's attention and User alignmentOrientation of the Community's attention and User alignment
Orientation of the Community's attention and User alignmentNicolas Maisonneuve
 
Matching Game Mechanics and Human Computation Tasks in Games with a Purpose
Matching Game Mechanics and Human Computation Tasks in Games with a PurposeMatching Game Mechanics and Human Computation Tasks in Games with a Purpose
Matching Game Mechanics and Human Computation Tasks in Games with a PurposeLuca Galli
 

Viewers also liked (10)

Observer service
Observer service Observer service
Observer service
 
The territory of expertise: machine vs. human expert vs. group of amateurs.
The territory of expertise: machine vs. human expert vs. group of amateurs.The territory of expertise: machine vs. human expert vs. group of amateurs.
The territory of expertise: machine vs. human expert vs. group of amateurs.
 
a dynamic web feed system
a dynamic web feed systema dynamic web feed system
a dynamic web feed system
 
Social Attention analysis
Social Attention analysisSocial Attention analysis
Social Attention analysis
 
Mapping Visual Perceptions using Google Street View
Mapping Visual Perceptions using Google Street ViewMapping Visual Perceptions using Google Street View
Mapping Visual Perceptions using Google Street View
 
Team activity analysis / visualization
Team activity analysis / visualizationTeam activity analysis / visualization
Team activity analysis / visualization
 
NoiseTube project
NoiseTube projectNoiseTube project
NoiseTube project
 
NoiseTube: Participatory sensing for noise pollution via mobile phones
NoiseTube: Participatory sensing for noise pollution via mobile phonesNoiseTube: Participatory sensing for noise pollution via mobile phones
NoiseTube: Participatory sensing for noise pollution via mobile phones
 
Orientation of the Community's attention and User alignment
Orientation of the Community's attention and User alignmentOrientation of the Community's attention and User alignment
Orientation of the Community's attention and User alignment
 
Matching Game Mechanics and Human Computation Tasks in Games with a Purpose
Matching Game Mechanics and Human Computation Tasks in Games with a PurposeMatching Game Mechanics and Human Computation Tasks in Games with a Purpose
Matching Game Mechanics and Human Computation Tasks in Games with a Purpose
 

Recently uploaded

Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAndikSusilo4
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphNeo4j
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 

Recently uploaded (20)

Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & Application
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 

Crowdsourcing satellite imagery (Talk at Giscience2012)

  • 1. Crowdsourcing satellite imagery: study of iterative vs. parallel models Nicolas Maisonneuve, Bastien Chopard Twitter: nmaisonneuve Friday, September 21, 12 1
  • 2. Damage assessment after a humanitarian crisis Friday, September 21, 12 2
  • 3. Port-au-prince: 300K buildings assessed in 3 months for 8 UNOSAT experts Friday, September 21, 12 3
  • 4. Organizational challenges: How to organize non-trained volunteers, especially to enforce quality? Friday, September 21, 12 4
  • 5. Organizational challenges: How to organize non-trained volunteers, especially to enforce quality? Investigated scope: • Qualitative + Quantitative study of 2 collaborative models inspired by Computer science: iterative vs parallel information processing • Controlled experiment to isolate quality = F(organisation), removing other parameters e.g. training, task difficulty • this research != studying real world collaborative practices but more extreme/symbolic cases to guide collaborative system designers Friday, September 21, 12 5
  • 6. Organizational challenges: How to organize non-trained volunteers, especially to enforce quality? Investigated scope: • Qualitative + Quantitative study of 2 collaborative models inspired by Computer science: iterative vs parallel information processing • Controlled experiment to isolate quality = F(organisation), removing other parameters e.g. training, task difficulty • this research != studying real world collaborative practices but more extreme/symbolic cases to guide collaborative system designers Friday, September 21, 12 6
  • 7. Organizational challenges: How to organize non-trained volunteers, especially to enforce quality? Investigated scope: • Qualitative + Quantitative study of 2 collaborative models inspired by Computer science: iterative vs parallel information processing • Controlled experiment to isolate quality = F(organisation), removing other parameters e.g. training, task difficulty • this research != studying real world collaborative practices but more extreme/symbolic cases to guide collaborative system designers Friday, September 21, 12 7
  • 8. Tested Collaborative Models (1/2) iterative model e.g. wikipedia, open street map, assembly lines Friday, September 21, 12 8
  • 9. Tested Collaborative Models (2/2) parallel model aggregation e.g. voting systems in society, distributed computing Friday, September 21, 12 9
  • 10. Tested Collaborative Models (2/2) parallel model old version (17th to mid 20th century): when computers were human/women (Mathematical Table project - (1938 -1948) Friday, September 21, 12 10
  • 11. Qualitative comparison Iterative Parallel problem No need to divide complex Complex problem need to be divisibility problem divided in easier pieces Friday, September 21, 12 11
  • 12. Qualitative comparison Iterative Parallel problem No need to divide complex Complex problem need to be divisibility problem divided in easier pieces optimization copy emphasizing isolation emphasizing tradeoff exploitation exploration Friday, September 21, 12 12
  • 13. Qualitative comparison Iterative Parallel problem No need to divide complex Complex problem need to be divisibility problem divided in easier pieces optimization copy emphasizing isolation emphasizing tradeoff exploitation exploration quality redundancy + diversity of sequential improvement mechanism opinions Friday, September 21, 12 13
  • 14. Qualitative comparison Iterative Parallel problem No need to divide complex Complex problem need to be divisibility problem divided in easier pieces optimization copy emphasizing isolation emphasizing tradeoff exploitation exploration quality redundancy + diversity of sequential improvement mechanism opinions useless redundancy for path dependency effect + side effect obvious decisions + pb of sensitivity to vandalism aggregation Friday, September 21, 12 14
  • 15. Controlled Experiment: web platform Interface/instruction for the Parallel model Friday, September 21, 12 15
  • 16. on 3 maps with different topologies (annotated by 1 UNITAR expert) Friday, September 21, 12 16
  • 17. Participants used for the experiments: Mechanical Turk as simulator Friday, September 21, 12 17
  • 18. Data Quality Metrics Quality of the collective output • type I errors = p(wrong annotation) • type II errors = p(missing a building) • Consistency Analogy with the information retrieval field: • Precision = p(an annotation is a building) • Recall = p(a building is annotated) • F-measure = score mixing recall + precision • (metrics adjusted with tolerance distance) Friday, September 21, 12 18
  • 19. Methodology for parallel model Step 1 - collecting independent contribution: N for (map1, map2, map3) = (121,120,113) Friday, September 21, 12 19
  • 20. Methodology for parallel model Step 2 - for each map, generating the set of groups of m=[1 to N] participants m=1 m=2 m=3 Friday, September 21, 12 20
  • 21. Methodology for parallel model Step 3 - for each group: aggregating + computing quality groups of m = 2 Spatial Clustering of points + quorum Compute Data Quality with Gold Standard Precision Recall F-measure Friday, September 21, 12 21
  • 22. The more = the better? (parallel model) avg. F-measure yes but until some points.. • (Adding more people wont change the consensus panel) • Limitation of Linus’ law (compared to iterative model e.g. openstreetmap) • Wisdom != skill: we can’t replace training by more people Friday, September 21, 12 22
  • 23. Methodology for Iterative model sample of an iterative process for map3 Friday, September 21, 12 23
  • 24. Methodology for Iterative model n instances of about m iterations Collected data for map1, map2, map3 = 13, 21,25 instances of about 10 iterations Friday, September 21, 12 24
  • 25. Methodology for Iterative model Step 2- for each iteration, we compute the precision, recall, f-measure of all the instances Precision Recall F-measure Friday, September 21, 12 25
  • 26. Intrepretation of results / Comparison on data quality Parallel Iterative Accuracy - wrong consensual results (*) error propagation annotations accumulation of Accuracy - useless redundancy on knowledge driving missing obvious buildings attention on buildings uncovered area Consistency redundancy naive last = best (*) but parallel < iterative in difficult cases (map 2) (lack of consensus) Friday, September 21, 12 26
  • 27. Side-objective: Measuring how the crowd spatially agrees Method: taking randomly 2 participants and measure their spatial inter-agreement (e.g. ratio of points matching) and repeat the process N time Friday, September 21, 12 27
  • 28. Side-objective: Measuring how the crowd spatially agrees Method: taking randomly 2 participants and measure their spatial inter-agreement (e.g. ratio of points matching) and repeat the process N time way to measure the intrinsic difficulty of a task (map 1 = easy , map 2 = quite hard) Friday, September 21, 12 28
  • 29. future tracks Impact of the organization beyond data quality • Energy / Footprint to collectively solve a problem, • Participation sustainability, • On Individual behavior (skill Learning & Enjoyment) Skill complementarity: Is the best group of 3 people the best 3 people at the individual level? data says no! Other symbolic organisations / mechanism: • human cellular automata (cell = 1 person, resubmit a task at time t, because influenced by peers results generated at time t-1) • Integration of Game design / Gamification Friday, September 21, 12 29