SlideShare a Scribd company logo
Linking Smart Cities Datasets
with Human Computation:
the case of UrbanMatch
Irene Celino, Simone Contessa, Marta Corubolo,
Daniele Dell’Aglio, Emanuele Della Valle,
Stefano Fumeo and Thorsten Krüger

CEFRIEL – Politecnico di Milano – SIEMENS
UrbanMatch - ISWC 2012
Citizen Computation
Human Computation

Citizen Science

exploiting human capabilities
to solve computational tasks
difficult for machines

exploiting volunteers
to collect scientific data or
to conduct experiments
"in the world"

Citizen Computation
exploiting human capabilities
to contribute to a mixed
computational system
by living "in the world"

UrbanMatch - ISWC 2012

2

Boston, 13th November 2012
UrbanMatch: Citizen Computation GWAP
Photos from social media

Places & POIs from
OpenStreetMap

e.g. Duomo

e.g. equestrian statue

Wikimedia
Commons

?

?
?

Which photos actually represent the urban POIs?
Can we link POIs to their most representative photos?
UrbanMatch - ISWC 2012

3

Boston, 13th November 2012
The Record Linkage Problem
Γ=A×B



B

Comparison space




A





M (set of matches) and
U (set of non-matches)

Each link γ has a score 𝑠 𝛾


B

Γ= 𝐴× 𝐵

Purpose is to split Γ in




[Fellegi&Sunter,1969]

𝑠 𝛾 = 𝑃 𝛾 ∈ Γ 𝑀 /𝑃(𝛾 ∈ Γ|𝑈)

Partitioning based on
lower/upper thresholds:


UrbanMatch - ISWC 2012

𝑠 𝛾 < 𝐿𝑂𝑊𝐸𝑅 ⇒ 𝛾 ∈ 𝑈



∈M
∈U
∈C

𝑠 𝛾 > 𝑈𝑃𝑃𝐸𝑅 ⇒ 𝛾 ∈ 𝑀



A

𝐿𝑂𝑊𝐸𝑅 ≤ 𝑠 𝛾 ≤ 𝑈𝑃𝑃𝐸𝑅 ⇒ 𝛾 ∈ 𝐶
(candidate links to be assessed by experts)

4

Boston, 13th November 2012
UrbanMatch Linkage Problem
B (photos)
A (POIs)
Piazza
del Duomo






Piazza della
Scala

A contains POIs
B contains photos
Γ contains POI-photo links
"Playable places" group
POIs  partitioning of Γ
(problem space reduction)



UrbanMatch links

Input data in UrbanMatch:






UrbanMatch - ISWC 2012

5

34 POIs in 14 playable places
11,089 photos
196 links in M
382 links in U
37,413 links in C
Boston, 13th November 2012
Showing links to UrbanMatch players


The POI-photo links to be evaluated could be shown
directly to players...
Santa Maria
delle Grazie



correct?
???

...but the player could not know the name of the POI
On the other hand, the player can easily "see"

(Santa Maria
delle Grazie)

correct?
NO!

(Sant'Ambrogio)

thus we use a "sure" photo as proxy for the POI, thus the
UrbanMatch game is designed as a photo-pairing challenge
UrbanMatch - ISWC 2012

6

Boston, 13th November 2012
UrbanMatch level definition
Set B of photos

Set A of POIs

a

b

c

POI 1

d

(Duomo)

h

POI 2
(statue)






g

f

e

Photos d and g depict POI 1 (and do not depict POI 2)
Photos c and h depict POI 2 (and do not depict POI 1)
Photos b and f do not depict POI 1 nor POI 2 (distractors)
Photos a and e maybe depict POI 1 and/or POI 2
UrbanMatch - ISWC 2012

7

Boston, 13th November 2012
Feedbacks to UrbanMatch players
If the selected pair
is correct (two sure
photos) or plausible,
UrbanMatch gives a
positive feedback

d

e

b

If the selected pair is
wrong (at least one
selected photo is a
"distractor" option),
UrbanMatch gives a
negative feedback
(and counts the player's mistake)

f
UrbanMatch - ISWC 2012

8

Boston, 13th November 2012
UrbanMatch: gameplay
Video at: http://bit.ly/um-video

UrbanMatch - ISWC 2012

9

Boston, 13th November 2012
Link score in UrbanMatch


Effect of players' evidences on the score sγ






If there is a positive evidence (an UrbanMatch player says that the
link γ is correct), sγ is increased at most by Kpos
If there is a negative evidence (an UrbanMatch player says that the
link γ is incorrect), sγ is decreased at most by Kneg
Each evidence is weighted with a factor that represents UrbanMatch
player's reliability

𝑟𝑝 =

𝜀
− 𝑝 2
𝑒

where 𝜀 𝑝 is the number of errors made by the player, thus 𝑟 𝑝 ∈ [0. . 1]
(reliability is 1 if the player makes no mistake, 0.6 if he makes 1 error,
almost zero if he makes 6 mistakes or more)


For each link γ in C, its score varies as follows


Positive evidence:

𝑠′ 𝛾 = 𝑠 𝛾 + 𝐾 𝑝𝑜𝑠 ∗ 𝑟 𝑝



Negative evidence:

𝑠′ 𝛾 = 𝑠 𝛾 − 𝐾 𝑛𝑒𝑔 ∗ 𝑟 𝑝

UrbanMatch - ISWC 2012

10

Boston, 13th November 2012
Evaluating UrbanMatch precision


Accuracy definition
(metrics specific to Data Quality)
candidate links correctly assessed to
be either trustable or incorrect

game ability to correctly
assess candidate links

𝐴𝑐𝑐𝑢𝑟𝑎𝑐𝑦 =

𝐶𝑀 − 𝐹𝑃 + (𝐶𝑈 − 𝐹𝑁)
𝐶𝑀 + 𝐶𝑈
candidate links assessed to be
either trustable or incorrect







CM = right links, i.e. candidate links moved from C to M (because 𝑠 𝛾 > 𝑈𝑃𝑃𝐸𝑅)
CU = wrong links, i.e. candidate links moved from C to U (because 𝑠 𝛾 < 𝐿𝑂𝑊𝐸𝑅)
FP = false positives (candidate links assessed as trustable but actually incorrect)
FN = false negatives (candidate links assessed as incorrect but actually trustable)

accuracy requires ground truth  computed only on a
manually analyzed set of links for evaluation's sake !
UrbanMatch - ISWC 2012

11

Boston, 13th November 2012
Evaluating UrbanMatch as a GWAP


Throughput and ALP definition
(metrics specific to Games with a Purpose)

game ability to assess
candidate links

game ability to
engage players

𝑆𝑜𝑙𝑣𝑒𝑑𝑃𝑟𝑜𝑏𝑙𝑒𝑚𝑠
𝑇ℎ𝑟𝑜𝑢𝑔ℎ𝑝𝑢𝑡 =
𝑃𝑙𝑎𝑦𝑒𝑑𝑇𝑖𝑚𝑒
𝑃𝑙𝑎𝑦𝑒𝑑𝑇𝑖𝑚𝑒
𝐴𝐿𝑃 =
𝐴𝑐𝑡𝑖𝑣𝑒𝑃𝑙𝑎𝑦𝑒𝑟𝑠

where 𝑆𝑜𝑙𝑣𝑒𝑑𝑃𝑟𝑜𝑏𝑙𝑒𝑚𝑠 = 𝐶𝑀 + 𝐶𝑈 in which:


CM = right links, i.e. candidate links moved from C to M
(because 𝑠 𝛾 > 𝑈𝑃𝑃𝐸𝑅)



CU = wrong links, i.e. candidate links moved from C to U
(because 𝑠 𝛾 < 𝐿𝑂𝑊𝐸𝑅)

UrbanMatch - ISWC 2012

12

Boston, 13th November 2012
Setting the thresholds for link score 𝒔 𝜸
Accuracy and Throughput as a function of 𝐿𝑂𝑊𝐸𝑅



LOWER

0.05

0.10

0.15

0.20

CU

321

348

1152

1216

FN

4

5

7

8

98.75 %

98.56 %

99.39 %

99.34 %

108.08

117.17

387.87

409.42

Accuracy
Throughput

Accuracy and Throughput as a function of 𝑈𝑃𝑃𝐸𝑅



UPPER

0.60

0.65

0.70

0.75

0.80

0.85

0.90

CM

227

225

68

65

61

60

49

FP

48

47

4

4

3

3

1

Accuracy
Throughput

78.85 % 79.11 % 94.11 % 95.38 % 95.08 % 95.00 % 97.95 %
76.43

UrbanMatch - ISWC 2012

75.75

22.89

13

21.88

20.53

20.20

Boston, 13th November 2012

16.49
UrbanMatch evaluation results
Accuracy, Throughput and ALP results
 Evaluation data






Accuracy (on the manually-checked subset)




54 unique players, 290 games (781 game levels)
2006 input links, 1284 assessed (trustable/incorrect)
upper threshold 0.70, lower threshold 0.20

99.06%

 very good result 

Throughput and ALP (on all processed links)



485 links/h
3m 17s

UrbanMatch - ISWC 2012

 very good result 
 one-time game, needs improvement 

14

Boston, 13th November 2012
Evaluation UrbanMatch as a whole
User-based evaluation: actual "engagement"





Based on game evaluation literature and integrated with our
research-specific questions
Questionnaire at http://bit.ly/um-survey
Findings:

UrbanMatch - ISWC 2012

15

Boston, 13th November 2012
Conclusions




Citizen Computation games are effective to leverage
linking capabilities of "humans"
Game is to be designed to keep in consideration users'
capabilities and usage context:





(lack of) background knowledge
small mobile-phone screen

Different tasks to be solved need different game "missions"




Linking task  pairing mission
Verifying task  rating/selecting mission
Collecting task  selecting/editing mission


 Advertisement  try out Urbanopoly game here in Boston,
to collect and assess geo-spatial linked data
participant to Semantic Web Challenge http://bit.ly/urbanopoly

UrbanMatch - ISWC 2012

16

Boston, 13th November 2012
Thanks for your attention!
Any question?

Irene Celino – CEFRIEL, ICT Institute Politecnico di Milano
email: Irene.Celino@cefriel.it – web: http://swa.cefriel.it
UrbanMatch - ISWC 2012

More Related Content

Similar to Linking Smart Cities Datasets with Human Computation: the case of UrbanMatch

Real-world News Recommender Systems
Real-world News Recommender SystemsReal-world News Recommender Systems
Real-world News Recommender Systems
kib_83
 
Recommender Systems with Implicit Feedback Challenges, Techniques, and Applic...
Recommender Systems with Implicit Feedback Challenges, Techniques, and Applic...Recommender Systems with Implicit Feedback Challenges, Techniques, and Applic...
Recommender Systems with Implicit Feedback Challenges, Techniques, and Applic...
NAVER Engineering
 
EXPERIMENTAL STUDY OF MINUTIAE BASED ALGORITHM FOR FINGERPRINT MATCHING
EXPERIMENTAL STUDY OF MINUTIAE BASED ALGORITHM FOR FINGERPRINT MATCHING EXPERIMENTAL STUDY OF MINUTIAE BASED ALGORITHM FOR FINGERPRINT MATCHING
EXPERIMENTAL STUDY OF MINUTIAE BASED ALGORITHM FOR FINGERPRINT MATCHING
cscpconf
 
Experimental study of minutiae based algorithm for fingerprint matching
Experimental study of minutiae based algorithm for fingerprint matchingExperimental study of minutiae based algorithm for fingerprint matching
Experimental study of minutiae based algorithm for fingerprint matching
csandit
 
An analytical framework for formulating metrics for evaluating multi-dimensio...
An analytical framework for formulating metrics for evaluating multi-dimensio...An analytical framework for formulating metrics for evaluating multi-dimensio...
An analytical framework for formulating metrics for evaluating multi-dimensio...
Rei Takami
 
How can we use Data Science to measure people's perception of safety
How can we use Data Science to measure people's perception of safetyHow can we use Data Science to measure people's perception of safety
How can we use Data Science to measure people's perception of safety
Institute of Contemporary Sciences
 

Similar to Linking Smart Cities Datasets with Human Computation: the case of UrbanMatch (6)

Real-world News Recommender Systems
Real-world News Recommender SystemsReal-world News Recommender Systems
Real-world News Recommender Systems
 
Recommender Systems with Implicit Feedback Challenges, Techniques, and Applic...
Recommender Systems with Implicit Feedback Challenges, Techniques, and Applic...Recommender Systems with Implicit Feedback Challenges, Techniques, and Applic...
Recommender Systems with Implicit Feedback Challenges, Techniques, and Applic...
 
EXPERIMENTAL STUDY OF MINUTIAE BASED ALGORITHM FOR FINGERPRINT MATCHING
EXPERIMENTAL STUDY OF MINUTIAE BASED ALGORITHM FOR FINGERPRINT MATCHING EXPERIMENTAL STUDY OF MINUTIAE BASED ALGORITHM FOR FINGERPRINT MATCHING
EXPERIMENTAL STUDY OF MINUTIAE BASED ALGORITHM FOR FINGERPRINT MATCHING
 
Experimental study of minutiae based algorithm for fingerprint matching
Experimental study of minutiae based algorithm for fingerprint matchingExperimental study of minutiae based algorithm for fingerprint matching
Experimental study of minutiae based algorithm for fingerprint matching
 
An analytical framework for formulating metrics for evaluating multi-dimensio...
An analytical framework for formulating metrics for evaluating multi-dimensio...An analytical framework for formulating metrics for evaluating multi-dimensio...
An analytical framework for formulating metrics for evaluating multi-dimensio...
 
How can we use Data Science to measure people's perception of safety
How can we use Data Science to measure people's perception of safetyHow can we use Data Science to measure people's perception of safety
How can we use Data Science to measure people's perception of safety
 

More from PlanetData Network of Excellence

Dl2014 slides
Dl2014 slidesDl2014 slides
A Contextualized Knowledge Repository for Open Data about Trentino
A Contextualized Knowledge Repository for Open Data about TrentinoA Contextualized Knowledge Repository for Open Data about Trentino
A Contextualized Knowledge Repository for Open Data about Trentino
PlanetData Network of Excellence
 
On Leveraging Crowdsourcing Techniques for Schema Matching Networks
On Leveraging Crowdsourcing Techniques for Schema Matching NetworksOn Leveraging Crowdsourcing Techniques for Schema Matching Networks
On Leveraging Crowdsourcing Techniques for Schema Matching Networks
PlanetData Network of Excellence
 
Privacy-Preserving Schema Reuse
Privacy-Preserving Schema ReusePrivacy-Preserving Schema Reuse
Privacy-Preserving Schema Reuse
PlanetData Network of Excellence
 
Pay-as-you-go Reconciliation in Schema Matching Networks
Pay-as-you-go Reconciliation in Schema Matching NetworksPay-as-you-go Reconciliation in Schema Matching Networks
Pay-as-you-go Reconciliation in Schema Matching Networks
PlanetData Network of Excellence
 
Demo: tablet-based visualisation of transport data in Madrid using SPARQLstream
Demo: tablet-based visualisation of transport data in Madrid using SPARQLstreamDemo: tablet-based visualisation of transport data in Madrid using SPARQLstream
Demo: tablet-based visualisation of transport data in Madrid using SPARQLstream
PlanetData Network of Excellence
 
On the need for a W3C community group on RDF Stream Processing
On the need for a W3C community group on RDF Stream ProcessingOn the need for a W3C community group on RDF Stream Processing
On the need for a W3C community group on RDF Stream Processing
PlanetData Network of Excellence
 
Urbanopoly: Collection and Quality Assessment of Geo-spatial Linked Data via ...
Urbanopoly: Collection and Quality Assessment of Geo-spatial Linked Data via ...Urbanopoly: Collection and Quality Assessment of Geo-spatial Linked Data via ...
Urbanopoly: Collection and Quality Assessment of Geo-spatial Linked Data via ...
PlanetData Network of Excellence
 
SciQL, Bridging the Gap between Science and Relational DBMS
SciQL, Bridging the Gap between Science and Relational DBMSSciQL, Bridging the Gap between Science and Relational DBMS
SciQL, Bridging the Gap between Science and Relational DBMS
PlanetData Network of Excellence
 
Scalable Nonmonotonic Reasoning over RDF Data Using MapReduce
Scalable Nonmonotonic Reasoning over RDF Data Using MapReduceScalable Nonmonotonic Reasoning over RDF Data Using MapReduce
Scalable Nonmonotonic Reasoning over RDF Data Using MapReduce
PlanetData Network of Excellence
 
Data and Knowledge Evolution
Data and Knowledge Evolution  Data and Knowledge Evolution
Data and Knowledge Evolution
PlanetData Network of Excellence
 
Evolution of Workflow Provenance Information in the Presence of Custom Infere...
Evolution of Workflow Provenance Information in the Presence of Custom Infere...Evolution of Workflow Provenance Information in the Presence of Custom Infere...
Evolution of Workflow Provenance Information in the Presence of Custom Infere...
PlanetData Network of Excellence
 
Access Control for RDF graphs using Abstract Models
Access Control for RDF graphs using Abstract ModelsAccess Control for RDF graphs using Abstract Models
Access Control for RDF graphs using Abstract Models
PlanetData Network of Excellence
 
Arrays in Databases, the next frontier?
Arrays in Databases, the next frontier?Arrays in Databases, the next frontier?
Arrays in Databases, the next frontier?
PlanetData Network of Excellence
 
Abstract Access Control Model for Dynamic RDF Datasets
Abstract Access Control Model for Dynamic RDF DatasetsAbstract Access Control Model for Dynamic RDF Datasets
Abstract Access Control Model for Dynamic RDF Datasets
PlanetData Network of Excellence
 
Towards Parallel Nonmonotonic Reasoning with Billions of Facts
Towards Parallel Nonmonotonic Reasoning with Billions of FactsTowards Parallel Nonmonotonic Reasoning with Billions of Facts
Towards Parallel Nonmonotonic Reasoning with Billions of Facts
PlanetData Network of Excellence
 
Automation in Cytomics: A Modern RDBMS Based Platform for Image Analysis and ...
Automation in Cytomics: A Modern RDBMS Based Platform for Image Analysis and ...Automation in Cytomics: A Modern RDBMS Based Platform for Image Analysis and ...
Automation in Cytomics: A Modern RDBMS Based Platform for Image Analysis and ...
PlanetData Network of Excellence
 
Heuristic based Query Optimisation for SPARQL
Heuristic based Query Optimisation for SPARQLHeuristic based Query Optimisation for SPARQL
Heuristic based Query Optimisation for SPARQL
PlanetData Network of Excellence
 
Building a Front End for a Sensor Data Cloud
Building a Front End for a Sensor Data CloudBuilding a Front End for a Sensor Data Cloud
Building a Front End for a Sensor Data Cloud
PlanetData Network of Excellence
 
OntoGen Extension for Exploring Image Collections
OntoGen Extension for Exploring Image CollectionsOntoGen Extension for Exploring Image Collections
OntoGen Extension for Exploring Image Collections
PlanetData Network of Excellence
 

More from PlanetData Network of Excellence (20)

Dl2014 slides
Dl2014 slidesDl2014 slides
Dl2014 slides
 
A Contextualized Knowledge Repository for Open Data about Trentino
A Contextualized Knowledge Repository for Open Data about TrentinoA Contextualized Knowledge Repository for Open Data about Trentino
A Contextualized Knowledge Repository for Open Data about Trentino
 
On Leveraging Crowdsourcing Techniques for Schema Matching Networks
On Leveraging Crowdsourcing Techniques for Schema Matching NetworksOn Leveraging Crowdsourcing Techniques for Schema Matching Networks
On Leveraging Crowdsourcing Techniques for Schema Matching Networks
 
Privacy-Preserving Schema Reuse
Privacy-Preserving Schema ReusePrivacy-Preserving Schema Reuse
Privacy-Preserving Schema Reuse
 
Pay-as-you-go Reconciliation in Schema Matching Networks
Pay-as-you-go Reconciliation in Schema Matching NetworksPay-as-you-go Reconciliation in Schema Matching Networks
Pay-as-you-go Reconciliation in Schema Matching Networks
 
Demo: tablet-based visualisation of transport data in Madrid using SPARQLstream
Demo: tablet-based visualisation of transport data in Madrid using SPARQLstreamDemo: tablet-based visualisation of transport data in Madrid using SPARQLstream
Demo: tablet-based visualisation of transport data in Madrid using SPARQLstream
 
On the need for a W3C community group on RDF Stream Processing
On the need for a W3C community group on RDF Stream ProcessingOn the need for a W3C community group on RDF Stream Processing
On the need for a W3C community group on RDF Stream Processing
 
Urbanopoly: Collection and Quality Assessment of Geo-spatial Linked Data via ...
Urbanopoly: Collection and Quality Assessment of Geo-spatial Linked Data via ...Urbanopoly: Collection and Quality Assessment of Geo-spatial Linked Data via ...
Urbanopoly: Collection and Quality Assessment of Geo-spatial Linked Data via ...
 
SciQL, Bridging the Gap between Science and Relational DBMS
SciQL, Bridging the Gap between Science and Relational DBMSSciQL, Bridging the Gap between Science and Relational DBMS
SciQL, Bridging the Gap between Science and Relational DBMS
 
Scalable Nonmonotonic Reasoning over RDF Data Using MapReduce
Scalable Nonmonotonic Reasoning over RDF Data Using MapReduceScalable Nonmonotonic Reasoning over RDF Data Using MapReduce
Scalable Nonmonotonic Reasoning over RDF Data Using MapReduce
 
Data and Knowledge Evolution
Data and Knowledge Evolution  Data and Knowledge Evolution
Data and Knowledge Evolution
 
Evolution of Workflow Provenance Information in the Presence of Custom Infere...
Evolution of Workflow Provenance Information in the Presence of Custom Infere...Evolution of Workflow Provenance Information in the Presence of Custom Infere...
Evolution of Workflow Provenance Information in the Presence of Custom Infere...
 
Access Control for RDF graphs using Abstract Models
Access Control for RDF graphs using Abstract ModelsAccess Control for RDF graphs using Abstract Models
Access Control for RDF graphs using Abstract Models
 
Arrays in Databases, the next frontier?
Arrays in Databases, the next frontier?Arrays in Databases, the next frontier?
Arrays in Databases, the next frontier?
 
Abstract Access Control Model for Dynamic RDF Datasets
Abstract Access Control Model for Dynamic RDF DatasetsAbstract Access Control Model for Dynamic RDF Datasets
Abstract Access Control Model for Dynamic RDF Datasets
 
Towards Parallel Nonmonotonic Reasoning with Billions of Facts
Towards Parallel Nonmonotonic Reasoning with Billions of FactsTowards Parallel Nonmonotonic Reasoning with Billions of Facts
Towards Parallel Nonmonotonic Reasoning with Billions of Facts
 
Automation in Cytomics: A Modern RDBMS Based Platform for Image Analysis and ...
Automation in Cytomics: A Modern RDBMS Based Platform for Image Analysis and ...Automation in Cytomics: A Modern RDBMS Based Platform for Image Analysis and ...
Automation in Cytomics: A Modern RDBMS Based Platform for Image Analysis and ...
 
Heuristic based Query Optimisation for SPARQL
Heuristic based Query Optimisation for SPARQLHeuristic based Query Optimisation for SPARQL
Heuristic based Query Optimisation for SPARQL
 
Building a Front End for a Sensor Data Cloud
Building a Front End for a Sensor Data CloudBuilding a Front End for a Sensor Data Cloud
Building a Front End for a Sensor Data Cloud
 
OntoGen Extension for Exploring Image Collections
OntoGen Extension for Exploring Image CollectionsOntoGen Extension for Exploring Image Collections
OntoGen Extension for Exploring Image Collections
 

Recently uploaded

AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial IntelligenceAI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
IndexBug
 
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc
 
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development ProvidersYour One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
akankshawande
 
National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
Quotidiano Piemontese
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
Safe Software
 
UI5 Controls simplified - UI5con2024 presentation
UI5 Controls simplified - UI5con2024 presentationUI5 Controls simplified - UI5con2024 presentation
UI5 Controls simplified - UI5con2024 presentation
Wouter Lemaire
 
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUHCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
panagenda
 
20240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 202420240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 2024
Matthew Sinclair
 
CAKE: Sharing Slices of Confidential Data on Blockchain
CAKE: Sharing Slices of Confidential Data on BlockchainCAKE: Sharing Slices of Confidential Data on Blockchain
CAKE: Sharing Slices of Confidential Data on Blockchain
Claudio Di Ciccio
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
Matthew Sinclair
 
Choosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptxChoosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptx
Brandon Minnick, MBA
 
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with SlackLet's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
shyamraj55
 
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
Edge AI and Vision Alliance
 
UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6
DianaGray10
 
Best 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERPBest 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERP
Pixlogix Infotech
 
OpenID AuthZEN Interop Read Out - Authorization
OpenID AuthZEN Interop Read Out - AuthorizationOpenID AuthZEN Interop Read Out - Authorization
OpenID AuthZEN Interop Read Out - Authorization
David Brossard
 
Generating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and MilvusGenerating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and Milvus
Zilliz
 
AI-Powered Food Delivery Transforming App Development in Saudi Arabia.pdf
AI-Powered Food Delivery Transforming App Development in Saudi Arabia.pdfAI-Powered Food Delivery Transforming App Development in Saudi Arabia.pdf
AI-Powered Food Delivery Transforming App Development in Saudi Arabia.pdf
Techgropse Pvt.Ltd.
 
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Speck&Tech
 
Artificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopmentArtificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopment
Octavian Nadolu
 

Recently uploaded (20)

AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial IntelligenceAI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
 
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
 
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development ProvidersYour One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
 
National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
 
UI5 Controls simplified - UI5con2024 presentation
UI5 Controls simplified - UI5con2024 presentationUI5 Controls simplified - UI5con2024 presentation
UI5 Controls simplified - UI5con2024 presentation
 
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUHCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
 
20240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 202420240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 2024
 
CAKE: Sharing Slices of Confidential Data on Blockchain
CAKE: Sharing Slices of Confidential Data on BlockchainCAKE: Sharing Slices of Confidential Data on Blockchain
CAKE: Sharing Slices of Confidential Data on Blockchain
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
 
Choosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptxChoosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptx
 
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with SlackLet's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
 
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
 
UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6
 
Best 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERPBest 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERP
 
OpenID AuthZEN Interop Read Out - Authorization
OpenID AuthZEN Interop Read Out - AuthorizationOpenID AuthZEN Interop Read Out - Authorization
OpenID AuthZEN Interop Read Out - Authorization
 
Generating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and MilvusGenerating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and Milvus
 
AI-Powered Food Delivery Transforming App Development in Saudi Arabia.pdf
AI-Powered Food Delivery Transforming App Development in Saudi Arabia.pdfAI-Powered Food Delivery Transforming App Development in Saudi Arabia.pdf
AI-Powered Food Delivery Transforming App Development in Saudi Arabia.pdf
 
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
 
Artificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopmentArtificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopment
 

Linking Smart Cities Datasets with Human Computation: the case of UrbanMatch

  • 1. Linking Smart Cities Datasets with Human Computation: the case of UrbanMatch Irene Celino, Simone Contessa, Marta Corubolo, Daniele Dell’Aglio, Emanuele Della Valle, Stefano Fumeo and Thorsten Krüger CEFRIEL – Politecnico di Milano – SIEMENS UrbanMatch - ISWC 2012
  • 2. Citizen Computation Human Computation Citizen Science exploiting human capabilities to solve computational tasks difficult for machines exploiting volunteers to collect scientific data or to conduct experiments "in the world" Citizen Computation exploiting human capabilities to contribute to a mixed computational system by living "in the world" UrbanMatch - ISWC 2012 2 Boston, 13th November 2012
  • 3. UrbanMatch: Citizen Computation GWAP Photos from social media Places & POIs from OpenStreetMap e.g. Duomo e.g. equestrian statue Wikimedia Commons ? ? ? Which photos actually represent the urban POIs? Can we link POIs to their most representative photos? UrbanMatch - ISWC 2012 3 Boston, 13th November 2012
  • 4. The Record Linkage Problem Γ=A×B  B Comparison space   A   M (set of matches) and U (set of non-matches) Each link γ has a score 𝑠 𝛾  B Γ= 𝐴× 𝐵 Purpose is to split Γ in   [Fellegi&Sunter,1969] 𝑠 𝛾 = 𝑃 𝛾 ∈ Γ 𝑀 /𝑃(𝛾 ∈ Γ|𝑈) Partitioning based on lower/upper thresholds:  UrbanMatch - ISWC 2012 𝑠 𝛾 < 𝐿𝑂𝑊𝐸𝑅 ⇒ 𝛾 ∈ 𝑈  ∈M ∈U ∈C 𝑠 𝛾 > 𝑈𝑃𝑃𝐸𝑅 ⇒ 𝛾 ∈ 𝑀  A 𝐿𝑂𝑊𝐸𝑅 ≤ 𝑠 𝛾 ≤ 𝑈𝑃𝑃𝐸𝑅 ⇒ 𝛾 ∈ 𝐶 (candidate links to be assessed by experts) 4 Boston, 13th November 2012
  • 5. UrbanMatch Linkage Problem B (photos) A (POIs) Piazza del Duomo     Piazza della Scala A contains POIs B contains photos Γ contains POI-photo links "Playable places" group POIs  partitioning of Γ (problem space reduction)  UrbanMatch links Input data in UrbanMatch:      UrbanMatch - ISWC 2012 5 34 POIs in 14 playable places 11,089 photos 196 links in M 382 links in U 37,413 links in C Boston, 13th November 2012
  • 6. Showing links to UrbanMatch players  The POI-photo links to be evaluated could be shown directly to players... Santa Maria delle Grazie  correct? ??? ...but the player could not know the name of the POI On the other hand, the player can easily "see" (Santa Maria delle Grazie) correct? NO! (Sant'Ambrogio) thus we use a "sure" photo as proxy for the POI, thus the UrbanMatch game is designed as a photo-pairing challenge UrbanMatch - ISWC 2012 6 Boston, 13th November 2012
  • 7. UrbanMatch level definition Set B of photos Set A of POIs a b c POI 1 d (Duomo) h POI 2 (statue)     g f e Photos d and g depict POI 1 (and do not depict POI 2) Photos c and h depict POI 2 (and do not depict POI 1) Photos b and f do not depict POI 1 nor POI 2 (distractors) Photos a and e maybe depict POI 1 and/or POI 2 UrbanMatch - ISWC 2012 7 Boston, 13th November 2012
  • 8. Feedbacks to UrbanMatch players If the selected pair is correct (two sure photos) or plausible, UrbanMatch gives a positive feedback d e b If the selected pair is wrong (at least one selected photo is a "distractor" option), UrbanMatch gives a negative feedback (and counts the player's mistake) f UrbanMatch - ISWC 2012 8 Boston, 13th November 2012
  • 9. UrbanMatch: gameplay Video at: http://bit.ly/um-video UrbanMatch - ISWC 2012 9 Boston, 13th November 2012
  • 10. Link score in UrbanMatch  Effect of players' evidences on the score sγ    If there is a positive evidence (an UrbanMatch player says that the link γ is correct), sγ is increased at most by Kpos If there is a negative evidence (an UrbanMatch player says that the link γ is incorrect), sγ is decreased at most by Kneg Each evidence is weighted with a factor that represents UrbanMatch player's reliability 𝑟𝑝 = 𝜀 − 𝑝 2 𝑒 where 𝜀 𝑝 is the number of errors made by the player, thus 𝑟 𝑝 ∈ [0. . 1] (reliability is 1 if the player makes no mistake, 0.6 if he makes 1 error, almost zero if he makes 6 mistakes or more)  For each link γ in C, its score varies as follows  Positive evidence: 𝑠′ 𝛾 = 𝑠 𝛾 + 𝐾 𝑝𝑜𝑠 ∗ 𝑟 𝑝  Negative evidence: 𝑠′ 𝛾 = 𝑠 𝛾 − 𝐾 𝑛𝑒𝑔 ∗ 𝑟 𝑝 UrbanMatch - ISWC 2012 10 Boston, 13th November 2012
  • 11. Evaluating UrbanMatch precision  Accuracy definition (metrics specific to Data Quality) candidate links correctly assessed to be either trustable or incorrect game ability to correctly assess candidate links 𝐴𝑐𝑐𝑢𝑟𝑎𝑐𝑦 = 𝐶𝑀 − 𝐹𝑃 + (𝐶𝑈 − 𝐹𝑁) 𝐶𝑀 + 𝐶𝑈 candidate links assessed to be either trustable or incorrect     CM = right links, i.e. candidate links moved from C to M (because 𝑠 𝛾 > 𝑈𝑃𝑃𝐸𝑅) CU = wrong links, i.e. candidate links moved from C to U (because 𝑠 𝛾 < 𝐿𝑂𝑊𝐸𝑅) FP = false positives (candidate links assessed as trustable but actually incorrect) FN = false negatives (candidate links assessed as incorrect but actually trustable) accuracy requires ground truth  computed only on a manually analyzed set of links for evaluation's sake ! UrbanMatch - ISWC 2012 11 Boston, 13th November 2012
  • 12. Evaluating UrbanMatch as a GWAP  Throughput and ALP definition (metrics specific to Games with a Purpose) game ability to assess candidate links game ability to engage players 𝑆𝑜𝑙𝑣𝑒𝑑𝑃𝑟𝑜𝑏𝑙𝑒𝑚𝑠 𝑇ℎ𝑟𝑜𝑢𝑔ℎ𝑝𝑢𝑡 = 𝑃𝑙𝑎𝑦𝑒𝑑𝑇𝑖𝑚𝑒 𝑃𝑙𝑎𝑦𝑒𝑑𝑇𝑖𝑚𝑒 𝐴𝐿𝑃 = 𝐴𝑐𝑡𝑖𝑣𝑒𝑃𝑙𝑎𝑦𝑒𝑟𝑠 where 𝑆𝑜𝑙𝑣𝑒𝑑𝑃𝑟𝑜𝑏𝑙𝑒𝑚𝑠 = 𝐶𝑀 + 𝐶𝑈 in which:  CM = right links, i.e. candidate links moved from C to M (because 𝑠 𝛾 > 𝑈𝑃𝑃𝐸𝑅)  CU = wrong links, i.e. candidate links moved from C to U (because 𝑠 𝛾 < 𝐿𝑂𝑊𝐸𝑅) UrbanMatch - ISWC 2012 12 Boston, 13th November 2012
  • 13. Setting the thresholds for link score 𝒔 𝜸 Accuracy and Throughput as a function of 𝐿𝑂𝑊𝐸𝑅  LOWER 0.05 0.10 0.15 0.20 CU 321 348 1152 1216 FN 4 5 7 8 98.75 % 98.56 % 99.39 % 99.34 % 108.08 117.17 387.87 409.42 Accuracy Throughput Accuracy and Throughput as a function of 𝑈𝑃𝑃𝐸𝑅  UPPER 0.60 0.65 0.70 0.75 0.80 0.85 0.90 CM 227 225 68 65 61 60 49 FP 48 47 4 4 3 3 1 Accuracy Throughput 78.85 % 79.11 % 94.11 % 95.38 % 95.08 % 95.00 % 97.95 % 76.43 UrbanMatch - ISWC 2012 75.75 22.89 13 21.88 20.53 20.20 Boston, 13th November 2012 16.49
  • 14. UrbanMatch evaluation results Accuracy, Throughput and ALP results  Evaluation data     Accuracy (on the manually-checked subset)   54 unique players, 290 games (781 game levels) 2006 input links, 1284 assessed (trustable/incorrect) upper threshold 0.70, lower threshold 0.20 99.06%  very good result  Throughput and ALP (on all processed links)   485 links/h 3m 17s UrbanMatch - ISWC 2012  very good result   one-time game, needs improvement  14 Boston, 13th November 2012
  • 15. Evaluation UrbanMatch as a whole User-based evaluation: actual "engagement"    Based on game evaluation literature and integrated with our research-specific questions Questionnaire at http://bit.ly/um-survey Findings: UrbanMatch - ISWC 2012 15 Boston, 13th November 2012
  • 16. Conclusions   Citizen Computation games are effective to leverage linking capabilities of "humans" Game is to be designed to keep in consideration users' capabilities and usage context:    (lack of) background knowledge small mobile-phone screen Different tasks to be solved need different game "missions"    Linking task  pairing mission Verifying task  rating/selecting mission Collecting task  selecting/editing mission   Advertisement  try out Urbanopoly game here in Boston, to collect and assess geo-spatial linked data participant to Semantic Web Challenge http://bit.ly/urbanopoly UrbanMatch - ISWC 2012 16 Boston, 13th November 2012
  • 17. Thanks for your attention! Any question? Irene Celino – CEFRIEL, ICT Institute Politecnico di Milano email: Irene.Celino@cefriel.it – web: http://swa.cefriel.it UrbanMatch - ISWC 2012