SlideShare a Scribd company logo
EKAW 2016
ACRyLIQ: Leveraging DBpedia for Adaptive
Crowdsourcing in Linked Data Quality Assessment
Umair ul Hassan, Amrapali Zaveri, Edgard Marx, Edward Curry, Jens Lehmann
Background
• Linked Data Quality Assessment
(LDQA)
– Incomplete, inaccurate,
inconsistent data in LOD
• Crowdsourcing LDQA
1. Generate Micro-tasks to
assess quality of Linked
Data dataset
2. Recruits crowd workers to
perform LDQA tasks
3. Update dataset based on
crowd answers
Zaveri, Amrapali, et al. "Quality assessment for linked data: A survey." Semantic Web 7.1 (2015): 63-93.
Acosta, Maribel, et al. "Crowdsourcing linked data quality assessment." International Semantic Web Conference. Springer Berlin Heidelberg, 2013.
2
Linked
Dataset
LDQA tasks Updates
Crowd
Workers
Answers
Research Challenge
• Workers have varying reliability and expertise depending on the
domain and topics of a datasets
3
Linked
Dataset
Crowdsourced
LDQA tasks
How can we estimate
the reliability of crowd
workers to achieve
high accuracy of LDQA
tasks though adaptive
task assignment?
Existing Approach
• Use experts to create gold-standard tasks (GST)
• Estimate worker reliability and assign tasks
4
Correct
Responses
Gold-standard
LDQA tasks
Linked
Dataset
Crowdsourced
LDQA tasks
1) GST Selection
2) Task Assignment
Domain
Experts
Propose Approach
• Leverage DBPedia to generate knowledge-based questions (KBQs)
• Estimate worker reliability and assign tasks
5
Facts (i.e. triples)
KBQs
Linked
Dataset
Crowdsourced
LDQA tasks
1) KBQ Selection
2) Task Assignment
Evaluation Methodology
Languages Interlinks
LDQA Tasks Verify language tags for
entities in LinkedSpending
dataset
Verify relationships
between entities as
generated by OAEI
Topics Chinese, English, French,
Japanese, Russian
Anatomy, Books,
Economics, Geography,
Nature
KBQs Verify language of Dbpedia
facts
Verify Dbpedia facts based
on SKOS relationships
No. of tasks 25 25
No. of KBQs 10 10
6
Evaluation Methodology
• Crowd Workers
– 60 workers from Amazon
Mechanical Turk
– $1.5 for 30 mins
– Provided answers to 10
KBQs and 25 tasks for both
datasets
– Diverse reliability on
Languages tasks
– Low reliability on Interlinks
tasks
7
Results: Compared Approaches
KBQ approach generates reliability estimates similar to the GST approach
8
Results: Algorithm Parameters
9
Summary
• Strengths
– KBQs provide a quick and inexpensive method of estimating the
reliability and expertise of workers
– Our approach is particularly suited for complex and knowledge-
intensive tasks
• Limitations
– Assumption that LDQA tasks and KBQs are partitioned according to
same set of topics
– Assumption that the all facts in Dbpedia are correct
– Assumption that dataset topics are mutually exclusive
• Future work
– Scalability of the proposed approach needs to be validated
– Evaluate of wide range of tasks and datasets
10
Thank you
Umair Ul Hassan, Amrapali Zaveri, Edgard Marx, Edward Curry, and Jens
Lehmann. “ACRyLIQ: Leveraging DBpedia for Adaptive Crowdsourcing in
Linked Data Quality Assessment”. In: 20th International Conference on
Knowledge Engineering and Knowledge Management. Springer
International Publishing. 2016
Questions:
umair.ulhassan@insight-centre.org

More Related Content

Viewers also liked

Using Web Data Provenance for Quality Assessment
Using Web Data Provenance for Quality AssessmentUsing Web Data Provenance for Quality Assessment
Using Web Data Provenance for Quality Assessment
Olaf Hartig
 
Assessing and Refining Mappings to RDF to Improve Dataset Quality
Assessing and Refining Mappings to RDF to Improve Dataset QualityAssessing and Refining Mappings to RDF to Improve Dataset Quality
Assessing and Refining Mappings to RDF to Improve Dataset Quality
andimou
 
METHODS, MATHEMATICAL MODELS, DATA QUALITY ASSESSMENT AND RESULT INTERPRETATI...
METHODS, MATHEMATICAL MODELS, DATA QUALITY ASSESSMENT AND RESULT INTERPRETATI...METHODS, MATHEMATICAL MODELS, DATA QUALITY ASSESSMENT AND RESULT INTERPRETATI...
METHODS, MATHEMATICAL MODELS, DATA QUALITY ASSESSMENT AND RESULT INTERPRETATI...
HTAi Bilbao 2012
 
MEASURE Evaluation Data Quality Assessment Methodology and Tools
MEASURE Evaluation Data Quality Assessment Methodology and ToolsMEASURE Evaluation Data Quality Assessment Methodology and Tools
MEASURE Evaluation Data Quality Assessment Methodology and Tools
MEASURE Evaluation
 
Data Quality Rules introduction
Data Quality Rules introductionData Quality Rules introduction
Data Quality Rules introduction
datatovalue
 
Linked Data Quality Assessment: A Survey
Linked Data Quality Assessment: A SurveyLinked Data Quality Assessment: A Survey
Linked Data Quality Assessment: A Survey
Amrapali Zaveri, PhD
 
Crowdsourcing Linked Data Quality Assessment
Crowdsourcing Linked Data Quality AssessmentCrowdsourcing Linked Data Quality Assessment
Crowdsourcing Linked Data Quality Assessment
Amrapali Zaveri, PhD
 
Data quality overview
Data quality overviewData quality overview
Data quality overview
Alex Meadows
 
FAIR Data Prototype - Interoperability and FAIRness through a novel combinati...
FAIR Data Prototype - Interoperability and FAIRness through a novel combinati...FAIR Data Prototype - Interoperability and FAIRness through a novel combinati...
FAIR Data Prototype - Interoperability and FAIRness through a novel combinati...
Mark Wilkinson
 
Data Quality Dashboards
Data Quality DashboardsData Quality Dashboards
Data Quality Dashboards
William Sharp
 
Building a Data Quality Program from Scratch
Building a Data Quality Program from ScratchBuilding a Data Quality Program from Scratch
Building a Data Quality Program from Scratch
dmurph4
 
Data Quality Definitions
Data Quality DefinitionsData Quality Definitions
Data Quality Definitions
Michael Küsters
 
Data quality and data profiling
Data quality and data profilingData quality and data profiling
Data quality and data profiling
Shailja Khurana
 
Data quality architecture
Data quality architectureData quality architecture
Data quality architecture
anicewick
 

Viewers also liked (14)

Using Web Data Provenance for Quality Assessment
Using Web Data Provenance for Quality AssessmentUsing Web Data Provenance for Quality Assessment
Using Web Data Provenance for Quality Assessment
 
Assessing and Refining Mappings to RDF to Improve Dataset Quality
Assessing and Refining Mappings to RDF to Improve Dataset QualityAssessing and Refining Mappings to RDF to Improve Dataset Quality
Assessing and Refining Mappings to RDF to Improve Dataset Quality
 
METHODS, MATHEMATICAL MODELS, DATA QUALITY ASSESSMENT AND RESULT INTERPRETATI...
METHODS, MATHEMATICAL MODELS, DATA QUALITY ASSESSMENT AND RESULT INTERPRETATI...METHODS, MATHEMATICAL MODELS, DATA QUALITY ASSESSMENT AND RESULT INTERPRETATI...
METHODS, MATHEMATICAL MODELS, DATA QUALITY ASSESSMENT AND RESULT INTERPRETATI...
 
MEASURE Evaluation Data Quality Assessment Methodology and Tools
MEASURE Evaluation Data Quality Assessment Methodology and ToolsMEASURE Evaluation Data Quality Assessment Methodology and Tools
MEASURE Evaluation Data Quality Assessment Methodology and Tools
 
Data Quality Rules introduction
Data Quality Rules introductionData Quality Rules introduction
Data Quality Rules introduction
 
Linked Data Quality Assessment: A Survey
Linked Data Quality Assessment: A SurveyLinked Data Quality Assessment: A Survey
Linked Data Quality Assessment: A Survey
 
Crowdsourcing Linked Data Quality Assessment
Crowdsourcing Linked Data Quality AssessmentCrowdsourcing Linked Data Quality Assessment
Crowdsourcing Linked Data Quality Assessment
 
Data quality overview
Data quality overviewData quality overview
Data quality overview
 
FAIR Data Prototype - Interoperability and FAIRness through a novel combinati...
FAIR Data Prototype - Interoperability and FAIRness through a novel combinati...FAIR Data Prototype - Interoperability and FAIRness through a novel combinati...
FAIR Data Prototype - Interoperability and FAIRness through a novel combinati...
 
Data Quality Dashboards
Data Quality DashboardsData Quality Dashboards
Data Quality Dashboards
 
Building a Data Quality Program from Scratch
Building a Data Quality Program from ScratchBuilding a Data Quality Program from Scratch
Building a Data Quality Program from Scratch
 
Data Quality Definitions
Data Quality DefinitionsData Quality Definitions
Data Quality Definitions
 
Data quality and data profiling
Data quality and data profilingData quality and data profiling
Data quality and data profiling
 
Data quality architecture
Data quality architectureData quality architecture
Data quality architecture
 

Similar to Leveraging DBpedia for Adaptive Crowdsourcing in Linked Data Quality Assessment

Java parser a fine grained indexing tool and its application
Java parser a fine grained indexing tool and its applicationJava parser a fine grained indexing tool and its application
Java parser a fine grained indexing tool and its application
Roya Hosseini
 
Professor Steve Roberts; The Bayesian Crowd: scalable information combinati...
Professor Steve Roberts; The Bayesian Crowd: scalable information combinati...Professor Steve Roberts; The Bayesian Crowd: scalable information combinati...
Professor Steve Roberts; The Bayesian Crowd: scalable information combinati...
Bayes Nets meetup London
 
Professor Steve Roberts; The Bayesian Crowd: scalable information combinati...
Professor Steve Roberts; The Bayesian Crowd: scalable information combinati...Professor Steve Roberts; The Bayesian Crowd: scalable information combinati...
Professor Steve Roberts; The Bayesian Crowd: scalable information combinati...
Ian Morgan
 
The LDBC Social Network Benchmark Interactive Workload - SIGMOD 2015
The LDBC Social Network Benchmark Interactive Workload - SIGMOD 2015The LDBC Social Network Benchmark Interactive Workload - SIGMOD 2015
The LDBC Social Network Benchmark Interactive Workload - SIGMOD 2015
Ioan Toma
 
Designing real-time recommendations engine using graph databases.pptx
Designing real-time recommendations engine using graph databases.pptxDesigning real-time recommendations engine using graph databases.pptx
Designing real-time recommendations engine using graph databases.pptx
Gopi Krishna
 
Beyond Collaborative Filtering: Learning to Rank Research Articles
Beyond Collaborative Filtering: Learning to Rank Research ArticlesBeyond Collaborative Filtering: Learning to Rank Research Articles
Beyond Collaborative Filtering: Learning to Rank Research Articles
Maya Hristakeva
 
DEPT CONF (1) (1).pptx
DEPT CONF (1) (1).pptxDEPT CONF (1) (1).pptx
DEPT CONF (1) (1).pptx
vijayalakshmi257551
 
Three Tools for "Human-in-the-loop" Data Science
Three Tools for "Human-in-the-loop" Data ScienceThree Tools for "Human-in-the-loop" Data Science
Three Tools for "Human-in-the-loop" Data Science
Aditya Parameswaran
 
LAK21 Data Driven Redesign of Tutoring Systems (Yun Huang)
LAK21 Data Driven Redesign of Tutoring Systems (Yun Huang)LAK21 Data Driven Redesign of Tutoring Systems (Yun Huang)
LAK21 Data Driven Redesign of Tutoring Systems (Yun Huang)
Yun Huang
 
MongoDB & The McGraw-Hill Education Learning Analytics Platform
MongoDB & The McGraw-Hill Education Learning Analytics PlatformMongoDB & The McGraw-Hill Education Learning Analytics Platform
MongoDB & The McGraw-Hill Education Learning Analytics Platform
MongoDB
 
Cloud
CloudCloud
KREAM@ICCS2013
KREAM@ICCS2013KREAM@ICCS2013
KREAM@ICCS2013
Jaakko Lappalainen
 
Proposed Working Memory Measures for Evaluating Information Visualization Tools.
Proposed Working Memory Measures for Evaluating Information Visualization Tools.Proposed Working Memory Measures for Evaluating Information Visualization Tools.
Proposed Working Memory Measures for Evaluating Information Visualization Tools.
BELIV Workshop
 
NL-Graphs: A Hybrid Approach toward Interactively Querying Semantic Data
NL-Graphs: A Hybrid Approach toward Interactively Querying Semantic DataNL-Graphs: A Hybrid Approach toward Interactively Querying Semantic Data
NL-Graphs: A Hybrid Approach toward Interactively Querying Semantic Data
Suvodeep Mazumdar
 
Crowdsourcing the Semantic Web
Crowdsourcing the Semantic WebCrowdsourcing the Semantic Web
Crowdsourcing the Semantic Web
Elena Simperl
 
Search to Distill: Pearls are Everywhere but not the Eyes
Search to Distill: Pearls are Everywhere but not the EyesSearch to Distill: Pearls are Everywhere but not the Eyes
Search to Distill: Pearls are Everywhere but not the Eyes
Sungchul Kim
 
A Maturing Role of Workflows in the Presence of Heterogenous Computing Archit...
A Maturing Role of Workflows in the Presence of Heterogenous Computing Archit...A Maturing Role of Workflows in the Presence of Heterogenous Computing Archit...
A Maturing Role of Workflows in the Presence of Heterogenous Computing Archit...
Ilkay Altintas, Ph.D.
 
COBWEB A quality assurance workflow authoring tool for citizen science and cr...
COBWEB A quality assurance workflow authoring tool for citizen science and cr...COBWEB A quality assurance workflow authoring tool for citizen science and cr...
COBWEB A quality assurance workflow authoring tool for citizen science and cr...
COBWEB Project
 
Linked Data Quality Assessment – daQ and Luzzu
Linked Data Quality Assessment – daQ and LuzzuLinked Data Quality Assessment – daQ and Luzzu
Linked Data Quality Assessment – daQ and Luzzu
jerdeb
 
Coverage-Criteria-for-Testing-SQL-Queries
Coverage-Criteria-for-Testing-SQL-QueriesCoverage-Criteria-for-Testing-SQL-Queries
Coverage-Criteria-for-Testing-SQL-Queries
Mohamed Reda
 

Similar to Leveraging DBpedia for Adaptive Crowdsourcing in Linked Data Quality Assessment (20)

Java parser a fine grained indexing tool and its application
Java parser a fine grained indexing tool and its applicationJava parser a fine grained indexing tool and its application
Java parser a fine grained indexing tool and its application
 
Professor Steve Roberts; The Bayesian Crowd: scalable information combinati...
Professor Steve Roberts; The Bayesian Crowd: scalable information combinati...Professor Steve Roberts; The Bayesian Crowd: scalable information combinati...
Professor Steve Roberts; The Bayesian Crowd: scalable information combinati...
 
Professor Steve Roberts; The Bayesian Crowd: scalable information combinati...
Professor Steve Roberts; The Bayesian Crowd: scalable information combinati...Professor Steve Roberts; The Bayesian Crowd: scalable information combinati...
Professor Steve Roberts; The Bayesian Crowd: scalable information combinati...
 
The LDBC Social Network Benchmark Interactive Workload - SIGMOD 2015
The LDBC Social Network Benchmark Interactive Workload - SIGMOD 2015The LDBC Social Network Benchmark Interactive Workload - SIGMOD 2015
The LDBC Social Network Benchmark Interactive Workload - SIGMOD 2015
 
Designing real-time recommendations engine using graph databases.pptx
Designing real-time recommendations engine using graph databases.pptxDesigning real-time recommendations engine using graph databases.pptx
Designing real-time recommendations engine using graph databases.pptx
 
Beyond Collaborative Filtering: Learning to Rank Research Articles
Beyond Collaborative Filtering: Learning to Rank Research ArticlesBeyond Collaborative Filtering: Learning to Rank Research Articles
Beyond Collaborative Filtering: Learning to Rank Research Articles
 
DEPT CONF (1) (1).pptx
DEPT CONF (1) (1).pptxDEPT CONF (1) (1).pptx
DEPT CONF (1) (1).pptx
 
Three Tools for "Human-in-the-loop" Data Science
Three Tools for "Human-in-the-loop" Data ScienceThree Tools for "Human-in-the-loop" Data Science
Three Tools for "Human-in-the-loop" Data Science
 
LAK21 Data Driven Redesign of Tutoring Systems (Yun Huang)
LAK21 Data Driven Redesign of Tutoring Systems (Yun Huang)LAK21 Data Driven Redesign of Tutoring Systems (Yun Huang)
LAK21 Data Driven Redesign of Tutoring Systems (Yun Huang)
 
MongoDB & The McGraw-Hill Education Learning Analytics Platform
MongoDB & The McGraw-Hill Education Learning Analytics PlatformMongoDB & The McGraw-Hill Education Learning Analytics Platform
MongoDB & The McGraw-Hill Education Learning Analytics Platform
 
Cloud
CloudCloud
Cloud
 
KREAM@ICCS2013
KREAM@ICCS2013KREAM@ICCS2013
KREAM@ICCS2013
 
Proposed Working Memory Measures for Evaluating Information Visualization Tools.
Proposed Working Memory Measures for Evaluating Information Visualization Tools.Proposed Working Memory Measures for Evaluating Information Visualization Tools.
Proposed Working Memory Measures for Evaluating Information Visualization Tools.
 
NL-Graphs: A Hybrid Approach toward Interactively Querying Semantic Data
NL-Graphs: A Hybrid Approach toward Interactively Querying Semantic DataNL-Graphs: A Hybrid Approach toward Interactively Querying Semantic Data
NL-Graphs: A Hybrid Approach toward Interactively Querying Semantic Data
 
Crowdsourcing the Semantic Web
Crowdsourcing the Semantic WebCrowdsourcing the Semantic Web
Crowdsourcing the Semantic Web
 
Search to Distill: Pearls are Everywhere but not the Eyes
Search to Distill: Pearls are Everywhere but not the EyesSearch to Distill: Pearls are Everywhere but not the Eyes
Search to Distill: Pearls are Everywhere but not the Eyes
 
A Maturing Role of Workflows in the Presence of Heterogenous Computing Archit...
A Maturing Role of Workflows in the Presence of Heterogenous Computing Archit...A Maturing Role of Workflows in the Presence of Heterogenous Computing Archit...
A Maturing Role of Workflows in the Presence of Heterogenous Computing Archit...
 
COBWEB A quality assurance workflow authoring tool for citizen science and cr...
COBWEB A quality assurance workflow authoring tool for citizen science and cr...COBWEB A quality assurance workflow authoring tool for citizen science and cr...
COBWEB A quality assurance workflow authoring tool for citizen science and cr...
 
Linked Data Quality Assessment – daQ and Luzzu
Linked Data Quality Assessment – daQ and LuzzuLinked Data Quality Assessment – daQ and Luzzu
Linked Data Quality Assessment – daQ and Luzzu
 
Coverage-Criteria-for-Testing-SQL-Queries
Coverage-Criteria-for-Testing-SQL-QueriesCoverage-Criteria-for-Testing-SQL-Queries
Coverage-Criteria-for-Testing-SQL-Queries
 

More from Umair ul Hassan

A Multi-armed Bandit Approach to Online Spatial Task Assignment
A Multi-armed Bandit Approach to Online Spatial Task AssignmentA Multi-armed Bandit Approach to Online Spatial Task Assignment
A Multi-armed Bandit Approach to Online Spatial Task Assignment
Umair ul Hassan
 
SLUA: Towards Semantic Linking of Users with Actions in Crowdsourcing
SLUA: Towards Semantic Linking of Users with Actions in CrowdsourcingSLUA: Towards Semantic Linking of Users with Actions in Crowdsourcing
SLUA: Towards Semantic Linking of Users with Actions in Crowdsourcing
Umair ul Hassan
 
A Collaborative Approach for Metadata Management for Internet of Things
A Collaborative Approach for Metadata Management for Internet of ThingsA Collaborative Approach for Metadata Management for Internet of Things
A Collaborative Approach for Metadata Management for Internet of Things
Umair ul Hassan
 
Researh toolbox - Data analysis with python
Researh toolbox  - Data analysis with pythonResearh toolbox  - Data analysis with python
Researh toolbox - Data analysis with python
Umair ul Hassan
 
A Capability Requirements Approach for Predicting Worker Performance in Crowd...
A Capability Requirements Approach for Predicting Worker Performance in Crowd...A Capability Requirements Approach for Predicting Worker Performance in Crowd...
A Capability Requirements Approach for Predicting Worker Performance in Crowd...
Umair ul Hassan
 
Effects of Expertise Assessment on the Quality of Task Routing in Human Compu...
Effects of Expertise Assessment on the Quality of Task Routing in Human Compu...Effects of Expertise Assessment on the Quality of Task Routing in Human Compu...
Effects of Expertise Assessment on the Quality of Task Routing in Human Compu...
Umair ul Hassan
 
Towards Expertise Modelling for Routing Data Cleaning Tasks within a Communit...
Towards Expertise Modelling for Routing Data Cleaning Tasks within a Communit...Towards Expertise Modelling for Routing Data Cleaning Tasks within a Communit...
Towards Expertise Modelling for Routing Data Cleaning Tasks within a Communit...
Umair ul Hassan
 
Leveraging Matching Dependencies for Guided User Feedback in Linked Data Appl...
Leveraging Matching Dependencies for Guided User Feedback in Linked Data Appl...Leveraging Matching Dependencies for Guided User Feedback in Linked Data Appl...
Leveraging Matching Dependencies for Guided User Feedback in Linked Data Appl...
Umair ul Hassan
 

More from Umair ul Hassan (8)

A Multi-armed Bandit Approach to Online Spatial Task Assignment
A Multi-armed Bandit Approach to Online Spatial Task AssignmentA Multi-armed Bandit Approach to Online Spatial Task Assignment
A Multi-armed Bandit Approach to Online Spatial Task Assignment
 
SLUA: Towards Semantic Linking of Users with Actions in Crowdsourcing
SLUA: Towards Semantic Linking of Users with Actions in CrowdsourcingSLUA: Towards Semantic Linking of Users with Actions in Crowdsourcing
SLUA: Towards Semantic Linking of Users with Actions in Crowdsourcing
 
A Collaborative Approach for Metadata Management for Internet of Things
A Collaborative Approach for Metadata Management for Internet of ThingsA Collaborative Approach for Metadata Management for Internet of Things
A Collaborative Approach for Metadata Management for Internet of Things
 
Researh toolbox - Data analysis with python
Researh toolbox  - Data analysis with pythonResearh toolbox  - Data analysis with python
Researh toolbox - Data analysis with python
 
A Capability Requirements Approach for Predicting Worker Performance in Crowd...
A Capability Requirements Approach for Predicting Worker Performance in Crowd...A Capability Requirements Approach for Predicting Worker Performance in Crowd...
A Capability Requirements Approach for Predicting Worker Performance in Crowd...
 
Effects of Expertise Assessment on the Quality of Task Routing in Human Compu...
Effects of Expertise Assessment on the Quality of Task Routing in Human Compu...Effects of Expertise Assessment on the Quality of Task Routing in Human Compu...
Effects of Expertise Assessment on the Quality of Task Routing in Human Compu...
 
Towards Expertise Modelling for Routing Data Cleaning Tasks within a Communit...
Towards Expertise Modelling for Routing Data Cleaning Tasks within a Communit...Towards Expertise Modelling for Routing Data Cleaning Tasks within a Communit...
Towards Expertise Modelling for Routing Data Cleaning Tasks within a Communit...
 
Leveraging Matching Dependencies for Guided User Feedback in Linked Data Appl...
Leveraging Matching Dependencies for Guided User Feedback in Linked Data Appl...Leveraging Matching Dependencies for Guided User Feedback in Linked Data Appl...
Leveraging Matching Dependencies for Guided User Feedback in Linked Data Appl...
 

Recently uploaded

Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
Alpen-Adria-Universität
 
Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
Aftab Hussain
 
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfUnlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Malak Abu Hammad
 
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
James Anderson
 
UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5
DianaGray10
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
mikeeftimakis1
 
How to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For FlutterHow to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For Flutter
Daiki Mogmet Ito
 
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AI
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AIEnchancing adoption of Open Source Libraries. A case study on Albumentations.AI
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AI
Vladimir Iglovikov, Ph.D.
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
Ana-Maria Mihalceanu
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
ControlCase
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
Safe Software
 
Building RAG with self-deployed Milvus vector database and Snowpark Container...
Building RAG with self-deployed Milvus vector database and Snowpark Container...Building RAG with self-deployed Milvus vector database and Snowpark Container...
Building RAG with self-deployed Milvus vector database and Snowpark Container...
Zilliz
 
Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1
DianaGray10
 
Data structures and Algorithms in Python.pdf
Data structures and Algorithms in Python.pdfData structures and Algorithms in Python.pdf
Data structures and Algorithms in Python.pdf
TIPNGVN2
 
20240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 202420240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 2024
Matthew Sinclair
 
A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...
sonjaschweigert1
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
Matthew Sinclair
 
20240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 202420240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 2024
Matthew Sinclair
 
National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
Quotidiano Piemontese
 
Full-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalizationFull-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalization
Zilliz
 

Recently uploaded (20)

Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
 
Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
 
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfUnlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
 
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
 
UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
 
How to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For FlutterHow to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For Flutter
 
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AI
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AIEnchancing adoption of Open Source Libraries. A case study on Albumentations.AI
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AI
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
 
Building RAG with self-deployed Milvus vector database and Snowpark Container...
Building RAG with self-deployed Milvus vector database and Snowpark Container...Building RAG with self-deployed Milvus vector database and Snowpark Container...
Building RAG with self-deployed Milvus vector database and Snowpark Container...
 
Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1
 
Data structures and Algorithms in Python.pdf
Data structures and Algorithms in Python.pdfData structures and Algorithms in Python.pdf
Data structures and Algorithms in Python.pdf
 
20240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 202420240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 2024
 
A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
 
20240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 202420240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 2024
 
National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
 
Full-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalizationFull-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalization
 

Leveraging DBpedia for Adaptive Crowdsourcing in Linked Data Quality Assessment

  • 1. EKAW 2016 ACRyLIQ: Leveraging DBpedia for Adaptive Crowdsourcing in Linked Data Quality Assessment Umair ul Hassan, Amrapali Zaveri, Edgard Marx, Edward Curry, Jens Lehmann
  • 2. Background • Linked Data Quality Assessment (LDQA) – Incomplete, inaccurate, inconsistent data in LOD • Crowdsourcing LDQA 1. Generate Micro-tasks to assess quality of Linked Data dataset 2. Recruits crowd workers to perform LDQA tasks 3. Update dataset based on crowd answers Zaveri, Amrapali, et al. "Quality assessment for linked data: A survey." Semantic Web 7.1 (2015): 63-93. Acosta, Maribel, et al. "Crowdsourcing linked data quality assessment." International Semantic Web Conference. Springer Berlin Heidelberg, 2013. 2 Linked Dataset LDQA tasks Updates Crowd Workers Answers
  • 3. Research Challenge • Workers have varying reliability and expertise depending on the domain and topics of a datasets 3 Linked Dataset Crowdsourced LDQA tasks How can we estimate the reliability of crowd workers to achieve high accuracy of LDQA tasks though adaptive task assignment?
  • 4. Existing Approach • Use experts to create gold-standard tasks (GST) • Estimate worker reliability and assign tasks 4 Correct Responses Gold-standard LDQA tasks Linked Dataset Crowdsourced LDQA tasks 1) GST Selection 2) Task Assignment Domain Experts
  • 5. Propose Approach • Leverage DBPedia to generate knowledge-based questions (KBQs) • Estimate worker reliability and assign tasks 5 Facts (i.e. triples) KBQs Linked Dataset Crowdsourced LDQA tasks 1) KBQ Selection 2) Task Assignment
  • 6. Evaluation Methodology Languages Interlinks LDQA Tasks Verify language tags for entities in LinkedSpending dataset Verify relationships between entities as generated by OAEI Topics Chinese, English, French, Japanese, Russian Anatomy, Books, Economics, Geography, Nature KBQs Verify language of Dbpedia facts Verify Dbpedia facts based on SKOS relationships No. of tasks 25 25 No. of KBQs 10 10 6
  • 7. Evaluation Methodology • Crowd Workers – 60 workers from Amazon Mechanical Turk – $1.5 for 30 mins – Provided answers to 10 KBQs and 25 tasks for both datasets – Diverse reliability on Languages tasks – Low reliability on Interlinks tasks 7
  • 8. Results: Compared Approaches KBQ approach generates reliability estimates similar to the GST approach 8
  • 10. Summary • Strengths – KBQs provide a quick and inexpensive method of estimating the reliability and expertise of workers – Our approach is particularly suited for complex and knowledge- intensive tasks • Limitations – Assumption that LDQA tasks and KBQs are partitioned according to same set of topics – Assumption that the all facts in Dbpedia are correct – Assumption that dataset topics are mutually exclusive • Future work – Scalability of the proposed approach needs to be validated – Evaluate of wide range of tasks and datasets 10
  • 11. Thank you Umair Ul Hassan, Amrapali Zaveri, Edgard Marx, Edward Curry, and Jens Lehmann. “ACRyLIQ: Leveraging DBpedia for Adaptive Crowdsourcing in Linked Data Quality Assessment”. In: 20th International Conference on Knowledge Engineering and Knowledge Management. Springer International Publishing. 2016 Questions: umair.ulhassan@insight-centre.org