SlideShare a Scribd company logo
Institute for Web Science and Technologies · University of Koblenz-Landau, Germany
Programmatic Access to Crowdsourced
Human Computation for
Designing and Enhancing Interlinking
Cristina Sarasua
csarasua@uni-koblenz.de
ESWC2015 Developers Workshop
CROWDKI 2Cristina Sarasua
The Problem
CROWDKI 3Cristina Sarasua
Mostly identity links, few prominent interlinking hubs with high
number of in-links, and still 44% of the analyzed datasets do
not contain out-links [Schmachtenberg et al., 2014]
 To improve in heterogeneity and quantity of links we
need:
– To overcome computational limitations of automatic link
discovery methods
– Methods that assist data publishers in deciding the
datasets to target and the way to define the interlinks.
Current LOD
CROWDKI 4Cristina Sarasua
CROWDKI: Crowd-Powered Knowledge
Integration
CROWDKI 5Cristina Sarasua
 Humans involved systematically in processing data for
interlinking
 Human input is collected via microtask crowdsourcing
– Online marketplaces (e.g. Clickworker)
– Anyone registered (all around the world)
– Economic reward
– Divided into simple tasks (e.g. review a particular link between
two resources)
– Large and dedicated workforce → fast completion time (hours /
days)
 Software to manage the generation and completion of
microtasks related to interlinking:
https://github.com/criscod/CROWDKI
Crowdsourced Human Computation for Interlinking
CROWDKI 6Cristina Sarasua
 Input: list of interlinking possibilities and a particular
context
Car wasDesigned by Person
Car wasDriven by Person
Car wasRecommended by Person
 Output: contextual relevance assessment by the
crowd
UC1:Assessing the relevance of different
interlinking possibilities
CROWDKI 7Cristina Sarasua
UC1:Assessing the relevance of different
interlinking possibilities
CROWDKI 8Cristina Sarasua
 Input: set of candidate links, RDF data
 Output: set of final links (extended / post-processed)
UC2:Validating and Enhancing automatically
computed links:
CROWDKI 9Cristina Sarasua
UC2:Validating and Enhancing automatically
computed links:
CROWDKI 10Cristina Sarasua
Architecture
CrowdFlower
REST API
Microtasks
templates
& config
Java
Jena, SPARQL
JSON
Guava IO
CROWDKI 11Cristina Sarasua
Starting CROWDKI
CROWDKI 12Cristina Sarasua
 Communication is key: processing data with typos may be
interpreted wrongly
 Communities of crowd workers are emerging
 Not all interlinking scenarios require human computation (e.g.
country ISO codes) – the challenge is to automatically decide
when it is really worthwhile
 Drawbacks: no real-time crowdsourcing and crowd workers
cannot be selected accurately
 CrowdFlower (the crowdsourcing platform used) provides
more access to more feature via the UI than the API
Lessons Learned
CROWDKI 13Cristina Sarasua
Conclusions
 Hybrid approaches (automatic + crowd interlinking) can be
better (P,R) than purely automatic interlinking methods
 CROWDKI could be used in combination with dataset
recommendation methods that analyze the way data is
already interlinked.
CROWDKI 14Cristina Sarasua
 Which challenges did you face with state-of-the-art
link discovery tools, what went wrong?
 How do you think human computation can further
help in interlinking?
 What are in your opinion pros and cons of this
approach?
Questions for the audience
CROWDKI 15Cristina Sarasua
 Max Schmachtenberg, Christian Bizer, Heiko Paulheim: Adoption of the
Linked Data Best Practices in Different Topical Domains.
13th International Semantic Web Conference (ISWC2014) - RDB
Track, Riva del Garda, Italy, October 2014
References

More Related Content

Viewers also liked

1 a clothes and complements
1 a  clothes and complements1 a  clothes and complements
1 a clothes and complements
gemlops
 
A Short Guide to Bed Bugs
A Short Guide to Bed BugsA Short Guide to Bed Bugs
A Short Guide to Bed Bugs
brown57
 
Clothes and complements - ESO 1B - School year 2013-14
Clothes and complements - ESO 1B - School year 2013-14Clothes and complements - ESO 1B - School year 2013-14
Clothes and complements - ESO 1B - School year 2013-14
gemlops
 
Clothes and complements - ESO 1A - School year 2013-14
Clothes and complements - ESO 1A - School year 2013-14Clothes and complements - ESO 1A - School year 2013-14
Clothes and complements - ESO 1A - School year 2013-14
gemlops
 
I contenuti digitali open source: le regole per editare su Wikipedia
I contenuti digitali open source: le regole per editare su WikipediaI contenuti digitali open source: le regole per editare su Wikipedia
I contenuti digitali open source: le regole per editare su Wikipedia
Fabio Rinnone
 
MobileMap Agrigento
MobileMap AgrigentoMobileMap Agrigento
MobileMap Agrigento
Fabio Rinnone
 
Un tool per la visualizzazione e l'analisi di reti biologiche e sociali
Un tool per la visualizzazione e l'analisi di reti biologiche e socialiUn tool per la visualizzazione e l'analisi di reti biologiche e sociali
Un tool per la visualizzazione e l'analisi di reti biologiche e sociali
Fabio Rinnone
 
Nixmap. Funzionalità ed aspetti implementativi
Nixmap. Funzionalità ed aspetti implementativiNixmap. Funzionalità ed aspetti implementativi
Nixmap. Funzionalità ed aspetti implementativi
Fabio Rinnone
 
MobileMap Enna: un'applicazione web-mobile per la consultazione di cartografi...
MobileMap Enna: un'applicazione web-mobile per la consultazione di cartografi...MobileMap Enna: un'applicazione web-mobile per la consultazione di cartografi...
MobileMap Enna: un'applicazione web-mobile per la consultazione di cartografi...
Fabio Rinnone
 
1 c clothes and complements
1 c  clothes and complements1 c  clothes and complements
1 c clothes and complements
gemlops
 
Nuovi Media: Definizioni | Case Histories | Lo stato dell'arte
Nuovi Media: Definizioni | Case Histories | Lo stato dell'arteNuovi Media: Definizioni | Case Histories | Lo stato dell'arte
Nuovi Media: Definizioni | Case Histories | Lo stato dell'arte
Simone Arcagni
 
Wiki Loves Monuments: il contest fotografico di Wikimedia Italia a Niscemi
Wiki Loves Monuments: il contest fotografico di Wikimedia Italia a NiscemiWiki Loves Monuments: il contest fotografico di Wikimedia Italia a Niscemi
Wiki Loves Monuments: il contest fotografico di Wikimedia Italia a Niscemi
Fabio Rinnone
 
Dbpedia leipzig2014 csarasua_open
Dbpedia leipzig2014 csarasua_openDbpedia leipzig2014 csarasua_open
Dbpedia leipzig2014 csarasua_open
Cristina Sarasua
 

Viewers also liked (14)

Digital festival 2013
Digital festival 2013Digital festival 2013
Digital festival 2013
 
1 a clothes and complements
1 a  clothes and complements1 a  clothes and complements
1 a clothes and complements
 
A Short Guide to Bed Bugs
A Short Guide to Bed BugsA Short Guide to Bed Bugs
A Short Guide to Bed Bugs
 
Clothes and complements - ESO 1B - School year 2013-14
Clothes and complements - ESO 1B - School year 2013-14Clothes and complements - ESO 1B - School year 2013-14
Clothes and complements - ESO 1B - School year 2013-14
 
Clothes and complements - ESO 1A - School year 2013-14
Clothes and complements - ESO 1A - School year 2013-14Clothes and complements - ESO 1A - School year 2013-14
Clothes and complements - ESO 1A - School year 2013-14
 
I contenuti digitali open source: le regole per editare su Wikipedia
I contenuti digitali open source: le regole per editare su WikipediaI contenuti digitali open source: le regole per editare su Wikipedia
I contenuti digitali open source: le regole per editare su Wikipedia
 
MobileMap Agrigento
MobileMap AgrigentoMobileMap Agrigento
MobileMap Agrigento
 
Un tool per la visualizzazione e l'analisi di reti biologiche e sociali
Un tool per la visualizzazione e l'analisi di reti biologiche e socialiUn tool per la visualizzazione e l'analisi di reti biologiche e sociali
Un tool per la visualizzazione e l'analisi di reti biologiche e sociali
 
Nixmap. Funzionalità ed aspetti implementativi
Nixmap. Funzionalità ed aspetti implementativiNixmap. Funzionalità ed aspetti implementativi
Nixmap. Funzionalità ed aspetti implementativi
 
MobileMap Enna: un'applicazione web-mobile per la consultazione di cartografi...
MobileMap Enna: un'applicazione web-mobile per la consultazione di cartografi...MobileMap Enna: un'applicazione web-mobile per la consultazione di cartografi...
MobileMap Enna: un'applicazione web-mobile per la consultazione di cartografi...
 
1 c clothes and complements
1 c  clothes and complements1 c  clothes and complements
1 c clothes and complements
 
Nuovi Media: Definizioni | Case Histories | Lo stato dell'arte
Nuovi Media: Definizioni | Case Histories | Lo stato dell'arteNuovi Media: Definizioni | Case Histories | Lo stato dell'arte
Nuovi Media: Definizioni | Case Histories | Lo stato dell'arte
 
Wiki Loves Monuments: il contest fotografico di Wikimedia Italia a Niscemi
Wiki Loves Monuments: il contest fotografico di Wikimedia Italia a NiscemiWiki Loves Monuments: il contest fotografico di Wikimedia Italia a Niscemi
Wiki Loves Monuments: il contest fotografico di Wikimedia Italia a Niscemi
 
Dbpedia leipzig2014 csarasua_open
Dbpedia leipzig2014 csarasua_openDbpedia leipzig2014 csarasua_open
Dbpedia leipzig2014 csarasua_open
 

Similar to Programmatic Access to Crowdsourced Human Computation for Designing and Enhancing Interlinking

M phil-computer-science-data-mining-projects
M phil-computer-science-data-mining-projectsM phil-computer-science-data-mining-projects
M phil-computer-science-data-mining-projects
Vijay Karan
 
M.Phil Computer Science Data Mining Projects
M.Phil Computer Science Data Mining ProjectsM.Phil Computer Science Data Mining Projects
M.Phil Computer Science Data Mining Projects
Vijay Karan
 
M.E Computer Science Data Mining Projects
M.E Computer Science Data Mining ProjectsM.E Computer Science Data Mining Projects
M.E Computer Science Data Mining Projects
Vijay Karan
 
Crowd Work CV: Recognition for Micro Work
Crowd Work CV: Recognition for Micro WorkCrowd Work CV: Recognition for Micro Work
Crowd Work CV: Recognition for Micro Work
Cristina Sarasua
 
Interlinking Is More Than owl:sameAs
Interlinking Is More Than owl:sameAsInterlinking Is More Than owl:sameAs
Interlinking Is More Than owl:sameAs
Cristina Sarasua
 
Seminar
SeminarSeminar
Identical Users in Different Social Media Provides Uniform Network Structure ...
Identical Users in Different Social Media Provides Uniform Network Structure ...Identical Users in Different Social Media Provides Uniform Network Structure ...
Identical Users in Different Social Media Provides Uniform Network Structure ...
IJMTST Journal
 
TFF2016, Rudi Studer, Smarte Dienstleistungen mit semantischen Technologien
TFF2016, Rudi Studer, Smarte Dienstleistungen mit semantischen TechnologienTFF2016, Rudi Studer, Smarte Dienstleistungen mit semantischen Technologien
TFF2016, Rudi Studer, Smarte Dienstleistungen mit semantischen Technologien
TourismFastForward
 
McGeary Data Curation Network: Developing and Scaling
McGeary Data Curation Network: Developing and ScalingMcGeary Data Curation Network: Developing and Scaling
McGeary Data Curation Network: Developing and Scaling
National Information Standards Organization (NISO)
 
Streaming HYpothesis REasoning
Streaming HYpothesis REasoningStreaming HYpothesis REasoning
Streaming HYpothesis REasoning
William Smith
 
Streaming Hypothesis Reasoning - William Smith, Jan 2016
Streaming Hypothesis Reasoning - William Smith, Jan 2016Streaming Hypothesis Reasoning - William Smith, Jan 2016
Streaming Hypothesis Reasoning - William Smith, Jan 2016
Seattle DAML meetup
 
Microtask Crowdsourcing Applications for Linked Data
Microtask Crowdsourcing Applications for Linked DataMicrotask Crowdsourcing Applications for Linked Data
Microtask Crowdsourcing Applications for Linked Data
EUCLID project
 
An adaptive clustering and classification algorithm for Twitter data streamin...
An adaptive clustering and classification algorithm for Twitter data streamin...An adaptive clustering and classification algorithm for Twitter data streamin...
An adaptive clustering and classification algorithm for Twitter data streamin...
TELKOMNIKA JOURNAL
 
Observlets
Observlets Observlets
Observlets
Aastha Madaan
 
WP4-QoS Management in the Cloud
WP4-QoS Management in the CloudWP4-QoS Management in the Cloud
WP4-QoS Management in the Cloud
CARLOS III UNIVERSITY OF MADRID
 
Challenges in Analytics for BIG Data
Challenges in Analytics for BIG DataChallenges in Analytics for BIG Data
Challenges in Analytics for BIG Data
Prasant Misra
 
Wrangling RedCap_An Introduction and Inspiration
Wrangling RedCap_An Introduction and InspirationWrangling RedCap_An Introduction and Inspiration
Wrangling RedCap_An Introduction and Inspiration
Jacqueline Stern
 
LyonALMProposal20041018.doc
LyonALMProposal20041018.docLyonALMProposal20041018.doc
LyonALMProposal20041018.doc
butest
 
LyonALMProposal20041018.doc
LyonALMProposal20041018.docLyonALMProposal20041018.doc
LyonALMProposal20041018.doc
butest
 
Organizational Overlap on Social Networks and its Applications
Organizational Overlap on Social Networks and its ApplicationsOrganizational Overlap on Social Networks and its Applications
Organizational Overlap on Social Networks and its Applications
Sam Shah
 

Similar to Programmatic Access to Crowdsourced Human Computation for Designing and Enhancing Interlinking (20)

M phil-computer-science-data-mining-projects
M phil-computer-science-data-mining-projectsM phil-computer-science-data-mining-projects
M phil-computer-science-data-mining-projects
 
M.Phil Computer Science Data Mining Projects
M.Phil Computer Science Data Mining ProjectsM.Phil Computer Science Data Mining Projects
M.Phil Computer Science Data Mining Projects
 
M.E Computer Science Data Mining Projects
M.E Computer Science Data Mining ProjectsM.E Computer Science Data Mining Projects
M.E Computer Science Data Mining Projects
 
Crowd Work CV: Recognition for Micro Work
Crowd Work CV: Recognition for Micro WorkCrowd Work CV: Recognition for Micro Work
Crowd Work CV: Recognition for Micro Work
 
Interlinking Is More Than owl:sameAs
Interlinking Is More Than owl:sameAsInterlinking Is More Than owl:sameAs
Interlinking Is More Than owl:sameAs
 
Seminar
SeminarSeminar
Seminar
 
Identical Users in Different Social Media Provides Uniform Network Structure ...
Identical Users in Different Social Media Provides Uniform Network Structure ...Identical Users in Different Social Media Provides Uniform Network Structure ...
Identical Users in Different Social Media Provides Uniform Network Structure ...
 
TFF2016, Rudi Studer, Smarte Dienstleistungen mit semantischen Technologien
TFF2016, Rudi Studer, Smarte Dienstleistungen mit semantischen TechnologienTFF2016, Rudi Studer, Smarte Dienstleistungen mit semantischen Technologien
TFF2016, Rudi Studer, Smarte Dienstleistungen mit semantischen Technologien
 
McGeary Data Curation Network: Developing and Scaling
McGeary Data Curation Network: Developing and ScalingMcGeary Data Curation Network: Developing and Scaling
McGeary Data Curation Network: Developing and Scaling
 
Streaming HYpothesis REasoning
Streaming HYpothesis REasoningStreaming HYpothesis REasoning
Streaming HYpothesis REasoning
 
Streaming Hypothesis Reasoning - William Smith, Jan 2016
Streaming Hypothesis Reasoning - William Smith, Jan 2016Streaming Hypothesis Reasoning - William Smith, Jan 2016
Streaming Hypothesis Reasoning - William Smith, Jan 2016
 
Microtask Crowdsourcing Applications for Linked Data
Microtask Crowdsourcing Applications for Linked DataMicrotask Crowdsourcing Applications for Linked Data
Microtask Crowdsourcing Applications for Linked Data
 
An adaptive clustering and classification algorithm for Twitter data streamin...
An adaptive clustering and classification algorithm for Twitter data streamin...An adaptive clustering and classification algorithm for Twitter data streamin...
An adaptive clustering and classification algorithm for Twitter data streamin...
 
Observlets
Observlets Observlets
Observlets
 
WP4-QoS Management in the Cloud
WP4-QoS Management in the CloudWP4-QoS Management in the Cloud
WP4-QoS Management in the Cloud
 
Challenges in Analytics for BIG Data
Challenges in Analytics for BIG DataChallenges in Analytics for BIG Data
Challenges in Analytics for BIG Data
 
Wrangling RedCap_An Introduction and Inspiration
Wrangling RedCap_An Introduction and InspirationWrangling RedCap_An Introduction and Inspiration
Wrangling RedCap_An Introduction and Inspiration
 
LyonALMProposal20041018.doc
LyonALMProposal20041018.docLyonALMProposal20041018.doc
LyonALMProposal20041018.doc
 
LyonALMProposal20041018.doc
LyonALMProposal20041018.docLyonALMProposal20041018.doc
LyonALMProposal20041018.doc
 
Organizational Overlap on Social Networks and its Applications
Organizational Overlap on Social Networks and its ApplicationsOrganizational Overlap on Social Networks and its Applications
Organizational Overlap on Social Networks and its Applications
 

More from Cristina Sarasua

Editing Behavior over Time Power vs. Standard Wikidata Editors
Editing Behavior over Time  Power vs. Standard Wikidata EditorsEditing Behavior over Time  Power vs. Standard Wikidata Editors
Editing Behavior over Time Power vs. Standard Wikidata Editors
Cristina Sarasua
 
Methods for Intrinsic Evaluation of Links in the Web of Data
Methods for Intrinsic Evaluation of Links in the Web of DataMethods for Intrinsic Evaluation of Links in the Web of Data
Methods for Intrinsic Evaluation of Links in the Web of Data
Cristina Sarasua
 
How links can make your open data even greater
How links can make your open data even greaterHow links can make your open data even greater
How links can make your open data even greater
Cristina Sarasua
 
Closing session
Closing sessionClosing session
Closing session
Cristina Sarasua
 
Reviews and awards
Reviews and awardsReviews and awards
Reviews and awards
Cristina Sarasua
 
Crowd statement marathon
Crowd statement marathonCrowd statement marathon
Crowd statement marathon
Cristina Sarasua
 
Paper presentations1
Paper presentations1Paper presentations1
Paper presentations1
Cristina Sarasua
 
Paper presentations2
Paper presentations2Paper presentations2
Paper presentations2
Cristina Sarasua
 
Hello session
Hello sessionHello session
Hello session
Cristina Sarasua
 
Tecnología e Igualdad
Tecnología e IgualdadTecnología e Igualdad
Tecnología e Igualdad
Cristina Sarasua
 
Introduccion a Wikidata DSS Wikidata Editathon
Introduccion a Wikidata DSS Wikidata EditathonIntroduccion a Wikidata DSS Wikidata Editathon
Introduccion a Wikidata DSS Wikidata Editathon
Cristina Sarasua
 
Exploring the challenge of linking scientific publications and studies with c...
Exploring the challenge of linking scientific publications and studies with c...Exploring the challenge of linking scientific publications and studies with c...
Exploring the challenge of linking scientific publications and studies with c...
Cristina Sarasua
 

More from Cristina Sarasua (12)

Editing Behavior over Time Power vs. Standard Wikidata Editors
Editing Behavior over Time  Power vs. Standard Wikidata EditorsEditing Behavior over Time  Power vs. Standard Wikidata Editors
Editing Behavior over Time Power vs. Standard Wikidata Editors
 
Methods for Intrinsic Evaluation of Links in the Web of Data
Methods for Intrinsic Evaluation of Links in the Web of DataMethods for Intrinsic Evaluation of Links in the Web of Data
Methods for Intrinsic Evaluation of Links in the Web of Data
 
How links can make your open data even greater
How links can make your open data even greaterHow links can make your open data even greater
How links can make your open data even greater
 
Closing session
Closing sessionClosing session
Closing session
 
Reviews and awards
Reviews and awardsReviews and awards
Reviews and awards
 
Crowd statement marathon
Crowd statement marathonCrowd statement marathon
Crowd statement marathon
 
Paper presentations1
Paper presentations1Paper presentations1
Paper presentations1
 
Paper presentations2
Paper presentations2Paper presentations2
Paper presentations2
 
Hello session
Hello sessionHello session
Hello session
 
Tecnología e Igualdad
Tecnología e IgualdadTecnología e Igualdad
Tecnología e Igualdad
 
Introduccion a Wikidata DSS Wikidata Editathon
Introduccion a Wikidata DSS Wikidata EditathonIntroduccion a Wikidata DSS Wikidata Editathon
Introduccion a Wikidata DSS Wikidata Editathon
 
Exploring the challenge of linking scientific publications and studies with c...
Exploring the challenge of linking scientific publications and studies with c...Exploring the challenge of linking scientific publications and studies with c...
Exploring the challenge of linking scientific publications and studies with c...
 

Recently uploaded

OpenID AuthZEN Interop Read Out - Authorization
OpenID AuthZEN Interop Read Out - AuthorizationOpenID AuthZEN Interop Read Out - Authorization
OpenID AuthZEN Interop Read Out - Authorization
David Brossard
 
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfUnlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Malak Abu Hammad
 
Webinar: Designing a schema for a Data Warehouse
Webinar: Designing a schema for a Data WarehouseWebinar: Designing a schema for a Data Warehouse
Webinar: Designing a schema for a Data Warehouse
Federico Razzoli
 
Nordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptxNordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptx
MichaelKnudsen27
 
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
名前 です男
 
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success StoryDriving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Safe Software
 
June Patch Tuesday
June Patch TuesdayJune Patch Tuesday
June Patch Tuesday
Ivanti
 
Taking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdfTaking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdf
ssuserfac0301
 
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial IntelligenceAI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
IndexBug
 
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUHCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
panagenda
 
HCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAUHCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAU
panagenda
 
Monitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdfMonitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdf
Tosin Akinosho
 
20240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 202420240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 2024
Matthew Sinclair
 
Digital Marketing Trends in 2024 | Guide for Staying Ahead
Digital Marketing Trends in 2024 | Guide for Staying AheadDigital Marketing Trends in 2024 | Guide for Staying Ahead
Digital Marketing Trends in 2024 | Guide for Staying Ahead
Wask
 
Best 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERPBest 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERP
Pixlogix Infotech
 
Serial Arm Control in Real Time Presentation
Serial Arm Control in Real Time PresentationSerial Arm Control in Real Time Presentation
Serial Arm Control in Real Time Presentation
tolgahangng
 
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdfHow to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
Chart Kalyan
 
5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides
DanBrown980551
 
Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
Zilliz
 
UI5 Controls simplified - UI5con2024 presentation
UI5 Controls simplified - UI5con2024 presentationUI5 Controls simplified - UI5con2024 presentation
UI5 Controls simplified - UI5con2024 presentation
Wouter Lemaire
 

Recently uploaded (20)

OpenID AuthZEN Interop Read Out - Authorization
OpenID AuthZEN Interop Read Out - AuthorizationOpenID AuthZEN Interop Read Out - Authorization
OpenID AuthZEN Interop Read Out - Authorization
 
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfUnlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
 
Webinar: Designing a schema for a Data Warehouse
Webinar: Designing a schema for a Data WarehouseWebinar: Designing a schema for a Data Warehouse
Webinar: Designing a schema for a Data Warehouse
 
Nordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptxNordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptx
 
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
 
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success StoryDriving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success Story
 
June Patch Tuesday
June Patch TuesdayJune Patch Tuesday
June Patch Tuesday
 
Taking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdfTaking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdf
 
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial IntelligenceAI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
 
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUHCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
 
HCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAUHCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAU
 
Monitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdfMonitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdf
 
20240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 202420240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 2024
 
Digital Marketing Trends in 2024 | Guide for Staying Ahead
Digital Marketing Trends in 2024 | Guide for Staying AheadDigital Marketing Trends in 2024 | Guide for Staying Ahead
Digital Marketing Trends in 2024 | Guide for Staying Ahead
 
Best 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERPBest 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERP
 
Serial Arm Control in Real Time Presentation
Serial Arm Control in Real Time PresentationSerial Arm Control in Real Time Presentation
Serial Arm Control in Real Time Presentation
 
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdfHow to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
 
5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides
 
Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
 
UI5 Controls simplified - UI5con2024 presentation
UI5 Controls simplified - UI5con2024 presentationUI5 Controls simplified - UI5con2024 presentation
UI5 Controls simplified - UI5con2024 presentation
 

Programmatic Access to Crowdsourced Human Computation for Designing and Enhancing Interlinking

  • 1. Institute for Web Science and Technologies · University of Koblenz-Landau, Germany Programmatic Access to Crowdsourced Human Computation for Designing and Enhancing Interlinking Cristina Sarasua csarasua@uni-koblenz.de ESWC2015 Developers Workshop
  • 3. CROWDKI 3Cristina Sarasua Mostly identity links, few prominent interlinking hubs with high number of in-links, and still 44% of the analyzed datasets do not contain out-links [Schmachtenberg et al., 2014]  To improve in heterogeneity and quantity of links we need: – To overcome computational limitations of automatic link discovery methods – Methods that assist data publishers in deciding the datasets to target and the way to define the interlinks. Current LOD
  • 4. CROWDKI 4Cristina Sarasua CROWDKI: Crowd-Powered Knowledge Integration
  • 5. CROWDKI 5Cristina Sarasua  Humans involved systematically in processing data for interlinking  Human input is collected via microtask crowdsourcing – Online marketplaces (e.g. Clickworker) – Anyone registered (all around the world) – Economic reward – Divided into simple tasks (e.g. review a particular link between two resources) – Large and dedicated workforce → fast completion time (hours / days)  Software to manage the generation and completion of microtasks related to interlinking: https://github.com/criscod/CROWDKI Crowdsourced Human Computation for Interlinking
  • 6. CROWDKI 6Cristina Sarasua  Input: list of interlinking possibilities and a particular context Car wasDesigned by Person Car wasDriven by Person Car wasRecommended by Person  Output: contextual relevance assessment by the crowd UC1:Assessing the relevance of different interlinking possibilities
  • 7. CROWDKI 7Cristina Sarasua UC1:Assessing the relevance of different interlinking possibilities
  • 8. CROWDKI 8Cristina Sarasua  Input: set of candidate links, RDF data  Output: set of final links (extended / post-processed) UC2:Validating and Enhancing automatically computed links:
  • 9. CROWDKI 9Cristina Sarasua UC2:Validating and Enhancing automatically computed links:
  • 10. CROWDKI 10Cristina Sarasua Architecture CrowdFlower REST API Microtasks templates & config Java Jena, SPARQL JSON Guava IO
  • 12. CROWDKI 12Cristina Sarasua  Communication is key: processing data with typos may be interpreted wrongly  Communities of crowd workers are emerging  Not all interlinking scenarios require human computation (e.g. country ISO codes) – the challenge is to automatically decide when it is really worthwhile  Drawbacks: no real-time crowdsourcing and crowd workers cannot be selected accurately  CrowdFlower (the crowdsourcing platform used) provides more access to more feature via the UI than the API Lessons Learned
  • 13. CROWDKI 13Cristina Sarasua Conclusions  Hybrid approaches (automatic + crowd interlinking) can be better (P,R) than purely automatic interlinking methods  CROWDKI could be used in combination with dataset recommendation methods that analyze the way data is already interlinked.
  • 14. CROWDKI 14Cristina Sarasua  Which challenges did you face with state-of-the-art link discovery tools, what went wrong?  How do you think human computation can further help in interlinking?  What are in your opinion pros and cons of this approach? Questions for the audience
  • 15. CROWDKI 15Cristina Sarasua  Max Schmachtenberg, Christian Bizer, Heiko Paulheim: Adoption of the Linked Data Best Practices in Different Topical Domains. 13th International Semantic Web Conference (ISWC2014) - RDB Track, Riva del Garda, Italy, October 2014 References