SlideShare a Scribd company logo
1 of 16
Crowdsourcing tasks in Linked Data management
 Elena Simperl,1 Barry Norton,2 Denny Vrandecic1
 1Institute         AIFB, Karlsruhe Institute of Technology, Germany
 2Ontotext          AD, Bulgaria
 Institute of Applied Informatics and Formal Description Methods (AIFB)
Institute of Applied Informatics and Formal Description Methods (AIFB)




 KIT – University of the State of Baden-Wuerttemberg and
 National Research Center of the Helmholtz Association                    www.kit.edu
Motivation

        Various aspects of Linked Data management
       naturally rely on human intelligence to yield
       optimal results
        But reaching a critical mass of useful contributions
       from all relevant stakeholders is still more an art
       than an engineering exercise




2   23.10.2011   Seminar - Die Rolle von Ontologien in Linked Data – Kickoff   Institut für Angewandte Informatik und Formale
                                                                                                Beschreibungsverfahren (AIFB)
Microtask platforms



                                                                 Break task
                                                                               Evaluate the
                    Define task                                 into smaller
                                                                                 results
                                                                    units




3   23.10.2011   Seminar - Die Rolle von Ontologien in Linked Data – Kickoff   Institut für Angewandte Informatik und Formale
                                                                                                Beschreibungsverfahren (AIFB)
Approach
        Formal, declarative description of the data and tasks
       using SPARQL patterns as a basis for the automatic
       design of HITs

         Integral part of Linked Data tools and applications
                 At design time application developer specifies which data
                 portions workers can process and via which types of HITs
                 At run time
                      The system materializes the data
                      Workers process it
                      Data and application are updated to reflect crowdsourcing results


4   23.10.2011   Crowdsourcing tasks in Linked Data management    Institut für Angewandte Informatik und Formale
                                                                                   Beschreibungsverfahren (AIFB)
Examples of Linked Data tasks
    amenable to crowdsourcing

         Identity resolution
         Metadata completion and checking/correction
         Classification
         Ordering
           Quantitative
           Qualitative
         Translation




5   23.10.2011   Crowdsourcing tasks in Linked Data management   Institut für Angewandte Informatik und Formale
                                                                                  Beschreibungsverfahren (AIFB)
Running Example




6   23.10.2011   Crowdsourcing tasks in Linked Data management   Institut für Angewandte Informatik und Formale
                                                                                  Beschreibungsverfahren (AIFB)
Identity resolution

    Identity Resolution “involves the creation of sameAs
    links, either by comparison of metadata or by
    investigation of links on the human Web.”
    Input: {?station a metar:Station;
                      rdfs:label ?slabel;
                      wgs84:lat ?slat;
                      wgs84:long ?slong .
             ?airport a dbp-owl:Airport;
                      rdfs:label ?alabel;
                      wgs84:lat ?alat;
                      wgs84:long ?along}
    Output: {OPTIONAL
             {?airport owl:sameAs ?station}}



7   23.10.2011   Crowdsourcing tasks in Linked Data management   Institut für Angewandte Informatik und Formale
                                                                                  Beschreibungsverfahren (AIFB)
Metadata completion & correction

    “Certain properties, necessary for a given query,
    may not be uniformly populated. Manually conducted
    research might be necessary to transfer this
    information from the human-readable Web”
     Input: {?station a metar:Station;
                      rdfs:label ?label;
                      wgs84:lat ?lat;
                      wgs84:long ?long;
                      dbp:icao ?badicao}


     Output: {?station dbp:icao ?goodicao}




8   23.10.2011   Crowdsourcing tasks in Linked Data management   Institut für Angewandte Informatik und Formale
                                                                                  Beschreibungsverfahren (AIFB)
Classification

    “Linked Data emphasis[es…] relationships between
    resources [over classification]. [D]ue to the promoted
    use of generic vocabularies, is it not always possible
    to infer classification from […] properties”
    Input: {?station a metar:Station;
                     rdfs:label ?label;
                     wgs84:lat ?lat;
                     wgs84:long ?long}



    Output: {?station a ?type.
             ?type rdfs:subClassOf
            metar:Station}


9   23.10.2011   Crowdsourcing tasks in Linked Data management   Institut für Angewandte Informatik und Formale
                                                                                  Beschreibungsverfahren (AIFB)
Ordering

     “Having means to rank Linked Data content along
     specific dimensions is typically deemed useful for
                                                          quantitative
     querying and browsing […both] “specific” ordering
     [(e.g. timestamps) … and] orderings […] via           qualitative
     “less straightforward” built-ins [(e.g. pref/alt labels)]”

 Input: {?station foaf:depiction ?x, ?y}




 Output: {{(?x ?y) a rdf:List}
          UNION {(?y ?x) a rdf:List}}



10   23.10.2011   Crowdsourcing tasks in Linked Data management   Institut für Angewandte Informatik und Formale
                                                                                   Beschreibungsverfahren (AIFB)
Translation

     “[An important] aspect of the labeling of resources for
     humans is multi-linguality […] actual provision of labels
     in non-English languages is currently rather low”


     Input: {?station rdfs:label ?enlabel.
             FILTER (LANG(?label) = "EN")}




     Output: {?station rdfs:label ?bglabel.
              FILTER (LANG(?label) = "BG")}




11   23.10.2011   Crowdsourcing tasks in Linked Data management   Institut für Angewandte Informatik und Formale
                                                                                   Beschreibungsverfahren (AIFB)
Open query answering

          Query a FOAF-file using the vCard vocabulary

     hp:Harry foaf:mbox <mailto:scarface@hogwarts.ac.uk> ;
        foaf:nick "Harry" ; foaf:familyName "Potter" .


     SELECT ?name ?email WHERE
     { ?p vcard:email ?email ; vcard:fn ?name }



          In order to answer the query as intended
                  Vocabulary mapping and entity resolution (foaf to vcard)
                  Metadata completion (full name is Harry Potter)
12   23.10.2011   Crowdsourcing tasks in Linked Data management   Institut für Angewandte Informatik und Formale
                                                                                   Beschreibungsverfahren (AIFB)
Limitations of microtask crowdsourcing

          Decomposability
          Verifiability
          Expertise

         Compositions to deal with tasks with
        underspecified workflow and/or multiple correct
        answers




13   23.10.2011   Crowdsourcing tasks in Linked Data management   Institut für Angewandte Informatik und Formale
                                                                                   Beschreibungsverfahren (AIFB)
Challenges
          Decomposition of user-visible queries:
                  SPARQL
                       Easy: Low quality (meta)data can be subject to automated
                       checking (even if not fixing)
                       Medium: Missing data (and translation) can be automatically
                       identified (but knowing to which dataset it should belong is not
                       necessarily clear)
                       Difficult:
                            Interlinking (at least sameAs) is somewhat implicit (using
                            entailment) and knowing where user expects
                            Query optimisation obfuscates what is used and should
                            involve costs for human tasks
                  Pig might be somewhat easier in latter regard
          Caching
                  Naively we can materialise HIT results into datasets
                  How to deal with partial coverage and dynamic datasets
14   23.10.2011   Crowdsourcing tasks in Linked Data management      Institut für Angewandte Informatik und Formale
                                                                                      Beschreibungsverfahren (AIFB)
Further Challenges

         Appropriate level of granularity for HITs design for
        specific SPARQL constructs and typical
        functionality of Linked Data management
        components
         Optimal user interfaces of graph-like content
                  (Contextual) Rendering of LOD entities and tasks
          Pricing and workers’ assignment
                  Can we connect the end-users of an application and
                  their wish for specific data to be consumed with the
                  payment of workers and prioritization of HITs?
                  Dealing with spam / gaming
15   23.10.2011   Crowdsourcing tasks in Linked Data management   Institut für Angewandte Informatik und Formale
                                                                                   Beschreibungsverfahren (AIFB)
QUESTIONS



16   23.10.2011   Crowdsourcing tasks in Linked Data management   Institut für Angewandte Informatik und Formale
                                                                                   Beschreibungsverfahren (AIFB)

More Related Content

Viewers also liked

Linked-Data based Data Management for data.gov.sg
Linked-Data based Data Management for data.gov.sgLinked-Data based Data Management for data.gov.sg
Linked-Data based Data Management for data.gov.sgAravind Sesagiri Raamkumar
 
Crowdsourcing for Information Retrieval: From Statistics to Ethics
Crowdsourcing for Information Retrieval: From Statistics to EthicsCrowdsourcing for Information Retrieval: From Statistics to Ethics
Crowdsourcing for Information Retrieval: From Statistics to EthicsMatthew Lease
 
Crowdsourcing Linked Data management
Crowdsourcing Linked Data managementCrowdsourcing Linked Data management
Crowdsourcing Linked Data managementElena Simperl
 
Crowdsourcing & Human Computation Labeling Data & Building Hybrid Systems
Crowdsourcing & Human Computation Labeling Data & Building Hybrid SystemsCrowdsourcing & Human Computation Labeling Data & Building Hybrid Systems
Crowdsourcing & Human Computation Labeling Data & Building Hybrid SystemsMatthew Lease
 
Managing Crowdsourced Human Computation: A Tutorial
Managing Crowdsourced Human Computation: A TutorialManaging Crowdsourced Human Computation: A Tutorial
Managing Crowdsourced Human Computation: A TutorialPanos Ipeirotis
 
ResearchSpace Platform in Use
ResearchSpace Platform in UseResearchSpace Platform in Use
ResearchSpace Platform in UseBarry Norton
 
European Data Science Academy: Training the Next Generation of Data Scientists
European Data Science Academy: Training the Next Generation of Data ScientistsEuropean Data Science Academy: Training the Next Generation of Data Scientists
European Data Science Academy: Training the Next Generation of Data ScientistsElena Simperl
 

Viewers also liked (7)

Linked-Data based Data Management for data.gov.sg
Linked-Data based Data Management for data.gov.sgLinked-Data based Data Management for data.gov.sg
Linked-Data based Data Management for data.gov.sg
 
Crowdsourcing for Information Retrieval: From Statistics to Ethics
Crowdsourcing for Information Retrieval: From Statistics to EthicsCrowdsourcing for Information Retrieval: From Statistics to Ethics
Crowdsourcing for Information Retrieval: From Statistics to Ethics
 
Crowdsourcing Linked Data management
Crowdsourcing Linked Data managementCrowdsourcing Linked Data management
Crowdsourcing Linked Data management
 
Crowdsourcing & Human Computation Labeling Data & Building Hybrid Systems
Crowdsourcing & Human Computation Labeling Data & Building Hybrid SystemsCrowdsourcing & Human Computation Labeling Data & Building Hybrid Systems
Crowdsourcing & Human Computation Labeling Data & Building Hybrid Systems
 
Managing Crowdsourced Human Computation: A Tutorial
Managing Crowdsourced Human Computation: A TutorialManaging Crowdsourced Human Computation: A Tutorial
Managing Crowdsourced Human Computation: A Tutorial
 
ResearchSpace Platform in Use
ResearchSpace Platform in UseResearchSpace Platform in Use
ResearchSpace Platform in Use
 
European Data Science Academy: Training the Next Generation of Data Scientists
European Data Science Academy: Training the Next Generation of Data ScientistsEuropean Data Science Academy: Training the Next Generation of Data Scientists
European Data Science Academy: Training the Next Generation of Data Scientists
 

Similar to Crowdsourcing tasks in Linked Data management

Crowdsourcing-enabled Linked Data management architecture
Crowdsourcing-enabled Linked Data management architectureCrowdsourcing-enabled Linked Data management architecture
Crowdsourcing-enabled Linked Data management architectureElena Simperl
 
Considerations for Abstracting Complexities of a Real-Time ML Platform, Zhenz...
Considerations for Abstracting Complexities of a Real-Time ML Platform, Zhenz...Considerations for Abstracting Complexities of a Real-Time ML Platform, Zhenz...
Considerations for Abstracting Complexities of a Real-Time ML Platform, Zhenz...HostedbyConfluent
 
International Journal of Engineering and Science Invention (IJESI)
International Journal of Engineering and Science Invention (IJESI)International Journal of Engineering and Science Invention (IJESI)
International Journal of Engineering and Science Invention (IJESI)inventionjournals
 
20100512 Workflow Ramage
20100512 Workflow Ramage20100512 Workflow Ramage
20100512 Workflow RamageSteven Ramage
 
Hibernate training at HarshithaTechnologySolutions @ Nizampet
Hibernate training at HarshithaTechnologySolutions @ NizampetHibernate training at HarshithaTechnologySolutions @ Nizampet
Hibernate training at HarshithaTechnologySolutions @ NizampetJayarajus
 
Sem tech 2011 v8
Sem tech 2011 v8Sem tech 2011 v8
Sem tech 2011 v8dallemang
 
How to Find a Needle in the Haystack
How to Find a Needle in the HaystackHow to Find a Needle in the Haystack
How to Find a Needle in the HaystackAdrian Stevenson
 
An Overview of VIEW
An Overview of VIEWAn Overview of VIEW
An Overview of VIEWShiyong Lu
 
ETL big data with apache hadoop
ETL big data with apache hadoopETL big data with apache hadoop
ETL big data with apache hadoopMaulik Thaker
 
Adcom2006 Full 6
Adcom2006 Full 6Adcom2006 Full 6
Adcom2006 Full 6umavanth
 
Application-Aware Big Data Deduplication in Cloud Environment
Application-Aware Big Data Deduplication in Cloud EnvironmentApplication-Aware Big Data Deduplication in Cloud Environment
Application-Aware Big Data Deduplication in Cloud EnvironmentSafayet Hossain
 
Compositional AI: Fusion of AI/ML Services
Compositional AI: Fusion of AI/ML ServicesCompositional AI: Fusion of AI/ML Services
Compositional AI: Fusion of AI/ML ServicesDebmalya Biswas
 
Possible Worlds Explorer: Datalog & Answer Set Programming for the Rest of Us
Possible Worlds Explorer: Datalog & Answer Set Programming for the Rest of UsPossible Worlds Explorer: Datalog & Answer Set Programming for the Rest of Us
Possible Worlds Explorer: Datalog & Answer Set Programming for the Rest of UsBertram Ludäscher
 

Similar to Crowdsourcing tasks in Linked Data management (20)

Crowdsourcing-enabled Linked Data management architecture
Crowdsourcing-enabled Linked Data management architectureCrowdsourcing-enabled Linked Data management architecture
Crowdsourcing-enabled Linked Data management architecture
 
Considerations for Abstracting Complexities of a Real-Time ML Platform, Zhenz...
Considerations for Abstracting Complexities of a Real-Time ML Platform, Zhenz...Considerations for Abstracting Complexities of a Real-Time ML Platform, Zhenz...
Considerations for Abstracting Complexities of a Real-Time ML Platform, Zhenz...
 
International Journal of Engineering and Science Invention (IJESI)
International Journal of Engineering and Science Invention (IJESI)International Journal of Engineering and Science Invention (IJESI)
International Journal of Engineering and Science Invention (IJESI)
 
A02620109
A02620109A02620109
A02620109
 
A02620109
A02620109A02620109
A02620109
 
20100512 Workflow Ramage
20100512 Workflow Ramage20100512 Workflow Ramage
20100512 Workflow Ramage
 
Hibernate training at HarshithaTechnologySolutions @ Nizampet
Hibernate training at HarshithaTechnologySolutions @ NizampetHibernate training at HarshithaTechnologySolutions @ Nizampet
Hibernate training at HarshithaTechnologySolutions @ Nizampet
 
Sem tech 2011 v8
Sem tech 2011 v8Sem tech 2011 v8
Sem tech 2011 v8
 
ThesisProposal
ThesisProposalThesisProposal
ThesisProposal
 
How to Find a Needle in the Haystack
How to Find a Needle in the HaystackHow to Find a Needle in the Haystack
How to Find a Needle in the Haystack
 
An Overview of VIEW
An Overview of VIEWAn Overview of VIEW
An Overview of VIEW
 
Software design
Software designSoftware design
Software design
 
Interoperability
InteroperabilityInteroperability
Interoperability
 
Task Complexity Metrics - Ben Colborn
Task Complexity Metrics - Ben ColbornTask Complexity Metrics - Ben Colborn
Task Complexity Metrics - Ben Colborn
 
ETL big data with apache hadoop
ETL big data with apache hadoopETL big data with apache hadoop
ETL big data with apache hadoop
 
Adcom2006 Full 6
Adcom2006 Full 6Adcom2006 Full 6
Adcom2006 Full 6
 
Application-Aware Big Data Deduplication in Cloud Environment
Application-Aware Big Data Deduplication in Cloud EnvironmentApplication-Aware Big Data Deduplication in Cloud Environment
Application-Aware Big Data Deduplication in Cloud Environment
 
Compositional AI: Fusion of AI/ML Services
Compositional AI: Fusion of AI/ML ServicesCompositional AI: Fusion of AI/ML Services
Compositional AI: Fusion of AI/ML Services
 
Search Approach - ES, GraphDB
Search Approach - ES, GraphDBSearch Approach - ES, GraphDB
Search Approach - ES, GraphDB
 
Possible Worlds Explorer: Datalog & Answer Set Programming for the Rest of Us
Possible Worlds Explorer: Datalog & Answer Set Programming for the Rest of UsPossible Worlds Explorer: Datalog & Answer Set Programming for the Rest of Us
Possible Worlds Explorer: Datalog & Answer Set Programming for the Rest of Us
 

More from Barry Norton

Knowledge Graphs and Milestone
Knowledge Graphs and MilestoneKnowledge Graphs and Milestone
Knowledge Graphs and MilestoneBarry Norton
 
ResearchSpace Collaborative Features
ResearchSpace Collaborative FeaturesResearchSpace Collaborative Features
ResearchSpace Collaborative FeaturesBarry Norton
 
Book of the Dead Project
Book of the Dead ProjectBook of the Dead Project
Book of the Dead ProjectBarry Norton
 
Data Culture / Culture Data
Data Culture / Culture DataData Culture / Culture Data
Data Culture / Culture DataBarry Norton
 
Querying Cultural Heritage
Querying Cultural HeritageQuerying Cultural Heritage
Querying Cultural HeritageBarry Norton
 
A Data API with Security and Graph-Level Access Control
A Data API with Security and Graph-Level Access ControlA Data API with Security and Graph-Level Access Control
A Data API with Security and Graph-Level Access ControlBarry Norton
 
GLAMorous LOD and ResearchSpace introduction
GLAMorous LOD and ResearchSpace introductionGLAMorous LOD and ResearchSpace introduction
GLAMorous LOD and ResearchSpace introductionBarry Norton
 
Linked Data, Ontologies and Inference
Linked Data, Ontologies and InferenceLinked Data, Ontologies and Inference
Linked Data, Ontologies and InferenceBarry Norton
 
Integrating Drupal with a Triple Store
Integrating Drupal with a Triple StoreIntegrating Drupal with a Triple Store
Integrating Drupal with a Triple StoreBarry Norton
 
Linked Data and Services
Linked Data and ServicesLinked Data and Services
Linked Data and ServicesBarry Norton
 
Towards Linked Open Services and Processes
Towards Linked Open Services and ProcessesTowards Linked Open Services and Processes
Towards Linked Open Services and ProcessesBarry Norton
 
Geospatial Linked Open Services
Geospatial Linked Open ServicesGeospatial Linked Open Services
Geospatial Linked Open ServicesBarry Norton
 
Linked Open Services @ SemData2010
Linked Open Services @ SemData2010Linked Open Services @ SemData2010
Linked Open Services @ SemData2010Barry Norton
 

More from Barry Norton (15)

Knowledge Graphs and Milestone
Knowledge Graphs and MilestoneKnowledge Graphs and Milestone
Knowledge Graphs and Milestone
 
GRAVITATE Search
GRAVITATE SearchGRAVITATE Search
GRAVITATE Search
 
ResearchSpace Collaborative Features
ResearchSpace Collaborative FeaturesResearchSpace Collaborative Features
ResearchSpace Collaborative Features
 
Book of the Dead Project
Book of the Dead ProjectBook of the Dead Project
Book of the Dead Project
 
Data Culture / Culture Data
Data Culture / Culture DataData Culture / Culture Data
Data Culture / Culture Data
 
Querying Cultural Heritage
Querying Cultural HeritageQuerying Cultural Heritage
Querying Cultural Heritage
 
A Data API with Security and Graph-Level Access Control
A Data API with Security and Graph-Level Access ControlA Data API with Security and Graph-Level Access Control
A Data API with Security and Graph-Level Access Control
 
GLAMorous LOD and ResearchSpace introduction
GLAMorous LOD and ResearchSpace introductionGLAMorous LOD and ResearchSpace introduction
GLAMorous LOD and ResearchSpace introduction
 
GLAMorous LOD
GLAMorous LODGLAMorous LOD
GLAMorous LOD
 
Linked Data, Ontologies and Inference
Linked Data, Ontologies and InferenceLinked Data, Ontologies and Inference
Linked Data, Ontologies and Inference
 
Integrating Drupal with a Triple Store
Integrating Drupal with a Triple StoreIntegrating Drupal with a Triple Store
Integrating Drupal with a Triple Store
 
Linked Data and Services
Linked Data and ServicesLinked Data and Services
Linked Data and Services
 
Towards Linked Open Services and Processes
Towards Linked Open Services and ProcessesTowards Linked Open Services and Processes
Towards Linked Open Services and Processes
 
Geospatial Linked Open Services
Geospatial Linked Open ServicesGeospatial Linked Open Services
Geospatial Linked Open Services
 
Linked Open Services @ SemData2010
Linked Open Services @ SemData2010Linked Open Services @ SemData2010
Linked Open Services @ SemData2010
 

Recently uploaded

"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
Science&tech:THE INFORMATION AGE STS.pdf
Science&tech:THE INFORMATION AGE STS.pdfScience&tech:THE INFORMATION AGE STS.pdf
Science&tech:THE INFORMATION AGE STS.pdfjimielynbastida
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphNeo4j
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksSoftradix Technologies
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024The Digital Insurer
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsSnow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsHyundai Motor Group
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsAndrey Dotsenko
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 

Recently uploaded (20)

"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
Science&tech:THE INFORMATION AGE STS.pdf
Science&tech:THE INFORMATION AGE STS.pdfScience&tech:THE INFORMATION AGE STS.pdf
Science&tech:THE INFORMATION AGE STS.pdf
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other Frameworks
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsSnow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 

Crowdsourcing tasks in Linked Data management

  • 1. Crowdsourcing tasks in Linked Data management Elena Simperl,1 Barry Norton,2 Denny Vrandecic1 1Institute AIFB, Karlsruhe Institute of Technology, Germany 2Ontotext AD, Bulgaria Institute of Applied Informatics and Formal Description Methods (AIFB) Institute of Applied Informatics and Formal Description Methods (AIFB) KIT – University of the State of Baden-Wuerttemberg and National Research Center of the Helmholtz Association www.kit.edu
  • 2. Motivation Various aspects of Linked Data management naturally rely on human intelligence to yield optimal results But reaching a critical mass of useful contributions from all relevant stakeholders is still more an art than an engineering exercise 2 23.10.2011 Seminar - Die Rolle von Ontologien in Linked Data – Kickoff Institut für Angewandte Informatik und Formale Beschreibungsverfahren (AIFB)
  • 3. Microtask platforms Break task Evaluate the Define task into smaller results units 3 23.10.2011 Seminar - Die Rolle von Ontologien in Linked Data – Kickoff Institut für Angewandte Informatik und Formale Beschreibungsverfahren (AIFB)
  • 4. Approach Formal, declarative description of the data and tasks using SPARQL patterns as a basis for the automatic design of HITs Integral part of Linked Data tools and applications At design time application developer specifies which data portions workers can process and via which types of HITs At run time The system materializes the data Workers process it Data and application are updated to reflect crowdsourcing results 4 23.10.2011 Crowdsourcing tasks in Linked Data management Institut für Angewandte Informatik und Formale Beschreibungsverfahren (AIFB)
  • 5. Examples of Linked Data tasks amenable to crowdsourcing Identity resolution Metadata completion and checking/correction Classification Ordering Quantitative Qualitative Translation 5 23.10.2011 Crowdsourcing tasks in Linked Data management Institut für Angewandte Informatik und Formale Beschreibungsverfahren (AIFB)
  • 6. Running Example 6 23.10.2011 Crowdsourcing tasks in Linked Data management Institut für Angewandte Informatik und Formale Beschreibungsverfahren (AIFB)
  • 7. Identity resolution Identity Resolution “involves the creation of sameAs links, either by comparison of metadata or by investigation of links on the human Web.” Input: {?station a metar:Station; rdfs:label ?slabel; wgs84:lat ?slat; wgs84:long ?slong . ?airport a dbp-owl:Airport; rdfs:label ?alabel; wgs84:lat ?alat; wgs84:long ?along} Output: {OPTIONAL {?airport owl:sameAs ?station}} 7 23.10.2011 Crowdsourcing tasks in Linked Data management Institut für Angewandte Informatik und Formale Beschreibungsverfahren (AIFB)
  • 8. Metadata completion & correction “Certain properties, necessary for a given query, may not be uniformly populated. Manually conducted research might be necessary to transfer this information from the human-readable Web” Input: {?station a metar:Station; rdfs:label ?label; wgs84:lat ?lat; wgs84:long ?long; dbp:icao ?badicao} Output: {?station dbp:icao ?goodicao} 8 23.10.2011 Crowdsourcing tasks in Linked Data management Institut für Angewandte Informatik und Formale Beschreibungsverfahren (AIFB)
  • 9. Classification “Linked Data emphasis[es…] relationships between resources [over classification]. [D]ue to the promoted use of generic vocabularies, is it not always possible to infer classification from […] properties” Input: {?station a metar:Station; rdfs:label ?label; wgs84:lat ?lat; wgs84:long ?long} Output: {?station a ?type. ?type rdfs:subClassOf metar:Station} 9 23.10.2011 Crowdsourcing tasks in Linked Data management Institut für Angewandte Informatik und Formale Beschreibungsverfahren (AIFB)
  • 10. Ordering “Having means to rank Linked Data content along specific dimensions is typically deemed useful for quantitative querying and browsing […both] “specific” ordering [(e.g. timestamps) … and] orderings […] via qualitative “less straightforward” built-ins [(e.g. pref/alt labels)]” Input: {?station foaf:depiction ?x, ?y} Output: {{(?x ?y) a rdf:List} UNION {(?y ?x) a rdf:List}} 10 23.10.2011 Crowdsourcing tasks in Linked Data management Institut für Angewandte Informatik und Formale Beschreibungsverfahren (AIFB)
  • 11. Translation “[An important] aspect of the labeling of resources for humans is multi-linguality […] actual provision of labels in non-English languages is currently rather low” Input: {?station rdfs:label ?enlabel. FILTER (LANG(?label) = "EN")} Output: {?station rdfs:label ?bglabel. FILTER (LANG(?label) = "BG")} 11 23.10.2011 Crowdsourcing tasks in Linked Data management Institut für Angewandte Informatik und Formale Beschreibungsverfahren (AIFB)
  • 12. Open query answering Query a FOAF-file using the vCard vocabulary hp:Harry foaf:mbox <mailto:scarface@hogwarts.ac.uk> ; foaf:nick "Harry" ; foaf:familyName "Potter" . SELECT ?name ?email WHERE { ?p vcard:email ?email ; vcard:fn ?name } In order to answer the query as intended Vocabulary mapping and entity resolution (foaf to vcard) Metadata completion (full name is Harry Potter) 12 23.10.2011 Crowdsourcing tasks in Linked Data management Institut für Angewandte Informatik und Formale Beschreibungsverfahren (AIFB)
  • 13. Limitations of microtask crowdsourcing Decomposability Verifiability Expertise Compositions to deal with tasks with underspecified workflow and/or multiple correct answers 13 23.10.2011 Crowdsourcing tasks in Linked Data management Institut für Angewandte Informatik und Formale Beschreibungsverfahren (AIFB)
  • 14. Challenges Decomposition of user-visible queries: SPARQL Easy: Low quality (meta)data can be subject to automated checking (even if not fixing) Medium: Missing data (and translation) can be automatically identified (but knowing to which dataset it should belong is not necessarily clear) Difficult: Interlinking (at least sameAs) is somewhat implicit (using entailment) and knowing where user expects Query optimisation obfuscates what is used and should involve costs for human tasks Pig might be somewhat easier in latter regard Caching Naively we can materialise HIT results into datasets How to deal with partial coverage and dynamic datasets 14 23.10.2011 Crowdsourcing tasks in Linked Data management Institut für Angewandte Informatik und Formale Beschreibungsverfahren (AIFB)
  • 15. Further Challenges Appropriate level of granularity for HITs design for specific SPARQL constructs and typical functionality of Linked Data management components Optimal user interfaces of graph-like content (Contextual) Rendering of LOD entities and tasks Pricing and workers’ assignment Can we connect the end-users of an application and their wish for specific data to be consumed with the payment of workers and prioritization of HITs? Dealing with spam / gaming 15 23.10.2011 Crowdsourcing tasks in Linked Data management Institut für Angewandte Informatik und Formale Beschreibungsverfahren (AIFB)
  • 16. QUESTIONS 16 23.10.2011 Crowdsourcing tasks in Linked Data management Institut für Angewandte Informatik und Formale Beschreibungsverfahren (AIFB)