Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
A Controlled Crowdsourcing Approach for Practical
Ontology Extensions and Metadata Annotations
Yolanda Gil1, Daniel Garijo...
Data reuse in paleoclimate and environmental
sciences
A Controlled Crowdsourcing Approach for Practical Ontology Extension...
Challenges
• How can we leverage basic core agreements?
• How can scientist create new properties that they want to use to...
Approach: Controlled crowdsourcing
• A metadata crowdsourcing platform
• Controlled standardization process for new metada...
A Framework for Controlled Crowdsourcing
Data Annotation
Core
ontology
Snapshot
Snapshot Repository
Update
Ontology Reposi...
Specifying metadata for a dataset
Data Download
Completed
properties
Missing properties
Crowd Properties
Category
Category...
Fostering standardization
Suggestion of renames
Autocompletion
A Controlled Crowdsourcing Approach for Practical Ontology ...
Dynamic map-based visualizations
Dataset annotation
interface
Author credit Polls for decision making
Community discussion...
The Linked Earth Ontology - Overview
• Modular design (Core modules + crowd extensions)
http://linked.earth/ontology#
Link...
The Linked Earth Ontology - versioning
• Working Groups discuss new changes to the ontology
• Once a new version is approv...
Organizing the community
• Basic editors
• Advanced editor
• Editorial board
• Working group
• Periodic face to face event...
Current Situation
Page Distribution
Datasets 699
ProxyAcrhive 207
ProxyObservation 76
ProxySensor 63
Instrument 45
Inferre...
Conclusions and Future Work
Approach for on the fly ontology extensions for scientific metadata
annotations
• Foster stand...
A Controlled Crowdsourcing Approach for Practical
Ontology Extensions and Metadata Annotations
Yolanda Gil1, Daniel Garijo...
Upcoming SlideShare
Loading in …5
×

A Controlled Crowdsourcing Approach for Practical Ontology Extensions and Metadata Annotations

223 views

Published on

Traditional approaches to ontology development have a large lapse between the time when a user using the ontology has found a need to extend it and the time when it does get extended. For scientists, this delay can be weeks or months and can be a significant barrier for adoption. We present a new approach to ontology development and data annotation enabling users to add new metadata properties on the fly as they describe their datasets, creating terms that can be immediately adopted by others and eventually become standardized. This approach combines a traditional, consensus-based approach to ontology development, and a crowdsourced approach where ex-pert users (the crowd) can dynamically add terms as needed to support their work. We have implemented this approach as a socio-technical system that includes: 1) a crowdsourcing platform to support metadata annotation and addition of new terms, 2) a range of social editorial processes to make standardization decisions for those new terms, and 3) a framework for ontology revision and updates to the metadata created with the previous version of the ontology. We present a prototype implementation for the paleoclimate community, the Linked Earth Framework, currently containing 700 datasets and engaging over 50 active contributors. Users exploit the platform to do science while extending the metadata vocabulary, thereby producing useful and practical metadata

Published in: Engineering
  • Be the first to comment

  • Be the first to like this

A Controlled Crowdsourcing Approach for Practical Ontology Extensions and Metadata Annotations

  1. 1. A Controlled Crowdsourcing Approach for Practical Ontology Extensions and Metadata Annotations Yolanda Gil1, Daniel Garijo1, Varun Ratnakar1, Deborah Khider2, Julien Emile-Geay2 and Nicholas McKay3 1Information Sciences Institute, University of Southern California, 2Department of Earth Sciences, University of Southern California, 3School of Earth Sciences and Environmental Sustainability, North Arizona University @yolandagil, @dgarijov {gil,dgarijo}@isi.edu Information Sciences Institute ISWC In-Use Track, Vienna, 2017
  2. 2. Data reuse in paleoclimate and environmental sciences A Controlled Crowdsourcing Approach for Practical Ontology Extensions and Metadata Annotations (Gilt et al, ISWC In use track, Vienna, 2017) • Data is collected using idiosyncratic notation and protocols by independent scientists. • Hundreds of types of observations • Physical samples may be from ice, tree, coral, marine sediment, etc. • Hundreds of types of measures • Temperature, rainfall, PH, etc. • Diversity is so great that no one dares to embark on standards. • Typical situation for environmental sciences (water modeling, hydrology etc.)
  3. 3. Challenges • How can we leverage basic core agreements? • How can scientist create new properties that they want to use to describe their data? • How to facilitate consensus on new extensions to core agreements? • How can the scientific community immediately benefit from these continued expansion of core agreements? • Coordination and maintenance of new extensions to core agreements A Controlled Crowdsourcing Approach for Practical Ontology Extensions and Metadata Annotations (Gilt et al, ISWC In use track, Vienna, 2017)
  4. 4. Approach: Controlled crowdsourcing • A metadata crowdsourcing platform • Controlled standardization process for new metadata properties • Framework for updating metadata of previously annotated datasets A Controlled Crowdsourcing Approach for Practical Ontology Extensions and Metadata Annotations (Gilt et al, ISWC In use track, Vienna, 2017)
  5. 5. A Framework for Controlled Crowdsourcing Data Annotation Core ontology Snapshot Snapshot Repository Update Ontology Repository Core ontology revision Crowd vocabulary revision Revision Annotation Framework Revision Framework Update Framework Version 0 Version 1 Requests & issues (core ontology) Requests & issues Extended crowd vocabulary Dataset metadata Dataset metadata store Changes -Monotonic changes -Non-monotonic changes Crowd vocabulary Load/ reload Load/ reload Reload datasets Changes to crowd vocabulary Editorial Board Basic editor Datasets Advanced editor Core ontology A Controlled Crowdsourcing Approach for Practical Ontology Extensions and Metadata Annotations (Gilt et al, ISWC In use track, Vienna, 2017)
  6. 6. Specifying metadata for a dataset Data Download Completed properties Missing properties Crowd Properties Category Category Annotation A Controlled Crowdsourcing Approach for Practical Ontology Extensions and Metadata Annotations (Gilt et al, ISWC In use track, Vienna, 2017)
  7. 7. Fostering standardization Suggestion of renames Autocompletion A Controlled Crowdsourcing Approach for Practical Ontology Extensions and Metadata Annotations (Gilt et al, ISWC In use track, Vienna, 2017)
  8. 8. Dynamic map-based visualizations Dataset annotation interface Author credit Polls for decision making Community discussions Implementation: The Linked Earth Platform A Controlled Crowdsourcing Approach for Practical Ontology Extensions and Metadata Annotations (Gilt et al, ISWC In use track, Vienna, 2017)
  9. 9. The Linked Earth Ontology - Overview • Modular design (Core modules + crowd extensions) http://linked.earth/ontology# Linked Paleo Data Ontology (LiPD) EXTENSION (Coral, Wood, Lake Sediment…) EXTENSION (Spectral, Chemical …) EXTENSION (Rock, Snow, Tree …) EXTENSION (Spectrometer, Spectroscope …) EXTENSION (Precipitation, time …) Crowd Vocabulary Extension Schema.org (Dataset) Wgs_84 (Position) Geosparql (Position) SSN (Observation) FOAF (Person) PROV (Derivation) DC (Publication) CoreOntology ProxyArchive ProxyObservation ProxySensor Instrument InferredVariable A Controlled Crowdsourcing Approach for Practical Ontology Extensions and Metadata Annotations (Gilt et al, ISWC In use track, Vienna, 2017)
  10. 10. The Linked Earth Ontology - versioning • Working Groups discuss new changes to the ontology • Once a new version is approved, the core vocabulary released and versioned outside the wiki: • Naming schema: http://linked.earth/ontology/module/version • Example: http://linked.earth/ontology/core/1.2.0 • Latest version preserves its URI (aggregates all modules): • http://linked.earth/ontology# • Each version is documented and published in a machine readable and human readable manner A Controlled Crowdsourcing Approach for Practical Ontology Extensions and Metadata Annotations (Gilt et al, ISWC In use track, Vienna, 2017)
  11. 11. Organizing the community • Basic editors • Advanced editor • Editorial board • Working group • Periodic face to face events for community engagement • Engagement through twitter polls, online surveys • Editorial board requests votes for candidate standard properties A Controlled Crowdsourcing Approach for Practical Ontology Extensions and Metadata Annotations (Gilt et al, ISWC In use track, Vienna, 2017)
  12. 12. Current Situation Page Distribution Datasets 699 ProxyAcrhive 207 ProxyObservation 76 ProxySensor 63 Instrument 45 InferredVariable 1207 MeasuredVariable 3348 Working Group 12 Location 659 Person 524 Publication 875 • More than 14000 pages • More than 150 registered users (50 active) • One full iteration and revision of the ontology • Identified leaders for working groups A Controlled Crowdsourcing Approach for Practical Ontology Extensions and Metadata Annotations (Gilt et al, ISWC In use track, Vienna, 2017)
  13. 13. Conclusions and Future Work Approach for on the fly ontology extensions for scientific metadata annotations • Foster standardization through renaming, autocompletion and voting • Editorial process to review core standard with new crowd terms • Framework for updating dataset properties when a new standard is released Ongoing work: • Support editorial process for core ontology revisions • Automating the ontology documentation updates • Further automations of update framework A Controlled Crowdsourcing Approach for Practical Ontology Extensions and Metadata Annotations (Gilt et al, ISWC In use track, Vienna, 2017)
  14. 14. A Controlled Crowdsourcing Approach for Practical Ontology Extensions and Metadata Annotations Yolanda Gil1, Daniel Garijo1, Varun Ratnakar1, Deborah Khider2, Julien Emile-Geay2 and Nicholas McKay3 1Information Sciences Institute, University of Southern California, 2Department of Earth Sciences, University of Southern California, 3School of Earth Sciences and Environmental Sustainability, North Arizona University @yolandagil, @dgarijov {gil,dgarijo}@isi.edu Information Sciences Institute ISWC In-Use Track, Vienna, 2017

×