A Controlled Crowdsourcing Approach for Practical Ontology Extensions and Metadata Annotations

dgarijo
A Controlled Crowdsourcing Approach for Practical
Ontology Extensions and Metadata Annotations
Yolanda Gil1, Daniel Garijo1, Varun Ratnakar1,
Deborah Khider2, Julien Emile-Geay2 and Nicholas McKay3
1Information Sciences Institute, University of Southern California,
2Department of Earth Sciences, University of Southern California,
3School of Earth Sciences and Environmental Sustainability,
North Arizona University
@yolandagil, @dgarijov
{gil,dgarijo}@isi.edu
Information
Sciences
Institute
ISWC In-Use Track, Vienna, 2017
Data reuse in paleoclimate and environmental
sciences
A Controlled Crowdsourcing Approach for Practical Ontology Extensions and Metadata Annotations
(Gilt et al, ISWC In use track, Vienna, 2017)
• Data is collected using idiosyncratic notation and protocols by independent
scientists.
• Hundreds of types of observations
• Physical samples may be from ice, tree, coral, marine sediment, etc.
• Hundreds of types of measures
• Temperature, rainfall, PH, etc.
• Diversity is so great that no one dares to embark on standards.
• Typical situation for environmental sciences (water modeling, hydrology etc.)
Challenges
• How can we leverage basic core agreements?
• How can scientist create new properties that they want to use to describe
their data?
• How to facilitate consensus on new extensions to core agreements?
• How can the scientific community immediately benefit from these continued
expansion of core agreements?
• Coordination and maintenance of new extensions to core agreements
A Controlled Crowdsourcing Approach for Practical Ontology Extensions and Metadata Annotations
(Gilt et al, ISWC In use track, Vienna, 2017)
Approach: Controlled crowdsourcing
• A metadata crowdsourcing platform
• Controlled standardization process for new metadata properties
• Framework for updating metadata of previously annotated datasets
A Controlled Crowdsourcing Approach for Practical Ontology Extensions and Metadata Annotations
(Gilt et al, ISWC In use track, Vienna, 2017)
A Framework for Controlled Crowdsourcing
Data Annotation
Core
ontology
Snapshot
Snapshot Repository
Update
Ontology Repository
Core
ontology
revision
Crowd
vocabulary
revision
Revision
Annotation Framework
Revision Framework
Update Framework
Version 0
Version 1
Requests & issues
(core ontology)
Requests
& issues
Extended
crowd
vocabulary
Dataset metadata
Dataset metadata store
Changes
-Monotonic changes
-Non-monotonic changes
Crowd
vocabulary
Load/
reload
Load/
reload
Reload
datasets
Changes to
crowd vocabulary
Editorial Board
Basic editor
Datasets
Advanced
editor
Core
ontology
A Controlled Crowdsourcing Approach for Practical Ontology Extensions and Metadata Annotations
(Gilt et al, ISWC In use track, Vienna, 2017)
Specifying metadata for a dataset
Data Download
Completed
properties
Missing properties
Crowd Properties
Category
Category Annotation
A Controlled Crowdsourcing Approach for Practical Ontology Extensions and Metadata Annotations
(Gilt et al, ISWC In use track, Vienna, 2017)
Fostering standardization
Suggestion of renames
Autocompletion
A Controlled Crowdsourcing Approach for Practical Ontology Extensions and Metadata Annotations
(Gilt et al, ISWC In use track, Vienna, 2017)
Dynamic map-based visualizations
Dataset annotation
interface
Author credit Polls for decision making
Community discussions
Implementation: The Linked Earth Platform
A Controlled Crowdsourcing Approach for Practical Ontology Extensions and Metadata Annotations
(Gilt et al, ISWC In use track, Vienna, 2017)
The Linked Earth Ontology - Overview
• Modular design (Core modules + crowd extensions)
http://linked.earth/ontology#
Linked Paleo Data Ontology (LiPD)
EXTENSION
(Coral, Wood,
Lake Sediment…)
EXTENSION
(Spectral,
Chemical …)
EXTENSION
(Rock, Snow,
Tree …)
EXTENSION
(Spectrometer,
Spectroscope …)
EXTENSION
(Precipitation,
time …)
Crowd Vocabulary Extension
Schema.org
(Dataset)
Wgs_84
(Position)
Geosparql
(Position)
SSN
(Observation)
FOAF
(Person)
PROV
(Derivation)
DC
(Publication)
CoreOntology
ProxyArchive ProxyObservation ProxySensor Instrument InferredVariable
A Controlled Crowdsourcing Approach for Practical Ontology Extensions and Metadata Annotations
(Gilt et al, ISWC In use track, Vienna, 2017)
The Linked Earth Ontology - versioning
• Working Groups discuss new changes to the ontology
• Once a new version is approved, the core vocabulary released and versioned
outside the wiki:
• Naming schema: http://linked.earth/ontology/module/version
• Example: http://linked.earth/ontology/core/1.2.0
• Latest version preserves its URI (aggregates all modules):
• http://linked.earth/ontology#
• Each version is documented and published in a machine readable and human
readable manner
A Controlled Crowdsourcing Approach for Practical Ontology Extensions and Metadata Annotations
(Gilt et al, ISWC In use track, Vienna, 2017)
Organizing the community
• Basic editors
• Advanced editor
• Editorial board
• Working group
• Periodic face to face events for community engagement
• Engagement through twitter polls, online surveys
• Editorial board requests votes for candidate standard properties
A Controlled Crowdsourcing Approach for Practical Ontology Extensions and Metadata Annotations
(Gilt et al, ISWC In use track, Vienna, 2017)
Current Situation
Page Distribution
Datasets 699
ProxyAcrhive 207
ProxyObservation 76
ProxySensor 63
Instrument 45
InferredVariable 1207
MeasuredVariable 3348
Working Group 12
Location 659
Person 524
Publication 875
• More than 14000 pages
• More than 150 registered users (50 active)
• One full iteration and revision of the ontology
• Identified leaders for working groups
A Controlled Crowdsourcing Approach for Practical Ontology Extensions and Metadata Annotations
(Gilt et al, ISWC In use track, Vienna, 2017)
Conclusions and Future Work
Approach for on the fly ontology extensions for scientific metadata
annotations
• Foster standardization through renaming, autocompletion and voting
• Editorial process to review core standard with new crowd terms
• Framework for updating dataset properties when a new standard is released
Ongoing work:
• Support editorial process for core ontology revisions
• Automating the ontology documentation updates
• Further automations of update framework
A Controlled Crowdsourcing Approach for Practical Ontology Extensions and Metadata Annotations
(Gilt et al, ISWC In use track, Vienna, 2017)
A Controlled Crowdsourcing Approach for Practical
Ontology Extensions and Metadata Annotations
Yolanda Gil1, Daniel Garijo1, Varun Ratnakar1,
Deborah Khider2, Julien Emile-Geay2 and Nicholas McKay3
1Information Sciences Institute, University of Southern California,
2Department of Earth Sciences, University of Southern California,
3School of Earth Sciences and Environmental Sustainability,
North Arizona University
@yolandagil, @dgarijov
{gil,dgarijo}@isi.edu
Information
Sciences
Institute
ISWC In-Use Track, Vienna, 2017
1 of 14

Recommended

WIDOCO: A Wizard for Documenting Ontologies by
WIDOCO: A Wizard for Documenting OntologiesWIDOCO: A Wizard for Documenting Ontologies
WIDOCO: A Wizard for Documenting Ontologiesdgarijo
1.2K views12 slides
Towards Knowledge Graphs of Reusable Research Software Metadata by
Towards Knowledge Graphs of Reusable Research Software MetadataTowards Knowledge Graphs of Reusable Research Software Metadata
Towards Knowledge Graphs of Reusable Research Software Metadatadgarijo
624 views22 slides
Reproducible Research: how could Research Objects help by
Reproducible Research: how could Research Objects helpReproducible Research: how could Research Objects help
Reproducible Research: how could Research Objects helpCarole Goble
605 views30 slides
FAIR Software (and Data) Citation: Europe, Research Object Systems, Networks ... by
FAIR Software (and Data) Citation: Europe, Research Object Systems, Networks ...FAIR Software (and Data) Citation: Europe, Research Object Systems, Networks ...
FAIR Software (and Data) Citation: Europe, Research Object Systems, Networks ...Carole Goble
866 views53 slides
Reproducible and citable data and models: an introduction. by
Reproducible and citable data and models: an introduction.Reproducible and citable data and models: an introduction.
Reproducible and citable data and models: an introduction.FAIRDOM
4.2K views15 slides
Introduction to FAIRDOM by
Introduction to FAIRDOMIntroduction to FAIRDOM
Introduction to FAIRDOMCarole Goble
1.3K views41 slides

More Related Content

What's hot

The Research Object Initiative: Frameworks and Use Cases by
The Research Object Initiative:Frameworks and Use CasesThe Research Object Initiative:Frameworks and Use Cases
The Research Object Initiative: Frameworks and Use CasesCarole Goble
1.7K views61 slides
Reflections on a (slightly unusual) multi-disciplinary academic career by
Reflections on a (slightly unusual) multi-disciplinary academic careerReflections on a (slightly unusual) multi-disciplinary academic career
Reflections on a (slightly unusual) multi-disciplinary academic careerCarole Goble
482 views38 slides
The FAIRDOM Commons for Systems Biology by
The FAIRDOM Commons for Systems BiologyThe FAIRDOM Commons for Systems Biology
The FAIRDOM Commons for Systems BiologyFAIRDOM
2.3K views29 slides
Being FAIR: FAIR data and model management SSBSS 2017 Summer School by
Being FAIR:  FAIR data and model management SSBSS 2017 Summer SchoolBeing FAIR:  FAIR data and model management SSBSS 2017 Summer School
Being FAIR: FAIR data and model management SSBSS 2017 Summer SchoolCarole Goble
978 views65 slides
Let’s go on a FAIR safari! by
Let’s go on a FAIR safari!Let’s go on a FAIR safari!
Let’s go on a FAIR safari!Carole Goble
1.4K views58 slides
FAIR data and model management for systems biology. by
FAIR data and model management for systems biology.FAIR data and model management for systems biology.
FAIR data and model management for systems biology.FAIRDOM
1.6K views21 slides

What's hot(20)

The Research Object Initiative: Frameworks and Use Cases by Carole Goble
The Research Object Initiative:Frameworks and Use CasesThe Research Object Initiative:Frameworks and Use Cases
The Research Object Initiative: Frameworks and Use Cases
Carole Goble1.7K views
Reflections on a (slightly unusual) multi-disciplinary academic career by Carole Goble
Reflections on a (slightly unusual) multi-disciplinary academic careerReflections on a (slightly unusual) multi-disciplinary academic career
Reflections on a (slightly unusual) multi-disciplinary academic career
Carole Goble482 views
The FAIRDOM Commons for Systems Biology by FAIRDOM
The FAIRDOM Commons for Systems BiologyThe FAIRDOM Commons for Systems Biology
The FAIRDOM Commons for Systems Biology
FAIRDOM2.3K views
Being FAIR: FAIR data and model management SSBSS 2017 Summer School by Carole Goble
Being FAIR:  FAIR data and model management SSBSS 2017 Summer SchoolBeing FAIR:  FAIR data and model management SSBSS 2017 Summer School
Being FAIR: FAIR data and model management SSBSS 2017 Summer School
Carole Goble978 views
Let’s go on a FAIR safari! by Carole Goble
Let’s go on a FAIR safari!Let’s go on a FAIR safari!
Let’s go on a FAIR safari!
Carole Goble1.4K views
FAIR data and model management for systems biology. by FAIRDOM
FAIR data and model management for systems biology.FAIR data and model management for systems biology.
FAIR data and model management for systems biology.
FAIRDOM1.6K views
SOMEF: a metadata extraction framework from software documentation by dgarijo
SOMEF: a metadata extraction framework from software documentationSOMEF: a metadata extraction framework from software documentation
SOMEF: a metadata extraction framework from software documentation
dgarijo121 views
Report of the second FAIRDOM foundry by FAIRDOM
Report of the second FAIRDOM foundryReport of the second FAIRDOM foundry
Report of the second FAIRDOM foundry
FAIRDOM1.1K views
Capturing the context: one small(ish step for modellers, one giant leap for m... by FAIRDOM
Capturing the context: one small(ish step for modellers, one giant leap for m...Capturing the context: one small(ish step for modellers, one giant leap for m...
Capturing the context: one small(ish step for modellers, one giant leap for m...
FAIRDOM3.4K views
FAIR Data, Operations and Model management for Systems Biology and Systems Me... by Carole Goble
FAIR Data, Operations and Model management for Systems Biology and Systems Me...FAIR Data, Operations and Model management for Systems Biology and Systems Me...
FAIR Data, Operations and Model management for Systems Biology and Systems Me...
Carole Goble1.5K views
Crediting informatics and data folks in life science teams by Carole Goble
Crediting informatics and data folks in life science teamsCrediting informatics and data folks in life science teams
Crediting informatics and data folks in life science teams
Carole Goble1.1K views
Towards Reusable Research Software by dgarijo
Towards Reusable Research SoftwareTowards Reusable Research Software
Towards Reusable Research Software
dgarijo171 views
OSFair2017 Workshop | How FAIR friendly is the FAIRDOM Hub? Exposing metadata... by Open Science Fair
OSFair2017 Workshop | How FAIR friendly is the FAIRDOM Hub? Exposing metadata...OSFair2017 Workshop | How FAIR friendly is the FAIRDOM Hub? Exposing metadata...
OSFair2017 Workshop | How FAIR friendly is the FAIRDOM Hub? Exposing metadata...
Open Science Fair203 views
Making your data good enough for sharing. by FAIRDOM
Making your data good enough for sharing.Making your data good enough for sharing.
Making your data good enough for sharing.
FAIRDOM4.8K views
Trust and Accountability: experiences from the FAIRDOM Commons Initiative. by Carole Goble
Trust and Accountability: experiences from the FAIRDOM Commons Initiative.Trust and Accountability: experiences from the FAIRDOM Commons Initiative.
Trust and Accountability: experiences from the FAIRDOM Commons Initiative.
Carole Goble1.4K views
OBA: An Ontology-Based Framework for Creating REST APIs for Knowledge Graphs by dgarijo
OBA: An Ontology-Based Framework for Creating REST APIs for Knowledge GraphsOBA: An Ontology-Based Framework for Creating REST APIs for Knowledge Graphs
OBA: An Ontology-Based Framework for Creating REST APIs for Knowledge Graphs
dgarijo424 views
Crosslinks by ericmeeks
Crosslinks Crosslinks
Crosslinks
ericmeeks554 views
Citing data in research articles: principles, implementation, challenges - an... by FAIRDOM
Citing data in research articles: principles, implementation, challenges - an...Citing data in research articles: principles, implementation, challenges - an...
Citing data in research articles: principles, implementation, challenges - an...
FAIRDOM3.8K views

Similar to A Controlled Crowdsourcing Approach for Practical Ontology Extensions and Metadata Annotations

Optique presentation by
Optique presentationOptique presentation
Optique presentationDBOnto
789 views36 slides
NIH iDASH meeting on data sharing - BioSharing, ISA and Scientific Data by
NIH iDASH meeting on data sharing - BioSharing, ISA and Scientific DataNIH iDASH meeting on data sharing - BioSharing, ISA and Scientific Data
NIH iDASH meeting on data sharing - BioSharing, ISA and Scientific DataSusanna-Assunta Sansone
621 views41 slides
IEDA Overview & Updates, March 2014 by
IEDA Overview & Updates, March 2014IEDA Overview & Updates, March 2014
IEDA Overview & Updates, March 2014iedadata
622 views40 slides
The Biodiversity Informatics Landscape by
The Biodiversity Informatics LandscapeThe Biodiversity Informatics Landscape
The Biodiversity Informatics LandscapeVince Smith
1.2K views29 slides
Using Feedback from Data Consumers to Capture Quality Information on Environm... by
Using Feedback from Data Consumers to Capture Quality Information on Environm...Using Feedback from Data Consumers to Capture Quality Information on Environm...
Using Feedback from Data Consumers to Capture Quality Information on Environm...Anusuriya Devaraju
689 views22 slides
Alive and kicking! Keeping data re-usable in the European Values Study by
Alive and kicking! Keeping data re-usable in the European Values StudyAlive and kicking! Keeping data re-usable in the European Values Study
Alive and kicking! Keeping data re-usable in the European Values StudyCESSDA Training
916 views19 slides

Similar to A Controlled Crowdsourcing Approach for Practical Ontology Extensions and Metadata Annotations(20)

Optique presentation by DBOnto
Optique presentationOptique presentation
Optique presentation
DBOnto789 views
NIH iDASH meeting on data sharing - BioSharing, ISA and Scientific Data by Susanna-Assunta Sansone
NIH iDASH meeting on data sharing - BioSharing, ISA and Scientific DataNIH iDASH meeting on data sharing - BioSharing, ISA and Scientific Data
NIH iDASH meeting on data sharing - BioSharing, ISA and Scientific Data
IEDA Overview & Updates, March 2014 by iedadata
IEDA Overview & Updates, March 2014IEDA Overview & Updates, March 2014
IEDA Overview & Updates, March 2014
iedadata622 views
The Biodiversity Informatics Landscape by Vince Smith
The Biodiversity Informatics LandscapeThe Biodiversity Informatics Landscape
The Biodiversity Informatics Landscape
Vince Smith1.2K views
Using Feedback from Data Consumers to Capture Quality Information on Environm... by Anusuriya Devaraju
Using Feedback from Data Consumers to Capture Quality Information on Environm...Using Feedback from Data Consumers to Capture Quality Information on Environm...
Using Feedback from Data Consumers to Capture Quality Information on Environm...
Anusuriya Devaraju689 views
Alive and kicking! Keeping data re-usable in the European Values Study by CESSDA Training
Alive and kicking! Keeping data re-usable in the European Values StudyAlive and kicking! Keeping data re-usable in the European Values Study
Alive and kicking! Keeping data re-usable in the European Values Study
CESSDA Training916 views
Data Facilities Workshop - Panel on Current Concepts in Data Sharing & Intero... by EarthCube
Data Facilities Workshop - Panel on Current Concepts in Data Sharing & Intero...Data Facilities Workshop - Panel on Current Concepts in Data Sharing & Intero...
Data Facilities Workshop - Panel on Current Concepts in Data Sharing & Intero...
EarthCube812 views
Current and emerging scientific data curation practices by Michael Day
Current and emerging scientific data curation practicesCurrent and emerging scientific data curation practices
Current and emerging scientific data curation practices
Michael Day2.5K views
Vince smith-delivering biodiversity knowledge in the information age-notext by Vince Smith
Vince smith-delivering biodiversity knowledge in the information age-notextVince smith-delivering biodiversity knowledge in the information age-notext
Vince smith-delivering biodiversity knowledge in the information age-notext
Vince Smith792 views
Australia's Environmental Predictive Capability by TERN Australia
Australia's Environmental Predictive CapabilityAustralia's Environmental Predictive Capability
Australia's Environmental Predictive Capability
TERN Australia114 views
The eCrystals Federation by ManjulaPatel
The eCrystals FederationThe eCrystals Federation
The eCrystals Federation
ManjulaPatel1.4K views
Data discovery and sharing at UCLH by Jisc
Data discovery and sharing at UCLHData discovery and sharing at UCLH
Data discovery and sharing at UCLH
Jisc1.2K views
Disciplinary and institutional perspectives on digital curation by Michael Day
Disciplinary and institutional perspectives on digital curationDisciplinary and institutional perspectives on digital curation
Disciplinary and institutional perspectives on digital curation
Michael Day1.2K views
Approach and outcome of the Biodiversity Virtual e-Laboratory (BioVeL) project by Alex Hardisty
Approach and outcome of the Biodiversity Virtual e-Laboratory (BioVeL) projectApproach and outcome of the Biodiversity Virtual e-Laboratory (BioVeL) project
Approach and outcome of the Biodiversity Virtual e-Laboratory (BioVeL) project
Alex Hardisty543 views
The Climate Tagger - a tagging and recommender service for climate informatio... by Martin Kaltenböck
The Climate Tagger - a tagging and recommender service for climate informatio...The Climate Tagger - a tagging and recommender service for climate informatio...
The Climate Tagger - a tagging and recommender service for climate informatio...
Martin Kaltenböck740 views

More from dgarijo

FOOPS!: An Ontology Pitfall Scanner for the FAIR principles by
FOOPS!: An Ontology Pitfall Scanner for the FAIR principlesFOOPS!: An Ontology Pitfall Scanner for the FAIR principles
FOOPS!: An Ontology Pitfall Scanner for the FAIR principlesdgarijo
520 views8 slides
FAIR Workflows: A step closer to the Scientific Paper of the Future by
FAIR Workflows: A step closer to the Scientific Paper of the FutureFAIR Workflows: A step closer to the Scientific Paper of the Future
FAIR Workflows: A step closer to the Scientific Paper of the Futuredgarijo
619 views36 slides
A Template-Based Approach for Annotating Long-Tailed Datasets by
A Template-Based Approach for Annotating Long-Tailed DatasetsA Template-Based Approach for Annotating Long-Tailed Datasets
A Template-Based Approach for Annotating Long-Tailed Datasetsdgarijo
144 views12 slides
Scientific Software Registry Collaboration Workshop: From Software Metadata r... by
Scientific Software Registry Collaboration Workshop: From Software Metadata r...Scientific Software Registry Collaboration Workshop: From Software Metadata r...
Scientific Software Registry Collaboration Workshop: From Software Metadata r...dgarijo
460 views12 slides
WDPlus: Leveraging Wikidata to Link and Extend Tabular Data by
WDPlus: Leveraging Wikidata to Link and Extend Tabular DataWDPlus: Leveraging Wikidata to Link and Extend Tabular Data
WDPlus: Leveraging Wikidata to Link and Extend Tabular Datadgarijo
584 views13 slides
OKG-Soft: An Open Knowledge Graph With Mathine Readable Scientific Software M... by
OKG-Soft: An Open Knowledge Graph With Mathine Readable Scientific Software M...OKG-Soft: An Open Knowledge Graph With Mathine Readable Scientific Software M...
OKG-Soft: An Open Knowledge Graph With Mathine Readable Scientific Software M...dgarijo
1.8K views28 slides

More from dgarijo(20)

FOOPS!: An Ontology Pitfall Scanner for the FAIR principles by dgarijo
FOOPS!: An Ontology Pitfall Scanner for the FAIR principlesFOOPS!: An Ontology Pitfall Scanner for the FAIR principles
FOOPS!: An Ontology Pitfall Scanner for the FAIR principles
dgarijo520 views
FAIR Workflows: A step closer to the Scientific Paper of the Future by dgarijo
FAIR Workflows: A step closer to the Scientific Paper of the FutureFAIR Workflows: A step closer to the Scientific Paper of the Future
FAIR Workflows: A step closer to the Scientific Paper of the Future
dgarijo619 views
A Template-Based Approach for Annotating Long-Tailed Datasets by dgarijo
A Template-Based Approach for Annotating Long-Tailed DatasetsA Template-Based Approach for Annotating Long-Tailed Datasets
A Template-Based Approach for Annotating Long-Tailed Datasets
dgarijo144 views
Scientific Software Registry Collaboration Workshop: From Software Metadata r... by dgarijo
Scientific Software Registry Collaboration Workshop: From Software Metadata r...Scientific Software Registry Collaboration Workshop: From Software Metadata r...
Scientific Software Registry Collaboration Workshop: From Software Metadata r...
dgarijo460 views
WDPlus: Leveraging Wikidata to Link and Extend Tabular Data by dgarijo
WDPlus: Leveraging Wikidata to Link and Extend Tabular DataWDPlus: Leveraging Wikidata to Link and Extend Tabular Data
WDPlus: Leveraging Wikidata to Link and Extend Tabular Data
dgarijo584 views
OKG-Soft: An Open Knowledge Graph With Mathine Readable Scientific Software M... by dgarijo
OKG-Soft: An Open Knowledge Graph With Mathine Readable Scientific Software M...OKG-Soft: An Open Knowledge Graph With Mathine Readable Scientific Software M...
OKG-Soft: An Open Knowledge Graph With Mathine Readable Scientific Software M...
dgarijo1.8K views
Towards Human-Guided Machine Learning - IUI 2019 by dgarijo
Towards Human-Guided Machine Learning - IUI 2019Towards Human-Guided Machine Learning - IUI 2019
Towards Human-Guided Machine Learning - IUI 2019
dgarijo545 views
Capturing Context in Scientific Experiments: Towards Computer-Driven Science by dgarijo
Capturing Context in Scientific Experiments: Towards Computer-Driven ScienceCapturing Context in Scientific Experiments: Towards Computer-Driven Science
Capturing Context in Scientific Experiments: Towards Computer-Driven Science
dgarijo551 views
Towards Automating Data Narratives by dgarijo
Towards Automating Data NarrativesTowards Automating Data Narratives
Towards Automating Data Narratives
dgarijo920 views
Automated Hypothesis Testing with Large Scale Scientific Workflows by dgarijo
Automated Hypothesis Testing with Large Scale Scientific WorkflowsAutomated Hypothesis Testing with Large Scale Scientific Workflows
Automated Hypothesis Testing with Large Scale Scientific Workflows
dgarijo586 views
OntoSoft: A Distributed Semantic Registry for Scientific Software by dgarijo
OntoSoft: A Distributed Semantic Registry for Scientific SoftwareOntoSoft: A Distributed Semantic Registry for Scientific Software
OntoSoft: A Distributed Semantic Registry for Scientific Software
dgarijo919 views
OEG tools for supporting Ontology Engineering by dgarijo
OEG tools for supporting Ontology EngineeringOEG tools for supporting Ontology Engineering
OEG tools for supporting Ontology Engineering
dgarijo289 views
Software Metadata: Describing "dark software" in GeoSciences by dgarijo
Software Metadata: Describing "dark software" in GeoSciencesSoftware Metadata: Describing "dark software" in GeoSciences
Software Metadata: Describing "dark software" in GeoSciences
dgarijo901 views
Reproducibility Using Semantics: An Overview by dgarijo
Reproducibility Using Semantics: An OverviewReproducibility Using Semantics: An Overview
Reproducibility Using Semantics: An Overview
dgarijo890 views
PhD Thesis: Mining abstractions in scientific workflows by dgarijo
PhD Thesis: Mining abstractions in scientific workflowsPhD Thesis: Mining abstractions in scientific workflows
PhD Thesis: Mining abstractions in scientific workflows
dgarijo1.8K views
Publicación de datos y métodos científicos en investigación by dgarijo
Publicación de datos y métodos científicos en investigaciónPublicación de datos y métodos científicos en investigación
Publicación de datos y métodos científicos en investigación
dgarijo820 views
EDBT 2015: Summer School Overview by dgarijo
EDBT 2015: Summer School OverviewEDBT 2015: Summer School Overview
EDBT 2015: Summer School Overview
dgarijo608 views
Similarity in Wikipedia Articles (EDBT Summer School) by dgarijo
Similarity in Wikipedia Articles (EDBT Summer School)Similarity in Wikipedia Articles (EDBT Summer School)
Similarity in Wikipedia Articles (EDBT Summer School)
dgarijo790 views
Semantic web 101: Benefits for geologists by dgarijo
Semantic web 101: Benefits for geologistsSemantic web 101: Benefits for geologists
Semantic web 101: Benefits for geologists
dgarijo551 views
Is preserving data enough? Towards the preservation of scientific methods by dgarijo
Is preserving data enough? Towards the preservation of scientific methods Is preserving data enough? Towards the preservation of scientific methods
Is preserving data enough? Towards the preservation of scientific methods
dgarijo899 views

Recently uploaded

Here comes the Loom - Ya!vaConf.pdf by
Here comes the Loom - Ya!vaConf.pdfHere comes the Loom - Ya!vaConf.pdf
Here comes the Loom - Ya!vaConf.pdfKrystian Zybała
7 views134 slides
Renewal Projects in Seismic Construction by
Renewal Projects in Seismic ConstructionRenewal Projects in Seismic Construction
Renewal Projects in Seismic ConstructionEngineering & Seismic Construction
12 views8 slides
Programmable Logic Devices : SPLD and CPLD by
Programmable Logic Devices : SPLD and CPLDProgrammable Logic Devices : SPLD and CPLD
Programmable Logic Devices : SPLD and CPLDUsha Mehta
44 views54 slides
Ansari: Practical experiences with an LLM-based Islamic Assistant by
Ansari: Practical experiences with an LLM-based Islamic AssistantAnsari: Practical experiences with an LLM-based Islamic Assistant
Ansari: Practical experiences with an LLM-based Islamic AssistantM Waleed Kadous
13 views29 slides
Global airborne satcom market report by
Global airborne satcom market reportGlobal airborne satcom market report
Global airborne satcom market reportdefencereport78
8 views13 slides
REACTJS.pdf by
REACTJS.pdfREACTJS.pdf
REACTJS.pdfArthyR3
39 views16 slides

Recently uploaded(20)

Programmable Logic Devices : SPLD and CPLD by Usha Mehta
Programmable Logic Devices : SPLD and CPLDProgrammable Logic Devices : SPLD and CPLD
Programmable Logic Devices : SPLD and CPLD
Usha Mehta44 views
Ansari: Practical experiences with an LLM-based Islamic Assistant by M Waleed Kadous
Ansari: Practical experiences with an LLM-based Islamic AssistantAnsari: Practical experiences with an LLM-based Islamic Assistant
Ansari: Practical experiences with an LLM-based Islamic Assistant
M Waleed Kadous13 views
REACTJS.pdf by ArthyR3
REACTJS.pdfREACTJS.pdf
REACTJS.pdf
ArthyR339 views
MODULE-1 CHAPTER 3- Operators - Object Oriented Programming with JAVA by Demian Antony D'Mello
MODULE-1 CHAPTER 3- Operators - Object Oriented Programming with JAVAMODULE-1 CHAPTER 3- Operators - Object Oriented Programming with JAVA
MODULE-1 CHAPTER 3- Operators - Object Oriented Programming with JAVA
REPORT Data Science EXPERT LECTURE.doc by Parulkhatri11
REPORT Data Science EXPERT LECTURE.docREPORT Data Science EXPERT LECTURE.doc
REPORT Data Science EXPERT LECTURE.doc
Parulkhatri117 views
Details of Acoustic Liner for selection of material by rafiqalisyed
Details of Acoustic Liner for selection of materialDetails of Acoustic Liner for selection of material
Details of Acoustic Liner for selection of material
rafiqalisyed5 views
2023-12 Emarei MRI Tool Set E2I0501ST (TQ).pdf by Philipp Daum
2023-12 Emarei MRI Tool Set E2I0501ST (TQ).pdf2023-12 Emarei MRI Tool Set E2I0501ST (TQ).pdf
2023-12 Emarei MRI Tool Set E2I0501ST (TQ).pdf
Philipp Daum6 views
Programmable Switches for Programmable Logic Devices by Usha Mehta
Programmable Switches for Programmable Logic DevicesProgrammable Switches for Programmable Logic Devices
Programmable Switches for Programmable Logic Devices
Usha Mehta37 views
DevFest 2023 Daegu Speech_이재규, Implementing easy and simple chat with gol... by JQLEE6
DevFest 2023 Daegu Speech_이재규,  Implementing easy and simple chat with gol...DevFest 2023 Daegu Speech_이재규,  Implementing easy and simple chat with gol...
DevFest 2023 Daegu Speech_이재규, Implementing easy and simple chat with gol...
JQLEE616 views
Trust Metric-Based Anomaly Detection via Deep Deterministic Policy Gradient R... by IJCNCJournal
Trust Metric-Based Anomaly Detection via Deep Deterministic Policy Gradient R...Trust Metric-Based Anomaly Detection via Deep Deterministic Policy Gradient R...
Trust Metric-Based Anomaly Detection via Deep Deterministic Policy Gradient R...
IJCNCJournal5 views
Solution Challenge Introduction.pptx by GDSCCEC
Solution Challenge Introduction.pptxSolution Challenge Introduction.pptx
Solution Challenge Introduction.pptx
GDSCCEC13 views
BCIC - Manufacturing Conclave - Technology-Driven Manufacturing for Growth by Innomantra
BCIC - Manufacturing Conclave -  Technology-Driven Manufacturing for GrowthBCIC - Manufacturing Conclave -  Technology-Driven Manufacturing for Growth
BCIC - Manufacturing Conclave - Technology-Driven Manufacturing for Growth
Innomantra 28 views
Building source code level profiler for C++.pdf by ssuser28de9e
Building source code level profiler for C++.pdfBuilding source code level profiler for C++.pdf
Building source code level profiler for C++.pdf
ssuser28de9e12 views
Web Dev Session 1.pptx by VedVekhande
Web Dev Session 1.pptxWeb Dev Session 1.pptx
Web Dev Session 1.pptx
VedVekhande23 views

A Controlled Crowdsourcing Approach for Practical Ontology Extensions and Metadata Annotations

  • 1. A Controlled Crowdsourcing Approach for Practical Ontology Extensions and Metadata Annotations Yolanda Gil1, Daniel Garijo1, Varun Ratnakar1, Deborah Khider2, Julien Emile-Geay2 and Nicholas McKay3 1Information Sciences Institute, University of Southern California, 2Department of Earth Sciences, University of Southern California, 3School of Earth Sciences and Environmental Sustainability, North Arizona University @yolandagil, @dgarijov {gil,dgarijo}@isi.edu Information Sciences Institute ISWC In-Use Track, Vienna, 2017
  • 2. Data reuse in paleoclimate and environmental sciences A Controlled Crowdsourcing Approach for Practical Ontology Extensions and Metadata Annotations (Gilt et al, ISWC In use track, Vienna, 2017) • Data is collected using idiosyncratic notation and protocols by independent scientists. • Hundreds of types of observations • Physical samples may be from ice, tree, coral, marine sediment, etc. • Hundreds of types of measures • Temperature, rainfall, PH, etc. • Diversity is so great that no one dares to embark on standards. • Typical situation for environmental sciences (water modeling, hydrology etc.)
  • 3. Challenges • How can we leverage basic core agreements? • How can scientist create new properties that they want to use to describe their data? • How to facilitate consensus on new extensions to core agreements? • How can the scientific community immediately benefit from these continued expansion of core agreements? • Coordination and maintenance of new extensions to core agreements A Controlled Crowdsourcing Approach for Practical Ontology Extensions and Metadata Annotations (Gilt et al, ISWC In use track, Vienna, 2017)
  • 4. Approach: Controlled crowdsourcing • A metadata crowdsourcing platform • Controlled standardization process for new metadata properties • Framework for updating metadata of previously annotated datasets A Controlled Crowdsourcing Approach for Practical Ontology Extensions and Metadata Annotations (Gilt et al, ISWC In use track, Vienna, 2017)
  • 5. A Framework for Controlled Crowdsourcing Data Annotation Core ontology Snapshot Snapshot Repository Update Ontology Repository Core ontology revision Crowd vocabulary revision Revision Annotation Framework Revision Framework Update Framework Version 0 Version 1 Requests & issues (core ontology) Requests & issues Extended crowd vocabulary Dataset metadata Dataset metadata store Changes -Monotonic changes -Non-monotonic changes Crowd vocabulary Load/ reload Load/ reload Reload datasets Changes to crowd vocabulary Editorial Board Basic editor Datasets Advanced editor Core ontology A Controlled Crowdsourcing Approach for Practical Ontology Extensions and Metadata Annotations (Gilt et al, ISWC In use track, Vienna, 2017)
  • 6. Specifying metadata for a dataset Data Download Completed properties Missing properties Crowd Properties Category Category Annotation A Controlled Crowdsourcing Approach for Practical Ontology Extensions and Metadata Annotations (Gilt et al, ISWC In use track, Vienna, 2017)
  • 7. Fostering standardization Suggestion of renames Autocompletion A Controlled Crowdsourcing Approach for Practical Ontology Extensions and Metadata Annotations (Gilt et al, ISWC In use track, Vienna, 2017)
  • 8. Dynamic map-based visualizations Dataset annotation interface Author credit Polls for decision making Community discussions Implementation: The Linked Earth Platform A Controlled Crowdsourcing Approach for Practical Ontology Extensions and Metadata Annotations (Gilt et al, ISWC In use track, Vienna, 2017)
  • 9. The Linked Earth Ontology - Overview • Modular design (Core modules + crowd extensions) http://linked.earth/ontology# Linked Paleo Data Ontology (LiPD) EXTENSION (Coral, Wood, Lake Sediment…) EXTENSION (Spectral, Chemical …) EXTENSION (Rock, Snow, Tree …) EXTENSION (Spectrometer, Spectroscope …) EXTENSION (Precipitation, time …) Crowd Vocabulary Extension Schema.org (Dataset) Wgs_84 (Position) Geosparql (Position) SSN (Observation) FOAF (Person) PROV (Derivation) DC (Publication) CoreOntology ProxyArchive ProxyObservation ProxySensor Instrument InferredVariable A Controlled Crowdsourcing Approach for Practical Ontology Extensions and Metadata Annotations (Gilt et al, ISWC In use track, Vienna, 2017)
  • 10. The Linked Earth Ontology - versioning • Working Groups discuss new changes to the ontology • Once a new version is approved, the core vocabulary released and versioned outside the wiki: • Naming schema: http://linked.earth/ontology/module/version • Example: http://linked.earth/ontology/core/1.2.0 • Latest version preserves its URI (aggregates all modules): • http://linked.earth/ontology# • Each version is documented and published in a machine readable and human readable manner A Controlled Crowdsourcing Approach for Practical Ontology Extensions and Metadata Annotations (Gilt et al, ISWC In use track, Vienna, 2017)
  • 11. Organizing the community • Basic editors • Advanced editor • Editorial board • Working group • Periodic face to face events for community engagement • Engagement through twitter polls, online surveys • Editorial board requests votes for candidate standard properties A Controlled Crowdsourcing Approach for Practical Ontology Extensions and Metadata Annotations (Gilt et al, ISWC In use track, Vienna, 2017)
  • 12. Current Situation Page Distribution Datasets 699 ProxyAcrhive 207 ProxyObservation 76 ProxySensor 63 Instrument 45 InferredVariable 1207 MeasuredVariable 3348 Working Group 12 Location 659 Person 524 Publication 875 • More than 14000 pages • More than 150 registered users (50 active) • One full iteration and revision of the ontology • Identified leaders for working groups A Controlled Crowdsourcing Approach for Practical Ontology Extensions and Metadata Annotations (Gilt et al, ISWC In use track, Vienna, 2017)
  • 13. Conclusions and Future Work Approach for on the fly ontology extensions for scientific metadata annotations • Foster standardization through renaming, autocompletion and voting • Editorial process to review core standard with new crowd terms • Framework for updating dataset properties when a new standard is released Ongoing work: • Support editorial process for core ontology revisions • Automating the ontology documentation updates • Further automations of update framework A Controlled Crowdsourcing Approach for Practical Ontology Extensions and Metadata Annotations (Gilt et al, ISWC In use track, Vienna, 2017)
  • 14. A Controlled Crowdsourcing Approach for Practical Ontology Extensions and Metadata Annotations Yolanda Gil1, Daniel Garijo1, Varun Ratnakar1, Deborah Khider2, Julien Emile-Geay2 and Nicholas McKay3 1Information Sciences Institute, University of Southern California, 2Department of Earth Sciences, University of Southern California, 3School of Earth Sciences and Environmental Sustainability, North Arizona University @yolandagil, @dgarijov {gil,dgarijo}@isi.edu Information Sciences Institute ISWC In-Use Track, Vienna, 2017