SlideShare a Scribd company logo
How Portable Are the
Metadata Standards for
Scientific Data?
Jian Qin Kai Li
School of Information Studies
Syracuse University
Syracuse, NY, USA
Why study metadata portability?
Complex, very large metadata
standards
•  are “…unwieldy to apply…”
•  are “difficult to understand
and enact in its entirety…”
•  require customization to
tailor to specific needs
•  costly in time and personnel
Each standard has its own
schema and tools…
EPA Metadata
Editor
Metavist2 Morpho
Protocol
Registration SystemAVM Tagging Tool
…that lead to duplicated efforts
and interoperability problems
Metadata for scientific data is
at the juncture of -
Data-Driven Science
RDF
RDFa
ORCID
XML
OWL
DOI
Technical Standards
Infrastructure
A few big questions
•  What action should and can we take at this
juncture as a community of metadata
practices?
•  How much do we know about metadata
standards for scientific data?
•  How can we transform the current
metadata standards into an infrastructure-
driven service?
•  Portable
•  Customizable
•  Extendable
•  Reusable
•  Easy to use
An infrastructure
perspective for
metadata
An attempt to define “metadata
infrastructure”
Semantically:
• metadata elements, vocabularies, entities, and
other metadata artifacts as the underlying
foundation to build tools, software and
applications
Technically:
• “a data model for describing the resources,
aspects of metadata encoding and storage
formats, metadata for web services, metadata
tools, usage, modification, transformation,
interoperability, and metadata crosswalk
” (CLARIN)
Portability is the key
Building blocks of
metadata
Metadata for data citation
Metadata for data discovery
Metadata for data archiving
Metadata for data quality
Metadata for data
provenance
Metadata for data
management
Metadata generation
output
How portable are metadata
standards for scientific data?
Two measures of metadata portability:
• Co-occurrence of semantic elements: the
times of semantically identical elements used
in multiple standards
• Degree of modularity: the degree of
independence and self-descriptiveness of a
sub-structure of concept/entity in metadata
standards
Data
•  5,800 elements from 16
scientific metadata standards
Element
Collection
•  4,434 unique elements in
terms of semantic
Element De-
duplication
•  9 categories based on
functionalities of the elements
Categorization
Element distribution by standard
NetCDF Climate and
Forecast Metadata
Convention (CF)
2427 elements
Ecological Metadata
Language
Element distribution by standard and
category
Frequency of occurrences by
category
Top occurring elements
Element
Element co-occurrences across
metadata standards
Of 4,434 unique elements,
539 (12.16%) elements
occurred in more than one
standard
Frequency of
co-occurrences
Numberofelements
Elements that most frequently
co-occurred
Modularity
Two levels of modularity:
• Level 1: having multiple XML schema files for
the whole standard;
• Level 2: having separate schemas for entities
such as person/organization, dataset, study,
instrument, and subject
Of the 6 standards with schema files, all of
them belong to Level 1 modularity.
Discussion
Portable metadata standards
• Possible?
• Feasible?
• Advantages over the one-covers-all
approach?
A metadata infrastructure for scientific data
• Bridge the gap between existing semantic and
entity resources and metadata generation
• Much to be researched…
Further research
More questions than answers from this study:
• What should a metadata infrastructure
constitute?
• How can the gaps be filled or narrowed
between the infrastructure resources and
metadata applications?
• Is it possible or is there a need to streamline
the metadata scheme design practice toward
a metadata infrastructure?
• …and the list can go on
Questions and
comments?

More Related Content

What's hot

IEEE and ORCID Implementation
IEEE and ORCID ImplementationIEEE and ORCID Implementation
IEEE and ORCID Implementation
ORCID, Inc
 
MetadataTheory: Introduction to Metadata (5th of 10)
MetadataTheory: Introduction to Metadata (5th of 10)MetadataTheory: Introduction to Metadata (5th of 10)
MetadataTheory: Introduction to Metadata (5th of 10)
Nikos Palavitsinis, PhD
 
Trust and Accountability: experiences from the FAIRDOM Commons Initiative.
Trust and Accountability: experiences from the FAIRDOM Commons Initiative.Trust and Accountability: experiences from the FAIRDOM Commons Initiative.
Trust and Accountability: experiences from the FAIRDOM Commons Initiative.
Carole Goble
 
Linked Open Data and Schema.org Panel
Linked Open Data and Schema.org PanelLinked Open Data and Schema.org Panel
Linked Open Data and Schema.org Panel
Adrian Paschke
 
Tools and Techniques for Creating, Maintaining, and Distributing Shareable Me...
Tools and Techniques for Creating, Maintaining, and Distributing Shareable Me...Tools and Techniques for Creating, Maintaining, and Distributing Shareable Me...
Tools and Techniques for Creating, Maintaining, and Distributing Shareable Me...
Jenn Riley
 
FAIRy stories: tales from building the FAIR Research Commons
FAIRy stories: tales from building the FAIR Research CommonsFAIRy stories: tales from building the FAIR Research Commons
FAIRy stories: tales from building the FAIR Research Commons
Carole Goble
 
How are we Faring with FAIR? (and what FAIR is not)
How are we Faring with FAIR? (and what FAIR is not)How are we Faring with FAIR? (and what FAIR is not)
How are we Faring with FAIR? (and what FAIR is not)
Carole Goble
 
Data Citation Implementation Guidelines By Tim Clark
Data Citation Implementation Guidelines By Tim ClarkData Citation Implementation Guidelines By Tim Clark
Data Citation Implementation Guidelines By Tim Clark
datascienceiqss
 
FAIRsharing and FAIRmetrics - RDA, March 2018
FAIRsharing and FAIRmetrics - RDA, March 2018FAIRsharing and FAIRmetrics - RDA, March 2018
FAIRsharing and FAIRmetrics - RDA, March 2018
Susanna-Assunta Sansone
 
Knowledge Discovery in an Agents Environment
Knowledge Discovery in an Agents EnvironmentKnowledge Discovery in an Agents Environment
Knowledge Discovery in an Agents Environment
ManjulaPatel
 
The eCrystals Federation
The eCrystals FederationThe eCrystals Federation
The eCrystals Federation
ManjulaPatel
 
RDAP 16 Lightning: An Open Science Framework for Solving Institutional Challe...
RDAP 16 Lightning: An Open Science Framework for Solving Institutional Challe...RDAP 16 Lightning: An Open Science Framework for Solving Institutional Challe...
RDAP 16 Lightning: An Open Science Framework for Solving Institutional Challe...
ASIS&T
 
Completepresentation
CompletepresentationCompletepresentation
Completepresentation
Andrew Wesolek
 
FAIRsharing Keynote - International Workshop on Sharing, Citation and Publica...
FAIRsharing Keynote - International Workshop on Sharing, Citation and Publica...FAIRsharing Keynote - International Workshop on Sharing, Citation and Publica...
FAIRsharing Keynote - International Workshop on Sharing, Citation and Publica...
Peter McQuilton
 
HKU Data Curation MLIM7350 Class 9
HKU Data Curation MLIM7350 Class 9 HKU Data Curation MLIM7350 Class 9
HKU Data Curation MLIM7350 Class 9
Scott Edmunds
 
A FAIR Data Sharing Framework for Large-Scale Human Cancer Proteogenomics
A FAIR Data Sharing Framework for Large-Scale Human Cancer ProteogenomicsA FAIR Data Sharing Framework for Large-Scale Human Cancer Proteogenomics
A FAIR Data Sharing Framework for Large-Scale Human Cancer Proteogenomics
Brett Tully
 
Preservation of Research Data: Dataverse / Archivematica Integration by Allan...
Preservation of Research Data: Dataverse / Archivematica Integration by Allan...Preservation of Research Data: Dataverse / Archivematica Integration by Allan...
Preservation of Research Data: Dataverse / Archivematica Integration by Allan...
datascienceiqss
 
Researh data management
Researh data managementResearh data management
Researh data management
Nikesh Narayanan
 
Rots RDAP11 Data Archives in Federal Agencies
Rots RDAP11 Data Archives in Federal AgenciesRots RDAP11 Data Archives in Federal Agencies
Rots RDAP11 Data Archives in Federal Agencies
ASIS&T
 

What's hot (20)

IEEE and ORCID Implementation
IEEE and ORCID ImplementationIEEE and ORCID Implementation
IEEE and ORCID Implementation
 
MetadataTheory: Introduction to Metadata (5th of 10)
MetadataTheory: Introduction to Metadata (5th of 10)MetadataTheory: Introduction to Metadata (5th of 10)
MetadataTheory: Introduction to Metadata (5th of 10)
 
Trust and Accountability: experiences from the FAIRDOM Commons Initiative.
Trust and Accountability: experiences from the FAIRDOM Commons Initiative.Trust and Accountability: experiences from the FAIRDOM Commons Initiative.
Trust and Accountability: experiences from the FAIRDOM Commons Initiative.
 
Linked Open Data and Schema.org Panel
Linked Open Data and Schema.org PanelLinked Open Data and Schema.org Panel
Linked Open Data and Schema.org Panel
 
Tools and Techniques for Creating, Maintaining, and Distributing Shareable Me...
Tools and Techniques for Creating, Maintaining, and Distributing Shareable Me...Tools and Techniques for Creating, Maintaining, and Distributing Shareable Me...
Tools and Techniques for Creating, Maintaining, and Distributing Shareable Me...
 
FAIRy stories: tales from building the FAIR Research Commons
FAIRy stories: tales from building the FAIR Research CommonsFAIRy stories: tales from building the FAIR Research Commons
FAIRy stories: tales from building the FAIR Research Commons
 
How are we Faring with FAIR? (and what FAIR is not)
How are we Faring with FAIR? (and what FAIR is not)How are we Faring with FAIR? (and what FAIR is not)
How are we Faring with FAIR? (and what FAIR is not)
 
Data Citation Implementation Guidelines By Tim Clark
Data Citation Implementation Guidelines By Tim ClarkData Citation Implementation Guidelines By Tim Clark
Data Citation Implementation Guidelines By Tim Clark
 
FAIRsharing and FAIRmetrics - RDA, March 2018
FAIRsharing and FAIRmetrics - RDA, March 2018FAIRsharing and FAIRmetrics - RDA, March 2018
FAIRsharing and FAIRmetrics - RDA, March 2018
 
Knowledge Discovery in an Agents Environment
Knowledge Discovery in an Agents EnvironmentKnowledge Discovery in an Agents Environment
Knowledge Discovery in an Agents Environment
 
The eCrystals Federation
The eCrystals FederationThe eCrystals Federation
The eCrystals Federation
 
RDAP 16 Lightning: An Open Science Framework for Solving Institutional Challe...
RDAP 16 Lightning: An Open Science Framework for Solving Institutional Challe...RDAP 16 Lightning: An Open Science Framework for Solving Institutional Challe...
RDAP 16 Lightning: An Open Science Framework for Solving Institutional Challe...
 
Completepresentation
CompletepresentationCompletepresentation
Completepresentation
 
FAIRsharing Keynote - International Workshop on Sharing, Citation and Publica...
FAIRsharing Keynote - International Workshop on Sharing, Citation and Publica...FAIRsharing Keynote - International Workshop on Sharing, Citation and Publica...
FAIRsharing Keynote - International Workshop on Sharing, Citation and Publica...
 
HKU Data Curation MLIM7350 Class 9
HKU Data Curation MLIM7350 Class 9 HKU Data Curation MLIM7350 Class 9
HKU Data Curation MLIM7350 Class 9
 
A FAIR Data Sharing Framework for Large-Scale Human Cancer Proteogenomics
A FAIR Data Sharing Framework for Large-Scale Human Cancer ProteogenomicsA FAIR Data Sharing Framework for Large-Scale Human Cancer Proteogenomics
A FAIR Data Sharing Framework for Large-Scale Human Cancer Proteogenomics
 
292 daniel dollar ssp yale_28_may2008
292 daniel dollar ssp yale_28_may2008292 daniel dollar ssp yale_28_may2008
292 daniel dollar ssp yale_28_may2008
 
Preservation of Research Data: Dataverse / Archivematica Integration by Allan...
Preservation of Research Data: Dataverse / Archivematica Integration by Allan...Preservation of Research Data: Dataverse / Archivematica Integration by Allan...
Preservation of Research Data: Dataverse / Archivematica Integration by Allan...
 
Researh data management
Researh data managementResearh data management
Researh data management
 
Rots RDAP11 Data Archives in Federal Agencies
Rots RDAP11 Data Archives in Federal AgenciesRots RDAP11 Data Archives in Federal Agencies
Rots RDAP11 Data Archives in Federal Agencies
 

Viewers also liked

EBA and BPM
EBA and BPMEBA and BPM
EBA and BPM
billpoole
 
Adsplay Sociality Rocks 22 02 2012
Adsplay Sociality Rocks 22 02 2012 Adsplay Sociality Rocks 22 02 2012
Adsplay Sociality Rocks 22 02 2012
Christine Vatutina
 
República bolivariana de venezuela
República bolivariana de venezuelaRepública bolivariana de venezuela
República bolivariana de venezuela
Deliana Aguero
 
Presentation SmaileX.com
Presentation SmaileX.comPresentation SmaileX.com
Presentation SmaileX.comSmaileX Project
 
Infrastructure, Standards, and Policies for Research Data Management
Infrastructure, Standards, and Policies for Research Data Management Infrastructure, Standards, and Policies for Research Data Management
Infrastructure, Standards, and Policies for Research Data Management
Jian Qin
 
Data Science: An Emerging Field for Future Jobs
Data Science: An Emerging Field for Future JobsData Science: An Emerging Field for Future Jobs
Data Science: An Emerging Field for Future Jobs
Jian Qin
 
Lineamientos
 Lineamientos Lineamientos
Lineamientos
Deliana Aguero
 
República bolivariana de venezuela
República bolivariana de venezuelaRepública bolivariana de venezuela
República bolivariana de venezuela
Deliana Aguero
 
Atlas mundial del medio ambiente
Atlas mundial del medio ambienteAtlas mundial del medio ambiente
Atlas mundial del medio ambiente
Deliana Aguero
 
Educating a New Breed of Data Scientists for Scientific Data Management
Educating a New Breed of Data Scientists for Scientific Data Management Educating a New Breed of Data Scientists for Scientific Data Management
Educating a New Breed of Data Scientists for Scientific Data Management
Jian Qin
 
Kpi digital проектов
Kpi digital проектовKpi digital проектов
Kpi digital проектов
Christine Vatutina
 
SDL/SSDL для руководителей
SDL/SSDL для руководителейSDL/SSDL для руководителей
SDL/SSDL для руководителей
Valery Boronin
 
Archives à conserver Associations loi 1901 .compressed 1
Archives à conserver Associations loi 1901 .compressed 1Archives à conserver Associations loi 1901 .compressed 1
Archives à conserver Associations loi 1901 .compressed 1
Dominique Gayraud
 

Viewers also liked (20)

Transmedia Ailove conference
Transmedia Ailove conferenceTransmedia Ailove conference
Transmedia Ailove conference
 
EBA and BPM
EBA and BPMEBA and BPM
EBA and BPM
 
Adsplay Sociality Rocks 22 02 2012
Adsplay Sociality Rocks 22 02 2012 Adsplay Sociality Rocks 22 02 2012
Adsplay Sociality Rocks 22 02 2012
 
República bolivariana de venezuela
República bolivariana de venezuelaRepública bolivariana de venezuela
República bolivariana de venezuela
 
Kinerja dan daya saing indonesia
Kinerja dan daya saing indonesiaKinerja dan daya saing indonesia
Kinerja dan daya saing indonesia
 
Sentiments d'una vall
Sentiments d'una vallSentiments d'una vall
Sentiments d'una vall
 
Presentation SmaileX.com
Presentation SmaileX.comPresentation SmaileX.com
Presentation SmaileX.com
 
Infrastructure, Standards, and Policies for Research Data Management
Infrastructure, Standards, and Policies for Research Data Management Infrastructure, Standards, and Policies for Research Data Management
Infrastructure, Standards, and Policies for Research Data Management
 
Viatge
ViatgeViatge
Viatge
 
Data Science: An Emerging Field for Future Jobs
Data Science: An Emerging Field for Future JobsData Science: An Emerging Field for Future Jobs
Data Science: An Emerging Field for Future Jobs
 
Lineamientos
 Lineamientos Lineamientos
Lineamientos
 
República bolivariana de venezuela
República bolivariana de venezuelaRepública bolivariana de venezuela
República bolivariana de venezuela
 
Atlas mundial del medio ambiente
Atlas mundial del medio ambienteAtlas mundial del medio ambiente
Atlas mundial del medio ambiente
 
Peranan ilmuwan
Peranan ilmuwanPeranan ilmuwan
Peranan ilmuwan
 
42 peranan ilmuwan
42 peranan ilmuwan42 peranan ilmuwan
42 peranan ilmuwan
 
Educating a New Breed of Data Scientists for Scientific Data Management
Educating a New Breed of Data Scientists for Scientific Data Management Educating a New Breed of Data Scientists for Scientific Data Management
Educating a New Breed of Data Scientists for Scientific Data Management
 
Kpi digital проектов
Kpi digital проектовKpi digital проектов
Kpi digital проектов
 
SDL/SSDL для руководителей
SDL/SSDL для руководителейSDL/SSDL для руководителей
SDL/SSDL для руководителей
 
Zara1
Zara1Zara1
Zara1
 
Archives à conserver Associations loi 1901 .compressed 1
Archives à conserver Associations loi 1901 .compressed 1Archives à conserver Associations loi 1901 .compressed 1
Archives à conserver Associations loi 1901 .compressed 1
 

Similar to How Portable Are the Metadata Standards for Scientific Data?

RDAP 15: Beyond Metadata: Leveraging the “README” to support disciplinary Doc...
RDAP 15: Beyond Metadata: Leveraging the “README” to support disciplinary Doc...RDAP 15: Beyond Metadata: Leveraging the “README” to support disciplinary Doc...
RDAP 15: Beyond Metadata: Leveraging the “README” to support disciplinary Doc...
ASIS&T
 
Essentials 4 Data Support: a fine course in FAIR Data Support
Essentials 4 Data Support: a fine course in FAIR Data SupportEssentials 4 Data Support: a fine course in FAIR Data Support
Essentials 4 Data Support: a fine course in FAIR Data Support
Ellen Verbakel
 
10-15-13 “Metadata and Repository Services for Research Data Curation” Presen...
10-15-13 “Metadata and Repository Services for Research Data Curation” Presen...10-15-13 “Metadata and Repository Services for Research Data Curation” Presen...
10-15-13 “Metadata and Repository Services for Research Data Curation” Presen...
DuraSpace
 
FSCI Data Discovery
FSCI Data DiscoveryFSCI Data Discovery
FSCI Data Discovery
ARDC
 
Umesha naik metadata
Umesha naik metadataUmesha naik metadata
Umesha naik metadata
Umesha Naik
 
Rdap12 wrap up reagan moore
Rdap12 wrap up reagan mooreRdap12 wrap up reagan moore
Rdap12 wrap up reagan moore
ASIS&T
 
Caplan and York, 'What It Takes To Make It Last: E-Resources Preservation"
Caplan and York, 'What It Takes To Make It Last:  E-Resources Preservation"Caplan and York, 'What It Takes To Make It Last:  E-Resources Preservation"
Caplan and York, 'What It Takes To Make It Last: E-Resources Preservation"
National Information Standards Organization (NISO)
 
Semantic Similarity and Selection of Resources Published According to Linked ...
Semantic Similarity and Selection of Resources Published According to Linked ...Semantic Similarity and Selection of Resources Published According to Linked ...
Semantic Similarity and Selection of Resources Published According to Linked ...
Riccardo Albertoni
 
Duraspace Hot Topics Series 6: Metadata and Repository Services
Duraspace Hot Topics Series 6: Metadata and Repository ServicesDuraspace Hot Topics Series 6: Metadata and Repository Services
Duraspace Hot Topics Series 6: Metadata and Repository Services
Matthew Critchlow
 
eROSA Stakeholder WS1: Data discovery through federated dataset catalogues
eROSA Stakeholder WS1: Data discovery through federated dataset catalogueseROSA Stakeholder WS1: Data discovery through federated dataset catalogues
eROSA Stakeholder WS1: Data discovery through federated dataset catalogues
e-ROSA
 
Hansen Metadata for Institutional Repositories
Hansen Metadata for Institutional RepositoriesHansen Metadata for Institutional Repositories
Hansen Metadata for Institutional Repositories
National Information Standards Organization (NISO)
 
Managing your data paget
Managing your data pagetManaging your data paget
Managing your data paget
TERN Australia
 
A theory of Metadata enriching & filtering
A theory of  Metadata enriching & filteringA theory of  Metadata enriching & filtering
A theory of Metadata enriching & filtering
Cuerpo Academico 'Estudios de la Información'
 
Impact of Covid-19 on Learning and Education
Impact of Covid-19 on Learning and EducationImpact of Covid-19 on Learning and Education
Impact of Covid-19 on Learning and Education
MANENDRASINGH30
 
Metadata mapping
Metadata mappingMetadata mapping
Metadata mappingVlad Vega
 
The Future of Semantics on the Web
The Future of Semantics on the WebThe Future of Semantics on the Web
The Future of Semantics on the Web
John Domingue
 
Semantics-enhanced Cyberinfrastructure for ICMSE : Interoperability, Analyti...
Semantics-enhanced Cyberinfrastructure for ICMSE :  Interoperability, Analyti...Semantics-enhanced Cyberinfrastructure for ICMSE :  Interoperability, Analyti...
Semantics-enhanced Cyberinfrastructure for ICMSE : Interoperability, Analyti...
Artificial Intelligence Institute at UofSC
 
MetadataTheory: Metadata Standards (6th of 10)
MetadataTheory: Metadata Standards (6th of 10)MetadataTheory: Metadata Standards (6th of 10)
MetadataTheory: Metadata Standards (6th of 10)
Nikos Palavitsinis, PhD
 
Data discovery through federated dataset catalogs
Data discovery through federated dataset catalogsData discovery through federated dataset catalogs
Data discovery through federated dataset catalogs
Valeria Pesce
 

Similar to How Portable Are the Metadata Standards for Scientific Data? (20)

RDAP 15: Beyond Metadata: Leveraging the “README” to support disciplinary Doc...
RDAP 15: Beyond Metadata: Leveraging the “README” to support disciplinary Doc...RDAP 15: Beyond Metadata: Leveraging the “README” to support disciplinary Doc...
RDAP 15: Beyond Metadata: Leveraging the “README” to support disciplinary Doc...
 
Essentials 4 Data Support: a fine course in FAIR Data Support
Essentials 4 Data Support: a fine course in FAIR Data SupportEssentials 4 Data Support: a fine course in FAIR Data Support
Essentials 4 Data Support: a fine course in FAIR Data Support
 
10-15-13 “Metadata and Repository Services for Research Data Curation” Presen...
10-15-13 “Metadata and Repository Services for Research Data Curation” Presen...10-15-13 “Metadata and Repository Services for Research Data Curation” Presen...
10-15-13 “Metadata and Repository Services for Research Data Curation” Presen...
 
FSCI Data Discovery
FSCI Data DiscoveryFSCI Data Discovery
FSCI Data Discovery
 
Presentation 16 may keynote karin bredenberg
Presentation 16 may keynote karin bredenbergPresentation 16 may keynote karin bredenberg
Presentation 16 may keynote karin bredenberg
 
Umesha naik metadata
Umesha naik metadataUmesha naik metadata
Umesha naik metadata
 
Rdap12 wrap up reagan moore
Rdap12 wrap up reagan mooreRdap12 wrap up reagan moore
Rdap12 wrap up reagan moore
 
Caplan and York, 'What It Takes To Make It Last: E-Resources Preservation"
Caplan and York, 'What It Takes To Make It Last:  E-Resources Preservation"Caplan and York, 'What It Takes To Make It Last:  E-Resources Preservation"
Caplan and York, 'What It Takes To Make It Last: E-Resources Preservation"
 
Semantic Similarity and Selection of Resources Published According to Linked ...
Semantic Similarity and Selection of Resources Published According to Linked ...Semantic Similarity and Selection of Resources Published According to Linked ...
Semantic Similarity and Selection of Resources Published According to Linked ...
 
Duraspace Hot Topics Series 6: Metadata and Repository Services
Duraspace Hot Topics Series 6: Metadata and Repository ServicesDuraspace Hot Topics Series 6: Metadata and Repository Services
Duraspace Hot Topics Series 6: Metadata and Repository Services
 
eROSA Stakeholder WS1: Data discovery through federated dataset catalogues
eROSA Stakeholder WS1: Data discovery through federated dataset catalogueseROSA Stakeholder WS1: Data discovery through federated dataset catalogues
eROSA Stakeholder WS1: Data discovery through federated dataset catalogues
 
Hansen Metadata for Institutional Repositories
Hansen Metadata for Institutional RepositoriesHansen Metadata for Institutional Repositories
Hansen Metadata for Institutional Repositories
 
Managing your data paget
Managing your data pagetManaging your data paget
Managing your data paget
 
A theory of Metadata enriching & filtering
A theory of  Metadata enriching & filteringA theory of  Metadata enriching & filtering
A theory of Metadata enriching & filtering
 
Impact of Covid-19 on Learning and Education
Impact of Covid-19 on Learning and EducationImpact of Covid-19 on Learning and Education
Impact of Covid-19 on Learning and Education
 
Metadata mapping
Metadata mappingMetadata mapping
Metadata mapping
 
The Future of Semantics on the Web
The Future of Semantics on the WebThe Future of Semantics on the Web
The Future of Semantics on the Web
 
Semantics-enhanced Cyberinfrastructure for ICMSE : Interoperability, Analyti...
Semantics-enhanced Cyberinfrastructure for ICMSE :  Interoperability, Analyti...Semantics-enhanced Cyberinfrastructure for ICMSE :  Interoperability, Analyti...
Semantics-enhanced Cyberinfrastructure for ICMSE : Interoperability, Analyti...
 
MetadataTheory: Metadata Standards (6th of 10)
MetadataTheory: Metadata Standards (6th of 10)MetadataTheory: Metadata Standards (6th of 10)
MetadataTheory: Metadata Standards (6th of 10)
 
Data discovery through federated dataset catalogs
Data discovery through federated dataset catalogsData discovery through federated dataset catalogs
Data discovery through federated dataset catalogs
 

Recently uploaded

From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
Product School
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
KatiaHIMEUR1
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
Product School
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
Frank van Harmelen
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
DianaGray10
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
Cheryl Hung
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
Safe Software
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
Alison B. Lowndes
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
OnBoard
 
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Tobias Schneck
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
Prayukth K V
 
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsTo Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
Paul Groth
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
James Anderson
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
RTTS
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
Product School
 
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualitySoftware Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Inflectra
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Product School
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Product School
 

Recently uploaded (20)

From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
 
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
 
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsTo Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
 
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualitySoftware Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
 

How Portable Are the Metadata Standards for Scientific Data?

  • 1. How Portable Are the Metadata Standards for Scientific Data? Jian Qin Kai Li School of Information Studies Syracuse University Syracuse, NY, USA
  • 2. Why study metadata portability? Complex, very large metadata standards •  are “…unwieldy to apply…” •  are “difficult to understand and enact in its entirety…” •  require customization to tailor to specific needs •  costly in time and personnel
  • 3. Each standard has its own schema and tools… EPA Metadata Editor Metavist2 Morpho Protocol Registration SystemAVM Tagging Tool …that lead to duplicated efforts and interoperability problems
  • 4. Metadata for scientific data is at the juncture of - Data-Driven Science RDF RDFa ORCID XML OWL DOI Technical Standards Infrastructure
  • 5. A few big questions •  What action should and can we take at this juncture as a community of metadata practices? •  How much do we know about metadata standards for scientific data? •  How can we transform the current metadata standards into an infrastructure- driven service?
  • 6. •  Portable •  Customizable •  Extendable •  Reusable •  Easy to use An infrastructure perspective for metadata
  • 7. An attempt to define “metadata infrastructure” Semantically: • metadata elements, vocabularies, entities, and other metadata artifacts as the underlying foundation to build tools, software and applications Technically: • “a data model for describing the resources, aspects of metadata encoding and storage formats, metadata for web services, metadata tools, usage, modification, transformation, interoperability, and metadata crosswalk ” (CLARIN)
  • 8. Portability is the key Building blocks of metadata Metadata for data citation Metadata for data discovery Metadata for data archiving Metadata for data quality Metadata for data provenance Metadata for data management Metadata generation output
  • 9. How portable are metadata standards for scientific data? Two measures of metadata portability: • Co-occurrence of semantic elements: the times of semantically identical elements used in multiple standards • Degree of modularity: the degree of independence and self-descriptiveness of a sub-structure of concept/entity in metadata standards
  • 10. Data •  5,800 elements from 16 scientific metadata standards Element Collection •  4,434 unique elements in terms of semantic Element De- duplication •  9 categories based on functionalities of the elements Categorization
  • 11. Element distribution by standard NetCDF Climate and Forecast Metadata Convention (CF) 2427 elements Ecological Metadata Language
  • 12. Element distribution by standard and category
  • 15. Element co-occurrences across metadata standards Of 4,434 unique elements, 539 (12.16%) elements occurred in more than one standard Frequency of co-occurrences Numberofelements
  • 16. Elements that most frequently co-occurred
  • 17. Modularity Two levels of modularity: • Level 1: having multiple XML schema files for the whole standard; • Level 2: having separate schemas for entities such as person/organization, dataset, study, instrument, and subject Of the 6 standards with schema files, all of them belong to Level 1 modularity.
  • 18. Discussion Portable metadata standards • Possible? • Feasible? • Advantages over the one-covers-all approach? A metadata infrastructure for scientific data • Bridge the gap between existing semantic and entity resources and metadata generation • Much to be researched…
  • 19. Further research More questions than answers from this study: • What should a metadata infrastructure constitute? • How can the gaps be filled or narrowed between the infrastructure resources and metadata applications? • Is it possible or is there a need to streamline the metadata scheme design practice toward a metadata infrastructure? • …and the list can go on