Jian Qin, Syracuse University

Jian Qin, Syracuse University; Alex Ball, UKLON; Jane Greenberg, University of North Carolina at Chapel Hill: “Functional and Architectural Requirements for Metadata: Supporting Discovery and Management of Scientific Data”

Panel: Linked data and metadata (co-sponsored by the ASIS&T Digital Libraries SIG)
Research Data Access & Preservation Summit 2013
Baltimore, MD April 4, 2013 #rdap13

  • Although there are vast records in bib data and indexing database, identity metadata is still being repeatedly created due to the lack of interoperability and infrastructural service
    1. 1. RDAP2013,http://www.asis.org/rdap/ Functional and Architectural Requirements for Metadata: Supporting Discovery and Management of Scientific Data Jian Qin Alex Ball Jane Greenberg School of Information Digital Curation Centre School of Information and Studies UKOLN Library Science Syracuse University University of Bath University of North Caroline USA UK Chapel Hill, USA Research Data Access & Preservation Summit Baltimore, MD, 2013
    2. 2. RDAP2013,http://www.asis.org/rdap/ 2 Metadata standards for scientific data Ecological Metadata Language (EML) CSDGM Access to Biological Collections Data – ABCD Darwin Core
    3. 3. RDAP2013,http://www.asis.org/rdap/ 3 Many tools have been developed…
    4. 4. RDAP2013,http://www.asis.org/rdap/ 4 Many tools have been developed…
    5. 5. RDAP2013,http://www.asis.org/rdap/ 5 Many tools have been developed…
    6. 6. RDAP2013,http://www.asis.org/rdap/ 6 Motivation for adoption • Standardize the format and terminology of metadata • Enable fast and effective discovery of datasets across different data repositories • Enable data sharing, reuse, and preservation • Provide information for obtaining datasets from the data owners
    7. 7. RDAP2013,http://www.asis.org/rdap/ 7 Hindering factors • large numbers of • steep learning curve elements • difficult to automate • many layers in structure metadata generation – “Unwieldy to apply” • unnecessary duplicate • been created for data entry manual data • high costs in entry, rather than for time, resource, and automatic generation personnel expertise
    8. 8. RDAP2013,http://www.asis.org/rdap/ 8 Same entity data repeated in the same record… Seamless Daily Precipitation for the Conterminous United States Metadata: Identification_Information Data_Quality_Information Spatial_Data_Organization_Information Spatial_Reference_Information Entity_and_Attribute_Information Distribution_Information Metadata_Reference_Information
    9. 9. RDAP2013,http://www.asis.org/rdap/ 9 …and they are already in…
    10. 10. RDAP2013,http://www.asis.org/rdap/ 10 Research questions • What functions do metadata standards for scientific data serve? • How should metadata standards for scientific data be modeled to support these functions by meeting the associated requirements?
    11. 11. RDAP2013,http://www.asis.org/rdap/ 11 Functions expected • Resource discovery and use, • Data interoperability, • Automatic and semi-automatic metadata generation, • Linking of publications and underlying datasets, • Data/metadata quality control, and • Data security.
    12. 12. RDAP2013,http://www.asis.org/rdap/ 12 Metadata requirements for scientific data
    13. 13. RDAP2013,http://www.asis.org/rdap/ 13
    14. 14. RDAP2013,http://www.asis.org/rdap/ 14 Architectural view
    15. 15. RDAP2013,http://www.asis.org/rdap/ 15 Identity metadata • Person: researcherID, • Name repositories URI, FOAF, ORCID • Linked data architecture • Institution: ORCID, URI • Customizable research • Data object: DOI, group/community Handle, URI member name lists • Associated publication: DOI
    16. 16. RDAP2013,http://www.asis.org/rdap/ 16 Semantic metadata • Large semantic resources available in linked data format, but usually not suitable for representing scientific data because they are designed for publications, especially books and journals (containers) • Format is contemporary but the content is far from it Smaller, specialized semantic resources are necessary for automatic semantic metadata generation
    17. 17. RDAP2013,http://www.asis.org/rdap/ 17 Contextual metadata • Provenance Provenance data model (W3C, http://www.w3.org/TR/prov-primer/) Provenance data represent the origins of digital objects and describe the entities and activities involved in producing and delivering or otherwise influencing a given object.
    18. 18. RDAP2013,http://www.asis.org/rdap/ Geospatial metadata Biological sciences Climate Ecological Darwin Metadata Biological Core NetCDF Data Profile Language (DwC) (EML) Climate and Forecast (CF) Georeferencing Metadata Shoreline elements Conventions Metadata Profile FGDC Georeferencing elements Astronomy CSDGM Profiles CSDGM Astronomy Visualization Metadata ISO 19115: 2003 Standard Geographic information - Metadata. 18
    19. 19. RDAP2013,http://www.asis.org/rdap/ 19 Temporal metadata • Mean solar time • Different measurement • Civil time systems result in • GPS time different units and format • Terrestrial time • Conversion between • Atomic time systems • … • Geologic time
    20. 20. RDAP2013,http://www.asis.org/rdap/ 20 • The least effort principle • The infrastructure service principle • The portable principle
    21. 21. RDAP2013,http://www.asis.org/rdap/ 21 TABLE 1. Mapping data user tasks with metadata functions and architectural building blocks Data Metadata function Architectural building block user tasks Discover Descriptive metadata Identity and semantic metadata Identify Descriptive metadata Identity metadata Select Descriptive, technical Identity, semantic, scientific metadata context, geospatial, temporal, miscellany metadata Obtain Descriptive metadata Identify metadata Verify Descriptive metadata Scientific context metadata Analyze Scientific context, geospatial, and temporal metadata Manage Descriptive, Identify, semantic, scientific administrative, context, geospatial, temporal, structural, and miscellany metadata technical metadata Archive Descriptive, Identify, semantic, scientific administrative, context, geospatial, temporal, structural, and miscellany metadata technical metadata Publish Descriptive metadata Identity, semantic, scientific context, geospatial, and temporal metadata Cite Descriptive metadata Identify metadata
    22. 22. RDAP2013,http://www.asis.org/rdap/ 22 Development potentials • An infrastructure of metadata services – Entities as linked data – Tools for “slicing” members by research group, community, or institution to customize the entity set – Tools for grabbing entity data from existing resources through interoperability protocols
    23. 23. RDAP2013,http://www.asis.org/rdap/ 23 Application scenarios • Cross-domain discovery and verification • Automatically populating entity information from customized slices of entities into metadata records • And more…
    24. 24. RDAP2013,http://www.asis.org/rdap/ 24 Conclusion • Scientific data are inherently complex and diverse • Functional metadata requirements should be translated into an effective and efficient architecture – Three principles for modeling metadata for scientific data • Metadata for scientific data (or other domains at large) should adopt an infrastructure service approach • Much to be explored, experimented, and evaluated
    25. 25. RDAP2013,http://www.asis.org/rdap/ 25 Thank you! Questions?