• Like
Metadata : Concentrating on the data, not on the scheme
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

Metadata : Concentrating on the data, not on the scheme

  • 1,264 views
Published

The LODE-BD Recommendations present a reference tool that assist bibliographic data providers in selecting appropriate encoding strategies according to their needs in order to facilitate metadata …

The LODE-BD Recommendations present a reference tool that assist bibliographic data providers in selecting appropriate encoding strategies according to their needs in order to facilitate metadata exchange by, for example, constructing crosswalks between their local data formats and widely-used formats or even with a Linked Data representation. The LODE-BD Recommendations aim to address two questions: how to encode bibliographic data hosted by diverse open repositories for the purpose of exchanging data across data providers; and how to encode these data as Linked Open Data (LOD) - enabled bibliographic data.

The core component of the LODE-BD Recommendations report contains a set of recommended decision-making trees for common properties used in describing a bibliographic resource instance (article, monograph, thesis, conference paper, presentation material, research report, learning object, etc. - in print or electronic format). Each decision tree is delivered with various acting points and the matching encoding suggestions, usually with multiple options.

LODE-BD is a part of a series of LODE recommendations overarching a wide range of resource types including the encoding of value vocabularies used in describing agents, places, and topics in bibliographic data.

Published in Education
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
    Be the first to like this
No Downloads

Views

Total Views
1,264
On SlideShare
0
From Embeds
0
Number of Embeds
2

Actions

Shares
Downloads
19
Comments
0
Likes
0

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide
  • With Web advances paving the way to an era with more open and linked data, the traditional approach of sharing data within silos seems to have reached its end. From governments and international organizations, to local cities and institutions, there is a widespread effort of opening and linking data according to the  four principles of Linked Data . With that in mind, the  Linked Open Data (LOD) cloud  has dramatically increased in size and variety with a great number of datasets and RDF triples (statements) during the last two years.
  • In the bibliographic universe there is a clear paradigm shift from fixed record formats to re-combinable metadata statements. For anyone who is contributing to an open bibliographic data repository as a data provider or service provider, the processes and strategies of providing data as Linked Data are practical issues. Guidelines and recommendations on what standards to follow and how to prepare LOD-ready metadata are essential. There seems to be no one-size-fits-all approach because during the last two decades a great number of metadata-related standards have been created by different communities for specifics purposes - to guide the design, creation, and implementation of data structure, data value, data content, and data exchange in certain communities. The operational metadata standards for data structures form a whole spectrum, ranging from isolated ones (which do not reuse any metadata terms from a known namespace) to those fully employing and incorporating existing metadata terms from other namespaces, such as many newly developed metadata structure application profiles or ontologies. Decisions on what standard(s) to adopt will directly impact the degree of LOD-readiness of the bibliographic data.
  • The approach of employing well-accepted metadata element sets and value vocabularies has already shown great benefits and potential for both information professional services and end-users of these services in resource discovery, data reuse and sharing, and the creation of new content based on linked data. However, deciding to take this approach is only the first step for the data providers and service providers of an open bibliographic repository. In the context of producing LOD-enabled bibliographical data, data and service providers are likely to have many specific questions related to the encoding strategies, for example:   What metadata standard(s) should be followed in order to publish any data as Linked Data? What is the minimal set of properties that a dataset should include for being meaningful in data sharing? Is there any metadata model or application profile that can be directly adopted for producing data (especially from our local database)? If the controlled vocabulary we have used are available as Linked Data, what kind of value should we exchange through our repository, specifically, the literal forms representing a concept or the URI identifying the concept ? How should we encode our data in order to move from a local database to a Linked Data dataset?
  • In this context and with the aim of providing content providers with a set of recommendations that will support the selection of appropriate encoding strategies for producing LOD-enabled data, the AIMS Team is preparing a series of LODE recommendations  that overarch a wide range of resource types including the encoding of value vocabularies used in describing agents, places, and topics in bibliographic data.
  • In order to enhance the quality of the interoperability and effectiveness of information exchange, the recommendations are built on five key principles: To promote the use of well-established metadata standards as well as the emerging LOD-enabled vocabularies proposed in the Linked Data community; To encourage the use of authority data, controlled vocabularies, and syntax encoding standards whenever possible in order to enhance the quality of the interoperability and effectiveness of information exchange; To encourage the use of resource URIs as names for things for data values when they are available; To facilitate the decision-making process regarding data encoding for the purpose of exchange and reuse; To provide a reference support that is open for suggestions of new properties and metadata terms according to the needs of the Linked Data community.
  • The first set of recommendations published on AIMS are the  LODE-BD  (Linked Open Data (LOD) - enabled bibliographical data). The LODE-BD  recommendations are applicable for structured data describing  bibliographic resources  (such as article, monograph, thesis, conference paper, presentation material, research report, learning object, etc. – in print or electronic format), although in the future the recommendations may be extended to accommodate other kinds of information resources. The  LODE-BD  Recommendations present a reference tool that assist bibliographic data providers in selecting appropriate encoding strategies according to their needs in order to facilitate metadata exchange by, for example, constructing crosswalks between their local data formats and widely-used formats or even with a Linked Data representation. The  LODE-BD  Recommendations aim to address two questions: how to encode bibliographic data hosted by diverse open repositories for the purpose of exchanging data across data providers; and how to encode these data as Linked Open Data (LOD) - enabled bibliographic data. LODE-BD  uses flowcharts to present individualized decision trees for the properties included in each of the nine groups. Starting from the property that describes a resource instance, each flowchart presents decision points and gives a step-by-step solution to a given problem of metadata encoding. These flowcharts are designed to facilitate the selection of the appropriate strategies adjustable to data providers according to their situations, while all work towards the goal of data exchange and reuse.
  • What kinds of entities and relationships are involved in bibliographic resource descriptions? The definition of a conceptual model helps to bring an overall picture of involving entities and relationships in bibliographic descriptions. In a broader context, the use of a similar conceptual model among content providers should also help to establish a common understanding of the involving data models. LODE-BD proposes a simple conceptual model based on three entities: resource, agent and thema. The model should provide sufficient capabilities for the data providers to present their content (such as document repositories and library catalogues) for sharing in the traditional environment or transferring to the Linked Data environment. More at  4.1. Core entities and relationships for bibliographic resource description
  • The general concept model (Figure 2) provides a high level of abstraction focusing on bibliographic resource - in short as resource- entity. Major relations can be identified between a resource instance and the agent(s) that are responsible for the creation of the content and the dissemination of the resource, as well as the thema(s) (subjects or topics)- that the resource’s content is about. As a result, three core entities are presented in the model: resource, agent, and thema. The model presented in Figure 3 is based on the implication of the general concept model in the LODE-BD case and provides examples of possible relationships between and among the instances in different entities.
  • What properties should be considered for publishing meaningful/useful LOD-ready bibliographic data? In the Linked Data context any content provider can expose anything contained in its local database. However, in the case of bibliographical data, standardized types of information should be considered in order to maximize the impact of sharing and connecting of the data. LODE-BD has identified nine groups of common properties for describing bibliographic resources. They form the backbone of LODE-BD Recommendations and are the basis of Chapter 5. About two dozen properties used for describing a bibliographic resource as well as an additional two sets of properties for describing relations between bibliographic resources or between agents are included with specific best practice recommendations. More at  4.2. Groups of Common Properties
  • What metadata standards should be used for preparing LOD-ready bibliographic data? The  Linked Open Vocabularies(LOV)  offers a long list of vocabularies used by the datasets that are available in the  Linked Data Cloud . LODE-BD has selected a number of well-accepted and widely-used metadata vocabularies and used their metadata terms in the recommendations. New metadata standards can be added to the list in the future depending on the needs on the Linked Open Data Community. More at  4.3. Metadata Standards
  • What metadata terms are appropriate in any given property for producing LOD-ready bibliographic data from a local database? Metadata terms from the DCMES (dc:) and DCMI Metadata Terms (dcterms:) namespaces are the fundamentals in the LODE-BD Recommendations. Metadata terms from other namespaces are supplemented when additional needs are to be satisfied. LODE-BD has prepared a crosswalk table where all metadata terms used in the Recommendations are included. More at  4.4. Metadata Terms and the Crosswalk
  • This part of the LODE-BD report aims to assist in the metadata term selection process to be carry out by any bibliographical data provider. LODE-BD uses flowcharts to present individualized decision trees for the properties included in each of the nine groups (refer to the previous chapter). Starting from the property that describes a resource instance, each flowchart presents decision points and gives a step-by-step solution to a given problem of metadata encoding. These flowcharts are designed to facilitate the selection of the appropriate strategies adjustable to data providers according to their situations, while all work towards the goal of data exchange and reuse. At the end of each flowchart there are alternative sets of metadata terms for selection. Each chart is followed by the text-based explanations corresponding to the flowchart, with notes, steps, and examples whenever necessary in the tables.    

Transcript

  • 1. Metadata : Concentrating on the data, not on the scheme Imma Subirats FAO of the United Nations Marcia Zeng Kent State University euroCRIS Meeting  Bologna (Italy) April 26-27, 2011
  • 2. FAO AIMS http://aims.fao.org/ A gricultural I nformation M anagement S tandards
  • 3.
    • Background
    • The first step for contributing “LOD”
    • LODE Recommendations
      • LODE-BD introduction
      • The decision trees
    Outline
  • 4. Background
  • 5.
    • Refers to a set of best practices for publishing, sharing, and interlinking structured data on the Web
    • Key technologies
    • URIs for identifying entities or concepts in the world
    • RDF model for structuring and linking descriptions of things
    • HTTP for retrieving resources or descriptions of resources
    Linked Open Data
  • 6. Breaking Silos, Linking Data! Linking Open Data cloud diagram as of 2010-09, by Richard Cyganiak and Anja Jentzsch.  http://lod-cloud.net/
  • 7.
    • Guidelines and recommendations on what standards to follow and how to prepare LOD-ready metadata are essential
    • There is no one-size-fits-all approach
    • During the last two decades a great number of metadata-related standards have been created by different communities for specifics purposes.  
    • Decisions on what standard(s) to adopt will directly impact the degree of LOD-readiness of the bibliographic data
    Metadata Standards
  • 8. The first step for contributing to "LOD"  
  • 9.
    • … are likely to have specific questions about encoding strategies, such as:
    Data and service providers…
    • What metadata standard(s) should be used?
    • What is the minimal set of properties meaningful in data sharing?
    • Is there any metadata model or application profile that can be reused or followed?
    • What kind of value should we exchange: the literal forms representing a concept or the URI identifying the concept ?
    • How should we encode our data?
  • 10. LODE Recommendations LOD - E nabled
  • 11.
    • To assist information professionals to decide what metadata terms to use when encoding existing bibliographic data
    • To provide a set of recommendations that will support the selection of appropriate encoding strategies for producing LOD-enabled data
    Objectives LOD - E nabled
  • 12.
    • To promote the use of well-established metadata standards
    • To encourage the use of authority data, controlled vocabularies, and syntax encoding standards whenever possible
    • To encourage the use of resource URIs as names for things for data values when they are available
    • To facilitate the decision-making process regarding data encoding for the purpose of exchange and reuse
    • To provide a reference support that is open for suggestions of new properties and metadata terms
    5 Principles LOD - E nabled
  • 13. The LODE-Bibliographical Data (LODE-BD) LOD - E nabled BD
  • 14.
    •   a reference tool in selecting appropriate encoding strategies
      • uses flowcharts to present individualized decision trees for metadata properties
    • aims to address two questions
      • how to encode bibliographic data for the purpose of exchanging data; and
      • how to encode these data as Linked Open Data (LOD) - enabled bibliographic data
    LODE-BD 
  • 15. Once a content provider has decided to publish a bibliographical database as Linked Open Data…. LOD - E nabled BD
  • 16. What kinds of entities and relationships are involved in bibliographic resource description? Thema Resource Agent
  • 17.  
  • 18. What properties should be considered for publishing meaningful/useful LOD-ready bibliographic data? 1. Title Information 2. Responsible Body 3. Physical Characteristics 4. Location 5. Subject 6. Description of content 7. Intellectual property 8. Usage 9. Relation between documents / agents
  • 19.  
  • 20. A selected w idely-used metadata standards and the emerging LOD-enabled vocabularies dc: Dublin Core Metadata Element Set dcterms: DCMI Metadata Terms bibo: Bibliographic Ontology agls: AGLS Metadata Standard ( Australian Government Locator Service) ags: AgMES ( Agricultural Metadata Element set) eprint: Eprints Terms , UKOLN marcrel: MARC List for Relators What metadata standards should be used for preparing LOD-ready metadata?
  • 21. What metadata terms are appropriate in any given property for publishing LOD-ready metadata based on a local database? 
  • 22. Decision Trees LOD - E nabled BD
  • 23.
    • Assisting in the metadata term selection process
    • Flowcharts to present individualized decision trees for the properties included in each of the nine groups
      • Starting from the property that describes a resource instance, each flowchart presents decision points and gives a step-by-step solution to a given problem of metadata encoding
      • designed to facilitate the selection of the appropriate strategies adjustable to data providers according to their situations  
    Decision Trees LOD - E nabled BD
  • 24.
    • At the end of each flowchart there are alternative sets of metadata terms for selection
    • Each chart is followed by the text-based explanations corresponding to the flowchart, with notes, steps, and examples whenever necessary in the tables  
    Decision Trees LOD - E nabled BD
  • 25. Decision Trees Subject
  • 26. Decision Trees Responsible Body. Creator
  • 27. Step Forward References and Links
    • How to publish and consume Linked Data
    • Where to find Linked Data sets and Vocabularies
    • How to express metadata with different syntaxes: text, html, xml, rdf, and rdfa
    • Why publish bibliographic data as Linked Data
    LOD - E nabled BD
  • 28. http://aims.fao.org/lode/bd/ The AIMS Team euroCRIS Meeting  Bologna (Italy) April 26-27, 2011