Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

It's All About the Metadata


Published on

UPDATED and REPLACED with new file June 2014

Simplified presentation on library metadata evolution, the perils of not curating the metadata properly, and how it's being used "in the wild".

But…it’s all on the internet and a keyword search will find it, right? Not exactly... There's been a massive change in cataloging in libraries with the rise of the internet. Everything is connected, including our metadata. Catalogers are no longer isolated, and metadata management is no longer just an internal process. Everything we do now links to the wider world of metadata, pushing libraries into re-purposing our long-held work into the new frontiers of identity management and linked data.

Published in: Education
  • Be the first to comment

It's All About the Metadata

  1. 1. IT’S ALL ABOUT THE METADATA Shana L. McDanold June 10, 2014 1
  2. 2. WHY DOES METADATA MATTER? – GEORGE Search: Koran Search: Quran 2
  3. 3. WHY DOES METADATA MATTER? – ONESEARCH Search: Koran Search: Quran 3
  4. 4. WHY DOES METADATA MATTER? – GEORGE Search: 9/11 Search: 9-11 4
  5. 5. WHY DOES METADATA MATTER? – ONESEARCH Search: 9/11 Search: 9-11 5
  7. 7. WHY DOES METADATA MATTER?  “This town built a memorial to the wrong guy”  Ottawa, Canada  “It’s the metadata, stupid: and it’s not just for your audience” (Joshua Lasky, posted 5/21/2014)  “To succeed in the digital age is to be able to easily aggregate all of your articles in the most meaningful way for each of your visitors. Competitors such as Circa actively use metadata to surface relevant content during breaking news events.” 7
  8. 8. WHY DOES METADATA MATTER?  What are we trying to identify? OR What are people trying to find?  Works  Individuals  Places  Things/objects  Concepts  Discovery and discovery enhancement  Relationships  “On the fly” collections of resources  Users start elsewhere 8
  9. 9. WHAT DO WE DO WHEN WE CURATE [CREATE] METADATA?  Create and enhance descriptive metadata  Apply controlled vocabularies  Disambiguation of works, authors, etc.  Unique identification of editions, works, etc.  Collocation of editions, works, etc.  Use agreed upon standards for data elements to ensure consistent application/use  MARC  DigitalGeorgetown (DublinCore)  RDF (Resource Description Framework) 9
  10. 10. HOW DO WE EXPOSE “OUR” METADATA?  Controlled vocabulary and mapping  Genres  Subjects/Concepts  Classification  Identification:  People  Places/Geographic  Works  OWL (Web Ontology Language)  SKOS (Simple Knowledge Organization System)  Normalization  Indexing 10
  11. 11. OWL: WEB ONTOLOGY LANGUAGE  Utilizes RDF (Resource Description Framework)  5.2 Individual identity  Many languages have a so-called "unique names" assumption: different names refer to different things in the world. On the web, such an assumption is not possible. For example, the same person could be referred to in many different ways (i.e. with different URI references). For this reason OWL does not make this assumption. Unless an explicit statement is being made that two URI references refer to the same or to different individuals, OWL tools should in principle assume either situation is possible.  OWL provides three constructs for stating facts about the identity of individuals:  owl:sameAs is used to state that two URI references refer to the same individual.  owl:differentFrom is used to state that two URI references refer to different individuals  owl:AllDifferent provides an idiom for stating that a list of individuals are all different. 11
  12. 12. SKOS: SIMPLE KNOWLEDGE ORGANIZATION SYSTEM  Utilizes RDF (Resource Description Framework)  2.3 Semantic Relationships  In KOSs semantic relations play a crucial role for defining concepts. The meaning of a concept is defined not just by the natural-language words in its labels but also by its links to other concepts in the vocabulary. Mirroring the fundamental categories of relations that are used in vocabularies such as thesauri [ISO2788], SKOS supplies three standard properties:  skos:broader and skos:narrower enable the representation of hierarchical links, such as the relationship between one genre and its more specific species, or, depending on interpretations, the relationship between onewhole and its parts;  skos:related enables the representation of associative (non-hierarchical) links, such as the relationship between one type of event and a category of entities which typically participate in it. Another use for skos:related is between two categories where neither is more general or more specific. Note that skos:related enables the representation of associative (non- hierarchical) links, which can also be used to represent part-whole links that are not meant as hierarchical relationships. 12
  13. 13. CURATED METADATA IN THE WILD – LIBRARY OF CONGRESS  Library of Congress data exposed as linked data  “The Library of Congress Linked Data Service enables both humans and machines to programmatically access authority data at the Library of Congress. This service is influenced by -- and implements -- the Linked Data movement's approach of exposing and inter-connecting data on the Web via dereferenceable URIs.” 13
  14. 14. CURATED METADATA IN THE WILD - WORLDCAT  Bibliographic records 14
  15. 15. CURATED METADATA IN THE WILD - WORLDCAT  Google searches! 15
  16. 16. CURATED METADATA IN THE WILD - OTHERS  Wikipedia/dbpedia  WorldCat: links to WorldCat Identities   LCCN: links to LC National Authority File (NAF)   VIAF record   ISNI (International Standard Name Identifier) record  16
  17. 17. CURATED METADATA IN THE WILD - OTHERS  Wikipedia/dbpedia  Disambiguation  guation_pages  Identity management:  John Smith  St. Mary’s Church  Georgetown  Hamlet 17
  18. 18. CURATED METADATA IN THE WILD - OTHERS  “MARC 21 records for CONSER serials either cataloged or processed by LC or by CONSER (Cooperative Online Serials Program) participants. Also includes records with ISSN assignments and U.S. Newspaper Program cataloging. Records include all languages. Available in MARC 21 and MARCXML formats.” eCIP CONSER 18
  19. 19. BUILDING CURATED METADATA: OTHER OPTIONS  Crowd sourcing  Archives and Alumni  Identification of individuals for identity control  Penn Provenance project  “We are trying to identify former owners and virtually reunite dispersed collections, and we welcome any information you have about the images posted here.”  Incorporate data into records; establish identities  19
  20. 20. CONCLUSION  All comes back to the basics of metadata work:  DESCRIPTION  COLLOCATION  DISAMBIGUATION (uniquely identifiable)  RELATIONSHIPS 20