Management of bibliographic metadata - Metadata management at the Leibniz Information Centre for Economics
Management of bibliographic metadata - Metadata management at the Leibniz Information Centre for Economics

Extended Version of a presentation held at the Hamburg University of Applied Sciences as guest speaker at the seminar "Digital Libraries", 15.05.2013

Extended Version of a presentation held at the Hamburg University of Applied Sciences as guest speaker at the seminar "Digital Libraries", 15.05.2013

  • 1. Die ZBW ist Mitglied der Leibniz-GemeinschaftManagement of bibliographic metadataKirsten Jeude15.05.2013HAWMetadata management at the Leibniz Information Centre for Economics<mods:titleInfo><mods:title>Envy, guilt, and thePhillips curve</mods:title></mods:titleInfo><mods:name type="personal" ><mods:nameParttype="family" >Ahrens</mods:namePart><mods:namePart type="given">Steffen</mods:namePart><mods:role><mods:roleTerm authority="marcrelator"type="code" >aut</mods:roleTerm></mods:role></mods:name>011@$a2012013@$0o3016D$0cr016H$0Elektronische Ressource019@$aXA-DE021A$aEnvy, guilt, and the Phillips curve$hby Steffen Ahrens andDennis J. Snower028A$dSteffen$aAhrens$961036832X$8Ahrens, Steffen028B/01$dDennis J.$aSnower$9366752960$8Snower, DennisJ. *1950-*033A$pKiel$nUniv., Dep. of Economics034D$aOnline-Ressource (PDF-Datei: 34 S., 1,70 MB)xxxxxnaa a22yyyyy c 4500001 592906469003 DE-601005 20090728100724.0008 090304s2008 000 0 eng d016 7 $a7153508$2DE-600040 $aGBVCP$bger$erakwb041 0 $aeng100 1 $aMerkl, Christian$0(DE-601)533312205$0(DE-588)133059545245 10$aEscaping the unemployment trap :$bthe case ofEast Germany /$cChristian Merkl; Dennis J. Snower300 $bgraph. Darst.650 7$81.1x$aLangzeitarbeitslosigkeit$0(DE-601)091374367$0(DE-STW)18124-6$2stw
  • 2. Seite 2Management of bibliographic metadata1. Metadata management at the ZBW2. Metadata und metadata standards3. Mappings4. Metadata for the Semantic Web5. Duties and responsibilities of themetadata management6. Conclusion: Requirements formetadata managersOutline
  • 3. Seite 3• Support for the development and maintenanceof the information systems of the informationcentre• Coordination of data delivery• Assistance on all aspects of metadata formatsand standards„We speak the languages of the data“Cross-sectional taskMetadata management at the ZBW
  • 4. Seite 4• Participation in cross-functional teams of the ZBW• Participation in externally funded projects (e.g. funded by DFG)• Close cooperation with the department „Innovative information-systems & publication technologies“• Participation in cooperations and (international) working groups„We speak the languages of the data“Metadata management at the ZBW
  • 5. Seite 5What is metadata?A variety of definitions• „… are (structured) data about data.“Miller, Eric: An introduction to the resource description framework. In: D-Lib Magazine, (May) 1998; Foulonneau, Muriel; Riley Jenn(2008): Metadata for Digital Resources; u.v.m.• „Unter Metadaten ("Daten über Daten") versteht man strukturierte Daten, mit deren Hilfe eineInformationsressource beschrieben und dadurch besser auffindbar gemacht wird.“SUB Göttingen Metadata Server• „… structured, encoded data that describe characteristics of information-bearing entities toaid in their identification, discovery, assessment, and management of the described entities“American Library Association´s Committee on Cataloguing• „… is any type of formal description of a resource, regardless of format“Mitchell, Nicole: Metadata Basics (The Southeastern Librarian, Fall 2006)• „…is cataloging done by men“Delsey, Tom, National Library of Canada• „… is machine understandable information about web resources or other things.“Berners-Lee, Tim: Design Issues. Architectural and philosophical points, 6. Januar 1997.• „An item of metadata is a relationship that someone claims to exist between two entities.“Foulonneau, Muriel; Riley Jenn (2008): Metadata for Digital Resources• „I like to think of metadata as data which removes from a user (human or machine) the needto have full advance knowledge of the existence or characteristics of things of potentialinterest in the environment“Lorcan Dempsey, zit. in: Foulonneau, Muriel; Riley Jenn (2008): Metadata for Digital Resources
  • 6. Metadata “is machine understandable information aboutweb resources or other things.“ (Tim Berners-Lee)Web-Resources:6
  • 7. Other Things:7Publications:Objekts:Pictures:Recordings:Persons:Metadata “is machine understandable information aboutweb resources or other things.“(Tim Berners-Lee)
  • 8. 8Index card in card catalogueMetadata – separated from theobjectMetadaten – contained in the objectDataset in bibliographic databaseMetadata „is any type of formal description of a resource,regardless of format“ (Nicole Mitchell)CIP (Cataloging inPublication)Header of an HTML-page with MetadataMicroformat(COinS) in HTML-page
  • 9. Metadata „are (structured) data about data“ (Eric Miller)Metadata are „Data about Data“Data - Book „Data about Data“Metadata / Marcia Lei Zeng and Jian QinZeng, Marcia Lei *1956-*Qin, Jian *1956-*London : Facet, 2008xvii, 365 p. : ill. ; 23cmIncludes bibliographical references and index1-85604-655-9, 978-1-85604-655-8 (pbk)MetadatenSTRUCTUREDMetadata-SchemaTitleCreatorContributorPublishedExtentNoteISBNSubject9
  • 10. Seite 10Metadata „are (structured) data about data“ (Eric Miller)Metadata standards(machine-readable)Designator of a property(human-readable)4000 Suchen mithilfe semantischer Metadaten (PICA3)021A $a Suchen mithilfe semantischer Metadaten (PICA+)331Suchen mithilfe semantischer Metadaten (MAB)245 00 $a Suchen mithilfe semantischer Metadaten (MARC21)<titleInfo><title>Suchen mithilfe semantischer Metadaten</title></titleInfo>(MODS)In principle and pragmatic
  • 11. ID:Title:Creator:Contributor:Published:Extent:Note:ISBN:Subject:Classification:11Zeng, Marcia LeiCreator:Metadata elementMetadata Setfollowing rules (e.g. family name comma blank given name)Element value571230652Metadata / Marcia Lei Zeng and Jian QinZeng, Marcia Lei *1956-*Qin, Jian *1956-*London : Facet, c2008xvii, 365 p. : ill. ; 23cmIncludes bibliographical references and index1-85604-655-9, 978-1-85604-655-8 (pbk) : £39.95*Metadaten / *MetadataDewey-Dezimalklassifikation: 025.3Metadata „are (structured) data about data“
  • 12. Metadata is „[…] to aid in their identification, discovery,assessment, and management of the described entities“Tasks of metadata (FRBR User Tasks)121. to FIND entities that correspond to the user’s stated search criteria: tolocate either a single entity or a set of entities in a file or database as theresult of a search using an attribute or relationship of the entity2. to IDENTIFY an entity: to confirm that the entity described correspondsto the entity sought, or to distinguish between two or more entities withsimilar characteristics3. to SELECT an entity that is appropriate to the user’s needs: tochoose an entity that meets the user’s requirements with respect tocontent, physical format, etc., or to reject an entity as being inappropriateto the user’s needs4. to OBTAIN access to the entity described: to acquire an entity throughpurchase, loan, etc., or to access an entity electronically through anonline connection to a remote computerFRBR: Functional Requirements for Bibliographic Records
  • 13. Metadata is „[…] to aid in their identification, discovery,assessment, and management of the described entities“TYPES of metadata13Descriptive metadata: Describe a resource for the purpose of discovery,identification, selection and accessStructural metadata: Describe the internal organization of a resource. Describe howinterconnected objects are relate to each other, e.g. how pages must bearranged that a chapter is created.Administrative metadata: „meta-metadata“; Information which helps to manage aresource:Technical metadata: Contain information on the format and file typeMetadata for Rights Management: Contain information that is used for accessauthorization and information on intellectual propertyProvenance metadata: Metadata about the origin of resource: to establish trustand for preserving the usability
  • 14. 14PPNTitleCreator ContributorExtentISBNSubjectFussnotenRelated ResourceURLTitle:Metadataschema of acertainapplication, e.g.RepositoryTitle:Creator:Contributor:Organisation:Published:Extent:Note:ISBN:Subject:URL:Metadata standard of a „Community“ / Domain,e.g. PICA, Dublin Core, DDI etc.DOIModeEpochFormatPixelOrganisationProvenceURNCompressionHandleActorMetadata standard VS. Metadata schemaCreator:Contributor:URL:Published:Extent:Metadataelements ofa realinformationobject
  • 15. 15Metadata standards and schemas define the allowed elements, their meaning(semantics) and their form (syntax):• Definition of the content of an element, sometimes with additional comments,which helps to separate one element from the other within a schema or standards• Syntax encoding scheme: rules about the syntax of the content values; e.g. adate has to be entered in the form YYYY-MM-TT• Syntax encoding schemes include norms and standards (e.g. ISO 8601, W3C-DTF, …), URIs (URL, URN, DOI, …) and other identifiers (e.g. ISBN, ISSN, …)• Vocabulary encoding scheme: rules about the allowed content values, e.g.which controlled vocabulary has to be used• Vocabulary encoding schemes include classifications (e.g. DDC, UDC, BK,JEL,..), thesauri (e.g. STW, TGN, MeSH, AGROVOC, …), authority files (e.g.GND, LCSH, Rameau, VIAF) and other controlled vocabulary (e.g. DCMI Type)• Information about the repeatability and obligation of an element and howelements are related to each other within the standardMetadata standards and schemas
  • 16. 16Dublin CoreDublin Core Metadata Initiative (DCMI)Developed for the description of webresourcesDublin Core Metadata Element Set15 core elements: Title, Creator,Subject, Description, Publisher,Contributor, Date, Type, Format,Identifier, Source, Language, Relation,Coverage, and Rightsall optional and repeatableDublin Core Metadata Terms55 elementsElement refinements and encodingschemes = „Qualifier“Title ”Metadata Demystified”Creator ”Brand, Amy”Creator ”Daly, Frank”Creator ”Meyers, Barbara”Subject ”metadata”Description ”Presents an overview ofmetadata conventions in publishing.”Publisher ”NISO Press”Publisher ”The Sheridan Press”Date ”2003-07"Type ”Text”Format ”application/pdf”Identifier ””Language ”en”
  • 17. 17Project of Integrated Catalogue Automation - PICA• for bibliographic datasets in librarycatalogues• ca. 1300 information units:elements with subelements• Used by GBV, SWD, HEBIS, DNB,…PICA+:internal format of the Database021A $aEscaping the unemploymenttrap$dthe case of East Germany$hChristian Merkl; Dennis J. SnowerPICA3:View for cataloguing4000 Escaping the unemployment trap: the case of East Germany /Christian Merkl; Dennis J. Snower001@ $026$aU001A $00206:04-03-09001B $00206:10-03-09$t10:42:23.000001D $00206:04-03-09001U $0utf8001X $00002@ $0Asu003@ $0592906469010@ $aeng011@ $a2008021A $aEscaping the unemployment trap$dthe case ofEast Germany$hChristian Merkl; Dennis J. Snower027D/00 $aJournal of comparativeeconomics$pAmsterdam$nElsevier$00147-5967$z7153508028A $dChristian$aMerkl$9533312205$8Christian@Merkl ;PND-ID: 133059545028B/01 $dDennis J.$aSnower$9366752960$8DennisJ.@Snower ; PND-ID: 124825109031A $d36$j2008$e4$c12$h542-556034M $agraph. Darst.039B $cIn$9130445541$8Journal of comparativeeconomics. - Amsterdam : Elsevier$x200800000360004458039B $cIn$7!562253327!045D/00 $9091374367$8Langzeitarbeitslosigkeit045D/00 $9091347742$8Arbeitsproduktivität045D/00 $9091384966$8Qualifikation045D/49 $b49101B $004-03-09$t09:15:56.000
  • 18. 18Machine-Readable Cataloging - MARC21• International exchange- andstorage format• for bibliographic datasets inlibrary catalogues• > 1000 elements withsubelements and indicatorsxxxxxnaa a22yyyyy c 4500001 592906469003 DE-601005 20090728100724.0008 090304s2008 000 0 eng d016 7 $a7153508$2DE-600040 $aGBVCP$bger$erakwb041 0 $aeng100 1 $aMerkl, Christian$0(DE-601)533312205$0(DE-588)133059545245 10$aEscaping the unemployment trap :$bthe case ofEast Germany /$cChristian Merkl; Dennis J. Snower300 $bgraph. Darst.650 7$81.1x$aLangzeitarbeitslosigkeit$0(DE-601)091374367$0(DE-STW)18124-6$2stw650 7$81.2x$aArbeitsproduktivität$0(DE-601)091347742$0(DE-STW)10440-1$2stw[…]700 1 $aSnower, Dennis J.$eAuthor$4aut$0(DE-601)366752960$0(DE-588)124825109773 08$iIn: $tJournal of comparativeeconomics$dAmsterdam : Elsevier$gVol. 36, No. 4 (2008),p. 542-556$q36:4<542-556$w(DE-601)130445541$x0147-5967773 08$iIn:900 $aGBV$bZBW Kiel <206>952 $d36$j2008$e4$c12$h542-556954 $a26$b923675930$c01$x0206
  • 19. Metadata standards using XMLSeite 19• MODS: Metadata Object and Description Schema• Developed as a compromise between the complex MARC21 and the simpleDublin Core• for electronic resources• expressed in XML
  • 20. Metadata formatsbibliographisch / deskriptiv„Metadata standard“ is not the same as „Metadata format“Dublin Core metadata standard• as .txt file• as .html- or .xhtml file• as .xml file• as .rdf statementsIn principle:• Metadata standards developed as text-basedstandard with subsequently further representations• primary XML-standards like MODS• proprietary formats, e.g. .mrc for MARC21 metadata format, e.g. DC-DS-XMLSeite 20
  • 21. Seeing Standards: A Visualization of the MetadataUniverse (Jenn Riley)Seite 21
  • 22. 22DOCUMENTATION of a Metadata standard• „Uses and Features“• „Usage Guidelines“• „Implementation Guidelines“• „Schemas and Outline“• „Data Dictionary“• “Specification”• Encoding Guidelines• Code Lists• Example Datasets• Tutorials• Mailinglists and Forums• Domain and Applications• Rules and Frameworks• Listing and Definition of theelements• Encoding Rules• Encoding Lists• Examples• Instructions manual• ExcercisesMeet the metadata standards100 1 $aMerkl,Christian$0(DE-601)533312205$0(DE-588)133059545
  • 23. Metadata standardsStandards?"The nice thing about standards is that you have so many tochoose from. Furthermore, if you do not like any of them, youcan just wait for next years model."Source: Andrew Tanenbaum, Computer Networks, 2nd edn., p. 254A metadata standard• ensures consistency of the metadata• improves usage of metadata• allowes exchange of metadata Semantic interoperability between applications, which uses thesame standardSeite 23
  • 24. What means „Interoperability“?Interoperability is the ability of different systems (and content) tocommunicate and exchange information as efficiently as possible Exchange of metadata between systems with little effort and aminimum of loss of informationProblem:Exchange or integration of metadata sets, which are presented indifferent standards Establish interoperabilität by Mapping24
  • 25. Seite 25What is „Mapping“?Task: Integration of heterogenous metadata in one system: EconBizStandard: Dublin Core(XML)Standard: PICA 3Title=4000Title=<title>• Metadata from various sources,stored in various standards• homogenous search with the helpof search engine technologies• Results to be refined by filters
  • 26. What is „Mapping“?26• „Translation“ of the elements and rules from one standard or schema toanother• Metadata "mapping" refers to a formal identification of equivalent or nearlyequivalent metadata elements or groups of metadata elements from differentmetadata schemas, carried out in order to facilitate semantic interoperability(Getty Glossary)• Crosswalk: List of elements, which relates the appropriate constituents of twodifferent metadata standards or schemas to one another, including rulesElement DB2 - MAB DB1 - PICA NotesDesignator Syntax Designator Syntax Rules for TransformationCreator 100Family_name,Given_name 3000Given_name@Family_nameWrite value in front of commercial at in targetschema separated by comma space behind value,which is behind commercial at […][…]
  • 27. 27Crosswalk
  • 28. CrosswalkA crosswalk ist "a mapping of the elements,semantics, and syntax from one metadata schemeto those of another„( ).Crosswalks are lateral mappings: They are one-waystreets from one schema to another.Seite 28
  • 29. 29Typical problems:• Mapping between two standards with different scope: missing elements intarget schema / more than one element in target schema• Different features: Obligation: mandatory elements vs. optional elements.Repeatability: element may be repeatable or not.• Different syntax encoding:. (e.g. A: „Given name“, „Family name“ and B:„Name“ (family name, given name) and C: „Name“ (given name@family name)• Different vocabulary encoding: If different vocabularies are used, the valueshave to be „translated“ too.What is „Mapping“?Source Target
  • 30. 30Mapping for the integration of heterogenous metadataSummary• Prerequisite for a correct mapping is the clear and precise definition of theelements in the respective standards.• The mapping should allow the creation of transformation rules, which allowsto transform metadata from one schema completely into the other schema.• Conversion of the metadata sets using programmed scripts• Mapping is the first step for the integrationof heterogenous metadata in one system• Mapping is an iterative processIterativeProcessDataanalysisMappingConver-sionControll-ingdatasets
  • 31. Metadata for the Semantic WebProblems of search engines today1. Improper search results  Query is ambiguous (Homonymy/Polysemy)2. Missing search results  Synonyms are not taken into account, for „Futorology“, will not find „Zukunftsfor-schung“ or „future studies“Reason  Lack of explicit semanticsSolution  Semantic Web: Information is linked on the level of semanticsSeite 31
  • 32. Metadata for the Semantic WebSeite 32RDF+ Language to model the relationships(e.g. SKOS, OWL)+ URIs________________________________Web of DataRequirements:• machine-interpretability of the information RDF (Resource Description Framework)• Unambiguousness of the used concepts (a person, an organisation, a theme) URIs (Uniform Resource Identifiers) in libraries: usage of authority dataVocabularyfor the relationships:(library) metadata standards,e.g. RDA, DChas Authordc:creatorURI URI
  • 33. Linked Data in LibrariesWhat?Provision and use of controlled vocabularies and ontologies in RDF, e.g. StandardThesaurus Economics: ; GND der DNB: and use of bibliographic data in RDF, e.g. hbz: ; B3Kat (BSB,KOBV): of Tools, e.g. Culturegraph: Platform for services and projects about data networking,Persistent Identifier and Linked Open Data:• Maximize web visibility• Makes it easier to find our library data / optimize search options• Ensure reusability: in particular by non library domains• Enrich bibliographic references by interlinking with other informationSeite 33
  • 34. Duties and responsibilities of the metadata managementMetadata standards and Mapping• Expertise regarding metadata standards and frameworks• Development of crosswalks• Development and enhancement of metadata schemasNew Systems:• Selection of appropriate metadata standards according to therequirements of the system, if necessary. Combination ofseveral standards• Development of an interoperable metadata schema• Development of Application Profiles (Documentation)Existing systems:• Continuous adaption to changing requirementsSeite 34
  • 35. Duties and responsibilities of the metadata managementCoordination of data delivieries• Provision / analysisof test data anddocumentation• Documentation ofincoming andoutgoing deliveries• Contact for allquestionsregardingmetadata format,cataloging andprovision / delivery
  • 36. Duties and responsibilities of the metadata managementQuality management and communication• Participation in system development: Functionsthat rely on metadata, such as "More like this",exports to reference management software, …• Quality management: Developing of statistics anddata analysis• Communication between library and ITSeite 36ILOVEMETADATAlVerweisung vom zweiten Bestandteil deszusammengesetzten Namens(§ 319)Ist nach den Regeln ein zusammengesetzter Name inder Ordnungsgruppe des Familiennamens anzusetzen,so wird vom zweiten und von allen weiterenHauptbestandteilen des zusammengesetzten Namensverwiesen.Die bei der Verweisung übergangenen Teile deszusammengesetzten Namens werden an das Ende derOrdnungsgruppe der Vornamen gestellt (319,1).lRAK
  • 37. Seite 37Conclusion: Requirements for metadata managersMetadatastandardsGuidelines andFrameworksVocabulariesand normsAuthority dataRAK-WB
  • 38. Seite 38• analytical thinking• Having fun to constantly deal withnew challenges (standards,technologies)• and to keep on learningMethodsBest PracticesMarkuplanguagesTechnologiesMetadata management  Highly specialized field in librarianshipConclusion: Requirements for metadata managers
  • 39. Management of bibliographic„Metadata is a love note to the future”, Cea., Seite 39Thanks for yourattention!Questions?
  • 40. Seite 40Bibliographical ReferencesSeite 40Berners-Lee, Tim: Design Issues. Architectural and philosophical points, 6. Januar 1997, A., Daly, F., Meyers, B., & National Information Standards Organization (U.S.)(2003):Metadatademystified: A guide for publishers. Bethesda, Md: NISO Press.( )Caplan, Priscilla: Metadata Fundamentals for All Librarians. Chicago, 2003, ALA EditionsDublin Core Metadata Initiative: DCMI Glossary., Muriel; Riley, Jenn (2008): Metadata for digital resources: implementation, systems design andinteroperability. Oxford: Chandos.Greenberg, J. (2008): Dublin Core History and Basics. Tutorial ASIST DC 2008, Corey (2010): Dublin Core Metadata Initiative: Beyond the Element Set – NISO InformationStandards Quarterly, v.22, no. 1, Winter 2010., D. I., & Westbrooks, E. L. (2004). Metadata in Practice. Chicago: ALA Editions.In Baca, M., & Getty Research Institute. (2008). Introduction to metadata. Los Angeles (Calif.: Getty ResearchInstitute.Miller, Eric (1998): An introduction to the resource description framework. In: D-Lib Magazine, Volume 4 Issue5, May 1998., Nicole: Metadata Basics (2066): In: The Southeastern Librarian, Vol. 54: Iss. 3, Article 6 .
  • 41. Bibliographical ReferencesNational Information Standards Organization (U.S.). (2004): Understanding metadata. Bethesda, MD: NISOPress. ( )Pohl, Adrian (2011): Linked Data und die Bibliothekswelt. Hochschulbibliothekszentrum des LandesNordrhein-Westfalen)., Jenn (2008-2010): Seeing Standards: A Visualization of the Metadata Universe.öllner, Konstanze (2008):„Academic Librarian of the Future“-Woher kommen die Spezialisten für die neuenAufgaben in den Bibliotheken?. 102. Deutscher Bibliothekartag <Leipzig, 2013> Pierre, Margaret; LaPlant, William P. (1998): Issues in Crosswalking Content Metadata Standards.(NISOWhite Papers). Bethesda, MD: NISO., S. (2002): Metadata Principles and Practicalities. In: D-Lib-Magazin, Vol. 8, No. 4, April 2002Zeng, M.L.; Chang, L.M. (2006): Metadata Interoperability and Standardization – A Study of Methodology PartI: Achieving Interoperability at the Schema Level. In: D-Lib-Magazin, Vol. 12, No. 6, June 2006., M.L.; Chang, L.M. (2006): Metadata Interoperability and Standardization – A Study of Methodology PartII: Achieving Interoperability at the Record and Repository Levels. In: D-Lib-Magazin, Vol. 12, No. 6, June2006., Marcia Lei, Jian Qin (2008): Metadata. New York: Neal-Schuman Publishers.Seite 41
  • 42. Web ResourcesK.I.M. Kompetenzzentrum Interoperable Metadaten: Matters: Bloggerin: Diane Hillmann: InFormation: Bloggerin: Karen Coyle: Metadata: Bloggerin: Laura Smart: Discussion Group (Indiana University Libraries): of library metadata. 2008. New York, NY: Haworth Press. .D-Lib-Magazin (frei): 42
  • 43. Web ResourcesQualificationK.I.M. Kompetenzzentrum Interoperable Metadaten: Library MOOC: (Hasso-Plattner-Institut für Informatik) Core Tools for creation / development:, like MarcEdit: Free Text-Editor for huge files. Search and Replace with regularExpressions possibleFirefox-AddOns: Dublin Core Viewer, Operator (Microformat detection)Chrome-AddOn: Schema Explorer (Microdata), OpenLink Data ExplorerSeite 43
  • 44. Metadata standards• Dublin Core:• Pica3/Pica+: (Katalogisierungsrichtlinie des Gemeinsamen Bibliotheksverbunds):• MARC21:• MODS:• METS:• Library of Congress Standards:• ONIX:• TEI:• DDI:• BibTeX:• RIS:• COinS (KEV): 44