• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
Metadata Workshop-Maastricht - November 6, 2008
 

Metadata Workshop-Maastricht - November 6, 2008

on

  • 1,391 views

 

Statistics

Views

Total Views
1,391
Views on SlideShare
1,390
Embed Views
1

Actions

Likes
0
Downloads
0
Comments
0

1 Embed 1

http://www.slideshare.net 1

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment
  • Introduce selves and interest in topic Please, please, please ask questions. No question too basic (although some might be too complex for me, but I’ll do my best). I don’t want to see on the evaluation form at the end of the day, pretty good, but it didn’t answer my question.

Metadata Workshop-Maastricht - November 6, 2008 Metadata Workshop-Maastricht - November 6, 2008 Presentation Transcript

  • AMY BENSON 6 november 2008 What’s New with AACR2? And the Rest of the Bibliographic Universe, too
  • Overview
    • Terms and definitions
      • What do all the acronyms mean?
    • Categories of metadata schemes and tools
      • How do they relate to each other?
    • Uses and functions
      • What do you do with them?
    • It’s all about me
      • Which ones will affect my work?
  • Metadata Standards
    • Metadata format standards
      • XML
    • Metadata element sets
      • MARC, MODS, ONIX, DC, EAD, TEI, CDGM
    • Metadata content and value standards
      • AACR, RDA, DACS, CCO
    • FRBR
    • Access
      • OAI, Next generation catalogs
  • Metadata - What is it?
    • Data about data
    • Information about any aspect of a resource - size, location, topic, origin, use, audience, creator, quality, access rights, reviews… the list is endless
    • An aid to the discovery, identification, assessment, and management of described entities
    • Captured or created
  • Types of Metadata
    • Descriptive/Discovery
      • What is it?
      • How can I find it?
      • How can I get to it?
    • Structural
      • What files comprise it?
      • Which file is page one?
    • Administrative
      • What do I need to know to manage it?
      • Can I use it?
      • How was it created?
      • What needs to be preserved?
  • Metadata - Who needs it?
    • Impact of metadata on collection access
      • Improves service to users
      • Provides the means for resource discovery, grouping, filtering, matching user needs
        • Drives functionality/capabilities
      • Keyword searching works only for resources that are text-based - excludes photographs, data sets, objects, maps, audio, video…
        • Google Image Labeler
    • Metadata decisions should reflect project goals
  • Interoperability
    • Interoperability allows different computer systems, networks, and software to work together
    • Usually achieved by following standards
    • Allows different systems to make use of same data
    • Generally, an increase in specialization results in a decrease in interoperability
    • Important feature of metadata in today’s world
  • Interoperability
    • Can increase awareness and use of collections
    • Reduces geographic and domain-specific isolation of collections
    • Likely to assist / promote the longevity of data and collections
    • One-stop access to the universe of online resources
  • eXtensible Markup Language (XML)
    • Developed by WWW Consortium (W3C)
    • Open, international standard
    • A structure for storing and tagging information, without prescribing how the information is displayed or used
    • Platform independent
    • A way to use the same data for many different purposes
    • Facilitates the sharing of data across institutions and projects
  • XML - Background
    • Markup languages
      • Label/tag information
      • Structure documents
        • TOC, Chapters, Index, etc.
      • Help computers “understand” the data
      • Use tags, similar to HTML
        • <title>Gone with the wind</title>
    • XML allows for coding of hierarchical relationships – often necessary for complex documents, 3-D objects, archives, etc.
    • XML is extensible – an important feature that allows tags to be created by users or a community of users
    • XML defines the syntax, but not the data elements that make up an XML document
  • XML - Elements Example
    • list (book+)
    • book (title, author+, date+, year, comment, code)
    • title <value>
    • author (aulast, aufirst)
    • aulast <value>
    • aufirst <value>
    • date (day, month)
    • day <value>
    • month <value>
    • year <value>
    • comment <value>
    • code <value>
  • XML Record Example
    • <book>
    • <title>Weaving the Web</title>
    • <author>
    • <aulast>Berners-Lee,</aulast>
    • <aufirst>Tim</aufirst>
    • </author>
    • <date>
    • <day>6</day>
    • <month>January</month>
    • </date>
    • <year>2002</year>
    • <comment>Interesting topic, but not too well written.</comment>
    • <code>nonfiction</code>
    • </book>
  • XML – DTDs and Schemas
    • Tags, definitions, and requirements are set and adhered to by a community of users
      • MARC XML, RecipeML, EAD
    • Two methods of defining specific XML implementations
      • DTDs (Document Type Definition)
      • Schemas
        • Lay out the logical structure of the data
        • Establish rules about which elements a document may have, which are required, which can repeat, etc.
        • Establish a root element, parent and child elements, and where data can be placed within hierarchy
  • RecipeML
  • XML – Ways to use XML
    • XML-encoded data is able to be re-used in multiple contexts
    • Due to its ability to be easily parsed, software can transform it in countless ways, thereby allowing:
        • Easy migration paths
        • Alternative displays
        • On-the-fly response to user needs
    • XML prescribes the structure of a document, or record but not content or display
    • Transform XML for display via style sheets (XSL) and transformations (XSLT)
  •  
  • XML File
  • XML File Transformation via XSL and XSLT
  • MARC
    • MAchine-Readable Cataloging (MARC)
    • Standard used to exchange, use, and interpret bibliographic information in libraries
    • Long, established, successful history
    • Large quantity of MARC data exists
    • Weak on rights information, etc.
    • Low extensibility
    • Highly interoperable within the library community, but not beyond
  • MARC Format
    • Leader / fixed field
      • Coded values
    • Tags / fields
      • Numeric labels for specific data elements
    • Indicators
      • Additional information about content in the field
    • Subfields
      • Segment data in fields into smaller units
  • MARC
    • Basic Tag Groups
    • 0XX Control information, numbers, codes
    • Example: 020 for ISBN
    • 1XX Main entry
    • Example: 100 for personal name
    • 2XX Titles, edition, imprint (in general, the title, statement of responsibility, edition, and publication information )
    • Example: 245 for title
    • 3XX Physical description, etc.
    • Example: 300 for extent
    • 4XX Series statements (as shown in the book)
    • Example: 490 for untraced series, or traced differently
  • MARC
    • Basic Tag Groups
    • 5XX Notes
    • Example: 520 for summary note
    • 6XX Subject added entries
    • Example: 650 for topical subject heading
    • 7XX Added entries other than subject or series
    • Example: 700 for added entry, personal name
    • 8XX Series added entries (other authoritative forms)
    • Example: 830 for series added entries in title form
    • 9XX Locally-defined uses
    • Example: 949 for barcode numbers
  • MARC Record
  • MARC XML
    • Future of MARC
      • Can it survive?
    • MARC XML developed by the Library of Congress (LC)
    • Allows representation of a complete MARC record in XML
    • LC has developed a schema, stylesheets, tools, and crosswalks
    • Will support new transformations for new uses of MARC data and into other standards such as MODS, EAD, ONIX, DC
  • MARCXML Example
  • Metadata Object Description Schema (MODS)
    • Set of 20 bibliographic elements - a subset of the MARC 21 Format for Bibliographic Data
    • XML-based standard
    • Alternative to MARC
    • Can be used for conversion of existing MARC records or to create new resource description records
    • Useful for library applications that want to go beyond the OPAC
  • MODS Elements
    • TitleInfo
    • Name
    • TypeOfResource
    • Genre
    • PublicationInfo
    • Language
    • PhysicalDescription
    • Abstract
    • TableOfContents
    • TargetAudience
    • Note
    • Cartographics
    • Subject
    • Classification
    • RelatedItem
    • Identifier
    • Location
    • AccessCondition
    • Extension
    • RecordInfo
  • MODS Elements
    • Elements can have sub-elements and attributes which provide refining detail for the element
    • Elements and sub-elements are repeatable, except in certain cases
    • Elements display in any order
    • More on MODS
      • http://www.loc.gov/standards/mods/
  • MODS Example
  •  
  • MODS Editor at Brown University
  • ONline Information eXchange (ONIX)
    • ONIX is the international standard for representing and communicating book industry product information in electronic form
    • Developed and maintained by EDItEUR and other international groups
    • XML-based
    • Focused on e-commerce of books
      • Synchronize the widely varying formats of major book wholesalers and retailers – interoperability
      • The need for richer book data online to improve sales
    • May appear in future library applications
  • ONIX
  • Other Metadata Standards
    • Encoded Archival Description (EAD)
      • Electronic Finding Aids
    • Document Data Initiative (DDI)
      • Data sets
    • Content Standard for Digital Geospatial Metadata (CDGM)
      • Primary standard for geospatial metadata
    • Visual Resources Association (VRA) Core
      • Visual culture and images that document them
  • Crosswalks
    • Crosswalks map an element from one scheme to its closest equivalent in another scheme
    • Convert data from one format to another - one that is potentially more widely accessible
    • Support cross-domain searching and interoperability of data
  • Dublin Core (DC)
    • A method of describing resources intended to facilitate the discovery of electronic resources
    • Designed to allow simple description of resources by non-catalogers as well as specialists
    • National and International standard
      • ANSI/NISO standard Z39.85-2001
      • ISO standard 15836
    • Includes 15 “core” elements
  • Dublin Core Elements
    • Title
    • Creator
    • Subject
    • Description
    • Publisher
    • Contributor
    • Date
    • Type
    • Format
    • Identifier
    • Source
    • Language
    • Relation
    • Coverage
    • Rights
  • Dublin Core
    • All elements optional and repeatable
    • Authority control not required
    • Simple and Qualified DC
      • Simple
        • Less Rich
        • Lowest common denominator
      • Qualified
        • More precise
        • Less interoperable
    • Extensible and flexible
    • “ Container” agnostic
  • Dublin Core Examples
    • Generic
      • Title=“The sound of music”
    • HTML
      • <meta name = &quot;DC.Title&quot; content = “The sound of music”>
    • XML
      • <?xml version=&quot;1.0&quot;?> <metadata xmlns:dc=&quot;http://purl.org/dc/elements/1.1/&quot;>
      • <dc:title> The Sound of Music</dc:title> </metadata>
  • Systems for Metadata
    • Metadata has to be stored and maintained somewhere
    • Digital content management systems
      • Databases
      • Software tools
      • Administration
      • Access
  • DC Record in OCLC Connexion
  • C/W MARS Digital Treasures Hosted Repository Using CONTENTdm
  •  
  •  
  • Content Standards
    • AACR (Anglo-American Cataloguing Rules)
      • “ The rules cover the description of, and the provision of access points for, all library materials commonly collected at the present time.”
      • The current text is the 2nd ed, 2002 Revision (with 2003, 2004, and 2005 updates)
      • The Joint Steering Committee for Revision of AACR (JSC) is working on a new code, “RDA: Resource Description and Access” scheduled for publication in 2009
  • RDA: Why?
    • Seen by many as time to take an opportunity to simplify the code and establish it as a content standard for resource description for libraries and beyond
    • Intended to support the objectives of resource discovery and user tasks based on the FRBR model
    • Provide more consistency, less redundancy within the rules
    • Planned as Web-based product, but will also be available in print (somehow at some point)
  • RDA: Goals
    • Flexible framework for describing all types of resources – analog and digital
    • Simplify rules
    • Create data that is readily adaptable to new and emerging database structures
      • Encourage use beyond the library community
    • Create data that is compatible with existing records in online library catalogs
    • Generate records that contain data that is relevant and important to users
  • International Scope
    • Designed to be a multi-national content standard
    • Developed for use in English language communities, but can be used in other language communities
    • Independent of the format used to communicate information
    • Compatible with other standards for resource description and retrieval
  • RDA and FRBR
    • Functional Requirements for Bibliographic Records
      • A new view of the bibliographic universe
      • FRBR is part of the conceptual foundation for RDA and makes use of FRBR terminology
    • Result of a study undertaken from 1992-1997 by a group of experts and consultants under IFLA
    • A conceptual model that establishes entities in relationship within and among 3 basic categories
      • Works, Persons, Subjects
      • RDA will highlight FRBR relationships
  • FRBR Group 1 Entities
    • The FRBR model divides Group 1 (bibliographic entities) into four levels of representation - the building blocks of the FRBR model
      • Work
      • Expression
      • Manifestation
      • Item
  • FRBR Group 1 Entities
    • Work
      • A distinct intellectual or artistic creation, in the abstract
    • Expression
      • The intellectual or artistic realization of a work by an illustrator, translator, performer
    • Manifestation
      • The physical embodiment of an expression of a work - a published edition
    • Item
      • A single example of a manifestation (copy)
  • A Work
    • “ A Work is an abstract entity; there is no single material object one can point to as the work. We recognize the Work through individual realizations, or Expressions of the Work, but the Work itself only exists in the commonality of content between and among the various Expressions of the work.” – FRBR Final Report
    • Abstract concept
    • What we mean when we say we’ve read Tale of Two Cities, or Pride and Prejudice
  • An Expression
    • “ The intellectual realization of a Work” in some form - FRBR Final Report
    • Abstract concept
    • Forms
      • Revisions, updates, abridgements, enlargements, translations, annotations, critical editions, etc.
  • A Manifestation
    • “ The physical embodiment of an Expression of a Work ” FRBR Final Report
    • Set of objects – still somewhat abstract
      • Manuscripts, Books, Periodicals, Maps, Posters, Films, CD-ROMs, etc.
    • Manifestations of a Work take different forms
      • Example: E-text of Pride and Prejudice from Project Gutenberg versus the Oxford Illustrated edition
    • Level at which we traditionally catalog library materials
  • An Item
    • “ A single exemplar of a Manifestation ” – FRBR Final Report
    • One specific concrete physical object
    • Designates a copy, or the circulation level of a bibliographic entity
    • Items may vary where the variations are a result of actions external to the intent of the producer of the Manifestation
      • Houghton Library’s first edition of Gone with the Wind previously owned by Thomas Wolfe
  • Top two levels: abstract intellectual/artistic content Lower two levels: physical recording of content
  • Specific Example
  • Groups 2 and 3
    • Group 2: Actor Entities
      • Persons or corporate bodies “responsible for the intellectual or artistic content, the physical production and dissemination, or the custodianship of the entities in the first group”
    • Group 3: Subject Entities
      • What a Work may be about including
        • Groups 1 & 2 (Other Works, People, Corporate Bodies)
        • Concepts
        • Objects
        • Events
        • Places
  • Advantages of FRBR
    • Better logic and organization to catalog
    • OPAC becomes simpler to navigate and understand
      • Easier to see all available Expressions/ Manifestations of a single Work
      • Easier to find desired resource when search results are grouped and related in meaningful ways
    • Display bibliographic entities within the context of related, established entities
      • Better understanding of relationships among related Works or Expressions
  • Advantages of FRBR
    • Ability to group individual records for Works to facilitate navigation and selection
    • Enable ILL holds at different levels depending on patron needs
      • Work, Expression, Manifestation, Item
    • Create catalog records once at the Work level, often including subject headings and classification, then use and expand on those records for Expressions/Manifestations
  • Traditional OPAC Display
  •  
  •  
  • OCLC WorldCat Analysis
    • Sample of 996 records (Manifestations)
    • Used computer algorithms to identify Works from within the group
    • 78% have only a single Manifestation
    • 99% of Works in WorldCat have 7 or fewer Manifestations
    • 1% of the whole benefits the most from FRBRization
  • FRBR Research Links
    • OCLC Research
      • http://www.oclc.org/research/projects/frbr/default.htm
    • Library of Congress
      • FRBR display tool
      • http://www.loc.gov/marc/marc-functional-analysis/frbr.html
  • Related Work: FRAD
    • FRAD = Functional Requirements for Authority Data
    • Provide a clearly defined, structured frame of reference for relating the data that are recorded in authority records to the needs of the users of those records
    • Deals with entities related to authority data
    • Expose the relationships between persons (or personas), names, and access points
    • Assist in an assessment of the potential for international sharing and use of authority data both within the library sector and beyond
      • Virtual International Authority File
    • Current draft dated April 1, 2007
  • RDA: Quick Overview
    • Designed for the digital environment
    • Description and access of all digital ( and analog) resources
    • Usable by libraries and other metadata communities
  • Working with RDA
    • Product developed as a subscription-based, XML-driven, online web system
    • Will include sample workflows, a core element set, customized views, links to rules, local notes, saved search profile s
    • Take the cataloger through the various data elements to be included in the resource description
      • Describe the purpose and scope of each element
      • Where to look for that element
      • How to record it
      • Build cataloger’s judgement
  • RDA and MARC
    • RDA establishes a clear line of separation between the recording of data and the presentation of data
      • Content vs. display
      • ISBD punctuation will be one option in an appendix
    • AACR2 and MARC are separate standards
      • RDA will remain a separate standard
    • RDA assists with the creating the content of the bibliographic record
    • MARC21 is one possible schema for encoding records created using RDA
      • RDA will be able to be used with any metadata standard such as MODS or Dublin Core
  • RDA: Impact on Libraries and Systems
    • The Joint Steering Committee is striving to minimize need for retrospective adjustments to pre-RDA records
    • RDA instructions are designed to be independent of the format, medium, or system used to store or communicate the data
    • ILS systems may implement RDA in different ways
    • Intended to be adaptable to newly-emerging database structures
    • RDA and FRBR would be better optimized in a relational database structure
  • RDA Timeline
    • Joint Steering Committee
      • http://www.collectionscanada.gc.ca/jsc/
    • Began as AACR3 in 2004
    • Renamed RDA in 2005
    • Reorganized in 2007 to follow FRBR
    • Scheduled for publication in 2009
    • Scheduled for implementation by the national libraries in 2010, pending evaluation
  • RDA Timeline: LC Plans
    • Oct. 2007 – CoP (Committee of Principals) issues statement on joint implementation of RDA
    • Jan. 2008 – report of the LC Working Group on the Future of Bibliographic Control released
      • http://www.loc.gov/bibliographic-future/
      • Recommends that LC “suspend work on RDA” until business case (return on investment) analyzed
    • May 2008 -LC, NLM, and NAL issue joint statement announcing their intention to evaluate RDA jointly to assist with implementation decision
    • June 2008 – LC’s official response to the LCWG report
  • RDA Timeline: Implementation
    • November 2008?? – first full draft of content to be released in online product for comment
    • Mid-January 2009 – comment period closes
    • Early March 2009 – JSC and CoP meet in Chicago. JSC finalizes review of comments received
    • Third quarter calendar 2009 – RDA is released
    • Last quarter calendar 2009–early 2010 – CoP national libraries evaluate RDA prior to implementation
  • RDA: Evaluation
    • LC, NLM, NAL plus 10-20 others
      • Selected libraries (PCC libraries, including small libraries in NACO funnel projects)
      • Library school
      • Archives
      • Non-MARC users
      • OCLC, Ex Libris, and other vendors
    • Criteria for evaluation under development
      • Usability, Technical, Financial criteria
  • Other Content Standards
    • International Standard Bibliographic Description (ISBD)
      • A family of standards to regularize the form and content of bibliographic descriptions
      • Available for different material types: monographs, computer files, etc.
      • Designed to promote record sharing and exchange
  • Other Content Standards
    • Book Industry Standards And Communications (BISAC)
      • Metadata Committee developed a Best Practices document
      • Intended as a response to the question, “I’ve downloaded the ONIX documentation. Now what?”
      • Its overriding purpose is to detail what data should be supplied and how it should be supplied
  • Other Content Standards
    • Describing Archives: A Content Standard (DACS)
      • Designed to facilitate consistent, appropriate, and self-explanatory description of archival materials and creators of archival materials
      • Replaces Archives, Personal Papers, and Manuscripts (APPM)
  • Other Content Standards
    • Cataloging Cultural Objects (CCO)
      • A guide to describing cultural works and their images
      • Provides guidelines for selecting, ordering, and formatting data used to populate catalog records
      • Designed to promote good descriptive cataloging, shared documentation, and enhanced end-user access
      • A project of the Visual Resources Association
  • Access: Open Archives Initiative (OAI)
    • A tool that supports interoperability among multiple databases
    • OAI goal: coarse-granularity resource discovery
    • Supports cross-database searching
    • Aggregates metadata from multiple community-specific repositories
    • Data providers expose (make available) the metadata for their collections
    • Service providers harvest the exposed metadata from data providers and aggregate it
  • OAI
    • OAI Protocol for Metadata Harvesting
      • Metadata content must be encoded in XML and have a corresponding XML schema for validation
      • Metadata must be supplied in unqualified Dublin Core format, at least
      • Other metadata formats are optional, but recommended
      • Metadata may optionally include a link to the actual content / resource
  • OAI Infrastructure repository repository repository repository Harvester Service Provider DC DC DC DC DC
  • OAI Infrastructure user Harvested Repository search Original repository
  •  
  •  
  •  
  • Book Details from U. of Chicago
  • Boston College
  • VuFind – Integrating Data Sources
  • OCLC’s WorldCat Local
  • OCLC’s Fiction Finder
  • OCLC’s Fiction Finder
  • Social Data
    • Libraries have some very useful data
    • When made available in standardized formats the data can be used in new ways
      • Wall of Books
      • iGoogle widget
    • Embrace data from users
    • Seek outside sources of information to bring in that might enhance the user experience
  • Wall of Books created by AADL Patron
  • xISBN
    • A Web Service that takes as input one ISBN and returns a list of other ISBNs of associated intellectual works – other expressions and manifestations
    • Search results on one specific ISBN can be misleading
    • Results intended for use by computer systems to generate new, more complete searches such as in an OPAC
  • xISBN Web Service Result
  • Library Lookup
  • Library Look Up
  • Metadata’s Ideal Profile
    • Metadata Characteristics
      • Standards-based
      • Consistent
      • Descriptive
      • Sharable
      • Contextual
      • Modular
      • Adjustable
      • Portable
  • Questions? Amy Benson Librarian/Archivist for Digital Initiatives Schlesinger Library Radcliffe Institute for Advanced Study Harvard University [email_address]